Commit Graph

13 Commits

Author SHA1 Message Date
CyberSecurityUP 761d3df444 feat: whitebox/greybox/repl accept a GitHub URL (auto-clone)
`whitebox <arg>`, `greybox --repo <arg>`, `tui --repo`, and the REPL `/repo`
now accept a git URL (https://github.com/owner/repo[.git], git@…, ssh://, *.git)
or an `owner/repo` shorthand. A new resolve_source() shallow-clones it into
<base>/repos/<name> (cached, .gitignored) and reviews it; existing local paths
are used unchanged. Works identically with API-key (--model) and --subscription.

Verified: `neurosploit whitebox https://github.com/digininja/DVWA --offline`
clones DVWA and runs the 78 code agents over 120KB of source.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-26 13:52:51 -03:00
CyberSecurityUP eb4e13efea v3.5.1: live findings + /finding + Ctrl+O/expand + 3-way /stop (soft validate) + report URL + structured Typst + IIS/CMS/CVE agents
REPL interactivity & findings:
- Live findings registered during a run: /results shows them accumulating;
  /finding opens a selection menu with FULL details (PoC, command, evidence,
  CVSS, OWASP/CWE, remediation). Past runs too.
- /expand (and Ctrl+O) dump the last full, untruncated commands.
- Findings colored by severity in the feed (not all-yellow); confirmed vote = green.

Stop & report:
- CRITICAL: /stop no longer kills validation. New SOFT stop (pool.soft) halts
  launching new agents but lets in-flight + VALIDATION finish — so confirmed
  findings are kept. /stop now asks 3 ways: [1] validate then report,
  [2] report raw (no validation), [3] discard.
- Report file:// URL printed on completion/stop.

Report:
- Typst report restructured: executive summary, a Vulnerability Summary TABLE
  (#, vuln, severity, CVSS, OWASP/CWE), and per-finding sections with criticality,
  CVSS, OWASP/CWE, description/impact, PoC, evidence, remediation. owasp passed through.

Agents: +14 app-stack/CVE (IIS tilde/WebDAV/ViewState/debug/handler-bypass,
CMS fingerprint + WordPress/Joomla/Drupal/default-admin, app-server consoles,
exposed VCS, known-CVE & outdated-component exploitation) → 343 total.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-24 23:21:43 -03:00
CyberSecurityUP e8df48af9e v3.5.0: orchestration chaining + rich REPL (rustyline, model arrow-select, persistent history) + model-aware /key
Harness:
- Exploit-chaining round: after validation, chain confirmed findings into deeper
  impact (SSRF→metadata, SQLi→dump→reuse, IDOR→ATO, file-read→secrets→RCE),
  validate the new findings, merge. Wired into black-box and greybox.
- Latest top models surfaced: claude-opus-4-8, gpt-5.1/gpt-5.1-codex, gemini-3-pro.

REPL:
- Real line editing via rustyline: ↑/↓ command-history recall, Ctrl-A/E/K, paste;
  Ctrl-C cancels the line, Ctrl-D exits. Command history persists to
  data/repl_history.txt. Graceful plain-stdin fallback when not a TTY.
- /model with no arg → arrow-key multi-select (dialoguer); with arg accepts any
  provider:model names.
- /key is model-aware: lists the providers your selected models need (set/missing)
  and prompts for the missing keys; /key <prov> <key> still works.
- Run history persists to data/repl_runs.json and reloads across sessions
  (/runs lists past + current; /results /report /status by run number).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-24 20:33:13 -03:00
CyberSecurityUP 3ca3f269ee v3.4.x: intelligent agent selection, whitebox, recon/code agents, Gemini, artifacts, RL, XBOW GUI
Harness intelligence:
- After recon, the model SELECTS which specialist agents match the target
  (select_agents) — runs the relevant subset, not blindly top-N
- RL reward store (rl.rs): per-agent weights persist to data/rl_state_rs.json,
  reward validated findings (severity-weighted), decay idle, bias next run
- Run artifacts persisted as JSON + MD (recon, exploitation transcript,
  findings, html report) under runs/<target>-<ts>/ for reuse by other AIs

Whitebox mode:
- run_whitebox: walks a repo, builds bounded source context, runs code agents,
  validates by adversarial vote. CLI `whitebox <path>` + web "White-box" mode

Agents: +12 recon (subdomain/tech/js/api/secrets/dns/content/param/waf/cloud/
graphql/osint) and +24 code SAST reviewers (sqli/cmdi/path/ssrf/xss/deser/
secrets/crypto/authz/idor/xxe/redirect/ssti/race/eval/csrf/random/logging/
upload/mass-assign/jwt/cors). Loader gains recon/ + code/ categories → 249 total

Models: +Google Gemini provider (API + gemini CLI subscription); installed_cli_
backends now detects gemini; chat_cli handles gemini/codex/grok + optional
Playwright MCP (.mcp.json) on the subscription path with autonomy flags

GUI: full XBOW-style redesign — sidebar (Operate/Library), topbar status, mode
segment (black-box/white-box), model panel, live console, severity cards,
agent browser with category filters, models view; responsive + aligned

Verified: cargo build --release clean; CLI agents/whitebox; LIVE subscription
run shows model selecting 23→4 agents, RL update, artifacts written; GUI +
white-box toggle in Playwright.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-23 11:39:56 -03:00
CyberSecurityUP 9c4f912323 chore: stop tracking generated report_rs.html
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-21 21:33:42 -03:00
CyberSecurityUP 56d3f0c723 NeuroSploit v3.4.0 — Rust multi-model harness + Axum dashboard
New cargo workspace `neurosploit-rs/` (single `neurosploit` binary):

harness crate:
- models.rs: 11 OpenAI-compatible providers / 31 models (Claude, GPT, Grok,
  NVIDIA NIM, DeepSeek, Mistral, Qwen, Groq, Together, OpenRouter, Ollama)
- pool.rs: ModelPool with bounded concurrency, provider failover, and N-model
  validator voting (the panel doubles as the jury)
- agents.rs: loads the existing agents_md/ library (213 agents)
- pipeline.rs: recon → parallel exploit (semaphore-bounded) → N-model
  adversarial vote → score; streams live progress over a channel
- report.rs: HTML report
- tokio + reqwest(rustls); offline mode runs the pipeline without API keys

app binary:
- clap CLI: serve | run | agents | models  (run supports --model x N, --vote-n,
  --max-agents, --offline)
- axum web dashboard with multi-model panel, live console, findings, agent
  browser, embedded report; single binary serves the SPA (no npm/build)

Verified: cargo build clean; agents/models/offline-run CLI; server endpoints
(/api/info, /api/run lifecycle, /report); dashboard + live run in Playwright.

Docs: README v3.4.0 callout + RELEASE.md notes. target/ gitignored.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-21 19:58:43 -03:00
CyberSecurityUP a5badefc29 v3.3.0 GUI dashboard + reports + model expansion + root fix
Engine:
- Fix: inject IS_SANDBOX=1 so Claude Code's --dangerously-skip-permissions
  works under root (real backend runs were exiting rc=1 immediately)
- models: expand to 40 models / 13 providers, tagged CLI vs API
  (NVIDIA NIM, DeepSeek, Mistral, Qwen/DashScope, Groq, Together, OpenRouter,
  Ollama, Gemini) — Qwen/DeepSeek/Llama usable via API
- backends: on_start callback surfaces the exact argv ("what runs behind it")
- orchestrator: require a Playwright screenshot per confirmed finding; collect
  results/activity.json; auto-generate reports after a run
- report.py: HTML always + PDF via Typst engine (.typ source emitted too)

Web dashboard (webgui/, stdlib only — no npm/build):
- Sidebar dashboard (PentAGI-style): Run / Agents / Insights / Reports / Settings
- Multi-target runs; live execution console + per-task activity; finding cards
  with screenshots; backend+provider+model pickers (CLI & API)
- Agents tab: browse 213 + add new .md agents from the UI
- Insights: interactive RL-weight + severity charts
- Reports: download/preview PDF + HTML
- Settings/API: execution mode, per-provider API keys, orchestrator, verbosity
- Endpoints: /api/agents (GET/POST), /api/rl, /api/config, /api/reports,
  /reports/* + /shots/* static serving

Cleanup: retire replaced web stack (frontend React, FastAPI backend, core
orchestration, old test) to legacy/. Active engine + GUI are fully standalone.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-14 23:26:11 -03:00
CyberSecurityUP 22a7302a35 Add minimalist web GUI for the v3.3.0 engine
Zero-dependency (stdlib http.server) front-end exposing only the essential
options — URL, backend, model, collaborator, RL + Playwright-MCP toggles — with
a live progress console. Calls neurosploit_agent directly; no npm/build.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-14 22:33:12 -03:00
CyberSecurityUP 55af0d4634 NeuroSploit v3.3.0 — Autonomous MD-Agent Engine
Re-model the pentest agent into an autonomous, markdown-driven engine that
turns a URL into a full engagement and delegates execution to a locally
installed agentic CLI backend.

Engine (neurosploit_agent/ + ./neurosploit launcher):
- orchestrator composes ONE master prompt from the agent library + RL weights
- backends: auto-detect & drive Claude Code / Codex / Grok CLI (+ Claude
  subscription); headless, autonomous, isolated workdir
- mcp: Playwright MCP (.mcp.json) for browser-based proof-of-execution
- rl: bounded per-agent reinforcement-learning weights w/ per-tech affinity,
  persisted to data/rl_state.json
- models: latest registry incl. NVIDIA NIM provider (PR #28)
- cli: interactive URL prompt + one-shot `run`, `backends`, `agents`, --dry-run

Agent library (agents_md/, 213 total):
- 196 vuln specialists incl. modern LLM/AI, cloud/K8s, API/auth, advanced
  injection, protocol smuggling, logic/crypto/supply-chain classes
- 17 meta-agents: orchestrator, recon, exploit_validator,
  false_positive_filter, severity_assessor, impact_evaluator, reporter,
  rl_feedback + migrated expert roles
- scripts/build_agents.py data-driven builder; REGISTRY.md index

Docs: rewritten README.md, v3.3.0 RELEASE.md, .env.example (NVIDIA NIM, xAI,
engine vars).

Retire legacy Python orchestration (neurosploit.py + agent classes) to legacy/.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-14 20:57:38 -03:00
CyberSecurityUP 59f8f42d80 NeuroSploit v3.2.4 - MD Agent Orchestrator Overhaul + Claude 4.6 + SmartRouter Failover
- MD Agent system restructured: real HTTP exploitation, retry with exponential backoff, reduced concurrency (2 parallel, 2s stagger)
- Claude 4.6 model support (Opus/Sonnet) with corrected API version headers
- SmartRouter true failover with provider preference cascade
- WAFResult attribute error fix in autonomous_agent.py
- CVSS data sanitization for all vulnerability database saves
- AI recon JSON parsing robustness improvements
- rebuild.sh simplified from 714 to 196 lines
- Frontend: removed unused routes, simplified Auto Pentest page
- Agent grid: reduced max tests per agent (8→5), condensed recon prompts

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-29 20:25:01 -03:00
CyberSecurityUP 7563260b2b NeuroSploit v3.2.3 - Multi-Agent Security Testing Framework
- Added 107 specialized MD-based security testing agents (per-vuln-type)
- New MdAgentLibrary + MdAgentOrchestrator for parallel agent dispatch
- Agent selector UI with category-based filtering on AutoPentestPage
- Azure OpenAI provider support in LLM client
- Gemini API key error message corrections
- Pydantic settings hardened (ignore extra env vars)
- Updated .gitignore for runtime data artifacts

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-16 18:59:22 -03:00
CyberSecurityUP 79acfe04a3 NeuroSploit v3.2.1 - AI-Everywhere Auto Pentest + Container Fix + Deep Recon Overhaul
## AI-Everywhere Auto Pentest
- Pre-stream AI master planning (_ai_master_plan) runs before parallel streams
- Stream 1 AI recon analysis (Phase 9: hidden endpoint probing, priority routing)
- Stream 2 AI payload generation (replaces hardcoded payloads with context-aware AI)
- Stream 3 AI tool output analysis (real findings vs noise classification)
- 4 new prompt builders in ai_prompts.py (master_plan, junior_ai_test, tool_analysis, recon_analysis)

## LLM-as-VulnEngine: AI Deep Testing
- New _ai_deep_test() iterative loop: OBSERVE→PLAN→EXECUTE→ANALYZE→ADAPT (3 iterations max)
- AI-first for top 15 injection types, hardcoded fallback for rest
- Per-endpoint AI testing in Phase C instead of single _ai_dynamic_test()
- New system prompt context: deep_testing + iterative_testing
- Token budget adaptive: 15 normal, 5 when <50k tokens remain

## Container Fix (Critical)
- Fixed ENTRYPOINT ["/bin/bash", "-c"] → CMD ["bash"] in Dockerfile.kali
- Root cause: Docker ran /bin/bash -c "sleep" "infinity" → missing operand → container exit
- All Kali sandbox tools (nuclei, naabu, etc.) now start and execute correctly

## Deep Recon Overhaul
- JS analysis: 10→30 files, 11 regex patterns, source map parsing, parameter extraction
- Sitemaps: recursive index following (depth 3), 8 candidates, 500 URL cap
- API discovery: 7→20 Swagger/OpenAPI paths, 1→6 GraphQL paths, request body schema extraction
- Framework detection: 9 frameworks (WordPress, Laravel, Django, Spring, Express, ASP.NET, Rails, Next.js, Flask)
- 40+ common hidden/sensitive paths checked (.env, .git, /actuator, /debug, etc.)
- API pattern fuzzing: infers endpoints from discovered patterns, batch existence checks
- HTTP method discovery via OPTIONS probing
- URL normalization and deduplication

## Frontend Fixes
- Elapsed time now works for completed scans (computed from started_at→completed_at)
- Container telemetry: exit -1 shows "ERR" (yellow), duration shows "N/A" on failure
- HTML report rewrite: professional pentest report with cover page, risk gauge, ToC, per-finding cards, print CSS

## Other
- Updated rebuild.sh summary and validation
- Bug bounty training datasets added

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-23 17:55:28 -03:00
CyberSecurityUP e0935793c5 NeuroSploit v3.2 - Autonomous AI Penetration Testing Platform
116 modules | 100 vuln types | 18 API routes | 18 frontend pages

Major features:
- VulnEngine: 100 vuln types, 526+ payloads, 12 testers, anti-hallucination prompts
- Autonomous Agent: 3-stream auto pentest, multi-session (5 concurrent), pause/resume/stop
- CLI Agent: Claude Code / Gemini CLI / Codex CLI inside Kali containers
- Validation Pipeline: negative controls, proof of execution, confidence scoring, judge
- AI Reasoning: ReACT engine, token budget, endpoint classifier, CVE hunter, deep recon
- Multi-Agent: 5 specialists + orchestrator + researcher AI + vuln type agents
- RAG System: BM25/TF-IDF/ChromaDB vectorstore, few-shot, reasoning templates
- Smart Router: 20 providers (8 CLI OAuth + 12 API), tier failover, token refresh
- Kali Sandbox: container-per-scan, 56 tools, VPN support, on-demand install
- Full IA Testing: methodology-driven comprehensive pentest sessions
- Notifications: Discord, Telegram, WhatsApp/Twilio multi-channel alerts
- Frontend: React/TypeScript with 18 pages, real-time WebSocket updates
2026-02-22 17:59:28 -03:00