116 Commits

Author SHA1 Message Date
CyberSecurityUP 669ab44cef ci: cross-build macOS x64 on Apple-Silicon runner (avoid scarce macos-13)
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-27 09:53:30 -03:00
CyberSecurityUP 64decada3e v3.5.3 — Integrations (GitHub · GitLab · Jira)
New harness module `integrations` (+ app commands) wiring NeuroSploit into the
SDLC. Config persists per-project to .neurosploit/integrations.json; secrets are
NEVER stored — only the env-var name is saved, values read from the environment.

GitHub:
- private-repo clone (token injected into the clone URL for whitebox/greybox/tui)
- `neurosploit pr <owner/repo> <n>`: clone the PR head (refs/pull/N/head),
  white-box review, optional `--comment` (PR summary) and `--jira` (cards)
- `neurosploit watch <owner/repo> --branch --interval`: re-review on each new commit
GitLab:
- private-repo clone (oauth2 token) for whitebox/greybox (gitlab.com or self-hosted)
Jira:
- `--jira` on any engagement opens one card per finding (REST /issue, basic auth)

Control:
- `/integrations` (REPL): show · enable/disable · setup jira|gitlab|github
- `neurosploit integrations [show|enable|disable] [github|gitlab|jira]` (CLI)

Docs: README "Integrations" section + new TUTORIAL-INTEGRATION.md (per-tool setup,
scopes, recipes, troubleshooting). Version bumped 3.5.2 → 3.5.3.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
v3.5.3
2026-06-27 01:56:49 -03:00
CyberSecurityUP ae5bb247a3 ci: cross-platform release builds (linux x64/arm64, macos x64/arm64, windows)
GitHub Actions workflow that, on a pushed v* tag (or manual dispatch), builds a
self-contained NeuroSploit (binary + agents_md/) for every OS/arch and uploads
the archives to the matching release. macOS builds are also attached manually.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
v3.5.2
2026-06-26 14:25:22 -03:00
CyberSecurityUP d957429c09 feat(models): add Azure OpenAI provider + GOOGLE_API_KEY alias for Gemini
Resolves the only two open issues that still apply to the Rust build:
- #21 Azure OpenAI: new `azure` provider (OpenAI-compatible). Endpoint comes
  from AZURE_OPENAI_ENDPOINT, api-version from AZURE_OPENAI_API_VERSION
  (default 2024-10-21); the model name is the Azure deployment; auth uses the
  `api-key` header instead of Bearer. Use `--model azure:<deployment>`.
- #25 Gemini key confusion: GEMINI_API_KEY now also accepts GOOGLE_API_KEY
  (Google's standard env var) as an alias; local providers (ollama/litellm)
  require no key. .env.example documents both.

Kept under the v3.5.2 line (additive provider support).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-26 14:17:25 -03:00
CyberSecurityUP 761d3df444 feat: whitebox/greybox/repl accept a GitHub URL (auto-clone)
`whitebox <arg>`, `greybox --repo <arg>`, `tui --repo`, and the REPL `/repo`
now accept a git URL (https://github.com/owner/repo[.git], git@…, ssh://, *.git)
or an `owner/repo` shorthand. A new resolve_source() shallow-clones it into
<base>/repos/<name> (cached, .gitignored) and reviews it; existing local paths
are used unchanged. Works identically with API-key (--model) and --subscription.

Verified: `neurosploit whitebox https://github.com/digininja/DVWA --offline`
clones DVWA and runs the 78 code agents over 120KB of source.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-26 13:52:51 -03:00
Joas A Santos 489b3abd3f Merge pull request #31 from leebaird/main
Revise DNS reconnaissance methodology details.
2026-06-26 13:39:28 -03:00
CyberSecurityUP e4efa9bbb0 v3.5.2 — Exploitation Depth & Report Hygiene
Distilled from reviewing real AI-pentest output that kept stopping at "exposed"
instead of "exploited". Pure-additive, back-compatible.

Behavior (injected into black/grey/chain exploit prompts via DEPTH_DOCTRINE):
- Exposed → exploited: any info-disclosure / exposed service/WSDL / leaked
  credential|token / reachable dev host MUST be used before it's a finding;
  otherwise it's a lead, not a confirmed High/Critical.
- Chain across modules: reuse obtained session/JWT/cookie/credential and pivot
  to IDOR/privesc/exfil; report the chain, not isolated parts.
- Decode & fingerprint → CVE; audit tokens (alg-confusion/none/kid/JWKS, weak
  HS256 secret cracking, lifecycle).

Deterministic post-pass (new crates/harness/src/hygiene.rs, wired into finish()):
- calibrate severity to PROVEN impact — unproven High/Critical (hedged, no
  payload, thin evidence) capped to Medium and re-titled "(potential)";
- depth_audit — flag exposures on a host with no real exploit;
- hygiene_summary — advise consolidating hygiene classes repeated across assets.
Unit tests cover calibration + depth audit.

5 new doctrine meta-agents (scripts/build_methodology_v352.py → agents_md/meta/):
exploit_depth_doctrine, finding_chainer, artifact_decoder, token_auditor,
report_calibrator (meta 17→22, total 343→348).

Version bumped 3.5.1 → 3.5.2 across crates/app/installers/docs; RELEASE/README
updated.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-26 11:31:11 -03:00
Lee Baird 9d261b45e7 Revise DNS reconnaissance methodology details.
Updated methodology section for DNS reconnaissance.
2026-06-25 20:22:45 -05:00
CyberSecurityUP ac84db024c docs: add v3.5.1 release notes to RELEASE.md
Prepend the 3.5.x entry: interactive REPL, POMDP belief/grounding, infra/host
(SSH + Windows/AD), attack-chain & app-stack/CVE agents, LiteLLM, Mission-Control
TUI, structured Typst report, and the new run control (background /run, 3-way
/stop, crash recovery, pause-on-quota /continue).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
v3.5.1
2026-06-25 09:28:16 -03:00
CyberSecurityUP 734af8d839 chore: stop tracking per-project .neurosploit/ test state
These session/runs/history files are runtime state generated during local
testing; .neurosploit/ is already in .gitignore. Untrack them so the repo
doesn't carry test artifacts.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-25 09:26:09 -03:00
CyberSecurityUP 49dde7c637 feat(repl): pause-on-exhaustion + live findings checkpoint + instant stop
Token/quota exhaustion no longer silently drops agents. When every candidate
model is rate-limited / out of quota, the run PARKS (keeping all state) and
prints "⏸ token/quota exhausted … PAUSED". The user can:
  - wait for renewal and /continue (retry same model), or
  - /model <provider:model> (or the /model selector) then /continue to switch.
Implemented via ModelPool: is_exhaustion() detection, park_exhausted() that
awaits a resume Notify, and a fallback-model slot tried first on retry. /model
queues the chosen models into a paused run's fallback so a plain /continue
resumes on them.

Findings now survive a crash/quit: each finding is checkpointed live to
.neurosploit/active_run.json; on next launch an interrupted run is recovered
into /runs (a raw report is materialized) so /results, /finding and /report
keep working.

/stop now actually halts immediately on raw/discard: one() races the in-flight
model call against the hard-cancel flag, so the CLI child (kill_on_drop) is
terminated at once instead of finishing its whole command sequence. The
validate path still soft-stops (lets validation run).

Docs: TUTORIAL documents the 3-way /stop, crash recovery and pause/continue;
/help lists /continue and the new behaviors.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-25 00:41:22 -03:00
CyberSecurityUP 7dba912d3f chore: slim .env.example to the v3.5.1 Rust providers
Drop the legacy Python-stack settings (DATABASE_URL, HOST/PORT, RAG,
Kali sandbox, Discord/Telegram/Twilio, feature flags) that no longer
exist in the Rust harness. Keep only the provider API-key env vars the
model pool actually reads, plus the Ollama/LiteLLM base-URL overrides.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-24 23:59:25 -03:00
CyberSecurityUP 79f20b1456 docs: detailed white-box & grey-box instructions (TUTORIAL + README + /help)
- TUTORIAL 5.2 white-box: how source review works (context collection, agent
  selection, source→sink dataflow, file:line symbolic grounding, validation),
  examples and tips.
- TUTORIAL 5.3 grey-box: code review leads → live exploitation flow, auth via
  creds.yaml, MCP, REPL repo+target = greybox.
- README quick-start gains white-box / grey-box / host one-liners + tutorial link.
- REPL /help shows the MODES line (black/white/grey/host) and Ctrl-O hint.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-24 23:26:57 -03:00
CyberSecurityUP c69546c145 v3.5.1: LiteLLM support (OpenAI-compatible proxy)
- New `litellm` provider (kind=api). Use `litellm:<model>` — model names pass
  through to your gateway. No hardcoded key required (proxy may be open).
- Env-configurable base URL: LITELLM_BASE_URL (default http://localhost:4000/v1),
  LITELLM_API_KEY. OLLAMA_BASE_URL override added too.
- TUTORIAL documents the LiteLLM env config.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-24 23:24:16 -03:00
CyberSecurityUP eb4e13efea v3.5.1: live findings + /finding + Ctrl+O/expand + 3-way /stop (soft validate) + report URL + structured Typst + IIS/CMS/CVE agents
REPL interactivity & findings:
- Live findings registered during a run: /results shows them accumulating;
  /finding opens a selection menu with FULL details (PoC, command, evidence,
  CVSS, OWASP/CWE, remediation). Past runs too.
- /expand (and Ctrl+O) dump the last full, untruncated commands.
- Findings colored by severity in the feed (not all-yellow); confirmed vote = green.

Stop & report:
- CRITICAL: /stop no longer kills validation. New SOFT stop (pool.soft) halts
  launching new agents but lets in-flight + VALIDATION finish — so confirmed
  findings are kept. /stop now asks 3 ways: [1] validate then report,
  [2] report raw (no validation), [3] discard.
- Report file:// URL printed on completion/stop.

Report:
- Typst report restructured: executive summary, a Vulnerability Summary TABLE
  (#, vuln, severity, CVSS, OWASP/CWE), and per-finding sections with criticality,
  CVSS, OWASP/CWE, description/impact, PoC, evidence, remediation. owasp passed through.

Agents: +14 app-stack/CVE (IIS tilde/WebDAV/ViewState/debug/handler-bypass,
CMS fingerprint + WordPress/Joomla/Drupal/default-admin, app-server consoles,
exposed VCS, known-CVE & outdated-component exploitation) → 343 total.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-24 23:21:43 -03:00
CyberSecurityUP df73c0e134 v3.5.1 fix: critical char-boundary panic (was dropping findings) + background runs, progress bar, severity colors, /help
CRITICAL BUG: truncate()/source-context slices cut strings by BYTE, panicking on
a multibyte char (e.g. '—'). The panic crashed agent tasks → task.await returned
JoinError → unwrap_or_default() → empty RunOutput. Result: real confirmed findings
(win.ini traversal, HTML injection) were silently lost, workdir was empty, report
missing. Now all string truncation is char-safe (models.rs, pipeline.rs, repl.rs).

Also:
- Background runs: /run now runs in the BACKGROUND via rustyline's ExternalPrinter
  — the REPL keeps accepting commands while the engagement streams live. New
  /status (live phase + progress bar + findings) and /stop (graceful). Findings
  persist to history + report on completion (finalize_run ensures workdir is set
  even on abort, fixing "no report file in ").
- Progress bar: agents-done/total with %, shown in /status.
- Severity colors in the live feed (Critical=red…Info=grey); confirmed vote = green.
- /help reformatted into clear aligned sections.
- TUTORIAL: document non-blocking runs, /status progress, /stop, colors.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-24 23:04:50 -03:00
CyberSecurityUP ab0161ee53 v3.5.1 fix: view inserted config + clear REPL run boundaries
- BUG: /auth (and /creds /focus /target /repo) with no argument CLEARED the value
  instead of showing it — so typing /auth to view wiped your credential. Now no-arg
  prints the current value; clear only with an explicit `clear`.
- /show now also displays API-key status (set/missing) for the selected models'
  providers, and a hint of which commands edit config.
- REPL /run prints a clear "▶ RUNNING (prompt returns when done; use tui for live)"
  banner before and "◀ back to the NeuroSploit REPL" after, so it's obvious the
  REPL didn't disappear during a run.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-24 22:45:39 -03:00
CyberSecurityUP 16e45eb0a3 v3.5.1: robust README + detailed TUTORIAL.md + cross-platform install (Linux/macOS/Windows · x64/arm64)
- README rewritten: engagement-modes table, highlights, supported-platforms
  matrix, agents 329, links to the tutorial.
- TUTORIAL.md: full user guide — concepts, install, auth (API/subscription),
  models, all modes (black/white/grey/host), REPL, TUI, creds.yaml, steering,
  outputs/reports, per-project memory, POMDP/grounding/chaining, agent library,
  MCP, troubleshooting, command/flag reference.
- setup.sh: detect OS (Linux/macOS/Windows) + arch (x64/arm64); v3.5.1 banner.
- install.ps1: native Windows PowerShell one-liner (winget/rustup, build, PATH).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-24 22:39:10 -03:00
CyberSecurityUP 3f78a2b686 v3.5.1: REPL quick-wins — @ list-completion menu, /diff (what-changed), /retest
- Claude-Code-style @ menu: rustyline CompletionType::List so @path shows a
  file/folder selection list (Tab), not inline cycling.
- /diff (/changed): shows new (+) / gone (-) findings between the last two runs.
- /retest [n]: loads a past run's target/repo and seeds a re-verify focus on its
  findings → /run to check if they're fixed.
- Both added to Tab-complete and /help.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-24 22:32:26 -03:00
CyberSecurityUP 639c2209f7 v3.5.1: attack-chain agents (12) + per-project .neurosploit/ persistence & resume
Chaining:
- agents_md/chains/ (12 multi-stage exploitation playbooks): SQLi→RCE→LPE,
  SSRF→AWS-creds, SSRF→RCE, upload→RCE, upload→LFI→RCE→LPE, XSS→ATO, IDOR→ATO,
  SSTI→RCE→cloud, default-creds→domain, deserialization→RCE, exposed-git→RCE,
  subdomain-takeover→trusted-abuse. Each stage proven by a tool receipt before
  advancing; reports chains_from edges.
- Loaded as a `chains` category (→ 329 agents). chain_round now injects the chain
  recipes as a menu so the LLM applies proven multi-stage paths.

Persistence (no DB — structured state):
- Per-project `<cwd>/.neurosploit/` holding session.json (config), runs.json
  (history), history.txt (readline). REPL resumes target/repo/auth/focus/models
  on reopen; saves on /run and /quit.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-24 22:30:22 -03:00
CyberSecurityUP f8d70ce9c5 v3.5.1: infra/host engagements — IP + SSH/Windows-AD creds + Linux/Win/AD agents + REPL context bar
Infra:
- creds.yaml gains `ssh:` (host/port/user/password/key) and `windows:`/`ad:`
  (host/user/password/domain/ntlm-hash) blocks; multi-block YAML parser.
  host_instruction() tells agents how to authenticate to the host.
- 14 infra agents (agents_md/infra/): port/service scan, SMB enum, Linux privesc/
  sudo/cron/SSH, Windows privesc/SMB-signing/WinRM, AD kerberoast/asreproast/ACL/
  DCSync/default-creds. Loader gains `infra` category → 317 agents total.
- run_host pipeline + `neurosploit host <ip> --creds creds.yaml` (and Mode::Host
  in run_mode/TUI): host recon (nmap/netexec) → infra agent selection → test →
  validate → chain → report, with host tooling doctrine + supplied creds.

REPL:
- Context/status bar above the prompt: "model auth · cwd · mode▸target"
  (e.g. claude-opus-4-8 sub · /opt/projeto · black-box▸app.acme.com).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-24 22:17:14 -03:00
CyberSecurityUP 969af20a8e v3.5.1: Mission Control TUI (ratatui) — concurrent panels + composer active during run
- `neurosploit tui <url> [--repo ..] [--model ..] [--subscription] [--mcp] [--focus ..]`
- Concurrent ratatui UI driven by the engagement's live event stream:
  * fixed status header: target · mode · model · phase · elapsed · token/cost · findings · ⏸
  * live activity feed (color-coded: commands, recon, findings, errors)
  * live Findings panel (severity-styled) and a Targets map (hosts → state)
  * composer input that stays active WHILE the runner streams — local, non-blocking
    answers: `summary`/`what` (partial summary), `pause` (graceful stop), `errors`
    (filter), `clear`, or free-text notes.
- Engagement runs as a tokio task; UI drains an mpsc channel each ~120ms tick.
  Esc/Ctrl-C requests a graceful stop; report is generated on exit (status stopped/complete).
- Terminal setup before task spawn → clean error on non-TTY, no detached run.
- README documents the TUI mode.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-24 21:52:53 -03:00
CyberSecurityUP 78653e45cd v3.5.1: live findings feed + 🔔 notifications + automatic partial summary
- Live findings feed: each candidate is surfaced (✦ possible finding [sev] title
  @ endpoint) the moment an agent returns it, not only at the end.
- 🔔 notifications in the feed: evidence saved, phase complete (with severity
  breakdown = automatic partial summary). Renderer styles notify/finding tags.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-24 21:43:24 -03:00
CyberSecurityUP a8676fee0a v3.5.1: POMDP belief-state + value-of-information planner + grounded anti-hallucination
Partial observability is now first-class:

- belief.rs — property-graph world model; nodes (host/service/vuln/exploit/cred)
  carry a probability, not a boolean. Bayesian observation updates; per-node
  Shannon entropy; mean-uncertainty + recon-frontier. Black-box = diffuse priors
  that sharpen with observation; white-box collapses toward deterministic (MDP).
- pomdp.rs — value_of_information(), decide() (recon vs exploit falls out of
  belief entropy), and may_assert() — the mathematical anti-hallucination gate:
  no exploitability claim while the belief is diffuse (high entropy) → observe first.
- grounding.rs — verification engine, hard rule "no claim without a tool receipt":
  empirical grounding for black-box (raw HTTP/OOB/error markers), symbolic for
  white-box (file:line into reviewed source). Ungrounded claims demoted + flagged
  receipt_missing (feeds future reward shaping).
- pipeline.finish(): grounding gate before reporting + belief-uncertainty readout.
- bump 3.5.0 → 3.5.1; README documents the v3.5.1 belief/grounding architecture
  and the infra/bandit/reward roadmap.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-24 21:41:18 -03:00
CyberSecurityUP d4bd6d4877 v3.5.0: per-agent attribution + token/cost telemetry + graceful Ctrl-C (stop → generate/discard)
- Streamed Claude events now tagged with the agent label (@name) so every
  command/tool/file is attributable to the agent that ran it.
- Token/cost telemetry: parse usage from the stream-json result event; feed shows
  per-call in/out/cost and a running total in the run summary.
- Ctrl-C during a run no longer hard-kills: it cancels cooperatively (no new
  agents launch, in-flight bounded), then asks "generate report from partial
  results? [Y/n]" — discard removes the run dir. Second Ctrl-C aborts.
- pool: cancel handle + is_cancelled; one()/complete_routed/chat_cli carry a label.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-24 21:36:23 -03:00
CyberSecurityUP 702f22a87a v3.5.0: REPL quick-wins (Tab-complete, @file/@dir/@line, multiline, /theme, /attach, /context) + installer + README
REPL (rustyline Helper):
- Tab autocomplete for /commands and @filesystem-paths.
- @path attach: @file, @folder, @file:LINE / @file:START-END fold scope files /
  stack traces into the agent context; /attach <path> and /context to manage.
- Multiline input: end a line with `\` to continue (validator-driven).
- /theme color|mono, /config (=/show); history (↑/↓) persists as before.
- Attachments are merged into the run's instruction context.

Install:
- setup.sh: `curl … | bash` — auto-installs Rust, clones to ~/.neurosploit,
  builds release, links neurosploit into ~/.local/bin; idempotent; env-tunable.

README: v3.5.0, 🧠 (back to "neuro"), one-line install section, neurosploit-on-PATH usage.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
v3.5.0
2026-06-24 21:19:56 -03:00
CyberSecurityUP 1be053c4a2 v3.5.0: attack graph + kill chain (OWASP/CWE/MITRE) + GPT 5.5/5.4/5.3-codex/5.2 + report graph
- Finding enriched with owasp / mitre / kill-chain stage / exploitability /
  business_impact / chains_from (attack-path edges).
- attack_graph module: derive OWASP Top 10 + MITRE ATT&CK technique + kill-chain
  stage from CWE (heuristic, no extra model call); render a Mermaid attack-path
  flowchart (findings grouped by stage, explicit + implicit edges) and an ASCII
  kill chain for the REPL.
- enrich() runs in finish() for every engagement.
- HTML report gains an "Attack Path & Kill Chain" section (Mermaid via CDN, dark)
  plus a stage/sev/OWASP/MITRE/exploitability table.
- REPL print_findings shows the ASCII kill-chain + severity summary after a run.
- models: add GPT-5.5, GPT-5.4, GPT-5.4-mini, GPT-5.3-codex, GPT-5.2.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-24 21:14:06 -03:00
CyberSecurityUP d864ea8b8a v3.5.0: structured activity feed — stream Claude tool/command/file events as a categorized REPL conversation
Harness:
- ModelPool gains a progress channel (set_progress); chat_cli forwards it.
- New chat_claude_stream: drives Claude Code with --output-format stream-json and
  parses the event stream live — assistant text, and tool_use blocks categorized
  into tagged events (exec/danger command, read/edit file, net request/browser,
  grep/glob tool). 900s bound; clear error surfacing.
- Wired set_progress into run / whitebox / greybox.

REPL renderer (render_line):
- Tagged events render as the conversation feed: tool/command/network as compact
  CARDS (tool-runner visual), files/edits/AI text/states as iconized lines.
- Clear "what the AI is doing" states: reconning, planning, testing, validating,
  chaining, report, complete — plus a ⚠ DANGEROUS marker for risky commands.
- Untagged harness lines mapped to the same state vocabulary.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-24 21:04:51 -03:00
CyberSecurityUP e8df48af9e v3.5.0: orchestration chaining + rich REPL (rustyline, model arrow-select, persistent history) + model-aware /key
Harness:
- Exploit-chaining round: after validation, chain confirmed findings into deeper
  impact (SSRF→metadata, SQLi→dump→reuse, IDOR→ATO, file-read→secrets→RCE),
  validate the new findings, merge. Wired into black-box and greybox.
- Latest top models surfaced: claude-opus-4-8, gpt-5.1/gpt-5.1-codex, gemini-3-pro.

REPL:
- Real line editing via rustyline: ↑/↓ command-history recall, Ctrl-A/E/K, paste;
  Ctrl-C cancels the line, Ctrl-D exits. Command history persists to
  data/repl_history.txt. Graceful plain-stdin fallback when not a TTY.
- /model with no arg → arrow-key multi-select (dialoguer); with arg accepts any
  provider:model names.
- /key is model-aware: lists the providers your selected models need (set/missing)
  and prompts for the missing keys; /key <prov> <key> still works.
- Run history persists to data/repl_runs.json and reloads across sessions
  (/runs lists past + current; /results /report /status by run number).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-24 20:33:13 -03:00
CyberSecurityUP f21b96e8c1 v3.5.0: complete REPL — run history, /results, /report, /status, /offline
- RunOutput exposes `workdir` so the session can locate reports.
- Session now records every run (RunRecord: id, mode, target, workdir, findings).
- New commands:
    /runs            list runs done this session (mode, target, severity counts)
    /results [n]     show findings of run n (default last), severity-sorted
    /report [n]      open the PDF/HTML report (open/xdg-open)
    /status [n]      print the run's status.json
    /offline on|off  pipeline self-test toggle (no model calls)
- Each /run prints "saved as run #n" with the quick commands.
- Verified offline: run → /runs → /results → /status all work.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-24 20:21:35 -03:00
CyberSecurityUP ae3e49f133 v3.5.0: automated login — execute the login flow and capture the live session
- harness/creds::login(): performs the real HTTP login (POST/GET form), captures
  a session Cookie from Set-Cookie or a Bearer token from the JSON body, with a
  soft success check (no hard fail on 302). Redirects not followed so Set-Cookie
  is visible.
- apply_creds is now async: direct material (jwt/header/cookie) used as-is; a
  `login:` flow is EXECUTED to obtain a live session; on failure, falls back to
  instructing the agents to log in themselves.
- --creds + --focus added to `run` (authenticated black-box) too.
- Verified live against a local mock: POST /login → 302 + Set-Cookie captured as
  the auth header used on subsequent requests.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-24 20:14:58 -03:00
CyberSecurityUP 7b1be0b424 v3.5.0: greybox (code + live) pipeline + credentials (creds.yaml / JWT / auth)
- New GREYBOX mode: review a repo's source AND exploit the running app in one
  pipeline — code-review findings become LEADS injected into live exploitation.
  CLI: `neurosploit greybox <repo> --url <app> [--creds creds.yaml] [--focus ...]`
  REPL: set both /repo and /target → greybox auto-selected.
- Credentials (harness/src/creds.rs, dependency-free YAML subset): jwt / header /
  cookie, or an automated `login:` flow. Derives an auth header and/or a
  "authenticate first via curl" directive injected into prompts so agents test
  authenticated. --creds flag + /creds command + creds.example.yaml.
- RunConfig gains `repo`; run_engagement refactored to a Mode enum (Black/White/Grey).
- Verified offline: greybox loads creds, combines repo+URL, runs pipeline, writes report.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-24 20:11:39 -03:00
CyberSecurityUP 435463979b v3.5.0: Claude-Code-style interactive harness (REPL) + instruction-steered testing
- New persistent interactive session (app/src/repl.rs), launched when run with no args:
  banner, model selection, API-key config (/key) or subscription (/sub), then a live
  session to set /target, /repo, /auth, and free-text /focus instructions (or just type
  them) that STEER which agents run and how.
- Slash-commands: /model /providers /key /sub /target /repo /auth /focus /mcp /votes
  /agents /show /run /quit  (+ bare text = focus).
- RunConfig gains `instructions` and `auth`:
  * instructions bias both LLM agent-selection and the heuristic (focus keywords →
    injection/access-control/etc. agents get a strong boost)
  * operator directives (focus + auth) injected into recon and exploit prompts so agents
    test as an authenticated user and prioritise the requested vuln classes
- bump 3.4.1 → 3.5.0 (CLI, harness, reports, credits)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-24 19:58:35 -03:00
CyberSecurityUP 5d83e8848e v3.4.1: harness intelligence — router, ReAct, dedup, token-trim, configurable MCP, +54 code agents, credits
- Task-based model ROUTER (recon/select prefer a fast model; exploit prefers primary; validate uses a different model than the finder)
- ReAct doctrine injected into exploit prompts (Thought→Action→Observation, token-efficient)
- Dedup: unique agents per run + findings deduped by CWE/endpoint/title (highest confidence kept)
- Token economy: recon blob capped for selector + per-agent context
- Configurable MCP: merge user mcp.servers.json into the pipeline's .mcp.json
- +54 white-box/code-analysis agents (NoSQLi, LDAP/XPath, JWT-none, Java/.NET/PHP/Go/Node/Python
  specifics, SSTI, ReDoS, deserialization, etc.) → 303 agents total (78 code)
- Credits: Joas A Santos & Red Team Leaders (CLI banner, interactive header, HTML+Typst report)
- README: GitHub stars/forks badges, 60-second quick start, full API config steps, intuitive layout

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-24 19:49:01 -03:00
CyberSecurityUP deca20d11f docs: README — how to run via API (keys, provider→env→endpoint table) + subscription
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
v3.4.1
2026-06-24 19:40:00 -03:00
CyberSecurityUP 0a2cf58d9e v3.4.1: slim Rust-only branch
Keep only the Rust harness (neurosploit-rs/) + the agent library (agents_md/) it
loads at runtime, plus docs. Remove the Python engine, web GUIs, legacy stack,
docker, build scripts and scratch test files from THIS branch only (other
branches keep everything). Rust-focused README with Kali/Docker + tool-install
guidance and testphp/DVWA usage examples.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-24 19:36:16 -03:00
CyberSecurityUP 96f00c1c68 v3.4.1: CLI-only Rust harness — interactive wizard, smart selection, tool doctrine, Typst, status
- Remove Rust web server (axum/tower-http); CLI-only binary
- Verbose logging (-v) + unique run-id output folder runs/ns-<ts>-<target>/
- status.json lifecycle (running → complete) + ✓ COMPLETE summary
- Interactive wizard when run with no args; detailed --help with testphp/DVWA examples + Kali tip
- Tool-usage doctrine injected into recon/exploit prompts: curl + rustscan/nmap
  (apt/brew/cargo install guidance) + browser via Playwright when present, else curl
- Smart recon-aware selection: map recon signals → agent categories, only run
  matching agents; heuristic fallback when LLM selection is empty
- Cross-model false-positive validation: voting prefers a model other than the finder
- Playwright MCP auto-provision (npx) + per-backend support (claude/codex; gemini/grok degrade)
- Gemini provider (API + gemini CLI subscription)
- Typst report (report.typ + compiled report.pdf) via blank structured template
- Lenient finding parsing (confidence as word/number) — fixes empty-results bug
- bump version 3.4.0 -> 3.4.1

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-24 19:34:13 -03:00
CyberSecurityUP e565270f43 fix: lenient finding parsing — models return confidence as words/strings
Root cause of empty results: models emit findings with confidence as a string
('High') or cvss as a number, but the Finding struct typed confidence as f64, so
serde failed the ENTIRE array on any mismatch -> 0 findings every run.

extract_findings now parses into serde_json::Value and coerces each field
(string/number/word), normalizes severity, and accepts qualitative confidence
(High->0.9 etc). Verified live: whitebox on a vulnerable sample now yields
validated findings (IDOR confirmed by vote).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-23 19:49:37 -03:00
CyberSecurityUP c6fd5d6ac8 fix: resilient subscription CLI calls (retry, richer errors, capped concurrency)
The 'recon failed (claude subscription CLI failed: )' was a transient CLI failure
(rate limit / cold start) reported with a blank message and no retry.

- chat_cli: on non-zero exit, surface exit code + stdout (CLI writes the real
  reason there, not stderr); treat empty output as an error
- pool.one(): retry up to 3x with backoff for transient failures (both
  subscription and API paths)
- with_auth: cap concurrency to 3 on the subscription path — spawning many
  parallel CLI processes itself trips provider rate limits

Verified: live subscription run recovers and completes recon → select → exploit
→ vote → artifacts.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-23 13:07:55 -03:00
CyberSecurityUP 9dfcea87bc docs: update README for v3.4.0 (Rust harness, whitebox, 249 agents, Gemini, intelligent selection)
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-23 11:51:07 -03:00
CyberSecurityUP 3ca3f269ee v3.4.x: intelligent agent selection, whitebox, recon/code agents, Gemini, artifacts, RL, XBOW GUI
Harness intelligence:
- After recon, the model SELECTS which specialist agents match the target
  (select_agents) — runs the relevant subset, not blindly top-N
- RL reward store (rl.rs): per-agent weights persist to data/rl_state_rs.json,
  reward validated findings (severity-weighted), decay idle, bias next run
- Run artifacts persisted as JSON + MD (recon, exploitation transcript,
  findings, html report) under runs/<target>-<ts>/ for reuse by other AIs

Whitebox mode:
- run_whitebox: walks a repo, builds bounded source context, runs code agents,
  validates by adversarial vote. CLI `whitebox <path>` + web "White-box" mode

Agents: +12 recon (subdomain/tech/js/api/secrets/dns/content/param/waf/cloud/
graphql/osint) and +24 code SAST reviewers (sqli/cmdi/path/ssrf/xss/deser/
secrets/crypto/authz/idor/xxe/redirect/ssti/race/eval/csrf/random/logging/
upload/mass-assign/jwt/cors). Loader gains recon/ + code/ categories → 249 total

Models: +Google Gemini provider (API + gemini CLI subscription); installed_cli_
backends now detects gemini; chat_cli handles gemini/codex/grok + optional
Playwright MCP (.mcp.json) on the subscription path with autonomy flags

GUI: full XBOW-style redesign — sidebar (Operate/Library), topbar status, mode
segment (black-box/white-box), model panel, live console, severity cards,
agent browser with category filters, models view; responsive + aligned

Verified: cargo build --release clean; CLI agents/whitebox; LIVE subscription
run shows model selecting 23→4 agents, RL update, artifacts written; GUI +
white-box toggle in Playwright.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-23 11:39:56 -03:00
CyberSecurityUP bf56184912 Merge v3.4.0 subscription backend into main 2026-06-22 16:59:38 -03:00
CyberSecurityUP d59f28f36d v3.4.0: subscription backend (Claude Code / Codex / Grok logins)
The Rust harness can now use models two ways:
- API: provider API key (OpenAI-compatible HTTP) — existing path
- Subscription: drive the locally-installed agentic CLI login directly, no API
  key (anthropic→claude, openai→codex, xai→grok)

- models.rs: ChatClient::chat_cli spawns the CLI (stdin prompt), cli_binary_for
  + installed_cli_backends + binary_in_path PATH detection
- pool.rs: ModelPool::with_auth(subscription); one() routes per model
- types/CLI: RunConfig.subscription + `run --subscription` flag
- web: /api/run honors "subscription"; /api/info reports detected cli_backends;
  SPA gets a "Use subscription" toggle

Verified live: `run --subscription --model anthropic:claude-haiku-4-5` drove the
Claude subscription end-to-end (recon + agent + vote) with no API key set.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-22 16:59:35 -03:00
CyberSecurityUP 9c4f912323 chore: stop tracking generated report_rs.html
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-21 21:33:42 -03:00
CyberSecurityUP a05a99e0f6 Merge NeuroSploit v3.4.0 — Rust multi-model harness into main 2026-06-21 19:59:33 -03:00
CyberSecurityUP 56d3f0c723 NeuroSploit v3.4.0 — Rust multi-model harness + Axum dashboard
New cargo workspace `neurosploit-rs/` (single `neurosploit` binary):

harness crate:
- models.rs: 11 OpenAI-compatible providers / 31 models (Claude, GPT, Grok,
  NVIDIA NIM, DeepSeek, Mistral, Qwen, Groq, Together, OpenRouter, Ollama)
- pool.rs: ModelPool with bounded concurrency, provider failover, and N-model
  validator voting (the panel doubles as the jury)
- agents.rs: loads the existing agents_md/ library (213 agents)
- pipeline.rs: recon → parallel exploit (semaphore-bounded) → N-model
  adversarial vote → score; streams live progress over a channel
- report.rs: HTML report
- tokio + reqwest(rustls); offline mode runs the pipeline without API keys

app binary:
- clap CLI: serve | run | agents | models  (run supports --model x N, --vote-n,
  --max-agents, --offline)
- axum web dashboard with multi-model panel, live console, findings, agent
  browser, embedded report; single binary serves the SPA (no npm/build)

Verified: cargo build clean; agents/models/offline-run CLI; server endpoints
(/api/info, /api/run lifecycle, /report); dashboard + live run in Playwright.

Docs: README v3.4.0 callout + RELEASE.md notes. target/ gitignored.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-21 19:58:43 -03:00
CyberSecurityUP a5badefc29 v3.3.0 GUI dashboard + reports + model expansion + root fix
Engine:
- Fix: inject IS_SANDBOX=1 so Claude Code's --dangerously-skip-permissions
  works under root (real backend runs were exiting rc=1 immediately)
- models: expand to 40 models / 13 providers, tagged CLI vs API
  (NVIDIA NIM, DeepSeek, Mistral, Qwen/DashScope, Groq, Together, OpenRouter,
  Ollama, Gemini) — Qwen/DeepSeek/Llama usable via API
- backends: on_start callback surfaces the exact argv ("what runs behind it")
- orchestrator: require a Playwright screenshot per confirmed finding; collect
  results/activity.json; auto-generate reports after a run
- report.py: HTML always + PDF via Typst engine (.typ source emitted too)

Web dashboard (webgui/, stdlib only — no npm/build):
- Sidebar dashboard (PentAGI-style): Run / Agents / Insights / Reports / Settings
- Multi-target runs; live execution console + per-task activity; finding cards
  with screenshots; backend+provider+model pickers (CLI & API)
- Agents tab: browse 213 + add new .md agents from the UI
- Insights: interactive RL-weight + severity charts
- Reports: download/preview PDF + HTML
- Settings/API: execution mode, per-provider API keys, orchestrator, verbosity
- Endpoints: /api/agents (GET/POST), /api/rl, /api/config, /api/reports,
  /reports/* + /shots/* static serving

Cleanup: retire replaced web stack (frontend React, FastAPI backend, core
orchestration, old test) to legacy/. Active engine + GUI are fully standalone.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-14 23:26:11 -03:00
CyberSecurityUP 22a7302a35 Add minimalist web GUI for the v3.3.0 engine
Zero-dependency (stdlib http.server) front-end exposing only the essential
options — URL, backend, model, collaborator, RL + Playwright-MCP toggles — with
a live progress console. Calls neurosploit_agent directly; no npm/build.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-14 22:33:12 -03:00
CyberSecurityUP 3de357bf18 Merge NeuroSploit v3.3.0 — Autonomous MD-Agent Engine into main
# Conflicts:
#	prompts/task_library.json
v3.3.0
2026-06-14 21:41:26 -03:00
CyberSecurityUP 55af0d4634 NeuroSploit v3.3.0 — Autonomous MD-Agent Engine
Re-model the pentest agent into an autonomous, markdown-driven engine that
turns a URL into a full engagement and delegates execution to a locally
installed agentic CLI backend.

Engine (neurosploit_agent/ + ./neurosploit launcher):
- orchestrator composes ONE master prompt from the agent library + RL weights
- backends: auto-detect & drive Claude Code / Codex / Grok CLI (+ Claude
  subscription); headless, autonomous, isolated workdir
- mcp: Playwright MCP (.mcp.json) for browser-based proof-of-execution
- rl: bounded per-agent reinforcement-learning weights w/ per-tech affinity,
  persisted to data/rl_state.json
- models: latest registry incl. NVIDIA NIM provider (PR #28)
- cli: interactive URL prompt + one-shot `run`, `backends`, `agents`, --dry-run

Agent library (agents_md/, 213 total):
- 196 vuln specialists incl. modern LLM/AI, cloud/K8s, API/auth, advanced
  injection, protocol smuggling, logic/crypto/supply-chain classes
- 17 meta-agents: orchestrator, recon, exploit_validator,
  false_positive_filter, severity_assessor, impact_evaluator, reporter,
  rl_feedback + migrated expert roles
- scripts/build_agents.py data-driven builder; REGISTRY.md index

Docs: rewritten README.md, v3.3.0 RELEASE.md, .env.example (NVIDIA NIM, xAI,
engine vars).

Retire legacy Python orchestration (neurosploit.py + agent classes) to legacy/.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-14 20:57:38 -03:00