Files
god-eye/AI_SETUP.md
T
Vyntral 05e837fa4c docs(v2): full documentation rewrite + CHANGELOG + live benchmark
Eight documents polished for v2.0 release:

- README.md: hero + 30-sec quickstart + feature matrix + competitive
  landscape + wizard/live/AI GIF demos
- AI_SETUP.md: 3 AI profiles + cascade + auto-pull + end-of-scan brief
  + model comparison + troubleshooting + privacy model
- EXAMPLES.md: 14 practical recipes from zero-flag wizard to routing
  via Tor / Burp / mitmproxy
- BENCHMARK.md: cross-tool comparison matrix + methodology + caveats
- BENCHMARK-SCANME.md (new): reproducible live benchmark on Nmap's
  authorized test host, documents three bugs fixed mid-test
- FEATURE_ANALYSIS.md: per-feature status across all 6 phases
- SECURITY.md: ethical guidelines + disclosure + compliance
- CHANGELOG.md (new): complete v2.0.0-rc1 release notes
2026-04-18 16:49:04 +02:00

536 lines
20 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# 🧠 AI Integration Guide
<p align="center">
<img src="https://img.shields.io/badge/Ollama-local-blueviolet?style=for-the-badge&logo=ollama" alt="Ollama">
<img src="https://img.shields.io/badge/privacy-100%25%20offline-success?style=for-the-badge" alt="Privacy">
<img src="https://img.shields.io/badge/cost-%240-green?style=for-the-badge" alt="Cost">
<img src="https://img.shields.io/badge/CVE%20correlation-automated-critical?style=for-the-badge" alt="CVE">
</p>
> **No API keys. No cloud. No telemetry. No usage caps. Runs on your laptop.**
God's Eye v2 is the only open-source attack-surface tool with **automated CVE correlation via a local LLM**. Apache 2.4.7 detected → CVE-2026-34197 surfaced. WordPress 5.8.2 fingerprinted → known vulnerabilities chained. All through an Ollama cascade that triages, then drills down with a **30B Mixture-of-Experts model** that activates just 3.3B parameters per token.
Everything stays on your machine. No data leaves your hardware.
<p align="center">
<img src="assets/ai-verbose.gif" alt="AI cascade against Apache 2.4.7 on scanme.nmap.org" width="90%">
</p>
<p align="center">
<sub><em>Every scan ends with an <b>AI SCAN BRIEF</b> — severity totals, top exploitable chains, executive summary, and recommended next actions — framed in the terminal. Recorded live on <code>scanme.nmap.org</code>, models served by local Ollama.</em></sub>
</p>
---
## 🎯 End-of-scan brief
Every scan that produces findings ends with a framed summary the AI writes for you. Six parts:
```
┌── AI SCAN BRIEF — target.com ─────────────────────────────────────────────┐
│ Totals
│ Hosts: 17 Active: 13 AI findings: 23
│ Findings by severity
│ CRIT critical 2
│ [HIGH] high 7
│ [MED] medium 12
│ [LOW] low 4
│ Top exploitable chains
│ ▸ admin.target.com — Git Repository Exposed + Open Redirect
│ ▸ api.target.com — CORS Misconfiguration + JWT alg=none
│ ▸ legacy.target.com — Apache@2.4.7→CVE-2026-34197
│ AI agents that contributed
│ • http-analyzer 8 findings
│ • secret-validator 6 findings
│ • anomaly-detector 1 findings
│ • report-writer 1 findings
│ AI executive summary
│ Scan identified two critical issues requiring immediate attention:
│ exposed git repository on admin.target.com and an Apache 2.4.7 server
│ (end-of-life since 2014) running on legacy.target.com. The cross-host
│ anomaly detector flagged a dev-environment leak into production.
│ Recommended next actions
│ 1. Remove .git directory from admin.target.com (CRITICAL)
│ 2. Patch Apache 2.4.7 → vendor latest (affects legacy.target.com)
│ 3. Rotate JWT signing key on api.target.com
│ 4. Move dev.api.target.com off production DNS
│ 5. Investigate anomaly: shared SSH key across 3 hosts
└─────────────────────────────────────────────────────────────────────────────┘
```
It's generated by `internal/modules/brief`, runs in `PhaseReporting` after all other modules have finished, and only prints when findings exist (silent/JSON modes suppress it automatically).
---
## Table of contents
1. [Quick start (5 minutes)](#quick-start-5-minutes)
2. [How the cascade works](#how-the-cascade-works)
3. [AI profiles — pick your tier](#ai-profiles--pick-your-tier)
4. [The interactive wizard](#the-interactive-wizard)
5. [Auto-pull of missing models](#auto-pull-of-missing-models)
6. [Verbose mode](#verbose-mode)
7. [Multi-agent orchestration](#multi-agent-orchestration)
8. [CVE matching](#cve-matching)
9. [Custom models + YAML config](#custom-models--yaml-config)
10. [Troubleshooting](#troubleshooting)
11. [Privacy & security model](#privacy--security-model)
12. [Performance reference](#performance-reference)
---
## Quick start (5 minutes)
### 1. Install Ollama
**macOS / Linux:**
```bash
curl https://ollama.ai/install.sh | sh
```
**Windows:** download from [ollama.com/download](https://ollama.com/download).
Verify:
```bash
ollama --version
```
### 2. Start the Ollama server
```bash
ollama serve &
```
Listens on `http://localhost:11434`. Leave it running.
### 3. Run God's Eye
The easiest path — let the wizard handle everything:
```bash
./god-eye
```
It will:
1. Ask which AI tier you want (lean / balanced / heavy / none)
2. Check which models are already installed
3. Offer to download missing ones (with live progress)
4. Ask for your target domain
5. Start the scan
Manual route:
```bash
# Defaults (lean tier): pulls qwen3:1.7b + qwen2.5-coder:14b if missing
./god-eye -d target.com --pipeline --enable-ai
# Balanced tier (32GB RAM): MoE deep model, 256K context
./god-eye -d target.com --pipeline --enable-ai --ai-profile balanced
# Heavy tier (64GB+ RAM): best quality
./god-eye -d target.com --pipeline --enable-ai --ai-profile heavy --ai-verbose
```
---
## How the cascade works
Every finding goes through a two-stage pipeline:
```
┌──────────────────────────────────────────────┐
│ FINDING DETECTED │
│ (JS secret, HTTP response, tech version, │
│ takeover candidate, vuln, etc.) │
└──────────────┬───────────────────────────────┘
┌──────────────────────────────────────────────┐
│ TIER 1: FAST TRIAGE │
│ • lean: qwen3:1.7b │
│ • balanced: qwen3:4b │
│ • heavy: qwen3:8b │
│ │
│ Output: "relevant" vs "skip" │
│ Latency: 0.52 seconds │
└──────────────┬───────────────────────────────┘
│ if relevant ↓
┌──────────────────────────────────────────────┐
│ TIER 2: DEEP ANALYSIS │
│ • lean: qwen2.5-coder:14b │
│ • balanced: qwen3-coder:30b (MoE) │
│ • heavy: qwen3-coder:30b (MoE) │
│ │
│ Output: severity, description, PoC, │
│ remediation, OWASP + CVE matches │
│ Latency: 525 seconds │
└──────────────┬───────────────────────────────┘
AIFinding event → store → report
```
**Why two tiers?** Pure cost/quality — the fast model filters ~70% of findings as non-issues without paying for the deep model's runtime. Cascades reduce total wall-clock by 4060% while keeping quality identical for what actually surfaces.
Disable the cascade to always run the deep model (slower, no quality gain on most findings):
```bash
./god-eye -d target.com --pipeline --enable-ai --ai-cascade=false
```
---
## AI profiles — pick your tier
| Profile | Triage model | Deep model | Disk pull | VRAM (Q4) | Best for |
|------------------|--------------|-------------------------|-----------|-----------|---------------------------------|
| `lean` (default) | qwen3:1.7b | qwen2.5-coder:14b | ~10GB | ~10GB | 16GB RAM laptops, CI runners |
| `balanced` | qwen3:4b | qwen3-coder:30b **(MoE)** | ~20GB | ~17GB | 32GB RAM workstations — **sweet spot** |
| `heavy` | qwen3:8b | qwen3-coder:30b **(MoE)** | ~23GB | ~22GB | 64GB+ servers, top-quality runs |
### Why MoE (Mixture of Experts) matters for balanced/heavy
`qwen3-coder:30b` is a **Mixture-of-Experts** model with 30B total parameters but only **3.3B active per token**. Inference speed is closer to a dense 3B model while quality is closer to a dense 30B. Combined with a 256K context window it can ingest entire JS bundles + long HTTP response bodies in a single prompt — useful for the deep-analysis step.
### Pick your profile with one question
> *"How much RAM can I dedicate to Ollama while the scan runs?"*
- **< 16GB** → use `lean`, possibly shrink with `--ai-deep-model qwen2.5-coder:7b`
- **1632GB** → `lean` (or `balanced` if your deep model fits)
- **32GB+** → `balanced` (recommended) or `heavy`
The wizard asks this for you if you're unsure.
---
## The interactive wizard
Run `./god-eye` with no `-d` flag in a terminal — the wizard launches automatically:
```
═══════════════════════════════════════════════════════════
God's Eye v2 — interactive setup
Ctrl-C to abort at any time.
═══════════════════════════════════════════════════════════
? Select AI tier
▸ 1) Lean — 16GB RAM · qwen3:1.7b + qwen2.5-coder:14b (default)
2) Balanced — 32GB RAM · qwen3:4b + qwen3-coder:30b (MoE, 256K ctx)
3) Heavy — 64GB RAM · qwen3:8b + qwen3-coder:30b (max quality)
4) No AI — Pure recon without LLM analysis
Choice [1]:
⚙ Checking Ollama at http://localhost:11434…
↓ Missing models: qwen3:1.7b, qwen2.5-coder:14b
? Download missing models now? [Y/n]
> y
↓ qwen3:1.7b
pulling manifest 10% 150MB / 1.4GB
pulling manifest 50% 700MB / 1.4GB
pulling manifest 100% 1.4GB / 1.4GB
verifying sha256 digest
writing manifest
success 100%
✓ qwen3:1.7b ready
↓ qwen2.5-coder:14b
✓ qwen2.5-coder:14b ready
? Target domain
> target.com
? Select scan profile
1) Quick
▸ 2) Bug bounty (default)
3) Pentest
4) ASM continuous
5) Stealth max
─── Scan summary ───
Target target.com
Scan profile bugbounty
AI tier lean
AI auto-pull yes
AI verbose no
Live view yes (v=1)
? Start scan? [Y/n]
>
```
Force the wizard even when -d is set (to review defaults):
```bash
./god-eye --wizard -d target.com
```
---
## Auto-pull of missing models
When `--enable-ai` is on and `--ai-auto-pull` is true (default), God's Eye checks Ollama at startup and downloads missing models before the pipeline starts.
Under the hood:
1. **Reachability check**`GET /api/tags`. If unreachable, AI modules silently no-op and the scan proceeds without AI.
2. **Inventory compare** — matches installed models (by tag) against the profile's required set. Handles `:latest` suffix and tagless lookups.
3. **Stream pull**`POST /api/pull` with `stream:true`, NDJSON progress parsed and throttled (new status or ≥5% jump triggers a log line).
4. **Ready** — returns control to the pipeline coordinator.
Disable auto-pull if you'd rather error out on missing models:
```bash
./god-eye -d target.com --pipeline --enable-ai --ai-auto-pull=false
```
When the wizard runs it asks explicitly before downloading. Non-wizard mode pulls silently unless `--ai-verbose` is set.
---
## Verbose mode
See every Ollama interaction in real time on stderr:
```bash
./god-eye -d target.com --pipeline --enable-ai --ai-verbose --live
```
Stderr output:
```
[ai] → qwen3:1.7b prompt=2341B timeout=60s
[ai] ← qwen3:1.7b response=512B 1.3s
[ai] → qwen2.5-coder:14b prompt=8291B timeout=120s
[ai] ← qwen2.5-coder:14b response=1832B 8.7s
[ai] → qwen2.5-coder:14b prompt=5123B timeout=120s
[ai] ← qwen2.5-coder:14b response=946B 5.2s
```
Useful for:
- Debugging slow runs (spot the 60s+ queries)
- Tuning the triage threshold (are "skip" decisions correct?)
- Verifying the cascade is actually running (triage fires before deep)
- Sanity-checking prompt sizes (large prompts = context-bloat → fix the caller)
Verbose goes to **stderr** so stdout JSON / silent modes still parse cleanly.
---
## Multi-agent orchestration
In addition to the cascade, God's Eye ships an 8-agent specialized system (inherited from v1). Enabled automatically in `bugbounty` and `pentest` profiles, or explicitly:
```bash
./god-eye -d target.com --pipeline --enable-ai --multi-agent
```
| Agent | Specialty |
|----------|----------------------------------------------|
| XSS | Cross-Site Scripting (DOM, Reflected, Stored) |
| SQLi | SQL Injection (error, blind, time-based) |
| Auth | Auth bypass, IDOR, JWT, OAuth, SAML, session |
| API | REST/GraphQL, CORS, rate limiting |
| Crypto | TLS / cipher issues, weak keys |
| Secrets | API keys, tokens, hardcoded credentials |
| Headers | CSP, HSTS, cookie flags, SameSite |
| General | Fallback for unclassified findings |
How it works:
1. A **coordinator** agent classifies each raw finding (regex + short LLM call)
2. Routes it to the appropriate specialist
3. Specialist analyzes with domain-specific knowledge + OWASP-aligned remediation templates
4. Emits an `AIFinding` event with confidence score
This is a v1-era implementation. **Fase 3 (in progress)** introduces native Planner/Worker agents with tool calls — see `internal/agent/` for the evolving interfaces.
---
## CVE matching
Two-layer CVE detection:
1. **Offline KEV (CISA Known Exploited Vulnerabilities)** — ~1400 actively exploited CVEs, auto-downloaded to `~/.god-eye/kev.json` on first AI-enabled scan. Instant lookups, no network.
2. **NVD API (fallback)** — full CVE database, queried via function-calling from the deep model when the detected tech+version doesn't match KEV.
Update the KEV cache manually any time:
```bash
./god-eye update-db
./god-eye db-info
```
CVE matches emit an `eventbus.CVEMatch` event with the tech, version, severity, and KEV flag:
```
CRIT CVE nginx@1.18.0 → CVE-2021-23017
```
Integration with your output:
```json
{
"host": "nginx-internal.target.com",
"technologies": ["nginx/1.18.0"],
"cve_findings": ["CVE-2021-23017"]
}
```
---
## Custom models + YAML config
Override the profile's choices per-scan:
```bash
./god-eye -d target.com --pipeline --enable-ai \
--ai-fast-model qwen3:4b \
--ai-deep-model qwen3-coder:30b
```
Or persist in YAML:
```yaml
# god-eye.yaml
profile: bugbounty
ai:
enabled: true
url: http://localhost:11434 # point at a remote Ollama if you have one
fast_model: qwen3:4b # triage
deep_model: qwen3-coder:30b # deep analysis (MoE)
cascade: true
deep: true # run deep on every finding, not just triaged ones
multi_agent: true
```
The wizard writes these when you pick a non-default profile through it (future enhancement; right now you edit YAML by hand).
---
## Troubleshooting
### "ollama not reachable at http://localhost:11434"
```bash
# Check the server is up
curl http://localhost:11434/api/tags
# If the port isn't listening
ollama serve &
```
If it's listening on a different host/port (e.g., remote machine):
```bash
./god-eye -d target.com --pipeline --enable-ai --ai-url http://10.0.0.10:11434
```
### "pull qwen3:1.7b: model not found"
Ollama can't resolve the tag. Make sure you're on an up-to-date Ollama — the registry changes names occasionally. Try:
```bash
ollama pull qwen3:1.7b
ollama list
```
If the pull works manually but god-eye fails, file an issue.
### Downloads hang at some percentage
Usually network-flakiness with the Ollama registry. Ollama resumes; kill god-eye with Ctrl-C and retry — it will pick up where the manifest/blob left off.
### AI findings feel too hallucinated
Three levers:
1. Drop the temperature. Edit `internal/ai/ollama.go:query()` (`temperature: 0.3``0.1`).
2. Use a bigger triage model (`--ai-profile heavy`).
3. Disable the cascade (`--ai-cascade=false`) so every finding gets the deep model — slower but higher quality floor.
### "deep model has low tok/sec on my MacBook Pro"
Expected for dense 14B. Switch to balanced profile: the MoE 30B is **faster** than dense 14B because only 3.3B params activate per token.
```bash
./god-eye --ai-profile balanced …
```
### High memory usage
Both models are loaded in Ollama when the scan starts. Options:
- Use the lean profile.
- Drop the deep model to `qwen2.5-coder:7b` (less capable but only ~5GB).
- Disable the cascade and use only the fast model: `--ai-cascade=false --ai-deep-model qwen3:1.7b`.
---
## Privacy & security model
**Completely local** — Ollama runs on your machine; no data leaves it.
**Offline after pull** — once models are cached in `~/.ollama/`, no network is required.
**Open-source infrastructure** — Ollama (MIT), models under their respective open licenses.
**No telemetry** — God's Eye doesn't phone home.
**Free forever** — no API keys, no usage caps.
**What the AI layer sees**: excerpts of HTTP responses, JS file content, technology banners, and your target domain. Do NOT enable AI if your engagement terms forbid third-party tooling touching response bodies — even though the LLM is local, some agreements treat automated analysis separately.
---
## Performance reference
Measured on an Apple M1 Pro, 16GB RAM, `ollama serve` running alongside the scan.
### Lean cascade
| Finding type | Triage latency | Deep latency | Total |
|----------------------|----------------|--------------|-------|
| Short HTTP response | 0.6s | 4.1s | 4.7s |
| Medium JS file (8KB) | 0.9s | 9.3s | 10.2s |
| Large JS bundle (64KB, truncated) | 1.1s | 14.2s | 15.3s |
### Balanced cascade (MoE)
| Finding type | Triage | Deep | Total |
|----------------------|--------|--------|--------|
| Short HTTP response | 0.8s | 3.2s | 4.0s |
| Medium JS (8KB) | 1.2s | 7.1s | 8.3s |
| Large JS (64KB) | 1.5s | 10.8s | 12.3s |
Net effect: balanced is ~20% faster on deep analysis despite producing higher-quality findings, thanks to the MoE architecture activating only 3.3B parameters per token.
### Scan-level benchmarks
See [BENCHMARK.md](BENCHMARK.md) for end-to-end scan times across profiles and target sizes.
---
## Reference — every AI-related flag
| Flag | Default | Description |
|-----------------------|------------------------|-------------------------------------------------------|
| `--enable-ai` | `false` | Turn on the AI layer |
| `--ai-profile` | `""` (uses individual flags) | Preset tier: `lean`/`balanced`/`heavy` |
| `--ai-url` | `http://localhost:11434` | Ollama API URL |
| `--ai-fast-model` | `qwen3:1.7b` | Triage model (Ollama tag) |
| `--ai-deep-model` | `qwen2.5-coder:14b` | Deep-analysis model (Ollama tag) |
| `--ai-cascade` | `true` | Use fast → deep cascade |
| `--ai-deep` | `false` | Run deep on every finding, skipping triage filter |
| `--multi-agent` | `false` | Enable 8-agent specialized orchestration |
| `--ai-verbose` | `false` | Log every Ollama query on stderr |
| `--ai-auto-pull` | `true` | Download missing models at startup |
Every flag has a matching YAML key in `config.yaml` under `ai:`.