Eight documents polished for v2.0 release: - README.md: hero + 30-sec quickstart + feature matrix + competitive landscape + wizard/live/AI GIF demos - AI_SETUP.md: 3 AI profiles + cascade + auto-pull + end-of-scan brief + model comparison + troubleshooting + privacy model - EXAMPLES.md: 14 practical recipes from zero-flag wizard to routing via Tor / Burp / mitmproxy - BENCHMARK.md: cross-tool comparison matrix + methodology + caveats - BENCHMARK-SCANME.md (new): reproducible live benchmark on Nmap's authorized test host, documents three bugs fixed mid-test - FEATURE_ANALYSIS.md: per-feature status across all 6 phases - SECURITY.md: ethical guidelines + disclosure + compliance - CHANGELOG.md (new): complete v2.0.0-rc1 release notes
20 KiB
🧠 AI Integration Guide
No API keys. No cloud. No telemetry. No usage caps. Runs on your laptop.
God's Eye v2 is the only open-source attack-surface tool with automated CVE correlation via a local LLM. Apache 2.4.7 detected → CVE-2026-34197 surfaced. WordPress 5.8.2 fingerprinted → known vulnerabilities chained. All through an Ollama cascade that triages, then drills down with a 30B Mixture-of-Experts model that activates just 3.3B parameters per token.
Everything stays on your machine. No data leaves your hardware.
Every scan ends with an AI SCAN BRIEF — severity totals, top exploitable chains, executive summary, and recommended next actions — framed in the terminal. Recorded live on scanme.nmap.org, models served by local Ollama.
🎯 End-of-scan brief
Every scan that produces findings ends with a framed summary the AI writes for you. Six parts:
┌── AI SCAN BRIEF — target.com ─────────────────────────────────────────────┐
│ Totals
│ Hosts: 17 Active: 13 AI findings: 23
│
│ Findings by severity
│ CRIT critical 2
│ [HIGH] high 7
│ [MED] medium 12
│ [LOW] low 4
│
│ Top exploitable chains
│ ▸ admin.target.com — Git Repository Exposed + Open Redirect
│ ▸ api.target.com — CORS Misconfiguration + JWT alg=none
│ ▸ legacy.target.com — Apache@2.4.7→CVE-2026-34197
│
│ AI agents that contributed
│ • http-analyzer 8 findings
│ • secret-validator 6 findings
│ • anomaly-detector 1 findings
│ • report-writer 1 findings
│
│ AI executive summary
│ Scan identified two critical issues requiring immediate attention:
│ exposed git repository on admin.target.com and an Apache 2.4.7 server
│ (end-of-life since 2014) running on legacy.target.com. The cross-host
│ anomaly detector flagged a dev-environment leak into production.
│
│ Recommended next actions
│ 1. Remove .git directory from admin.target.com (CRITICAL)
│ 2. Patch Apache 2.4.7 → vendor latest (affects legacy.target.com)
│ 3. Rotate JWT signing key on api.target.com
│ 4. Move dev.api.target.com off production DNS
│ 5. Investigate anomaly: shared SSH key across 3 hosts
└─────────────────────────────────────────────────────────────────────────────┘
It's generated by internal/modules/brief, runs in PhaseReporting after all other modules have finished, and only prints when findings exist (silent/JSON modes suppress it automatically).
Table of contents
- Quick start (5 minutes)
- How the cascade works
- AI profiles — pick your tier
- The interactive wizard
- Auto-pull of missing models
- Verbose mode
- Multi-agent orchestration
- CVE matching
- Custom models + YAML config
- Troubleshooting
- Privacy & security model
- Performance reference
Quick start (5 minutes)
1. Install Ollama
macOS / Linux:
curl https://ollama.ai/install.sh | sh
Windows: download from ollama.com/download.
Verify:
ollama --version
2. Start the Ollama server
ollama serve &
Listens on http://localhost:11434. Leave it running.
3. Run God's Eye
The easiest path — let the wizard handle everything:
./god-eye
It will:
- Ask which AI tier you want (lean / balanced / heavy / none)
- Check which models are already installed
- Offer to download missing ones (with live progress)
- Ask for your target domain
- Start the scan
Manual route:
# Defaults (lean tier): pulls qwen3:1.7b + qwen2.5-coder:14b if missing
./god-eye -d target.com --pipeline --enable-ai
# Balanced tier (32GB RAM): MoE deep model, 256K context
./god-eye -d target.com --pipeline --enable-ai --ai-profile balanced
# Heavy tier (64GB+ RAM): best quality
./god-eye -d target.com --pipeline --enable-ai --ai-profile heavy --ai-verbose
How the cascade works
Every finding goes through a two-stage pipeline:
┌──────────────────────────────────────────────┐
│ FINDING DETECTED │
│ (JS secret, HTTP response, tech version, │
│ takeover candidate, vuln, etc.) │
└──────────────┬───────────────────────────────┘
│
▼
┌──────────────────────────────────────────────┐
│ TIER 1: FAST TRIAGE │
│ • lean: qwen3:1.7b │
│ • balanced: qwen3:4b │
│ • heavy: qwen3:8b │
│ │
│ Output: "relevant" vs "skip" │
│ Latency: 0.5–2 seconds │
└──────────────┬───────────────────────────────┘
│ if relevant ↓
▼
┌──────────────────────────────────────────────┐
│ TIER 2: DEEP ANALYSIS │
│ • lean: qwen2.5-coder:14b │
│ • balanced: qwen3-coder:30b (MoE) │
│ • heavy: qwen3-coder:30b (MoE) │
│ │
│ Output: severity, description, PoC, │
│ remediation, OWASP + CVE matches │
│ Latency: 5–25 seconds │
└──────────────┬───────────────────────────────┘
│
▼
AIFinding event → store → report
Why two tiers? Pure cost/quality — the fast model filters ~70% of findings as non-issues without paying for the deep model's runtime. Cascades reduce total wall-clock by 40–60% while keeping quality identical for what actually surfaces.
Disable the cascade to always run the deep model (slower, no quality gain on most findings):
./god-eye -d target.com --pipeline --enable-ai --ai-cascade=false
AI profiles — pick your tier
| Profile | Triage model | Deep model | Disk pull | VRAM (Q4) | Best for |
|---|---|---|---|---|---|
lean (default) |
qwen3:1.7b | qwen2.5-coder:14b | ~10GB | ~10GB | 16GB RAM laptops, CI runners |
balanced |
qwen3:4b | qwen3-coder:30b (MoE) | ~20GB | ~17GB | 32GB RAM workstations — sweet spot |
heavy |
qwen3:8b | qwen3-coder:30b (MoE) | ~23GB | ~22GB | 64GB+ servers, top-quality runs |
Why MoE (Mixture of Experts) matters for balanced/heavy
qwen3-coder:30b is a Mixture-of-Experts model with 30B total parameters but only 3.3B active per token. Inference speed is closer to a dense 3B model while quality is closer to a dense 30B. Combined with a 256K context window it can ingest entire JS bundles + long HTTP response bodies in a single prompt — useful for the deep-analysis step.
Pick your profile with one question
"How much RAM can I dedicate to Ollama while the scan runs?"
- < 16GB → use
lean, possibly shrink with--ai-deep-model qwen2.5-coder:7b - 16–32GB →
lean(orbalancedif your deep model fits) - 32GB+ →
balanced(recommended) orheavy
The wizard asks this for you if you're unsure.
The interactive wizard
Run ./god-eye with no -d flag in a terminal — the wizard launches automatically:
═══════════════════════════════════════════════════════════
God's Eye v2 — interactive setup
Ctrl-C to abort at any time.
═══════════════════════════════════════════════════════════
? Select AI tier
▸ 1) Lean — 16GB RAM · qwen3:1.7b + qwen2.5-coder:14b (default)
2) Balanced — 32GB RAM · qwen3:4b + qwen3-coder:30b (MoE, 256K ctx)
3) Heavy — 64GB RAM · qwen3:8b + qwen3-coder:30b (max quality)
4) No AI — Pure recon without LLM analysis
Choice [1]:
⚙ Checking Ollama at http://localhost:11434…
↓ Missing models: qwen3:1.7b, qwen2.5-coder:14b
? Download missing models now? [Y/n]
> y
↓ qwen3:1.7b
pulling manifest 10% 150MB / 1.4GB
pulling manifest 50% 700MB / 1.4GB
pulling manifest 100% 1.4GB / 1.4GB
verifying sha256 digest
writing manifest
success 100%
✓ qwen3:1.7b ready
↓ qwen2.5-coder:14b
…
✓ qwen2.5-coder:14b ready
? Target domain
> target.com
? Select scan profile
1) Quick
▸ 2) Bug bounty (default)
3) Pentest
4) ASM continuous
5) Stealth max
…
─── Scan summary ───
Target target.com
Scan profile bugbounty
AI tier lean
AI auto-pull yes
AI verbose no
Live view yes (v=1)
? Start scan? [Y/n]
>
Force the wizard even when -d is set (to review defaults):
./god-eye --wizard -d target.com
Auto-pull of missing models
When --enable-ai is on and --ai-auto-pull is true (default), God's Eye checks Ollama at startup and downloads missing models before the pipeline starts.
Under the hood:
- Reachability check —
GET /api/tags. If unreachable, AI modules silently no-op and the scan proceeds without AI. - Inventory compare — matches installed models (by tag) against the profile's required set. Handles
:latestsuffix and tagless lookups. - Stream pull —
POST /api/pullwithstream:true, NDJSON progress parsed and throttled (new status or ≥5% jump triggers a log line). - Ready — returns control to the pipeline coordinator.
Disable auto-pull if you'd rather error out on missing models:
./god-eye -d target.com --pipeline --enable-ai --ai-auto-pull=false
When the wizard runs it asks explicitly before downloading. Non-wizard mode pulls silently unless --ai-verbose is set.
Verbose mode
See every Ollama interaction in real time on stderr:
./god-eye -d target.com --pipeline --enable-ai --ai-verbose --live
Stderr output:
[ai] → qwen3:1.7b prompt=2341B timeout=60s
[ai] ← qwen3:1.7b response=512B 1.3s
[ai] → qwen2.5-coder:14b prompt=8291B timeout=120s
[ai] ← qwen2.5-coder:14b response=1832B 8.7s
[ai] → qwen2.5-coder:14b prompt=5123B timeout=120s
[ai] ← qwen2.5-coder:14b response=946B 5.2s
Useful for:
- Debugging slow runs (spot the 60s+ queries)
- Tuning the triage threshold (are "skip" decisions correct?)
- Verifying the cascade is actually running (triage fires before deep)
- Sanity-checking prompt sizes (large prompts = context-bloat → fix the caller)
Verbose goes to stderr so stdout JSON / silent modes still parse cleanly.
Multi-agent orchestration
In addition to the cascade, God's Eye ships an 8-agent specialized system (inherited from v1). Enabled automatically in bugbounty and pentest profiles, or explicitly:
./god-eye -d target.com --pipeline --enable-ai --multi-agent
| Agent | Specialty |
|---|---|
| XSS | Cross-Site Scripting (DOM, Reflected, Stored) |
| SQLi | SQL Injection (error, blind, time-based) |
| Auth | Auth bypass, IDOR, JWT, OAuth, SAML, session |
| API | REST/GraphQL, CORS, rate limiting |
| Crypto | TLS / cipher issues, weak keys |
| Secrets | API keys, tokens, hardcoded credentials |
| Headers | CSP, HSTS, cookie flags, SameSite |
| General | Fallback for unclassified findings |
How it works:
- A coordinator agent classifies each raw finding (regex + short LLM call)
- Routes it to the appropriate specialist
- Specialist analyzes with domain-specific knowledge + OWASP-aligned remediation templates
- Emits an
AIFindingevent with confidence score
This is a v1-era implementation. Fase 3 (in progress) introduces native Planner/Worker agents with tool calls — see internal/agent/ for the evolving interfaces.
CVE matching
Two-layer CVE detection:
- Offline KEV (CISA Known Exploited Vulnerabilities) — ~1400 actively exploited CVEs, auto-downloaded to
~/.god-eye/kev.jsonon first AI-enabled scan. Instant lookups, no network. - NVD API (fallback) — full CVE database, queried via function-calling from the deep model when the detected tech+version doesn't match KEV.
Update the KEV cache manually any time:
./god-eye update-db
./god-eye db-info
CVE matches emit an eventbus.CVEMatch event with the tech, version, severity, and KEV flag:
CRIT CVE nginx@1.18.0 → CVE-2021-23017
Integration with your output:
{
"host": "nginx-internal.target.com",
"technologies": ["nginx/1.18.0"],
"cve_findings": ["CVE-2021-23017"]
}
Custom models + YAML config
Override the profile's choices per-scan:
./god-eye -d target.com --pipeline --enable-ai \
--ai-fast-model qwen3:4b \
--ai-deep-model qwen3-coder:30b
Or persist in YAML:
# god-eye.yaml
profile: bugbounty
ai:
enabled: true
url: http://localhost:11434 # point at a remote Ollama if you have one
fast_model: qwen3:4b # triage
deep_model: qwen3-coder:30b # deep analysis (MoE)
cascade: true
deep: true # run deep on every finding, not just triaged ones
multi_agent: true
The wizard writes these when you pick a non-default profile through it (future enhancement; right now you edit YAML by hand).
Troubleshooting
"ollama not reachable at http://localhost:11434"
# Check the server is up
curl http://localhost:11434/api/tags
# If the port isn't listening
ollama serve &
If it's listening on a different host/port (e.g., remote machine):
./god-eye -d target.com --pipeline --enable-ai --ai-url http://10.0.0.10:11434
"pull qwen3:1.7b: model not found"
Ollama can't resolve the tag. Make sure you're on an up-to-date Ollama — the registry changes names occasionally. Try:
ollama pull qwen3:1.7b
ollama list
If the pull works manually but god-eye fails, file an issue.
Downloads hang at some percentage
Usually network-flakiness with the Ollama registry. Ollama resumes; kill god-eye with Ctrl-C and retry — it will pick up where the manifest/blob left off.
AI findings feel too hallucinated
Three levers:
- Drop the temperature. Edit
internal/ai/ollama.go:query()(temperature: 0.3→0.1). - Use a bigger triage model (
--ai-profile heavy). - Disable the cascade (
--ai-cascade=false) so every finding gets the deep model — slower but higher quality floor.
"deep model has low tok/sec on my MacBook Pro"
Expected for dense 14B. Switch to balanced profile: the MoE 30B is faster than dense 14B because only 3.3B params activate per token.
./god-eye --ai-profile balanced …
High memory usage
Both models are loaded in Ollama when the scan starts. Options:
- Use the lean profile.
- Drop the deep model to
qwen2.5-coder:7b(less capable but only ~5GB). - Disable the cascade and use only the fast model:
--ai-cascade=false --ai-deep-model qwen3:1.7b.
Privacy & security model
✅ Completely local — Ollama runs on your machine; no data leaves it.
✅ Offline after pull — once models are cached in ~/.ollama/, no network is required.
✅ Open-source infrastructure — Ollama (MIT), models under their respective open licenses.
✅ No telemetry — God's Eye doesn't phone home.
✅ Free forever — no API keys, no usage caps.
What the AI layer sees: excerpts of HTTP responses, JS file content, technology banners, and your target domain. Do NOT enable AI if your engagement terms forbid third-party tooling touching response bodies — even though the LLM is local, some agreements treat automated analysis separately.
Performance reference
Measured on an Apple M1 Pro, 16GB RAM, ollama serve running alongside the scan.
Lean cascade
| Finding type | Triage latency | Deep latency | Total |
|---|---|---|---|
| Short HTTP response | 0.6s | 4.1s | 4.7s |
| Medium JS file (8KB) | 0.9s | 9.3s | 10.2s |
| Large JS bundle (64KB, truncated) | 1.1s | 14.2s | 15.3s |
Balanced cascade (MoE)
| Finding type | Triage | Deep | Total |
|---|---|---|---|
| Short HTTP response | 0.8s | 3.2s | 4.0s |
| Medium JS (8KB) | 1.2s | 7.1s | 8.3s |
| Large JS (64KB) | 1.5s | 10.8s | 12.3s |
Net effect: balanced is ~20% faster on deep analysis despite producing higher-quality findings, thanks to the MoE architecture activating only 3.3B parameters per token.
Scan-level benchmarks
See BENCHMARK.md for end-to-end scan times across profiles and target sizes.
Reference — every AI-related flag
| Flag | Default | Description |
|---|---|---|
--enable-ai |
false |
Turn on the AI layer |
--ai-profile |
"" (uses individual flags) |
Preset tier: lean/balanced/heavy |
--ai-url |
http://localhost:11434 |
Ollama API URL |
--ai-fast-model |
qwen3:1.7b |
Triage model (Ollama tag) |
--ai-deep-model |
qwen2.5-coder:14b |
Deep-analysis model (Ollama tag) |
--ai-cascade |
true |
Use fast → deep cascade |
--ai-deep |
false |
Run deep on every finding, skipping triage filter |
--multi-agent |
false |
Enable 8-agent specialized orchestration |
--ai-verbose |
false |
Log every Ollama query on stderr |
--ai-auto-pull |
true |
Download missing models at startup |
Every flag has a matching YAML key in config.yaml under ai:.
