Files
god-eye/AI_SETUP.md
T
Vyntral b6042bd5df docs(v2): full documentation rewrite + CHANGELOG + live benchmark
Eight documents polished for v2.0 release:

- README.md: hero + 30-sec quickstart + feature matrix + competitive
  landscape + wizard/live/AI GIF demos
- AI_SETUP.md: 3 AI profiles + cascade + auto-pull + end-of-scan brief
  + model comparison + troubleshooting + privacy model
- EXAMPLES.md: 14 practical recipes from zero-flag wizard to routing
  via Tor / Burp / mitmproxy
- BENCHMARK.md: cross-tool comparison matrix + methodology + caveats
- BENCHMARK-SCANME.md (new): reproducible live benchmark on Nmap's
  authorized test host, documents three bugs fixed mid-test
- FEATURE_ANALYSIS.md: per-feature status across all 6 phases
- SECURITY.md: ethical guidelines + disclosure + compliance
- CHANGELOG.md (new): complete v2.0.0-rc1 release notes
2026-04-18 16:49:04 +02:00

20 KiB
Raw Permalink Blame History

🧠 AI Integration Guide

Ollama Privacy Cost CVE

No API keys. No cloud. No telemetry. No usage caps. Runs on your laptop.

God's Eye v2 is the only open-source attack-surface tool with automated CVE correlation via a local LLM. Apache 2.4.7 detected → CVE-2026-34197 surfaced. WordPress 5.8.2 fingerprinted → known vulnerabilities chained. All through an Ollama cascade that triages, then drills down with a 30B Mixture-of-Experts model that activates just 3.3B parameters per token.

Everything stays on your machine. No data leaves your hardware.

AI cascade against Apache 2.4.7 on scanme.nmap.org

Every scan ends with an AI SCAN BRIEF — severity totals, top exploitable chains, executive summary, and recommended next actions — framed in the terminal. Recorded live on scanme.nmap.org, models served by local Ollama.


🎯 End-of-scan brief

Every scan that produces findings ends with a framed summary the AI writes for you. Six parts:

┌──  AI SCAN BRIEF — target.com  ─────────────────────────────────────────────┐
│ Totals
│   Hosts: 17   Active: 13   AI findings: 23
│
│ Findings by severity
│    CRIT   critical   2
│   [HIGH]  high       7
│   [MED]   medium     12
│   [LOW]   low        4
│
│ Top exploitable chains
│   ▸ admin.target.com  — Git Repository Exposed + Open Redirect
│   ▸ api.target.com    — CORS Misconfiguration + JWT alg=none
│   ▸ legacy.target.com — Apache@2.4.7→CVE-2026-34197
│
│ AI agents that contributed
│   • http-analyzer       8 findings
│   • secret-validator    6 findings
│   • anomaly-detector    1 findings
│   • report-writer       1 findings
│
│ AI executive summary
│   Scan identified two critical issues requiring immediate attention:
│   exposed git repository on admin.target.com and an Apache 2.4.7 server
│   (end-of-life since 2014) running on legacy.target.com. The cross-host
│   anomaly detector flagged a dev-environment leak into production.
│
│ Recommended next actions
│   1. Remove .git directory from admin.target.com (CRITICAL)
│   2. Patch Apache 2.4.7 → vendor latest (affects legacy.target.com)
│   3. Rotate JWT signing key on api.target.com
│   4. Move dev.api.target.com off production DNS
│   5. Investigate anomaly: shared SSH key across 3 hosts
└─────────────────────────────────────────────────────────────────────────────┘

It's generated by internal/modules/brief, runs in PhaseReporting after all other modules have finished, and only prints when findings exist (silent/JSON modes suppress it automatically).


Table of contents

  1. Quick start (5 minutes)
  2. How the cascade works
  3. AI profiles — pick your tier
  4. The interactive wizard
  5. Auto-pull of missing models
  6. Verbose mode
  7. Multi-agent orchestration
  8. CVE matching
  9. Custom models + YAML config
  10. Troubleshooting
  11. Privacy & security model
  12. Performance reference

Quick start (5 minutes)

1. Install Ollama

macOS / Linux:

curl https://ollama.ai/install.sh | sh

Windows: download from ollama.com/download.

Verify:

ollama --version

2. Start the Ollama server

ollama serve &

Listens on http://localhost:11434. Leave it running.

3. Run God's Eye

The easiest path — let the wizard handle everything:

./god-eye

It will:

  1. Ask which AI tier you want (lean / balanced / heavy / none)
  2. Check which models are already installed
  3. Offer to download missing ones (with live progress)
  4. Ask for your target domain
  5. Start the scan

Manual route:

# Defaults (lean tier): pulls qwen3:1.7b + qwen2.5-coder:14b if missing
./god-eye -d target.com --pipeline --enable-ai

# Balanced tier (32GB RAM): MoE deep model, 256K context
./god-eye -d target.com --pipeline --enable-ai --ai-profile balanced

# Heavy tier (64GB+ RAM): best quality
./god-eye -d target.com --pipeline --enable-ai --ai-profile heavy --ai-verbose

How the cascade works

Every finding goes through a two-stage pipeline:

┌──────────────────────────────────────────────┐
│  FINDING DETECTED                            │
│  (JS secret, HTTP response, tech version,    │
│   takeover candidate, vuln, etc.)            │
└──────────────┬───────────────────────────────┘
               │
               ▼
┌──────────────────────────────────────────────┐
│  TIER 1: FAST TRIAGE                         │
│  • lean:     qwen3:1.7b                      │
│  • balanced: qwen3:4b                        │
│  • heavy:    qwen3:8b                        │
│                                              │
│  Output: "relevant" vs "skip"                │
│  Latency: 0.52 seconds                      │
└──────────────┬───────────────────────────────┘
               │  if relevant ↓
               ▼
┌──────────────────────────────────────────────┐
│  TIER 2: DEEP ANALYSIS                       │
│  • lean:     qwen2.5-coder:14b               │
│  • balanced: qwen3-coder:30b (MoE)           │
│  • heavy:    qwen3-coder:30b (MoE)           │
│                                              │
│  Output: severity, description, PoC,          │
│          remediation, OWASP + CVE matches    │
│  Latency: 525 seconds                       │
└──────────────┬───────────────────────────────┘
               │
               ▼
         AIFinding event → store → report

Why two tiers? Pure cost/quality — the fast model filters ~70% of findings as non-issues without paying for the deep model's runtime. Cascades reduce total wall-clock by 4060% while keeping quality identical for what actually surfaces.

Disable the cascade to always run the deep model (slower, no quality gain on most findings):

./god-eye -d target.com --pipeline --enable-ai --ai-cascade=false

AI profiles — pick your tier

Profile Triage model Deep model Disk pull VRAM (Q4) Best for
lean (default) qwen3:1.7b qwen2.5-coder:14b ~10GB ~10GB 16GB RAM laptops, CI runners
balanced qwen3:4b qwen3-coder:30b (MoE) ~20GB ~17GB 32GB RAM workstations — sweet spot
heavy qwen3:8b qwen3-coder:30b (MoE) ~23GB ~22GB 64GB+ servers, top-quality runs

Why MoE (Mixture of Experts) matters for balanced/heavy

qwen3-coder:30b is a Mixture-of-Experts model with 30B total parameters but only 3.3B active per token. Inference speed is closer to a dense 3B model while quality is closer to a dense 30B. Combined with a 256K context window it can ingest entire JS bundles + long HTTP response bodies in a single prompt — useful for the deep-analysis step.

Pick your profile with one question

"How much RAM can I dedicate to Ollama while the scan runs?"

  • < 16GB → use lean, possibly shrink with --ai-deep-model qwen2.5-coder:7b
  • 1632GBlean (or balanced if your deep model fits)
  • 32GB+balanced (recommended) or heavy

The wizard asks this for you if you're unsure.


The interactive wizard

Run ./god-eye with no -d flag in a terminal — the wizard launches automatically:

═══════════════════════════════════════════════════════════
  God's Eye v2 — interactive setup
  Ctrl-C to abort at any time.
═══════════════════════════════════════════════════════════

? Select AI tier
  ▸ 1) Lean     — 16GB RAM · qwen3:1.7b + qwen2.5-coder:14b (default)
    2) Balanced — 32GB RAM · qwen3:4b + qwen3-coder:30b (MoE, 256K ctx)
    3) Heavy    — 64GB RAM · qwen3:8b + qwen3-coder:30b (max quality)
    4) No AI    — Pure recon without LLM analysis
  Choice [1]:

⚙ Checking Ollama at http://localhost:11434…
  ↓ Missing models: qwen3:1.7b, qwen2.5-coder:14b
? Download missing models now? [Y/n]
  > y
↓ qwen3:1.7b
  pulling manifest         10%  150MB / 1.4GB
  pulling manifest         50%  700MB / 1.4GB
  pulling manifest        100%  1.4GB / 1.4GB
  verifying sha256 digest
  writing manifest
  success                 100%
✓ qwen3:1.7b ready
↓ qwen2.5-coder:14b
  …
✓ qwen2.5-coder:14b ready

? Target domain
  > target.com

? Select scan profile
    1) Quick
  ▸ 2) Bug bounty (default)
    3) Pentest
    4) ASM continuous
    5) Stealth max

…

─── Scan summary ───
  Target           target.com
  Scan profile     bugbounty
  AI tier          lean
  AI auto-pull     yes
  AI verbose       no
  Live view        yes (v=1)

? Start scan? [Y/n]
  >

Force the wizard even when -d is set (to review defaults):

./god-eye --wizard -d target.com

Auto-pull of missing models

When --enable-ai is on and --ai-auto-pull is true (default), God's Eye checks Ollama at startup and downloads missing models before the pipeline starts.

Under the hood:

  1. Reachability checkGET /api/tags. If unreachable, AI modules silently no-op and the scan proceeds without AI.
  2. Inventory compare — matches installed models (by tag) against the profile's required set. Handles :latest suffix and tagless lookups.
  3. Stream pullPOST /api/pull with stream:true, NDJSON progress parsed and throttled (new status or ≥5% jump triggers a log line).
  4. Ready — returns control to the pipeline coordinator.

Disable auto-pull if you'd rather error out on missing models:

./god-eye -d target.com --pipeline --enable-ai --ai-auto-pull=false

When the wizard runs it asks explicitly before downloading. Non-wizard mode pulls silently unless --ai-verbose is set.


Verbose mode

See every Ollama interaction in real time on stderr:

./god-eye -d target.com --pipeline --enable-ai --ai-verbose --live

Stderr output:

[ai] → qwen3:1.7b  prompt=2341B timeout=60s
[ai] ← qwen3:1.7b  response=512B  1.3s
[ai] → qwen2.5-coder:14b  prompt=8291B timeout=120s
[ai] ← qwen2.5-coder:14b  response=1832B  8.7s
[ai] → qwen2.5-coder:14b  prompt=5123B timeout=120s
[ai] ← qwen2.5-coder:14b  response=946B  5.2s

Useful for:

  • Debugging slow runs (spot the 60s+ queries)
  • Tuning the triage threshold (are "skip" decisions correct?)
  • Verifying the cascade is actually running (triage fires before deep)
  • Sanity-checking prompt sizes (large prompts = context-bloat → fix the caller)

Verbose goes to stderr so stdout JSON / silent modes still parse cleanly.


Multi-agent orchestration

In addition to the cascade, God's Eye ships an 8-agent specialized system (inherited from v1). Enabled automatically in bugbounty and pentest profiles, or explicitly:

./god-eye -d target.com --pipeline --enable-ai --multi-agent
Agent Specialty
XSS Cross-Site Scripting (DOM, Reflected, Stored)
SQLi SQL Injection (error, blind, time-based)
Auth Auth bypass, IDOR, JWT, OAuth, SAML, session
API REST/GraphQL, CORS, rate limiting
Crypto TLS / cipher issues, weak keys
Secrets API keys, tokens, hardcoded credentials
Headers CSP, HSTS, cookie flags, SameSite
General Fallback for unclassified findings

How it works:

  1. A coordinator agent classifies each raw finding (regex + short LLM call)
  2. Routes it to the appropriate specialist
  3. Specialist analyzes with domain-specific knowledge + OWASP-aligned remediation templates
  4. Emits an AIFinding event with confidence score

This is a v1-era implementation. Fase 3 (in progress) introduces native Planner/Worker agents with tool calls — see internal/agent/ for the evolving interfaces.


CVE matching

Two-layer CVE detection:

  1. Offline KEV (CISA Known Exploited Vulnerabilities) — ~1400 actively exploited CVEs, auto-downloaded to ~/.god-eye/kev.json on first AI-enabled scan. Instant lookups, no network.
  2. NVD API (fallback) — full CVE database, queried via function-calling from the deep model when the detected tech+version doesn't match KEV.

Update the KEV cache manually any time:

./god-eye update-db
./god-eye db-info

CVE matches emit an eventbus.CVEMatch event with the tech, version, severity, and KEV flag:

CRIT  CVE  nginx@1.18.0 → CVE-2021-23017

Integration with your output:

{
  "host": "nginx-internal.target.com",
  "technologies": ["nginx/1.18.0"],
  "cve_findings": ["CVE-2021-23017"]
}

Custom models + YAML config

Override the profile's choices per-scan:

./god-eye -d target.com --pipeline --enable-ai \
  --ai-fast-model qwen3:4b \
  --ai-deep-model qwen3-coder:30b

Or persist in YAML:

# god-eye.yaml
profile: bugbounty

ai:
  enabled: true
  url: http://localhost:11434       # point at a remote Ollama if you have one
  fast_model: qwen3:4b               # triage
  deep_model: qwen3-coder:30b        # deep analysis (MoE)
  cascade: true
  deep: true                         # run deep on every finding, not just triaged ones
  multi_agent: true

The wizard writes these when you pick a non-default profile through it (future enhancement; right now you edit YAML by hand).


Troubleshooting

"ollama not reachable at http://localhost:11434"

# Check the server is up
curl http://localhost:11434/api/tags

# If the port isn't listening
ollama serve &

If it's listening on a different host/port (e.g., remote machine):

./god-eye -d target.com --pipeline --enable-ai --ai-url http://10.0.0.10:11434

"pull qwen3:1.7b: model not found"

Ollama can't resolve the tag. Make sure you're on an up-to-date Ollama — the registry changes names occasionally. Try:

ollama pull qwen3:1.7b
ollama list

If the pull works manually but god-eye fails, file an issue.

Downloads hang at some percentage

Usually network-flakiness with the Ollama registry. Ollama resumes; kill god-eye with Ctrl-C and retry — it will pick up where the manifest/blob left off.

AI findings feel too hallucinated

Three levers:

  1. Drop the temperature. Edit internal/ai/ollama.go:query() (temperature: 0.30.1).
  2. Use a bigger triage model (--ai-profile heavy).
  3. Disable the cascade (--ai-cascade=false) so every finding gets the deep model — slower but higher quality floor.

"deep model has low tok/sec on my MacBook Pro"

Expected for dense 14B. Switch to balanced profile: the MoE 30B is faster than dense 14B because only 3.3B params activate per token.

./god-eye --ai-profile balanced …

High memory usage

Both models are loaded in Ollama when the scan starts. Options:

  • Use the lean profile.
  • Drop the deep model to qwen2.5-coder:7b (less capable but only ~5GB).
  • Disable the cascade and use only the fast model: --ai-cascade=false --ai-deep-model qwen3:1.7b.

Privacy & security model

Completely local — Ollama runs on your machine; no data leaves it. Offline after pull — once models are cached in ~/.ollama/, no network is required. Open-source infrastructure — Ollama (MIT), models under their respective open licenses. No telemetry — God's Eye doesn't phone home. Free forever — no API keys, no usage caps.

What the AI layer sees: excerpts of HTTP responses, JS file content, technology banners, and your target domain. Do NOT enable AI if your engagement terms forbid third-party tooling touching response bodies — even though the LLM is local, some agreements treat automated analysis separately.


Performance reference

Measured on an Apple M1 Pro, 16GB RAM, ollama serve running alongside the scan.

Lean cascade

Finding type Triage latency Deep latency Total
Short HTTP response 0.6s 4.1s 4.7s
Medium JS file (8KB) 0.9s 9.3s 10.2s
Large JS bundle (64KB, truncated) 1.1s 14.2s 15.3s

Balanced cascade (MoE)

Finding type Triage Deep Total
Short HTTP response 0.8s 3.2s 4.0s
Medium JS (8KB) 1.2s 7.1s 8.3s
Large JS (64KB) 1.5s 10.8s 12.3s

Net effect: balanced is ~20% faster on deep analysis despite producing higher-quality findings, thanks to the MoE architecture activating only 3.3B parameters per token.

Scan-level benchmarks

See BENCHMARK.md for end-to-end scan times across profiles and target sizes.


Flag Default Description
--enable-ai false Turn on the AI layer
--ai-profile "" (uses individual flags) Preset tier: lean/balanced/heavy
--ai-url http://localhost:11434 Ollama API URL
--ai-fast-model qwen3:1.7b Triage model (Ollama tag)
--ai-deep-model qwen2.5-coder:14b Deep-analysis model (Ollama tag)
--ai-cascade true Use fast → deep cascade
--ai-deep false Run deep on every finding, skipping triage filter
--multi-agent false Enable 8-agent specialized orchestration
--ai-verbose false Log every Ollama query on stderr
--ai-auto-pull true Download missing models at startup

Every flag has a matching YAML key in config.yaml under ai:.