Files
god-eye/internal/modules/github/github.go
T
Vyntral 3a4c230aa7 feat: v2.0 full rewrite — event-driven pipeline, AI + Nuclei + proxy
Complete architectural overhaul. Replaces the v0.1 monolithic scanner
with an event-driven pipeline of auto-registered modules.

Foundation (internal/):
- eventbus: typed pub/sub, 20 event types, race-safe, drop counter
- module: registry with phase-based selection
- store: thread-safe host store with per-host locks + deep-copy reads
- pipeline: coordinator with phase barriers + panic recovery
- config: 5 scan profiles + 3 AI tiers + YAML loader + auto-discovery

Modules (26 auto-registered across 6 phases):
- Discovery: passive (26 sources), bruteforce, recursive, AXFR, GitHub
  dorks, CT streaming, permutation, reverse DNS, vhost, ASN, supply
  chain (npm + PyPI)
- Enrichment: HTTP probe + tech fingerprint + TLS appliance ID, ports
- Analysis: security checks, takeover (110+ sigs), cloud, JavaScript,
  GraphQL, JWT, headers (OWASP), HTTP smuggling, AI cascade, Nuclei
- Reporting: TXT/JSON/CSV writer + AI scan brief

AI layer (internal/ai/ + internal/modules/ai/):
- Three profiles: lean (16 GB), balanced (32 GB MoE), heavy (64 GB)
- Six event-driven handlers: CVE, JS file, HTTP response, secret
  filter, multi-agent vuln enrichment, anomaly + executive report
- Content-hash cache dedups Ollama calls across hosts
- Auto-pull of missing models via /api/pull with streaming progress
- End-of-scan AI SCAN BRIEF in terminal with top chains + next actions

Nuclei compat layer (internal/nucleitpl/):
- Executes ~13k community templates (HTTP subset)
- Auto-download of nuclei-templates ZIP to ~/.god-eye/nuclei-templates
- Scope filter rejects off-host templates (eliminates OSINT FPs)

Operations:
- Interactive wizard (internal/wizard/) — zero-flag launch
- LivePrinter (internal/tui/) — colorized event stream
- Diff engine + scheduler (internal/diff, internal/scheduler) for
  continuous ASM monitoring with webhook alerts
- Proxy support (internal/proxyconf/): http / https / socks5 / socks5h
  + basic auth

Fixes #1 — native SOCKS5 / Tor compatibility via --proxy flag.

185 unit tests across 15 packages, all race-detector clean.
2026-04-18 16:48:41 +02:00

151 lines
3.8 KiB
Go

// Package github discovers subdomains from public GitHub code via dorks.
// Uses the v3 REST Search API. Works anonymously at a very low rate
// (strict API limits); a token in the GITHUB_TOKEN env var lifts limits.
//
// Dorks used:
//
// "<domain>" in:file
// "api.<domain>" in:file
//
// The module only emits subdomains that match the target domain suffix.
package github
import (
"context"
"encoding/json"
"fmt"
"io"
"net/http"
"net/url"
"os"
"strings"
"sync"
"time"
"god-eye/internal/eventbus"
"god-eye/internal/module"
"god-eye/internal/sources"
"god-eye/internal/store"
)
const ModuleName = "discovery.github-dorks"
type ghModule struct{}
func Register() { module.Register(&ghModule{}) }
func (*ghModule) Name() string { return ModuleName }
func (*ghModule) Phase() module.Phase { return module.PhaseDiscovery }
func (*ghModule) Consumes() []eventbus.EventType { return nil }
func (*ghModule) Produces() []eventbus.EventType {
return []eventbus.EventType{eventbus.EventSubdomainDiscovered}
}
// Default-enabled so bug-bounty users get it for free. Falls back to
// no-op when unauthenticated requests hit rate limits.
func (*ghModule) DefaultEnabled() bool { return true }
func (*ghModule) Run(mctx module.Context) error {
target := mctx.Target
if target == "" {
return nil
}
token := os.Getenv("GITHUB_TOKEN")
timeout := time.Duration(mctx.Config.Int("timeout", 10)) * time.Second
client := &http.Client{Timeout: timeout}
// Two dorks run in parallel. Each returns up to 100 results per page.
dorks := []string{
fmt.Sprintf(`"%s"`, target),
fmt.Sprintf(`"api.%s"`, target),
}
seen := make(map[string]struct{})
var seenMu sync.Mutex
var wg sync.WaitGroup
for _, q := range dorks {
q := q
wg.Add(1)
go func() {
defer wg.Done()
hits := searchCode(client, q, token)
for _, text := range hits {
for _, sub := range sources.ExtractSubdomains(text, target) {
seenMu.Lock()
if _, dup := seen[sub]; dup {
seenMu.Unlock()
continue
}
seen[sub] = struct{}{}
seenMu.Unlock()
_ = mctx.Store.Upsert(mctx.Ctx, sub, func(h *store.Host) {
store.AddDiscoveryMethod(h, "github-dorks")
})
mctx.Bus.Publish(mctx.Ctx, eventbus.SubdomainDiscovered{
EventMeta: eventbus.EventMeta{At: time.Now(), Source: ModuleName, Target: sub},
Subdomain: sub,
Method: "github-dorks",
})
}
}
}()
}
wg.Wait()
return nil
}
// searchCode hits GitHub's code-search endpoint and returns text_matches
// fragments (the snippet fields containing the dorked domain). When
// unauthenticated it may silently return zero hits due to rate limiting;
// the module fails open.
func searchCode(client *http.Client, q, token string) []string {
u := "https://api.github.com/search/code?q=" + url.QueryEscape(q) + "&per_page=100"
req, err := http.NewRequest("GET", u, nil)
if err != nil {
return nil
}
req.Header.Set("Accept", "application/vnd.github.text-match+json")
req.Header.Set("User-Agent", "god-eye-v2")
if token != "" {
req.Header.Set("Authorization", "Bearer "+token)
}
resp, err := client.Do(req)
if err != nil {
return nil
}
defer resp.Body.Close()
body, err := io.ReadAll(resp.Body)
if err != nil {
return nil
}
if resp.StatusCode == 403 || resp.StatusCode == 429 {
return nil
}
var parsed struct {
Items []struct {
TextMatches []struct {
Fragment string `json:"fragment"`
} `json:"text_matches"`
HTMLURL string `json:"html_url"`
} `json:"items"`
}
if err := json.Unmarshal(body, &parsed); err != nil {
return nil
}
var out []string
for _, it := range parsed.Items {
out = append(out, it.HTMLURL)
for _, tm := range it.TextMatches {
out = append(out, tm.Fragment)
}
}
return out
}
var _ = strings.TrimSpace
var _ = context.Canceled