mirror of
https://github.com/BigBodyCobain/Shadowbroker.git
synced 2026-05-27 01:22:27 +02:00
d00c63abed
External security audit by @tg12 (May 17, 2026) filed 11 issues against the backend. PR #227 (May 18, AI-generated) closed seven of them by adding require_local_operator to control-plane endpoints. Four remained live; this PR closes the rest. #192 — CCTV proxy followed redirects without re-validating host Issue: /api/cctv/media validated only the caller-supplied URL host before passing it to requests.get(..., allow_redirects=True). A 302 to http://127.0.0.1 or any internal/disallowed host was silently followed, turning the proxy into an open-redirect-to-SSRF chain. Fix in routers/cctv.py: replace the single allow_redirects=True call with a manual follow loop. Each hop's Location is parsed, the host is rerun through _cctv_host_allowed(), and non-HTTP schemes (file://, ftp://, etc.) are rejected. Cap chain length at 5 hops. Test: backend/tests/test_cctv_redirect_ssrf.py covers - redirect to disallowed host -> 502 - redirect to localhost -> 502 - redirect to another allowed host -> 200 - redirect chain length cap - non-HTTP scheme rejected #198 — Gate introspection GETs were unauthenticated Issue: /api/wormhole/gate/{gate_id}/{identity,personas,key} were callable with no auth dependency. Any caller that could reach the backend could dump the operator's active persona, persona inventory, and key status for any gate_id they knew. The wiki's privacy threat model explicitly markets gate personas as rotating, unlinkable pseudonyms — this leak defeated that property. Fix in routers/wormhole.py: add dependencies=[Depends(require_local_operator)] to all three routes. Test: backend/tests/test_control_surface_auth.py extended with three new parameterized cases (lines 75-77). #199 — GDELT military incident ingestion used plaintext HTTP Issue: backend/services/geopolitics.py fetched http://data.gdeltproject.org/gdeltv2/lastupdate.txt and ~48 export archive URLs over plaintext HTTP. Passive observers could identify Shadowbroker nodes from the fetch pattern. Active MITM could inject doctored military incident records into the global map. Fix in services/geopolitics.py: rewrite the lastupdate.txt fetch and the export download URL constructor to use https://. GDELT's data.gdeltproject.org serves the same content over HTTPS. Test: backend/tests/test_gdelt_https.py asserts no plaintext HTTP URLs to data.gdeltproject.org remain in code (comments excluded) and that the HTTPS URLs we expect are present. #200 — Sentinel token cache lookup used client_id only Issue: routers/tools.py kept a process-global cache of Copernicus bearer tokens. The lookup compared _sh_token_cache["client_id"] == client_id. A caller who knew a valid client_id but supplied any wrong client_secret hit the cache and reused the legitimate caller's bearer token — burning their quota and accessing imagery on their account. Fix in routers/tools.py: replace the client_id field with credential_fp, an HMAC-SHA256 over (client_id, client_secret) under a per-process random key (_SH_TOKEN_CACHE_HMAC_KEY = os.urandom(32), regenerated at startup). A caller who doesn't know the secret cannot compute a matching fingerprint, so they miss the cache and hit the real Copernicus token endpoint — which will reject their wrong secret with a 401. Test: backend/tests/test_sentinel_token_cache.py covers - same client_id + different secrets => different fingerprints - same credentials => same fingerprint (cache still works) - different client_ids + same secret => different fingerprints - cache no longer stores raw client_id (catches regression) - attacker with wrong secret cannot reuse victim's token Validation pytest backend/tests/test_control_surface_auth.py backend/tests/test_cctv_redirect_ssrf.py backend/tests/test_gdelt_https.py backend/tests/test_sentinel_token_cache.py -> 37 passed Credit: @tg12 reported all four of these in their May 17 audit with correct line-number citations and accurate remediation recommendations. Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
45 lines
1.7 KiB
Python
45 lines
1.7 KiB
Python
"""Issue #199 (tg12): GDELT military incident ingestion must use HTTPS.
|
|
|
|
The previous code fetched ``http://data.gdeltproject.org/gdeltv2/lastupdate.txt``
|
|
and ~48 export archives over plaintext HTTP, which let a passive observer
|
|
identify Shadowbroker nodes by their fetch pattern and let an active MITM
|
|
inject doctored export records into the global incident map.
|
|
|
|
These tests assert the URL constants and outbound URL constructor in
|
|
``services/geopolitics.py`` only use HTTPS.
|
|
"""
|
|
import re
|
|
from pathlib import Path
|
|
|
|
|
|
_GEOPOLITICS_SRC = Path(__file__).resolve().parent.parent / "services" / "geopolitics.py"
|
|
|
|
|
|
def _read_source() -> str:
|
|
return _GEOPOLITICS_SRC.read_text(encoding="utf-8")
|
|
|
|
|
|
def test_geopolitics_does_not_use_plaintext_http_for_gdelt():
|
|
"""No string literal in geopolitics.py should fetch GDELT over plaintext HTTP."""
|
|
src = _read_source()
|
|
# Strings that would issue an HTTP request — comments are excluded because
|
|
# comments include "http://" in example URLs even after the fix.
|
|
code_lines = [
|
|
ln for ln in src.split("\n")
|
|
if "http://data.gdeltproject.org" in ln and not ln.lstrip().startswith("#")
|
|
]
|
|
assert code_lines == [], (
|
|
"Found plaintext http://data.gdeltproject.org usage in geopolitics.py:\n"
|
|
+ "\n".join(code_lines)
|
|
)
|
|
|
|
|
|
def test_geopolitics_uses_https_for_gdelt():
|
|
"""The HTTPS URLs we expect must be present."""
|
|
src = _read_source()
|
|
assert "https://data.gdeltproject.org/gdeltv2/lastupdate.txt" in src
|
|
# The download URL is constructed via f-string with {fname}
|
|
assert re.search(
|
|
r'https://data\.gdeltproject\.org/gdeltv2/\{fname\}', src
|
|
), "expected https URL template for individual GDELT export downloads"
|