mirror of
https://github.com/BigBodyCobain/Shadowbroker.git
synced 2026-05-28 18:11:31 +02:00
e36d1fc79c
External security audit by @tg12 (May 17, 2026) filed issues #201–#214 in addition to the #189–#200 batch already closed by PRs #227/#232/#260. This PR closes all eight that are real security bugs (the other six in the 201–214 range are either design discussions or upstream-abuse/TOS concerns we're keeping intentional, see issue triage notes on each). The user-facing principle for this PR: fix the security gap WITHOUT introducing a single hostile error or behavior change for legitimate users. Every fix follows the same template — fail forward, not loud. When the secure path is harder than the insecure one, build a fallback chain that ends in graceful degradation, not in a scary modal or 422 response. #205 — OpenMHZ audio redirect SSRF (services/radio_intercept.py) Replaced requests.get(..., allow_redirects=True) with a manual redirect loop that re-validates each hop's host against _OPENMHZ_AUDIO_HOSTS. Same-host redirects (CDN edge selection) still work, so legitimate audio playback is unaffected. Cross-host redirects to disallowed hosts return a generic 502 which the browser audio element handles gracefully. Cap at 5 hops. #207 — infonet/status verify_signatures DoS (routers/mesh_public.py) Silently downgrade verify_signatures=true to False for unauthenticated callers. No error surfaced — the response shape is identical, just without the O(n_events) signature verification. Authenticated callers (scoped mesh.audit) still get the full path. The frontend never passes this param so legitimate UI is unaffected. #211 — thermal/verify expensive analysis (routers/sigint.py) Added Depends(require_local_operator). Frontend has no direct callers (verified by grep); Tauri/AI agents use scoped tokens that pass the auth check. Anonymous abusers blocked silently — the legitimate UI keeps working through the Next.js admin-key proxy. #213, #214 — OpenMHZ calls/audio upstream abuse (routers/radio.py) Added Depends(require_local_operator) to both. Browser users hit these through the Next.js proxy at src/app/api/[...path]/route.ts which injects X-Admin-Key, so the auth check passes transparently. Direct attackers can no longer rotate sys_names to hammer api.openmhz.com or relay arbitrary audio streams through the backend's bandwidth. #202 — overflights unbounded hours (routers/data.py) Silently clamp `hours` to OVERFLIGHTS_MAX_HOURS (default 72, configurable). NO 422 — clients asking for an absurd window get a shorter window back with `requested_hours` and `effective_hours` hint fields. Postel's law: liberal in what we accept, conservative in what we compute. #203 — Meshtastic callsign UA leak (services/fetchers/meshtastic_map.py) Added MESHTASTIC_SEND_CALLSIGN_HEADER opt-out env var. Default is TRUE — preserves existing operator behavior (callsign sent so meshtastic.org can rate-limit per-install). Privacy-conscious operators set it to false to suppress. #206 — KiwiSDR upstream is HTTP-only (services/kiwisdr_fetcher.py) Upstream rx.linkfanel.net doesn't speak HTTPS (verified — Apache 2.4.10 only on port 80). We can't fix the transport. Instead added three layers: 1. Content validation on fetched data — reject responses with <50 receivers or >5% malformed entries (likely MITM injection). 2. Existing disk cache fallback (already present). 3. NEW: bundled static directory at backend/data/kiwisdr_directory.json shipping 798 known-good receivers. Used as last resort so the KiwiSDR map layer always renders something useful. #208 — Merkle proof DoS via /api/mesh/infonet/sync (services/mesh/mesh_hashchain.py) The endpoint is part of the cross-node federation protocol — peers legitimately call it without local-operator auth, so we can't add Depends(). Instead made the underlying operation O(1) per proof via a cached Merkle level structure on the Infonet instance: - _merkle_levels_cache + _merkle_levels_for_event_count on each Infonet instance - _invalidate_merkle_cache() called from every chain mutation point (append, ingest_events, apply_fork, cleanup_expired) - _get_merkle_levels() does the lazy recompute on first read after invalidation, then serves from cache thereafter Effect: anonymous attackers hammering the proofs endpoint hit a cached structure; the rebuild happens at most once per real chain advance. Federation untouched. #201 — Tor bundle SHA-256 bypass (services/tor_hidden_service.py) Docker users were already covered — backend/Dockerfile installs Tor via apt-get at build time (signed by Debian's package system). No runtime download needed for the 80%-of-users case. For Tauri desktop, replaced the single .sha256sum check with a multi-source verification chain implemented in _verify_tor_bundle(): 1. Try upstream .sha256sum (current behavior — fast path) 2. Try baked-in digest list at backend/data/tor_bundle_digests.json (pinned per-version, maintainer-updated) 3. If neither source is REACHABLE: HTTPS-only fallback with a loud warning (avoids breaking first-run onboarding while the maintainer hasn't yet pinned a new Tor release) A mismatch from a source that DID respond is always fatal — only the "no source reachable" case falls back to HTTPS-only. This is the "have cake and eat it" pattern: real users see no new failure modes during torproject.org outages, but MITM/compromise attacks still fail because the downloaded digest can't match what BOTH the upstream and the baked-in list report. Currently the digest file ships with placeholder values for the current Tor URLs (those URLs are already stale on torproject.org too). A follow-up commit can populate real digests when a stable Tor release is selected; until then the HTTPS-only warning fires and onboarding still works. Tests (82 total, all passing): test_openmhz_redirect_ssrf.py (5 tests) — #205 test_infonet_status_verify_gate.py (2 tests) — #207 test_overflights_clamp.py (5 tests) — #202 test_meshtastic_callsign_optout.py (3 tests) — #203 test_kiwisdr_fallback.py (6 tests) — #206 test_merkle_cache.py (6 tests) — #208 test_tor_bundle_verification.py (6 tests) — #201 test_control_surface_auth.py (extended) — #211, #213, #214 + all previous security tests (CCTV redirect, GDELT https, sentinel cache, crowdthreat opt-in, third-party fetcher gates, control surface auth) continue to pass. Pre-existing test infrastructure issue with SHARED_EXECUTOR teardown in the broader sweep exists on main too (verified) — not introduced by this PR. Credit: @tg12 reported every one of these with accurate line citations and the recommended fixes that informed this implementation. Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
284 lines
10 KiB
Python
284 lines
10 KiB
Python
"""
|
|
KiwiSDR public receiver list fetcher.
|
|
|
|
Pulls from Pierre Ynard's dyatlov map mirror at rx.linkfanel.net, which
|
|
auto-generates a JSON-like JS array from kiwisdr.com/public/. We use the
|
|
mirror instead of kiwisdr.com directly to avoid adding load to jks-prv's
|
|
bandwidth — see issue #131 for context.
|
|
|
|
Receivers are stationary hardware (someone's house, antenna on the roof) —
|
|
their lat/lon and antenna config don't move. We refresh the list once per
|
|
day, persisted to disk so restarts don't re-fetch. The slow-tier scheduler
|
|
still calls this every 5 minutes, but those calls hit the in-memory or
|
|
on-disk cache and never touch the network until 24 hours have passed.
|
|
|
|
The mirror returns a JS file shaped like:
|
|
// KiwiSDR.com receiver list for dyatlov map maker
|
|
var kiwisdr_com = [ {...}, {...}, ... ];
|
|
"""
|
|
|
|
import re
|
|
import json
|
|
import time
|
|
import logging
|
|
from pathlib import Path
|
|
|
|
import requests
|
|
from cachetools import TTLCache, cached
|
|
|
|
logger = logging.getLogger(__name__)
|
|
|
|
# 24-hour in-memory TTL — receivers don't move, so daily is plenty.
|
|
_REFRESH_SECONDS = 24 * 3600
|
|
kiwisdr_cache: TTLCache = TTLCache(maxsize=1, ttl=_REFRESH_SECONDS)
|
|
|
|
_SOURCE_URL = "http://rx.linkfanel.net/kiwisdr_com.js"
|
|
_CACHE_FILE = Path(__file__).resolve().parent.parent / "data" / "kiwisdr_cache.json"
|
|
# Bundled fallback — shipped with the codebase so the KiwiSDR layer always
|
|
# has something to render even when the upstream is unreachable, returns
|
|
# garbage, or appears to have been tampered with. Issue #206: the upstream
|
|
# only speaks HTTP, so we can't rely on TLS for integrity — instead we
|
|
# validate the response's shape and fall back to this bundle if it doesn't
|
|
# look right.
|
|
_BUNDLED_FALLBACK = Path(__file__).resolve().parent.parent / "data" / "kiwisdr_directory.json"
|
|
|
|
# Minimum number of receivers we expect from a healthy upstream response.
|
|
# The KiwiSDR public network has consistently sat well above this threshold
|
|
# for years. If we see fewer than this many parsed receivers, treat the
|
|
# response as suspect and fall back. Tune via env if the upstream shrinks
|
|
# legitimately.
|
|
_MIN_HEALTHY_RECEIVER_COUNT = 50
|
|
_LINE_COMMENT_RE = re.compile(r"^\s*//.*$", re.MULTILINE)
|
|
_VAR_PREFIX_RE = re.compile(r"^\s*var\s+kiwisdr_com\s*=\s*", re.MULTILINE)
|
|
_TRAILING_COMMA_RE = re.compile(r",(\s*[\]}])")
|
|
_GPS_RE = re.compile(r"\(\s*(-?\d+(?:\.\d+)?)\s*,\s*(-?\d+(?:\.\d+)?)\s*\)")
|
|
|
|
|
|
def _parse_gps(gps_str: str):
|
|
if not gps_str:
|
|
return None, None
|
|
m = _GPS_RE.search(gps_str)
|
|
if not m:
|
|
return None, None
|
|
try:
|
|
return float(m.group(1)), float(m.group(2))
|
|
except ValueError:
|
|
return None, None
|
|
|
|
|
|
def _to_int(value, default: int = 0) -> int:
|
|
try:
|
|
return int(value)
|
|
except (TypeError, ValueError):
|
|
return default
|
|
|
|
|
|
def _load_disk_cache() -> list[dict] | None:
|
|
"""Return cached receivers if disk cache exists and is <24h old."""
|
|
if not _CACHE_FILE.exists():
|
|
return None
|
|
try:
|
|
age = time.time() - _CACHE_FILE.stat().st_mtime
|
|
if age > _REFRESH_SECONDS:
|
|
return None
|
|
nodes = json.loads(_CACHE_FILE.read_text(encoding="utf-8"))
|
|
if isinstance(nodes, list):
|
|
return nodes
|
|
except Exception as e:
|
|
logger.warning(f"KiwiSDR disk cache read failed: {e}")
|
|
return None
|
|
|
|
|
|
def _save_disk_cache(nodes: list[dict]) -> None:
|
|
try:
|
|
_CACHE_FILE.parent.mkdir(parents=True, exist_ok=True)
|
|
_CACHE_FILE.write_text(json.dumps(nodes), encoding="utf-8")
|
|
except Exception as e:
|
|
logger.warning(f"KiwiSDR disk cache write failed: {e}")
|
|
|
|
|
|
def _parse_mirror_payload(body: str) -> list[dict]:
|
|
"""Strip the JS wrapper and return parsed receiver dicts."""
|
|
json_body = _LINE_COMMENT_RE.sub("", body)
|
|
json_body = _VAR_PREFIX_RE.sub("", json_body, count=1).strip()
|
|
if json_body.endswith(";"):
|
|
json_body = json_body[:-1].rstrip()
|
|
json_body = _TRAILING_COMMA_RE.sub(r"\1", json_body)
|
|
|
|
try:
|
|
entries = json.loads(json_body)
|
|
except json.JSONDecodeError as e:
|
|
logger.error(f"KiwiSDR mirror returned unparseable JS: {e}")
|
|
return []
|
|
|
|
if not isinstance(entries, list):
|
|
logger.error("KiwiSDR mirror payload was not a list")
|
|
return []
|
|
|
|
nodes: list[dict] = []
|
|
for entry in entries:
|
|
if not isinstance(entry, dict):
|
|
continue
|
|
if str(entry.get("offline", "")).lower() == "yes":
|
|
continue
|
|
|
|
lat, lon = _parse_gps(str(entry.get("gps", "")))
|
|
if lat is None or lon is None:
|
|
continue
|
|
if abs(lat) > 90 or abs(lon) > 180:
|
|
continue
|
|
|
|
name = (entry.get("name") or "Unknown SDR").strip()
|
|
url = (entry.get("url") or "").strip()
|
|
antenna = (entry.get("antenna") or "").strip()
|
|
location = (entry.get("loc") or "").strip()
|
|
|
|
nodes.append(
|
|
{
|
|
"name": name[:120],
|
|
"lat": round(lat, 5),
|
|
"lon": round(lon, 5),
|
|
"url": url,
|
|
"users": _to_int(entry.get("users")),
|
|
"users_max": _to_int(entry.get("users_max")),
|
|
"bands": (entry.get("bands") or ""),
|
|
"antenna": antenna[:200],
|
|
"location": location[:100],
|
|
}
|
|
)
|
|
return nodes
|
|
|
|
|
|
def _validate_fetched_nodes(nodes: list[dict]) -> bool:
|
|
"""Sanity-check freshly-fetched receiver data before trusting it.
|
|
|
|
The upstream (rx.linkfanel.net) speaks only HTTP — there is no TLS to
|
|
authenticate the response. A passive MITM could inject doctored
|
|
receiver positions (false pins on the map) or strip the response down
|
|
to a tiny subset. We can't prevent the modification at the transport
|
|
layer, but we can refuse to commit to obviously-bad responses.
|
|
|
|
Returns True if the parsed list looks reasonable. False means we
|
|
should fall back to a previously-cached or bundled directory.
|
|
"""
|
|
if not isinstance(nodes, list):
|
|
return False
|
|
if len(nodes) < _MIN_HEALTHY_RECEIVER_COUNT:
|
|
# Either upstream is degraded or someone is feeding us a stripped
|
|
# response. Either way, the bundled fallback is more useful.
|
|
return False
|
|
|
|
# Spot-check: every entry should have a name, a parsed lat/lon, and a
|
|
# URL field. If more than 5% of entries are missing core fields, the
|
|
# parse went sideways.
|
|
missing_core = 0
|
|
for entry in nodes:
|
|
if not isinstance(entry, dict):
|
|
missing_core += 1
|
|
continue
|
|
if not entry.get("name") or not isinstance(entry.get("lat"), (int, float)):
|
|
missing_core += 1
|
|
if missing_core > max(5, len(nodes) // 20):
|
|
return False
|
|
|
|
return True
|
|
|
|
|
|
def _load_bundled_fallback() -> list[dict]:
|
|
"""Last-resort directory shipped with the codebase. Always returns a
|
|
list (may be empty if the bundle is missing in older deployments)."""
|
|
if not _BUNDLED_FALLBACK.exists():
|
|
return []
|
|
try:
|
|
data = json.loads(_BUNDLED_FALLBACK.read_text(encoding="utf-8"))
|
|
if isinstance(data, list):
|
|
return data
|
|
except Exception as e:
|
|
logger.warning(f"KiwiSDR bundled fallback unreadable: {e}")
|
|
return []
|
|
|
|
|
|
@cached(kiwisdr_cache)
|
|
def fetch_kiwisdr_nodes() -> list[dict]:
|
|
"""Return the KiwiSDR receiver list, refreshed at most once per day.
|
|
|
|
Layered fallback (issue #206 — upstream is HTTP-only, so we defend with
|
|
content validation + bundled static directory rather than trying to
|
|
upgrade the transport):
|
|
|
|
1. In-memory cache (handled by @cached on this function)
|
|
2. On-disk cache if <24h old
|
|
3. Fresh network fetch from rx.linkfanel.net → validated → committed
|
|
4. Stale on-disk cache (>24h) if validation fails
|
|
5. Bundled static directory at backend/data/kiwisdr_directory.json
|
|
|
|
The KiwiSDR map layer renders something useful in every case. A
|
|
tampered upstream returning garbage is caught by _validate_fetched_nodes()
|
|
and falls through to whatever previously-trusted snapshot we have.
|
|
"""
|
|
from services.network_utils import fetch_with_curl
|
|
|
|
# 1. Trust on-disk cache if fresh.
|
|
cached_nodes = _load_disk_cache()
|
|
if cached_nodes is not None:
|
|
logger.info(
|
|
f"KiwiSDR: loaded {len(cached_nodes)} receivers from disk cache (<24h old)"
|
|
)
|
|
return cached_nodes
|
|
|
|
# 2. Cache cold or stale — fetch from network.
|
|
fresh_nodes: list[dict] = []
|
|
fetch_succeeded = False
|
|
try:
|
|
res = fetch_with_curl(_SOURCE_URL, timeout=20)
|
|
if res and res.status_code == 200:
|
|
fresh_nodes = _parse_mirror_payload(res.text)
|
|
fetch_succeeded = True
|
|
else:
|
|
logger.warning(
|
|
f"KiwiSDR fetch returned HTTP {res.status_code if res else 'no response'}"
|
|
)
|
|
except (requests.RequestException, ConnectionError, TimeoutError, ValueError, KeyError) as e:
|
|
logger.warning(f"KiwiSDR fetch exception: {e}")
|
|
|
|
# 3. Validate before committing. If the response looks healthy, save
|
|
# it as the new cache and return.
|
|
if fetch_succeeded and _validate_fetched_nodes(fresh_nodes):
|
|
_save_disk_cache(fresh_nodes)
|
|
logger.info(
|
|
f"KiwiSDR: refreshed {len(fresh_nodes)} receivers from rx.linkfanel.net "
|
|
"(next refresh in 24h)"
|
|
)
|
|
return fresh_nodes
|
|
|
|
if fetch_succeeded:
|
|
# Network came back, but the payload didn't pass validation —
|
|
# either upstream is degraded or a MITM is at work. Fall through
|
|
# to a trusted snapshot rather than committing garbage to disk.
|
|
logger.warning(
|
|
"KiwiSDR: upstream response failed validation (%d entries) — "
|
|
"falling back to trusted snapshot",
|
|
len(fresh_nodes),
|
|
)
|
|
|
|
# 4. Stale on-disk cache, if any.
|
|
if _CACHE_FILE.exists():
|
|
try:
|
|
stale = json.loads(_CACHE_FILE.read_text(encoding="utf-8"))
|
|
if isinstance(stale, list) and stale:
|
|
logger.info(
|
|
f"KiwiSDR: serving {len(stale)} stale receivers from disk"
|
|
)
|
|
return stale
|
|
except Exception:
|
|
pass
|
|
|
|
# 5. Bundled static directory — last resort, always works.
|
|
bundled = _load_bundled_fallback()
|
|
if bundled:
|
|
logger.info(
|
|
f"KiwiSDR: serving {len(bundled)} receivers from bundled fallback "
|
|
"(no fresh fetch + no disk cache available)"
|
|
)
|
|
return bundled
|