Commit Graph

6 Commits

Author SHA1 Message Date
Shadowbroker 76750caa92 Round 7a: per-operator outbound attribution + GDELT GCS-direct fix (#292)
== Per-install operator handle for every third-party API call ==

Before this PR, every Shadowbroker install identified itself to
Wikipedia, Wikidata, Nominatim, GDELT, OpenMHz, Broadcastify,
weather.gov, NUFORC, Sentinel/Planetary Computer, TinyGS / CelesTrak,
Shodan, Finnhub, and others with a single project-wide User-Agent
("Shadowbroker/1.0" or "ShadowBroker-OSINT/1.0"). From the upstream's
perspective every install in the world looked like one giant scraper.
If one install misbehaved, the upstream's only recourse was to block
"Shadowbroker" as a whole.

PR #284 inadvertently doubled down on this in the frontend by
introducing a shared `WIKIMEDIA_API_USER_AGENT` constant. This PR
retrofits both backends to per-operator attribution.

  New setting: OPERATOR_HANDLE (env var / settings UI / auto-gen)
  New helper:  network_utils.outbound_user_agent("purpose")

The handle is auto-generated as "operator-XXXXXX" on first call (the
"shadow-" prefix from earlier drafts was deliberately dropped — too
suspicious-looking for abuse-detection systems). Operators can
override via OPERATOR_HANDLE; the value is sanitized to lowercase
alphanumeric+dash+underscore and capped at 48 chars. Persisted to
backend/data/operator_handle.json so it survives container restarts.

Retrofitted call sites (every previously-MONSTER User-Agent):
  - services/region_dossier.py (Wikipedia + Wikidata + Nominatim)
  - services/geocode.py         (Nominatim)
  - services/sentinel_search.py (Microsoft Planetary Computer)
  - services/feed_ingester.py   (operator-curated RSS feeds)
  - services/fetchers/earth_observation.py (weather.gov, NUFORC)
  - services/fetchers/infrastructure.py
  - services/fetchers/aircraft_database.py
  - services/fetchers/route_database.py
  - services/fetchers/trains.py
  - services/fetchers/meshtastic_map.py
  - services/shodan_connector.py
  - services/unusual_whales_connector.py (Finnhub)
  - services/tinygs_fetcher.py            (CelesTrak + TinyGS)
  - services/sar/sar_products_client.py
  - services/geopolitics.py               (GDELT)
  - services/radio_intercept.py           (Broadcastify + OpenMHz)
  - routers/cctv.py + main.py             (CCTV proxy)
  - routers/ai_intel.py
  - scripts/convert_power_plants.py       (release-time data refresh)

Spoofed browser UAs removed (issues #289 / #290 / #291 — tg12 audit):
  - cloudscraper-based Chrome impersonation against api.openmhz.com
    -> replaced with honest requests + per-install UA
  - Mozilla/5.0 spoofed UA on Broadcastify scrape
    -> replaced with honest UA
  - Mozilla/5.0 + fake first-party Referer on OpenMHz audio relay
    -> replaced with honest UA
  - cloudscraper dependency dropped from pyproject.toml + uv.lock

Frontend retrofit:
  - new GET /api/settings/operator-handle endpoint (local-operator
    gated) returns the install's handle
  - frontend/src/lib/wikimediaClient.ts fetches the handle once on
    first use, caches it for page lifetime, embeds it in the
    Api-User-Agent for every Wikipedia / Wikidata browser-direct call

== GDELT GCS-direct fix ==

GDELT's data.gdeltproject.org is a CNAME to a Google Cloud Storage
bucket. GCS responds with the wildcard *.storage.googleapis.com cert
which legitimately does NOT cover the GDELT custom domain, so Python's
TLS verification correctly refuses the connection. Some networks
happen to route through a path where this works; many (notably Docker
Desktop's outbound NAT on local installs) do not. Verified on the
maintainer's local install: GDELT was unreachable; 1610 geopolitical
events / 48 export files were dropping silently.

Fix: services/geopolitics._gcs_direct_gdelt_url() rewrites any
data.gdeltproject.org URL to its GCS-direct equivalent
(storage.googleapis.com/data.gdeltproject.org/...) where the standard
GCS cert is genuinely valid. api.gdeltproject.org and every other host
are left untouched.

Confirmed live: backend log goes from
  GDELT lastupdate failed: 500
to
  Downloading 48 GDELT export files...
  Downloaded 48/48 GDELT exports
  GDELT parsed: 1610 conflict locations from 48 files

== Tests ==

  backend/tests/test_per_operator_outbound_attribution.py (12 tests)
  backend/tests/test_gdelt_gcs_direct_rewrite.py          (6 tests)
  backend/tests/test_region_dossier_wikimedia_ua.py       (updated to
    pin the helper + per-operator handle, not the old constant)
  frontend/src/__tests__/utils/wikimediaClient.test.ts    (rewritten
    to mock /api/settings/operator-handle and assert per-operator UA)

Local: backend 114/114 security+audit+round7a suite green;
       frontend 718/718 vitest suite green.

Credit: tg12 (external security audit, issues #289/#290/#291
relating to spoofed UAs); BigBodyCobain (operator-prefix call,
GDELT cloud-vs-local diagnosis).
2026-05-21 15:11:28 -06:00
Shadowbroker e36d1fc79c [security] Close tg12 audit issues #201–#214 seamlessly (#261)
External security audit by @tg12 (May 17, 2026) filed issues #201–#214
in addition to the #189–#200 batch already closed by PRs #227/#232/#260.
This PR closes all eight that are real security bugs (the other six in
the 201–214 range are either design discussions or upstream-abuse/TOS
concerns we're keeping intentional, see issue triage notes on each).

The user-facing principle for this PR: fix the security gap WITHOUT
introducing a single hostile error or behavior change for legitimate
users. Every fix follows the same template — fail forward, not loud.
When the secure path is harder than the insecure one, build a
fallback chain that ends in graceful degradation, not in a scary
modal or 422 response.

  #205 — OpenMHZ audio redirect SSRF (services/radio_intercept.py)

  Replaced requests.get(..., allow_redirects=True) with a manual
  redirect loop that re-validates each hop's host against
  _OPENMHZ_AUDIO_HOSTS. Same-host redirects (CDN edge selection)
  still work, so legitimate audio playback is unaffected. Cross-host
  redirects to disallowed hosts return a generic 502 which the
  browser audio element handles gracefully. Cap at 5 hops.

  #207 — infonet/status verify_signatures DoS (routers/mesh_public.py)

  Silently downgrade verify_signatures=true to False for
  unauthenticated callers. No error surfaced — the response shape is
  identical, just without the O(n_events) signature verification.
  Authenticated callers (scoped mesh.audit) still get the full path.
  The frontend never passes this param so legitimate UI is unaffected.

  #211 — thermal/verify expensive analysis (routers/sigint.py)

  Added Depends(require_local_operator). Frontend has no direct
  callers (verified by grep); Tauri/AI agents use scoped tokens that
  pass the auth check. Anonymous abusers blocked silently — the
  legitimate UI keeps working through the Next.js admin-key proxy.

  #213, #214 — OpenMHZ calls/audio upstream abuse (routers/radio.py)

  Added Depends(require_local_operator) to both. Browser users hit
  these through the Next.js proxy at src/app/api/[...path]/route.ts
  which injects X-Admin-Key, so the auth check passes transparently.
  Direct attackers can no longer rotate sys_names to hammer
  api.openmhz.com or relay arbitrary audio streams through the
  backend's bandwidth.

  #202 — overflights unbounded hours (routers/data.py)

  Silently clamp `hours` to OVERFLIGHTS_MAX_HOURS (default 72,
  configurable). NO 422 — clients asking for an absurd window get a
  shorter window back with `requested_hours` and `effective_hours`
  hint fields. Postel's law: liberal in what we accept, conservative
  in what we compute.

  #203 — Meshtastic callsign UA leak (services/fetchers/meshtastic_map.py)

  Added MESHTASTIC_SEND_CALLSIGN_HEADER opt-out env var. Default is
  TRUE — preserves existing operator behavior (callsign sent so
  meshtastic.org can rate-limit per-install). Privacy-conscious
  operators set it to false to suppress.

  #206 — KiwiSDR upstream is HTTP-only (services/kiwisdr_fetcher.py)

  Upstream rx.linkfanel.net doesn't speak HTTPS (verified — Apache
  2.4.10 only on port 80). We can't fix the transport. Instead added
  three layers:
    1. Content validation on fetched data — reject responses with
       <50 receivers or >5% malformed entries (likely MITM injection).
    2. Existing disk cache fallback (already present).
    3. NEW: bundled static directory at backend/data/kiwisdr_directory.json
       shipping 798 known-good receivers. Used as last resort so the
       KiwiSDR map layer always renders something useful.

  #208 — Merkle proof DoS via /api/mesh/infonet/sync (services/mesh/mesh_hashchain.py)

  The endpoint is part of the cross-node federation protocol — peers
  legitimately call it without local-operator auth, so we can't add
  Depends(). Instead made the underlying operation O(1) per proof
  via a cached Merkle level structure on the Infonet instance:
    - _merkle_levels_cache + _merkle_levels_for_event_count on each
      Infonet instance
    - _invalidate_merkle_cache() called from every chain mutation
      point (append, ingest_events, apply_fork, cleanup_expired)
    - _get_merkle_levels() does the lazy recompute on first read
      after invalidation, then serves from cache thereafter
  Effect: anonymous attackers hammering the proofs endpoint hit a
  cached structure; the rebuild happens at most once per real chain
  advance. Federation untouched.

  #201 — Tor bundle SHA-256 bypass (services/tor_hidden_service.py)

  Docker users were already covered — backend/Dockerfile installs
  Tor via apt-get at build time (signed by Debian's package system).
  No runtime download needed for the 80%-of-users case.

  For Tauri desktop, replaced the single .sha256sum check with a
  multi-source verification chain implemented in _verify_tor_bundle():
    1. Try upstream .sha256sum (current behavior — fast path)
    2. Try baked-in digest list at backend/data/tor_bundle_digests.json
       (pinned per-version, maintainer-updated)
    3. If neither source is REACHABLE: HTTPS-only fallback with a loud
       warning (avoids breaking first-run onboarding while the
       maintainer hasn't yet pinned a new Tor release)
  A mismatch from a source that DID respond is always fatal — only
  the "no source reachable" case falls back to HTTPS-only. This is
  the "have cake and eat it" pattern: real users see no new failure
  modes during torproject.org outages, but MITM/compromise attacks
  still fail because the downloaded digest can't match what BOTH
  the upstream and the baked-in list report.

  Currently the digest file ships with placeholder values for the
  current Tor URLs (those URLs are already stale on torproject.org
  too). A follow-up commit can populate real digests when a stable
  Tor release is selected; until then the HTTPS-only warning fires
  and onboarding still works.

Tests (82 total, all passing):
  test_openmhz_redirect_ssrf.py        (5 tests)  — #205
  test_infonet_status_verify_gate.py   (2 tests)  — #207
  test_overflights_clamp.py            (5 tests)  — #202
  test_meshtastic_callsign_optout.py   (3 tests)  — #203
  test_kiwisdr_fallback.py             (6 tests)  — #206
  test_merkle_cache.py                 (6 tests)  — #208
  test_tor_bundle_verification.py      (6 tests)  — #201
  test_control_surface_auth.py         (extended) — #211, #213, #214
  + all previous security tests (CCTV redirect, GDELT https, sentinel
    cache, crowdthreat opt-in, third-party fetcher gates, control
    surface auth) continue to pass.

Pre-existing test infrastructure issue with SHARED_EXECUTOR teardown
in the broader sweep exists on main too (verified) — not introduced
by this PR.

Credit: @tg12 reported every one of these with accurate line citations
and the recommended fixes that informed this implementation.

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-20 19:57:06 -06:00
BigBodyCobain 28b3bd5ebf release: prepare v0.9.7 2026-05-01 22:56:50 -06:00
anoracleofra-code 668ce16dc7 v0.9.6: InfoNet hashchain, Wormhole gate encryption, mesh reputation, 16 community contributors
Gate messages now propagate via the Infonet hashchain as encrypted blobs — every node syncs them
through normal chain sync while only Gate members with MLS keys can decrypt. Added mesh reputation
system, peer push workers, voluntary Wormhole opt-in for node participation, fork recovery,
killwormhole scripts, obfuscated terminology, and hardened the self-updater to protect encryption
keys and chain state during updates.

New features: Shodan search, train tracking, Sentinel Hub imagery, 8 new intelligence layers,
CCTV expansion to 11,000+ cameras across 6 countries, Mesh Terminal CLI, prediction markets,
desktop-shell scaffold, and comprehensive mesh test suite (215 frontend + backend tests passing).

Community contributors: @wa1id, @AlborzNazari, @adust09, @Xpirix, @imqdcr, @csysp, @suranyami,
@chr0n1x, @johan-martensson, @singularfailure, @smithbh, @OrfeoTerkuci, @deuza, @tm-const,
@Elhard1, @ttulttul
2026-03-26 05:58:04 -06:00
anoracleofra-code fc9eff865e v0.9.0: in-app auto-updater, ship toggle split, stable entity IDs, performance fixes
New features:
- In-app auto-updater with confirmation dialog, manual download fallback,
  restart polling, and protected file safety net
- Ship layers split into 4 independent toggles (Military/Carriers, Cargo/Tankers,
  Civilian, Cruise/Passenger) with per-category counts
- Stable entity IDs using MMSI/callsign instead of volatile array indices
- Dismissible threat alert bubbles (session-scoped, survives data refresh)

Performance:
- GDELT title fetching is now non-blocking (background enrichment)
- Removed duplicate startup fetch jobs
- Docker healthcheck start_period 15s → 90s

Bug fixes:
- Removed fake intelligence assessment generator (OSINT-only policy)
- Fixed carrier tracker GDELT 429/TypeError crash
- Fixed ETag collision (full payload hash)
- Added concurrent /api/refresh guard

Contributors: @imqdcr (ship split + stable IDs), @csysp (dismissible alerts, PR #48)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Former-commit-id: a2c4c67da54345393f70a9b33b52e7e4fd6c049f
2026-03-13 11:32:16 -06:00
anoracleofra-code 362a6e2ceb Initial commit: ShadowBroker v0.1
Former-commit-id: 8ed321f2ba
2026-03-04 22:44:08 -07:00