Commit Graph

15 Commits

Author SHA1 Message Date
Shadowbroker 76750caa92 Round 7a: per-operator outbound attribution + GDELT GCS-direct fix (#292)
== Per-install operator handle for every third-party API call ==

Before this PR, every Shadowbroker install identified itself to
Wikipedia, Wikidata, Nominatim, GDELT, OpenMHz, Broadcastify,
weather.gov, NUFORC, Sentinel/Planetary Computer, TinyGS / CelesTrak,
Shodan, Finnhub, and others with a single project-wide User-Agent
("Shadowbroker/1.0" or "ShadowBroker-OSINT/1.0"). From the upstream's
perspective every install in the world looked like one giant scraper.
If one install misbehaved, the upstream's only recourse was to block
"Shadowbroker" as a whole.

PR #284 inadvertently doubled down on this in the frontend by
introducing a shared `WIKIMEDIA_API_USER_AGENT` constant. This PR
retrofits both backends to per-operator attribution.

  New setting: OPERATOR_HANDLE (env var / settings UI / auto-gen)
  New helper:  network_utils.outbound_user_agent("purpose")

The handle is auto-generated as "operator-XXXXXX" on first call (the
"shadow-" prefix from earlier drafts was deliberately dropped — too
suspicious-looking for abuse-detection systems). Operators can
override via OPERATOR_HANDLE; the value is sanitized to lowercase
alphanumeric+dash+underscore and capped at 48 chars. Persisted to
backend/data/operator_handle.json so it survives container restarts.

Retrofitted call sites (every previously-MONSTER User-Agent):
  - services/region_dossier.py (Wikipedia + Wikidata + Nominatim)
  - services/geocode.py         (Nominatim)
  - services/sentinel_search.py (Microsoft Planetary Computer)
  - services/feed_ingester.py   (operator-curated RSS feeds)
  - services/fetchers/earth_observation.py (weather.gov, NUFORC)
  - services/fetchers/infrastructure.py
  - services/fetchers/aircraft_database.py
  - services/fetchers/route_database.py
  - services/fetchers/trains.py
  - services/fetchers/meshtastic_map.py
  - services/shodan_connector.py
  - services/unusual_whales_connector.py (Finnhub)
  - services/tinygs_fetcher.py            (CelesTrak + TinyGS)
  - services/sar/sar_products_client.py
  - services/geopolitics.py               (GDELT)
  - services/radio_intercept.py           (Broadcastify + OpenMHz)
  - routers/cctv.py + main.py             (CCTV proxy)
  - routers/ai_intel.py
  - scripts/convert_power_plants.py       (release-time data refresh)

Spoofed browser UAs removed (issues #289 / #290 / #291 — tg12 audit):
  - cloudscraper-based Chrome impersonation against api.openmhz.com
    -> replaced with honest requests + per-install UA
  - Mozilla/5.0 spoofed UA on Broadcastify scrape
    -> replaced with honest UA
  - Mozilla/5.0 + fake first-party Referer on OpenMHz audio relay
    -> replaced with honest UA
  - cloudscraper dependency dropped from pyproject.toml + uv.lock

Frontend retrofit:
  - new GET /api/settings/operator-handle endpoint (local-operator
    gated) returns the install's handle
  - frontend/src/lib/wikimediaClient.ts fetches the handle once on
    first use, caches it for page lifetime, embeds it in the
    Api-User-Agent for every Wikipedia / Wikidata browser-direct call

== GDELT GCS-direct fix ==

GDELT's data.gdeltproject.org is a CNAME to a Google Cloud Storage
bucket. GCS responds with the wildcard *.storage.googleapis.com cert
which legitimately does NOT cover the GDELT custom domain, so Python's
TLS verification correctly refuses the connection. Some networks
happen to route through a path where this works; many (notably Docker
Desktop's outbound NAT on local installs) do not. Verified on the
maintainer's local install: GDELT was unreachable; 1610 geopolitical
events / 48 export files were dropping silently.

Fix: services/geopolitics._gcs_direct_gdelt_url() rewrites any
data.gdeltproject.org URL to its GCS-direct equivalent
(storage.googleapis.com/data.gdeltproject.org/...) where the standard
GCS cert is genuinely valid. api.gdeltproject.org and every other host
are left untouched.

Confirmed live: backend log goes from
  GDELT lastupdate failed: 500
to
  Downloading 48 GDELT export files...
  Downloaded 48/48 GDELT exports
  GDELT parsed: 1610 conflict locations from 48 files

== Tests ==

  backend/tests/test_per_operator_outbound_attribution.py (12 tests)
  backend/tests/test_gdelt_gcs_direct_rewrite.py          (6 tests)
  backend/tests/test_region_dossier_wikimedia_ua.py       (updated to
    pin the helper + per-operator handle, not the old constant)
  frontend/src/__tests__/utils/wikimediaClient.test.ts    (rewritten
    to mock /api/settings/operator-handle and assert per-operator UA)

Local: backend 114/114 security+audit+round7a suite green;
       frontend 718/718 vitest suite green.

Credit: tg12 (external security audit, issues #289/#290/#291
relating to spoofed UAs); BigBodyCobain (operator-prefix call,
GDELT cloud-vs-local diagnosis).
2026-05-21 15:11:28 -06:00
Shadowbroker 2b03b808ac Fix #279: add defusedxml to uv.lock so Docker image installs it (#282)
defusedxml is listed in backend/pyproject.toml line 18 but was missing
from uv.lock. The backend Dockerfile uses `uv sync --frozen --no-dev`,
which only installs packages pinned in the lockfile. As a result the
runtime image shipped without defusedxml even though pyproject declared
it, and any import path that touched it crashed at startup with:

    ModuleNotFoundError: No module named 'defusedxml'

Affected import sites:

- backend/services/psk_reporter_fetcher.py:10
- backend/services/fetchers/aircraft_database.py:21
- backend/services/cctv_pipeline.py:990
- backend/services/cctv_pipeline.py:1018

Fix: regenerate uv.lock so defusedxml v0.7.1 (matching the >=0.7.1
specifier in pyproject) is locked. No code changes -- only the lockfile.
Next image build picks it up via the existing `uv sync --frozen` step.

Reporter: external user. Thanks for catching the missing dep.
2026-05-21 10:18:40 -06:00
BigBodyCobain b86a258535 Release v0.9.79 runtime and messaging update
Ship the v0.9.79 runtime refresh with transport lane isolation, Infonet secure-message address management, MeshChat MQTT controls, selected asset trail behavior, telemetry panel refinements, onboarding updates, and desktop/package metadata alignment.

Also ignore local graphify work products so analysis folders do not leak into future commits.
2026-05-12 11:49:46 -06:00
BigBodyCobain b8ac0fb9e7 Harden v0.9.75 wormhole node sync and telemetry panels
Add Tor/onion runtime wiring and faster Infonet node status refresh.

Keep node bootstrap state clearer across Docker and local runtimes.

Use selected aircraft trail history for cumulative tracked-aircraft emissions.
2026-05-06 14:04:16 -06:00
BigBodyCobain 6ffd54931c Release v0.9.75 runtime and onboarding update
Ship the 0.9.75 source update with improved startup/runtime hardening, operator API key onboarding, Meshtastic MQTT controls, Infonet/MeshChat separation, desktop package versioning, and aircraft telemetry refinements.

Also updates focused backend/frontend tests for node settings, Meshtastic MQTT settings, and desktop runtime behavior.
2026-05-06 01:15:54 -06:00
BigBodyCobain 28b3bd5ebf release: prepare v0.9.7 2026-05-01 22:56:50 -06:00
anoracleofra-code 81b99c0571 fix: add meshtastic, PyNaCl, vaderSentiment to dependencies
Full import audit found these packages used but missing from
pyproject.toml — all silently broken in Docker:
- meshtastic: MQTT protobuf decode (why US/LongFast chat was empty)
- PyNaCl: DM sealed-box encryption
- vaderSentiment: oracle sentiment analysis (unguarded, would crash)
2026-03-26 16:19:24 -06:00
anoracleofra-code 6140e9b7da fix: pin paho-mqtt to v1.x (v2 broke callback API)
paho-mqtt v2 changed Client constructor and on_connect callback
signatures, breaking the Meshtastic MQTT bridge. Pin to <2.0.0
so the existing v1 code works correctly in Docker.
2026-03-26 15:57:14 -06:00
anoracleofra-code 12cf5c0824 fix: add paho-mqtt dependency + improve Infonet sync status labels
paho-mqtt was missing from pyproject.toml, causing the Meshtastic MQTT
bridge to silently disable itself in Docker — no live chat messages
could be received. Also improve Infonet node status labels: show
RETRYING when sync fails instead of misleading SYNCING, and WAITING
when node is enabled but no sync has run yet.
2026-03-26 15:45:11 -06:00
anoracleofra-code fb6d098adf fix: add missing orjson, beautifulsoup4, cryptography deps to pyproject.toml
Docker image was crash-looping with `ModuleNotFoundError: No module named 'orjson'`
because these packages were imported but not declared as dependencies.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-26 08:03:17 -06:00
anoracleofra-code 09e39de4ef fix: add dev dependency group to pyproject.toml for CI
CI runs `uv sync --group dev` but only a `test` group existed.
Renamed to `dev` and added ruff + black so Docker Publish can pass.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-26 06:33:35 -06:00
Orfeo Terkuci fa2d47ca66 Refactor project structure: separate backend dependencies into pyproject.toml 2026-03-24 20:03:51 +01:00
Orfeo Terkuci b87e9c36a6 Remove unused dependencies
Dependencies which are not used, such as geopy, legacy-cgi and lxml are removed.
Subdependencies such as beautifulsoup4 and pytz have been removed
2026-03-22 16:08:43 +01:00
Orfeo Terkuci edc22c6461 Remove duplicate pytest declaration 2026-03-22 15:54:42 +01:00
Orfeo Terkuci e7f96499b9 Create pyproject.toml file and import dependencies 2026-03-22 15:39:09 +01:00