Detected by Aeon + Semgrep (5x use-defused-xml ERROR).
Severity: medium
CWE-776 (billion laughs) / CWE-611 (XML external entity)
Five XML parse sites pass response bodies into the Python stdlib
xml.etree.ElementTree without protection against entity expansion
attacks. Python's ElementTree still permits internal entity references
by default (per the docs vulnerabilities table), so a malicious or
compromised upstream can ship a "billion laughs"-style payload that
expands to gigabytes in memory.
The user-controllable site is sb_monitor._parse_rss: the OpenClaw skill
exposes add_custom_feed(name, url, ...) to the agent, then
poll_custom_feeds fetches feed.url and passes the body to
xml.etree.ElementTree.fromstring with no host allowlist or
entity-bomb defence. The other four sites (psk_reporter_fetcher,
aircraft_database, cctv_pipeline x2) parse XML from hard-coded
upstreams (pskreporter.info, s3.opensky-network.org,
datos.madrid.es); defence-in-depth for upstream-compromise/MITM.
Switch all five call sites to defusedxml.ElementTree. Same
fromstring/find/findall/iter/findtext API, but rejects entity
references by default (raises defusedxml.EntitiesForbidden).
Confirmed locally that a 4-deep billion-laughs payload that
expands to 3000 chars under stdlib ET is rejected by defusedxml.
Added defusedxml>=0.7.1 to backend/pyproject.toml dependencies.
Co-authored-by: aeonframework <aeon-bot@aaronjmars.com>
External security audit by @tg12 (May 17, 2026) filed issues #201–#214
in addition to the #189–#200 batch already closed by PRs #227/#232/#260.
This PR closes all eight that are real security bugs (the other six in
the 201–214 range are either design discussions or upstream-abuse/TOS
concerns we're keeping intentional, see issue triage notes on each).
The user-facing principle for this PR: fix the security gap WITHOUT
introducing a single hostile error or behavior change for legitimate
users. Every fix follows the same template — fail forward, not loud.
When the secure path is harder than the insecure one, build a
fallback chain that ends in graceful degradation, not in a scary
modal or 422 response.
#205 — OpenMHZ audio redirect SSRF (services/radio_intercept.py)
Replaced requests.get(..., allow_redirects=True) with a manual
redirect loop that re-validates each hop's host against
_OPENMHZ_AUDIO_HOSTS. Same-host redirects (CDN edge selection)
still work, so legitimate audio playback is unaffected. Cross-host
redirects to disallowed hosts return a generic 502 which the
browser audio element handles gracefully. Cap at 5 hops.
#207 — infonet/status verify_signatures DoS (routers/mesh_public.py)
Silently downgrade verify_signatures=true to False for
unauthenticated callers. No error surfaced — the response shape is
identical, just without the O(n_events) signature verification.
Authenticated callers (scoped mesh.audit) still get the full path.
The frontend never passes this param so legitimate UI is unaffected.
#211 — thermal/verify expensive analysis (routers/sigint.py)
Added Depends(require_local_operator). Frontend has no direct
callers (verified by grep); Tauri/AI agents use scoped tokens that
pass the auth check. Anonymous abusers blocked silently — the
legitimate UI keeps working through the Next.js admin-key proxy.
#213, #214 — OpenMHZ calls/audio upstream abuse (routers/radio.py)
Added Depends(require_local_operator) to both. Browser users hit
these through the Next.js proxy at src/app/api/[...path]/route.ts
which injects X-Admin-Key, so the auth check passes transparently.
Direct attackers can no longer rotate sys_names to hammer
api.openmhz.com or relay arbitrary audio streams through the
backend's bandwidth.
#202 — overflights unbounded hours (routers/data.py)
Silently clamp `hours` to OVERFLIGHTS_MAX_HOURS (default 72,
configurable). NO 422 — clients asking for an absurd window get a
shorter window back with `requested_hours` and `effective_hours`
hint fields. Postel's law: liberal in what we accept, conservative
in what we compute.
#203 — Meshtastic callsign UA leak (services/fetchers/meshtastic_map.py)
Added MESHTASTIC_SEND_CALLSIGN_HEADER opt-out env var. Default is
TRUE — preserves existing operator behavior (callsign sent so
meshtastic.org can rate-limit per-install). Privacy-conscious
operators set it to false to suppress.
#206 — KiwiSDR upstream is HTTP-only (services/kiwisdr_fetcher.py)
Upstream rx.linkfanel.net doesn't speak HTTPS (verified — Apache
2.4.10 only on port 80). We can't fix the transport. Instead added
three layers:
1. Content validation on fetched data — reject responses with
<50 receivers or >5% malformed entries (likely MITM injection).
2. Existing disk cache fallback (already present).
3. NEW: bundled static directory at backend/data/kiwisdr_directory.json
shipping 798 known-good receivers. Used as last resort so the
KiwiSDR map layer always renders something useful.
#208 — Merkle proof DoS via /api/mesh/infonet/sync (services/mesh/mesh_hashchain.py)
The endpoint is part of the cross-node federation protocol — peers
legitimately call it without local-operator auth, so we can't add
Depends(). Instead made the underlying operation O(1) per proof
via a cached Merkle level structure on the Infonet instance:
- _merkle_levels_cache + _merkle_levels_for_event_count on each
Infonet instance
- _invalidate_merkle_cache() called from every chain mutation
point (append, ingest_events, apply_fork, cleanup_expired)
- _get_merkle_levels() does the lazy recompute on first read
after invalidation, then serves from cache thereafter
Effect: anonymous attackers hammering the proofs endpoint hit a
cached structure; the rebuild happens at most once per real chain
advance. Federation untouched.
#201 — Tor bundle SHA-256 bypass (services/tor_hidden_service.py)
Docker users were already covered — backend/Dockerfile installs
Tor via apt-get at build time (signed by Debian's package system).
No runtime download needed for the 80%-of-users case.
For Tauri desktop, replaced the single .sha256sum check with a
multi-source verification chain implemented in _verify_tor_bundle():
1. Try upstream .sha256sum (current behavior — fast path)
2. Try baked-in digest list at backend/data/tor_bundle_digests.json
(pinned per-version, maintainer-updated)
3. If neither source is REACHABLE: HTTPS-only fallback with a loud
warning (avoids breaking first-run onboarding while the
maintainer hasn't yet pinned a new Tor release)
A mismatch from a source that DID respond is always fatal — only
the "no source reachable" case falls back to HTTPS-only. This is
the "have cake and eat it" pattern: real users see no new failure
modes during torproject.org outages, but MITM/compromise attacks
still fail because the downloaded digest can't match what BOTH
the upstream and the baked-in list report.
Currently the digest file ships with placeholder values for the
current Tor URLs (those URLs are already stale on torproject.org
too). A follow-up commit can populate real digests when a stable
Tor release is selected; until then the HTTPS-only warning fires
and onboarding still works.
Tests (82 total, all passing):
test_openmhz_redirect_ssrf.py (5 tests) — #205
test_infonet_status_verify_gate.py (2 tests) — #207
test_overflights_clamp.py (5 tests) — #202
test_meshtastic_callsign_optout.py (3 tests) — #203
test_kiwisdr_fallback.py (6 tests) — #206
test_merkle_cache.py (6 tests) — #208
test_tor_bundle_verification.py (6 tests) — #201
test_control_surface_auth.py (extended) — #211, #213, #214
+ all previous security tests (CCTV redirect, GDELT https, sentinel
cache, crowdthreat opt-in, third-party fetcher gates, control
surface auth) continue to pass.
Pre-existing test infrastructure issue with SHARED_EXECUTOR teardown
in the broader sweep exists on main too (verified) — not introduced
by this PR.
Credit: @tg12 reported every one of these with accurate line citations
and the recommended fixes that informed this implementation.
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
External security audit by @tg12 (May 17, 2026) filed 11 issues against
the backend. PR #227 (May 18, AI-generated) closed seven of them by
adding require_local_operator to control-plane endpoints. Four remained
live; this PR closes the rest.
#192 — CCTV proxy followed redirects without re-validating host
Issue: /api/cctv/media validated only the caller-supplied URL host
before passing it to requests.get(..., allow_redirects=True). A 302
to http://127.0.0.1 or any internal/disallowed host was silently
followed, turning the proxy into an open-redirect-to-SSRF chain.
Fix in routers/cctv.py: replace the single allow_redirects=True call
with a manual follow loop. Each hop's Location is parsed, the host is
rerun through _cctv_host_allowed(), and non-HTTP schemes (file://,
ftp://, etc.) are rejected. Cap chain length at 5 hops.
Test: backend/tests/test_cctv_redirect_ssrf.py covers
- redirect to disallowed host -> 502
- redirect to localhost -> 502
- redirect to another allowed host -> 200
- redirect chain length cap
- non-HTTP scheme rejected
#198 — Gate introspection GETs were unauthenticated
Issue: /api/wormhole/gate/{gate_id}/{identity,personas,key} were
callable with no auth dependency. Any caller that could reach the
backend could dump the operator's active persona, persona inventory,
and key status for any gate_id they knew. The wiki's privacy threat
model explicitly markets gate personas as rotating, unlinkable
pseudonyms — this leak defeated that property.
Fix in routers/wormhole.py: add
dependencies=[Depends(require_local_operator)] to all three routes.
Test: backend/tests/test_control_surface_auth.py extended with
three new parameterized cases (lines 75-77).
#199 — GDELT military incident ingestion used plaintext HTTP
Issue: backend/services/geopolitics.py fetched
http://data.gdeltproject.org/gdeltv2/lastupdate.txt and ~48 export
archive URLs over plaintext HTTP. Passive observers could identify
Shadowbroker nodes from the fetch pattern. Active MITM could inject
doctored military incident records into the global map.
Fix in services/geopolitics.py: rewrite the lastupdate.txt fetch and
the export download URL constructor to use https://. GDELT's
data.gdeltproject.org serves the same content over HTTPS.
Test: backend/tests/test_gdelt_https.py asserts no plaintext HTTP
URLs to data.gdeltproject.org remain in code (comments excluded) and
that the HTTPS URLs we expect are present.
#200 — Sentinel token cache lookup used client_id only
Issue: routers/tools.py kept a process-global cache of Copernicus
bearer tokens. The lookup compared
_sh_token_cache["client_id"] == client_id. A caller who knew a valid
client_id but supplied any wrong client_secret hit the cache and
reused the legitimate caller's bearer token — burning their quota
and accessing imagery on their account.
Fix in routers/tools.py: replace the client_id field with
credential_fp, an HMAC-SHA256 over (client_id, client_secret) under
a per-process random key (_SH_TOKEN_CACHE_HMAC_KEY = os.urandom(32),
regenerated at startup). A caller who doesn't know the secret cannot
compute a matching fingerprint, so they miss the cache and hit the
real Copernicus token endpoint — which will reject their wrong
secret with a 401.
Test: backend/tests/test_sentinel_token_cache.py covers
- same client_id + different secrets => different fingerprints
- same credentials => same fingerprint (cache still works)
- different client_ids + same secret => different fingerprints
- cache no longer stores raw client_id (catches regression)
- attacker with wrong secret cannot reuse victim's token
Validation
pytest backend/tests/test_control_surface_auth.py
backend/tests/test_cctv_redirect_ssrf.py
backend/tests/test_gdelt_https.py
backend/tests/test_sentinel_token_cache.py
-> 37 passed
Credit: @tg12 reported all four of these in their May 17 audit with
correct line-number citations and accurate remediation recommendations.
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
PR #226 landed the i18n infrastructure and Chinese (zh-CN) translations.
This follow-up adds the safeguards that make accepting community
translations sustainable without exposing the project to subtle
state-aligned framing in future translation PRs.
Changes:
frontend/src/i18n/index.tsx (renamed from .ts)
- Add LOCALES registry: a single source of truth for available
languages and their NATIVE display names ("English", "中文 (简体)").
Adding a new language is now a one-entry change here plus a
JSON file.
- Add isLocale() guard so an unknown value in localStorage falls
through to navigator.language detection instead of corrupting
state.
- File renamed to .tsx because it contains JSX. Next.js tolerated
JSX in .ts but Vite/Oxc (used by vitest) does not.
frontend/src/components/SettingsPanel.tsx
Add a UI language picker to the Settings header — a small <select>
populated from LOCALES. Users no longer need the dev console to
switch languages. Locale change remains 100% client-side
(localStorage), no network call, no telemetry.
CONTRIBUTING.md (new)
Documents the translation-neutrality requirement that applies
symmetrically to all source countries:
- Translations must be technically faithful to the English source.
- Substitutions aligned with state propaganda from ANY country
(PRC, Russia, US, EU, etc.) will be rejected.
- The test is: "would a translator working strictly from the
English source produce this rendering?"
Also explains how translation PRs are reviewed and how to add
a new language.
.github/CODEOWNERS (new)
Auto-requests maintainer review on:
- /frontend/src/i18n/ (translation safety)
- /backend/auth.py, /backend/routers/wormhole.py,
/backend/services/mesh/, /backend/services/fetchers/
(the same paths recent security audits flagged as sensitive)
- /.github/workflows/, /.gitlab-ci.yml, /docker-compose*.yml,
/helm/ (build/deploy)
- /CONTRIBUTING.md, /.github/CODEOWNERS (policy itself)
frontend/src/__tests__/i18n/i18nProvider.test.tsx (new, 8 tests)
Locks in the i18n contract:
- LOCALES has both en and zh-CN with non-empty native labels
- Default English when navigator is English
- Auto-detect zh-CN when navigator language starts with "zh"
- localStorage preference overrides auto-detect
- setLocale persists to localStorage
- Unknown stored locale falls back to auto-detect
- Renders a real zh-CN translation (catches large-scale
translation removal in future PRs)
- Missing key falls back to the key itself
Note: i18n/index.tsx, the language toggle UI, the translation
policy, and the test suite together form a defense-in-depth setup.
The structural safety guarantee (no network calls, static JSON
bundled at build) is intact; this PR makes the social contract
around translations explicit and enforceable via branch
protection on CODEOWNERS-marked paths.
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Introduce a lightweight i18n system with auto-detection of browser
language and localStorage persistence. Add complete Chinese translations
for all major UI sections: navigation, controls, update dialogs, node
activation, terminal launcher, data layers, settings, filters, and more.
Technical terms (Wormhole, Infonet, Mesh, Shodan, SAR, etc.) are
intentionally kept in English. Falls back to English when Chinese
translation is not found.
Co-authored-by: wangsudong <wangsudong@kylinos.cn>
Brings the GitLab side to full parity with GitHub so users who prefer
gitlab.com get the same source, the same images, and the same install
paths. Today, GitLab users can clone the source but the Helm chart and
docker-compose paths only worked against GHCR.
What's new:
.gitlab-ci.yml
Multi-arch (amd64 + arm64) Docker builds on every push to main,
pushed to the project's GitLab Container Registry as:
registry.gitlab.com/bigbodycobain/shadowbroker/backend:latest
registry.gitlab.com/bigbodycobain/shadowbroker/frontend:latest
Plus a :$CI_COMMIT_SHORT_SHA tag for traceability. Uses
$CI_JOB_TOKEN — no credentials need to be configured.
Also adds a 'mirror-to-github' job that pushes main back to GitHub
via fast-forward-only `git push`. Skipped silently if the
GITHUB_MIRROR_TOKEN CI/CD variable isn't set. Setup instructions
are in the file header.
docker-compose.gitlab.yml
Override file that swaps the backend/frontend image: lines to the
GitLab registry. Used as:
docker compose -f docker-compose.yml -f docker-compose.gitlab.yml up -d
Verified with `docker compose config` — merges cleanly and emits
registry.gitlab.com/... image references.
helm/chart/values-gitlab.yaml
Helm values override that points the chart at the GitLab registry.
Used alongside the default values.yaml:
helm install ... -f helm/chart/values.yaml -f helm/chart/values-gitlab.yaml
README.md
Documents both install paths (GitHub default, GitLab override) for
both docker compose and Helm. Notes that both registries publish
identical images (same source, same CI matrix).
No credentials needed for the GitLab→GitLab side. The optional reverse
mirror requires a GitHub PAT (public_repo scope) added as the GitLab
CI/CD variable GITHUB_MIRROR_TOKEN — instructions in the .gitlab-ci.yml
header.
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
The chart referenced registry.gitlab.com/bigbodycobain/shadowbroker/{backend,frontend}:latest
as the primary image source, but two things made that path effectively
broken for new K8s installs:
1. No .gitlab-ci.yml has ever existed in this repo, so the GitLab
registry was never populated by automated builds. Any images there
would be stale or manually pushed.
2. The GitLab registry returns HTTP 401 on anonymous pulls, so even
if images existed, Helm-managed deployments without registry
credentials would fail.
GHCR, by contrast, is auto-built and pushed on every merge to main by
.github/workflows/docker-publish.yml, and ghcr.io allows anonymous pulls
for public images. It's also the registry that docker-compose.yml has
been using as primary all along, so this brings the Helm install path
to parity with the Docker Compose install path.
After this change:
- ghcr.io/bigbodycobain/shadowbroker-backend:latest <- now in chart
- ghcr.io/bigbodycobain/shadowbroker-frontend:latest <- now in chart
GitLab is preserved in the comments as a documented fallback for
operators who run private mirrors with their own CI.
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Each alert toast had a 5-second auto-dismiss timer that fired even
while the user was reading the card. This adds pause-on-hover: the
dismiss timer stops while the mouse is over a toast and restarts (full
lifetime) on mouse leave. The progress bar animation pauses with it,
so the visual matches the actual remaining time.
All other behavior is preserved: same cyber/mono styling, same spring
slide-in, same risk-color border + glow, same warning icon, same
LVL X/10 readout, same title/source layout, same click-to-fly + dismiss
on body click, same × dismiss button.
Implementation notes:
- Extract a ToastCard sub-component so each card can own its own
paused state (useState can't be array-indexed in the parent).
- Move the auto-dismiss timer out of useAlertToasts.ts and into
ToastCard. The hook previously scheduled the dismiss itself, which
meant the UI couldn't pause it — only the component knows whether
the user is interacting.
- Add tests covering: title/source/severity render, auto-dismiss
fires at 5s, hover pauses indefinitely, mouse-leave restarts the
full lifetime, × dismisses without flying, body-click flies +
dismisses.
This implements the genuine UX improvement that PR #234 was reaching
for, without #234's broken syntax, missing-field bug, duplicate
timer logic, or design regression.
Refs: #234
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
PR #227 hardened most Wormhole/Infonet control surfaces behind
require_local_operator and made the CrowdThreat fetcher opt-in. An
audit of the codebase against that PR's stated goals turned up four
classes of gap that the original change missed:
1. Two operator-only endpoints were left unprotected:
- POST /api/wormhole/join: calls bootstrap_wormhole_identity() and
flips the node into Tor mode, exactly the surface #227 hardened
on /api/wormhole/identity/bootstrap.
- POST /api/sigint/transmit: relays APRS-IS packets over radio
using operator-supplied credentials. Anything that reached the
API could transmit on the operator's authority.
Both now require_local_operator. test_control_surface_auth.py
extended with regression coverage for both.
2. Five third-party fetchers were still default-on, phoning home to
politically/commercially sensitive upstreams on every poll cycle:
- fimi.py -> euvsdisinfo.eu -> FIMI_ENABLED
- prediction_markets -> Polymarket + Kalshi -> PREDICTION_MARKETS_ENABLED
- financial.py -> Finnhub / yfinance -> FINANCIAL_ENABLED or FINNHUB_API_KEY
- nuforc_enrichment -> huggingface.co -> NUFORC_ENABLED
- news.py -> configured RSS feeds -> NEWS_ENABLED (default on, kill switch)
Same CrowdThreat-style pattern: explicit env-var opt-in, empty
the data slot and mark_fresh when disabled. New regression test
file test_third_party_fetchers_opt_in.py asserts each fetcher's
network entry point is not called when its gate is off.
3. The outbound User-Agent leaked both the operator's personal email
and a fork-specific GitHub URL on every fetcher request. Consolidated
to a single DEFAULT_USER_AGENT in network_utils.py, project-generic
by default (no contact info), overridable via SHADOWBROKER_USER_AGENT
for operators who want to identify themselves (e.g. for Nominatim or
weather.gov usage-policy compliance). Six call sites updated; the
Nominatim-specific override is preserved.
4. The same generic UA now also flows through the peer prekey lookup
in mesh_wormhole_prekey.py, so DM first-contact requests no longer
identify the caller as a Shadowbroker fork to the peer being
queried.
.env.example updated to document all new opt-in env vars.
Tests: backend/tests/test_control_surface_auth.py (extended),
backend/tests/test_crowdthreat_opt_in.py (unchanged, still passes),
backend/tests/test_third_party_fetchers_opt_in.py (new, 7 tests).
All 31 tests pass.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Allow local-operator DM invite import without requiring a full admin session.
Prioritize bundled/bootstrap seed peers and shorten stale seed cooldowns for faster Infonet recovery.
Replace raw DM invite dumps with copyable signed-address controls, contact request handling, and safer sealed-send behavior while the private delivery route connects.
Ship the v0.9.79 runtime refresh with transport lane isolation, Infonet secure-message address management, MeshChat MQTT controls, selected asset trail behavior, telemetry panel refinements, onboarding updates, and desktop/package metadata alignment.
Also ignore local graphify work products so analysis folders do not leak into future commits.
Add Tor/onion runtime wiring and faster Infonet node status refresh.
Keep node bootstrap state clearer across Docker and local runtimes.
Use selected aircraft trail history for cumulative tracked-aircraft emissions.
Reduce cold-start stalls by raising the default backend memory limit, bounding heavy feed concurrency, preserving non-empty startup caches, and refreshing working news feeds. Fix the Next API proxy for Docker control-plane writes by stripping unsupported hop/body headers and forwarding small request bodies safely. Keep the dashboard dynamic so production users do not get stuck on a cached startup shell.
Let fresh Docker and local installs enter OpenSky, AIS, and other provider keys directly in onboarding or Settings without manually creating .env files. Persist keys server-side in the backend data store, keep them write-only from the browser, reload runtime settings, and retain local-operator access controls.
Allow the bundled Docker frontend proxy to reach local-operator endpoints through the private compose bridge without trusting LAN clients. This restores Time Machine, MeshChat key creation, AI pins/layers, and related local controls in Docker installs. Refresh first-run guidance so Docker users know to configure OpenSky and AIS keys through .env.
Render the app shell dynamically so Next can attach per-request CSP nonces to its production scripts, preventing Docker from serving a static shell that cannot hydrate. Also gives the first-contact warmup test enough time in CI.
Seed safe static backend data into fresh Docker volumes, tighten Docker build-context exclusions, avoid optional env warnings, and make the frontend healthcheck use the IPv4 loopback path that works inside the container.