Compare commits

...

76 Commits

Author SHA1 Message Date
BigBodyCobain 5ede669a12 Ship ShadowBroker v0.9.83 with live Infonet gate messaging and DM protocols.
Gate hashchain replication, Tor/SOCKS transport hardening, terminal session teardown, v0.9.83 UI/changelog, and release digest pins for seamless updater verification.
2026-06-15 15:37:29 -06:00
BigBodyCobain 8fcb01276c Participant compose: keep 1 CPU limit for single-vCPU VPS nodes.
Co-authored-by: Cursor <cursoragent@cursor.com>
2026-06-15 02:34:03 -06:00
BigBodyCobain 10dc9450be Participant compose: 4G default RAM; fleet join opt-out via .env.
Dashboard VPS nodes can set MESH_INFONET_FLEET_JOIN=false to avoid Tor
manifest sync wedging the API during OSINT warmup.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-06-15 02:33:34 -06:00
BigBodyCobain bef462cdcf Restore full telemetry after E2E; make participant MESH_ONLY opt-in.
E2E harness recreates the full dashboard backend when a run ends so local
map layers are not left in lean MESH_ONLY mode. Participant compose no
longer forces MESH_ONLY=true — set it in .env only for lean DM-only nodes.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-06-15 02:19:08 -06:00
BigBodyCobain 5135b771f5 Fix fleet E2E for third participant and Tor-only shared DM delivery.
Step 8 uses live HTTP poll/decrypt instead of wedging remote python;
prime local wormhole before Tor warmup; auto-set MESH_RELAY_PEERS on
participant prime. Verified Extra run 119 and Pete Tor-only run 121.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-06-15 02:10:26 -06:00
BigBodyCobain 7151563a41 Fix Extra-participant E2E: live contact send and Tor prekey cache path.
Adds connect-contact HTTP endpoint with cached-bundle support, subprocess contact send via docker cp bundle file, and direct Tor prekey fetch to avoid wedging single-worker uvicorn.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-06-14 17:00:13 -06:00
BigBodyCobain 52a28967a0 Use direct Tor prekey fetch for third-party participant E2E lookups.
Avoids wedging single-worker local uvicorn on long /dm/pubkey aggregator calls when testing new fleet onions like vps-extra.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-06-14 13:25:37 -06:00
BigBodyCobain 96182fe66d Add git-deploy and skip-remote-prep options for fleet E2E harness.
Supports third-party participants deployed via compose pull; includes wormhole prime helper for fresh VPS nodes.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-06-14 12:53:36 -06:00
BigBodyCobain 174031479c Generalize E2E harness env for any fleet participant onion host.
REMOTE_PARTICIPANT_ONION aliases PETE_ONION so the same script can target a non-Pete peer once deployed.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-06-14 12:14:51 -06:00
BigBodyCobain f1cd9eb4b9 Pass Tor E2E shared DM flow and harden mesh relay for fleet participants.
MLS export/reset and accept use live HTTP so uvicorn privacy-core state stays consistent; relay persistence and sender_seal fixes enable invite-accept-shared decrypt across onion peers. Adds participant/e2e compose overlays and harness recovery with optional Tor-only replicate mode.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-06-14 11:26:57 -06:00
BigBodyCobain c266c5ff5e Close v1 swarm: fresh-participant smoke test, join retries, README fleet note.
Retry announce/manifest while Tor circuits warm on NODE and startup bootstrap.
Add verify_swarm_fresh_participant.py for empty-volume GHCR smoke tests.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-06-12 03:09:02 -06:00
BigBodyCobain 52a0968092 Fix MessagesView first-contact test for allowLegacyAgentId lookup option.
fetchDmPublicKey now passes allowLegacyAgentId: false for short-address
contact requests; update the assertion to match the new call signature.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-06-12 02:21:25 -06:00
BigBodyCobain 89d6bb8fb9 Ship DM connect delivery, fleet pubkey lookup, OpenClaw Infonet agent, and relay auto-wormhole.
Auto-relay connect DMs with End Contact severing, signed fleet prekey lookup,
OpenClaw private Infonet channel intents, headless relay Tor bootstrap on redeploy,
and swarm/DM live verification scripts.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-06-12 02:15:56 -06:00
BigBodyCobain d48a0cdace Use GHCR image for relay compose so seed VPS pulls published builds.
Seed relay nodes should track CI-published backend images instead of local builds that fail without full monorepo context.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-06-11 10:39:46 -06:00
BigBodyCobain df76f6f147 Enable zero-config Infonet fleet join for all participant nodes.
Ship sb-testnet fleet defaults, swarm/join API, NODE launcher registration step, and meshnode script defaults so users discover peers via the signed seed manifest without manual peer lists.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-06-11 10:25:48 -06:00
BigBodyCobain 776c89bfcf Add private Infonet swarm discovery and gate propagation.
Signed peer manifest pull/announce on the seed, immediate hashchain push for gate messages, seed-only Docker defaults, and stale-genesis sync diagnostics.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-06-11 03:15:25 -06:00
BigBodyCobain d3006df57a Fix frontend CI after Meshtastic Chat panel refactor.
Update gate-resync decomposition expectations for Infonet embed and harden GateView stream snapshot waits for slower CI runners.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-06-11 01:05:59 -06:00
BigBodyCobain e78e4d186d Ship Meshtastic Chat UX, embedded Infonet/SHELL panels, and Docker dev polish.
Rename Mesh Chat to Meshtastic Chat, embed the Infonet terminal with Arti/Tor warmup, improve the agent shell PTY (git in the backend image, operator PATH), and add docker-compose.override for local image builds. Gitignore Hermes Agent runtime installs.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-06-11 00:55:38 -06:00
BigBodyCobain d1e1be4016 Replace mock Agent Shell overlay with inline xterm PTY and dock/expand UX.
Uses a local-operator WebSocket bash session, keeps the map interactive, and SNAP docks the shell back into Mesh Chat instead of a floating blurred panel.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-06-10 11:30:50 -06:00
BigBodyCobain 0afb85e241 Fix MeshChat behavior tests after Agent Shell tab replaced dashboard Dead Drop UI.
Point trust and dm-add assertions at Infonet Messages and MeshTerminal where those flows now live.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-06-10 09:44:44 -06:00
BigBodyCobain 039a0f9d0c Remove dead Drop dashboard UI so Agent Shell frontend build passes.
Dead Drop chat stays in Infonet Terminal; Mesh Chat dms tab is Agent Shell only.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-06-10 09:40:54 -06:00
BigBodyCobain b9b99c1fa8 Replace Mesh Chat Dead Drop tab with stretchable Agent Shell panel.
Anchors to the Mesh Chat box, stretches on tab enter, and supports user resize without changing the fixed left column width.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-06-10 00:26:58 -06:00
BigBodyCobain a8fd33a758 Add OpenClaw fast-path routing with playbooks and expensive-command gate.
Move intent routing into route_query/ask, short-circuit find_entity fuzzy search, and document the thin three-tool agent surface so Hermes avoids multi-second search_telemetry by default.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-06-09 21:32:08 -06:00
BigBodyCobain 7346129d0e Fix ChangelogModal TypeScript after contributor trim.
Declare optional pr on contributor entries so the build type-check passes with OSIRIS-only credits.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-06-09 00:14:09 -06:00
BigBodyCobain eb8f39f84e Fix v0.9.82 changelog credits: drop stale contributor tags.
Remove recycled names from older releases; keep only OSIRIS third-party attribution for this cycle.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-06-08 23:30:42 -06:00
BigBodyCobain 00f9e3f1fd Pin v0.9.82 release digests for updater integrity verification.
Carry SHA-256 hashes for the source zip, MSI, and setup EXE into release_digests.json while retaining prior release entries.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-06-08 23:13:34 -06:00
BigBodyCobain ffdfe0426b Prepare v0.9.82 release: bump versions and changelog UI.
Align backend, desktop, helm, and frontend package versions for the Telegram OSINT and OpenClaw recon release.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-06-08 23:05:26 -06:00
BigBodyCobain 1583fd5715 Expose new telemetry and recon toolkit to OpenClaw agents.
Wire telegram_osint, malware, cyber, and SCM into search/slow-tier helpers; add osint_lookup, entity_expand, and osint_sweep commands; update README and skill docs.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-06-08 22:44:16 -06:00
BigBodyCobain af9b3d08cc feat: Telegram OSINT map layer, Osiris intel ports, and maritime settings
Add Telegram OSINT with hourly incremental t.me scraping, metro geocoding
separate from news centroids, threat-intercept popup UI with inline media,
and HTML markers above alert boxes so pins stay clickable. Expose GFW_API_TOKEN
in onboarding and Settings Maritime; harden GFW/CCTV/geo fetchers. Port Osiris-
derived recon, SCM, entity graph, malware/cyber feeds, sanctions, and submarine
cable layers with tests and documentation.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-06-08 21:04:08 -06:00
BigBodyCobain b64b9e0962 Add Sentinel-2 road freight trends with Analyze Here UI.
Port DrishX truck-motion detection as an opt-in slow layer: on-demand map-center analysis, preset corridors, layer panel toggle, and Docker road-corridor extras.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-06-07 23:39:13 -06:00
BigBodyCobain 76f4deb3a7 test: remove dead _make_client helper from conftest (from PR #376 review).
Co-authored-by: Cursor <cursoragent@cursor.com>
2026-06-06 20:40:29 -06:00
BigBodyCobain 49d90eaf69 Track production-hardening checklist in docs (gitignore exception).
Co-authored-by: Cursor <cursoragent@cursor.com>
2026-06-06 20:23:11 -06:00
BigBodyCobain 079ff7b737 Harden production checklist: dedupe live-data routes and align serializers.
Pin Mathieu's data-path checklist in docs and PR template, remove dead main.py fast/slow handlers, unify orjson via _live_data_json_bytes, and bound LiveUAMap Playwright defaults.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-06-06 20:16:18 -06:00
BigBodyCobain bd81a940ff Follow up on #375 review: dedupe live-data route and harden serializers.
Align full /api/live-data with slow-tier orjson options, remove dead main.py duplicate, cap slow batches to pool size, cancel queued work on timeout, and stop retrying HTTP 4xx/5xx.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-06-06 20:10:59 -06:00
BigBodyCobain 9a0a9a116a Address #375 production-readiness: dev bind, live-data lock, heavy fetch pool.
Default python main.py to loopback, deep-copy dashboard snapshots outside the store lock with ETag on full live-data, and route GDELT/LiveUAMap/CCTV/slow-tier work through an isolated executor so Playwright jobs cannot starve fast-tier workers.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-06-04 17:29:04 -06:00
BigBodyCobain 80a01275ff Add MKT opt-in on threat intercept, jittered market fetches, and Sentinel multi-scene dossier.
Operators enable Polymarket/Kalshi correlation from Global Threat Intercept with a consent dialog; polls use a jittered schedule separate from the slow tier. Right-click Sentinel imagery returns up to three signed scenes again.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-06-04 09:01:21 -06:00
BigBodyCobain 3ac8442e4b fix(uap): weekly live NUFORC refresh with 7-day cache for operators
Each install pulls ~60-day sightings from nuforc.org every Monday; disk cache
matches weekly cadence so users keep current pins between restarts.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-06-03 18:41:28 -06:00
BigBodyCobain 5f322b0a79 fix(uap): enforce 60-day window, refresh daily, live NUFORC on Windows
Filter stale rows out of nuforc_recent_sightings.json on load; add requests-based
live scrape when curl is disabled; daily scheduler rebuild instead of weekly-only.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-06-03 18:27:30 -06:00
BigBodyCobain 363b5a49c8 Close tg12 outbound audit (#348-#366): operator UA, opt-ins, docs
- User-Agent is per-install handle only (no Shadowbroker product token)
- LiveUAMap: Windows UI consent when enabling Global Incidents; env override
- Meshtastic callsign upstream header off by default (opt-in true)
- Expanded docs/OUTBOUND_DATA.md and README link for CCTV, basemap, Broadcastify

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-06-03 15:01:32 -06:00
BigBodyCobain a3e5c98cd0 test(cctv): Madrid KML HTTPS-first fallback; clarify KiwiSDR #364 docs
Adds unit coverage for MadridCityIngestor catalog fetch order.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-06-03 14:33:01 -06:00
BigBodyCobain 6a098e1c5f Pin DeepState mirror, prefer HTTPS for Madrid/KiwiSDR, document outbound data (#362–#364).
Operators can set DEEPSTATE_MIRROR_COMMIT for immutable frontline ingest; Madrid KML tries HTTPS then HTTP without changing camera image URLs or proxy Referers.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-06-03 14:31:31 -06:00
BigBodyCobain f08781bdc9 Route dossier, geocode, and Wikimedia through the backend (#351, #352, #360)
Proxy region dossier, Sentinel search, Wikipedia, and Wikidata via self-hosted
APIs; remove LocateBar client-side Nominatim fallback; migrate legacy shadow-
operator handles to operator- prefix.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-06-02 15:20:44 -06:00
BigBodyCobain c3dd95f6a9 Address remaining safe security hardening 2026-06-02 13:34:11 -06:00
BigBodyCobain 10a8c7b5be Apply non-disruptive security hardening 2026-06-02 12:50:41 -06:00
BigBodyCobain f03ebbba11 Clarify OpenClaw HMAC agent credentials 2026-05-30 13:52:01 -06:00
BigBodyCobain a16f22ed34 Cover AI and SAR proxy auth routes 2026-05-29 08:15:06 -06:00
BigBodyCobain 41e35e4da2 Fail fast on short admin keys 2026-05-28 15:02:40 -06:00
BigBodyCobain be3ab5823a Fix self-host API key proxy auth 2026-05-28 01:54:23 -06:00
BigBodyCobain ef52bd03d2 Harden private Infonet host checks 2026-05-28 01:26:48 -06:00
BigBodyCobain 017f383096 Fix BadHost path handling 2026-05-28 01:24:33 -06:00
Shadowbroker 41799f9891 feat(ci): switch GitLab mirror-to-github job to per-repo SSH deploy key (#331)
* feat(ci): switch mirror-to-github job from PAT to per-repo SSH deploy key

GitHub fine-grained PATs are capped at 366 days, classic PATs would
need 'public_repo' (broader scope than needed). Per-repo SSH deploy
keys are tighter:
- Can ONLY push to BigBodyCobain/Shadowbroker (no access to anything
  else, not even other repos owned by the same account).
- Never expire.
- Rotating == one-click delete on github.com/.../settings/keys.

Changes:
- New CI/CD variable GITHUB_MIRROR_SSH_KEY (File, Protected) holding
  the ed25519 private half. Public half lives on the repo's deploy
  keys with write access enabled.
- mirror-to-github before_script writes the key to ~/.ssh/id_ed25519,
  pins github.com host fingerprints (ed25519 + ecdsa + rsa from the
  2023-03-24 rotation) into ~/.ssh/known_hosts so we never trust a
  MITM, then pushes via git@github.com:... instead of HTTPS.
- Job rule now gates on GITHUB_MIRROR_SSH_KEY (the new var) instead
  of GITHUB_MIRROR_TOKEN (which never existed).

After this lands, every commit pushed directly to GitLab main will
mirror back to GitHub main automatically — closing the loop on
bi-directional sync.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix(secret-scan): exempt SSH known_hosts entries from leaked-key detection

PR #331 introduced github.com host fingerprints pinned in
.gitlab-ci.yml's mirror-to-github before_script. The scanner flagged
them as embedded secrets and blocked CI:

  BLOCKED: Embedded secrets/tokens found in:
    .gitlab-ci.yml
      133: github.com ssh-ed25519 AAAA...
      135: github.com ssh-rsa AAAA...

These are PUBLIC host keys — the whole point of pinning known_hosts is
to publish the fingerprint widely so a MITM is detectable. They are
documented at https://docs.github.com/en/authentication/keeping-your-account-and-data-secure/githubs-ssh-key-fingerprints
and committing them is the correct, secure practice.

Fix: add a KNOWN_HOSTS_LINE regex to the content-scan block that
recognizes `<host-or-ip> [salt] <algo> AAAA...` shape lines (the
exact format used in ~/.ssh/known_hosts) and filters them out before
flagging the file. Bare `ssh-rsa AAAA...` lines without a host prefix
are still caught — only the host-key shape is exempt.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-25 04:22:09 -06:00
Shadowbroker a1af9c3595 fix(ci): wrap GitLab dind TLS env in docker context so buildx accepts it (#330)
The build-backend and build-frontend jobs were failing immediately after
identity verification finally allocated runners:

    $ docker buildx create --use --name multiarch --driver docker-container
    ERROR: could not create a builder instance with TLS data loaded from
    environment. Please use `docker context create <context-name>` to create
    a context for current environment and then create a builder instance
    with context set to <context-name>

The dind service exports DOCKER_HOST=tcp://docker:2376 +
DOCKER_TLS_CERTDIR=/certs, but buildx --driver docker-container doesn't
read TLS from those env vars directly. Documented GitLab fix: create an
empty `docker context` (which inherits the current TLS env), then bind
buildx to that context name as a positional arg.

After this lands, the multi-arch buildx jobs should actually build and
push amd64 + arm64 images to
  registry.gitlab.com/bigbodycobain/shadowbroker/backend:latest
  registry.gitlab.com/bigbodycobain/shadowbroker/frontend:latest

Surfaced by the post-verification pipeline at
  https://gitlab.com/bigbodycobain/Shadowbroker/-/pipelines/2550501798

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-25 02:04:53 -06:00
Shadowbroker c8a8fc56f8 chore(ci): bump comment in .gitlab-ci.yml to verify post-verification runner allocation (#329)
Pipelines on the GitLab mirror have been instant-failing with 0 jobs and
no started_at since the project was created — classic "shared runners
not allocated to unverified free-tier accounts" pattern. The account is
now identity-verified; this trivial comment bump exists solely to fire a
fresh pipeline that confirms runners now pick up the build-backend and
build-frontend jobs.

If the resulting pipeline produces real jobs that build the multi-arch
images and push them to registry.gitlab.com/bigbodycobain/shadowbroker/{backend,frontend},
the GitLab install path is at full parity with the GitHub one.

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-25 01:54:08 -06:00
Shadowbroker e6aba86ce1 chore(release): update v0.9.81 SHA256 digests after rebuild (#328)
Re-cut v0.9.81 binaries from current main (which now includes the
private gate + DM hashchain spool from #326 and the gate-directory
test from #327). All three artifacts were signed with the same
minisign updater key as the original v0.9.81 release, so existing
v0.9.81 installs on Tauri auto-update accept the new bundles.

Updated hashes (verified against released assets):
- ShadowBroker_v0.9.81.zip      f81f454bdc88e9a32c351df38212b8cfa624704d65764b971bb091eef62259c6
- ShadowBroker_0.9.81_x64-setup.exe   25e9a95d0d8ce959a7d08fe8e7406772ae24b596652793e81d1de5d02510a5a6
- ShadowBroker_0.9.81_x64_en-US.msi   34e655fc0c0f195ee4ac978f228a4b2b9d5565253b8771aca9ef4693409e9e70

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-25 01:16:12 -06:00
Shadowbroker d5609ac02f test(infonet): cover gate directory renderer (landing + command variants) (#327)
Adds the focused test Codex wrote alongside the gate-directory UI work
that already shipped in #326 (the `renderGateDirectory` helper used
both under the Infonet logo on the landing screen and as the output of
the `gates` command in the terminal).

The renderer itself is already on origin/main; this PR just ships the
test so CI catches regressions to the dual-variant render.

Verified locally:
- frontend npm run test:ci -- src/__tests__/mesh/infonetShellGateDirectory.test.tsx → 1/1 pass

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-24 21:55:54 -06:00
Shadowbroker 1d7fa5185a feat(infonet): private gate + DM hashchain spool with hardened propagation (#326)
Private gate messages and offline DMs now ride the Infonet hashchain
as ciphertext-only events, replicated across nodes via private
transports (Tor onion / RNS / loopback) and decrypted only by parties
holding the gate or recipient keys.

Hashchain core (mesh_hashchain.py)
----------------------------------

* New ``append_private_gate_message`` and ``append_private_dm_message``
  append paths with full signature verification, public-key binding,
  revocation check, and replay protection in a dedicated sequence
  domain (so a gate post does not consume the author's public broadcast
  sequence, and a DM cannot replay-block a public message at sequence=1).
* Fork validation and full-chain validation now accept the gate
  signature compatibility variants — older signatures that canonicalize
  with/without epoch or reply_to still verify, so a re-sync from an
  older peer doesn't reject still-valid history.
* DM hashchain spool: capped at 2 active sealed offline DMs per
  recipient mailbox, plus a per-(sender, recipient) cap so one prolific
  sender can't consume both slots. 1-hour TTL on the cap counter.
  Spool intentionally small — it's an offline bootstrap channel,
  not a persistent mailbox.
* Rebuild-state preserves the gate sequence domain across reloads so
  a chain reload doesn't accidentally let an old gate sequence
  replay-collide on next append.

Schema enforcement (mesh_schema.py)
-----------------------------------

* Private gate + DM payloads have closed allowlists of fields.
  Plaintext keys (``message``, ``plaintext``, ``_local_plaintext``,
  ``_local_reply_to``) are explicit rejection-bait — they raise before
  the event ever touches the chain.
* DM ciphertext + nonce must look like base64-ish sealed bytes;
  obvious base64-encoded plaintext shapes are rejected.
* ``transport_lock`` required: DM hashchain spool requires
  ``private_strong``; gate accepts ``private``/``private_strong``/
  ``rns``/``onion``.

Defense-in-depth at the network layer (main.py + mesh_public.py)
----------------------------------------------------------------

* ``_infonet_sync_response_events`` now silently redacts private events
  (gate_message + dm_message) unless the request looks like a loopback /
  onion / RNS / private transport caller. If an operator accidentally
  exposes :8000 to the public internet, an external puller gets
  public events only — never ciphertext.
* ``_sync_from_peer`` raises ``PeerSyncRateLimited`` for 429 (handled
  as 4-tuple return with retry_after_s) and ``PeerSyncHTTPError`` for
  other non-200 statuses (handled by ``_run_public_sync_cycle`` to
  honor server cooldown hints even outside the 429 path).

DM relay hydration (main.py)
-----------------------------

* New ``_hydrate_dm_relay_from_chain``: when accepted dm_message chain
  events arrive on a node, they get deposited into the local DM relay
  store with a deterministic sender_token_hash so re-sync of the same
  event is idempotent. Recipients see the ciphertext as a normal DM
  on their next poll and decrypt with their existing recipient key.

Other surfaces
--------------

* meshnode.bat / meshnode.sh now set ``MESH_INFONET_ALLOW_CLEARNET_SYNC=
  false`` and the participant runtime flags by default so a freshly
  spun-up node defaults to private-only sync.
* InfonetTerminal/InfonetShell.tsx adds a gate directory renderer for
  the new private-gate workflow.
* docker-compose.relay.yml binds the relay backend to 127.0.0.1:8000
  only; Tor's hidden service forwards onion traffic into 127.0.0.1.
  Public clearnet :8000 stays off the network edge.

Tests
-----

* 7 new tests in test_private_gate_hashchain.py + test_private_dm_
  hashchain.py covering: gate fork accepts ciphertext propagation,
  gate fork rejects plaintext, append rejects plaintext before
  normalize, append requires private_strong, append rejects
  non-sealed ciphertext shape, DM spool 2-per-recipient + 1-per-pair
  cap, DM hydration delivers to poll/claim.
* Updated test_mesh_node_bootstrap_runtime.py covers 429 backoff via
  PeerSyncRateLimited 4-tuple AND PeerSyncHTTPError exception.
* Updated test_s14b_public_sync_gate_filter.py + test_s9b_gate_store_
  hydration.py + test_gate_write_cutover.py cover the new private
  redaction on public sync responses.
* test_private_gate_hashchain.py + test_private_dm_hashchain.py:
  10 passed locally.
* Combined mesh-relevant suite (the 5 modified existing tests +
  2 new): 17 passed.

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-24 21:25:18 -06:00
Shadowbroker fb97042c01 Update README.md
Elaborated on Tor and Reticulum usage.
2026-05-24 11:08:05 -06:00
Shadowbroker 2616a6c9e3 Update README.md 2026-05-24 11:06:40 -06:00
Shadowbroker a930497e14 fix(start-scripts): find bundled privacy_core.dll next to script (#319) (#324)
* fix(start-scripts): find bundled privacy_core.dll next to script

start.bat and start.sh only checked the source-tree DLL path
(``privacy-core/target/release/privacy_core.dll``), not the bundled
location where MSI/AppImage/DMG installers stage the library directly
next to the script in backend-runtime/.

Users running start.bat from inside an MSI install dir (a documented
workaround when the desktop shell crashes) saw a scary "install Rust"
warning even though the DLL was sitting right next to them. See issue
#319 for the user-reported confusion.

Fix: add a fallback check for the bundled location before falling
through to the "build privacy-core from source" warning. Source-tree
behavior unchanged — the source path is still preferred when present.

Also re-stamps the v0.9.81 source archive: ``release_digests.json``
v0.9.81 zip hash updated to point at the rebuilt source archive that
contains these script changes. MSI/EXE/sig hashes are unchanged (the
scripts live at the repo root, not inside the desktop bundle).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix(#319): bundle start.bat + start.sh into the MSI/EXE installers

Follow-up to the start-script DLL fallback fix in the prior commit.

ChrisMTheMan's report on #319 made it clear the workaround flow was:

  1. MSI install crashes on launch (different bug, fixed in v0.9.81)
  2. User goes looking for start.bat to launch the backend manually
  3. start.bat isn't in their install dir, so they go fetch it from GitHub
  4. They get a working script but it doesn't know about the bundled
     privacy_core.dll layout, so they see a scary "install Rust" warning

The prior commit fixed step 4. This commit fixes step 3 — start.bat and
start.sh now ship inside the MSI/EXE installers (staged into
backend-runtime/ next to the privacy_core.dll they expect to find).
After the rebuild lands, an MSI user looking for these scripts finds
them right inside their install dir, already pointing at the correct
bundled DLL location.

What changed
------------

* ``build-backend-runtime.cjs`` now has a ``stageStartScripts()`` step
  that copies start.bat and start.sh from the repo root into the
  staged backend-runtime/. Preserves the executable bit on .sh under
  POSIX.

* ``release_digests.json`` v0.9.81 block hashes refreshed for the
  rebuilt MSI / EXE / source-zip (the scripts being bundled changed
  the MSI/EXE contents; the source zip also includes the start-script
  fix from the prior commit).

  ShadowBroker_v0.9.81.zip                  6.06 MB
    af8c87ccdece8fbb9aadc6be63cce10d3fcba74e6d87ef83289dda6d555fd270
  ShadowBroker_0.9.81_x64_en-US.msi       122.4 MB
    8977c9a1c54e1f0d030436be9c4e3d81d766cc0080699eb747649095f360c7ff
  ShadowBroker_0.9.81_x64-setup.exe        76.5 MB
    4e866fa0423c0c2470ed32f4809167a7815dc23ee7762b69e95681c1f3a28250

Post-merge plan
---------------

Force-move the v0.9.81 tag to this commit and replace ALL release
assets on the GitHub release: zip, msi, exe, both .sig files,
latest.json, SHA256SUMS.txt, release-manifest.json.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-23 21:34:59 -06:00
Shadowbroker 2dc1fcc778 release: v0.9.81 — signed auto-update + admin_session race fix (#323)
What this release does
----------------------

1. Establishes a fresh Tauri updater signing keypair. The previous keypair
   (pubkey baked into v0.9.79 / v0.9.8) had no matching private key on
   any maintainer-controlled machine — every prior release shipped
   without signatures, so auto-update has never actually worked. v0.9.81
   rotates to a new pubkey and ships signed installers + latest.json so
   every release from here is a one-click upgrade.

2. Fixes the ``admin_session_required`` race in TopRightControls.tsx.
   The updateAction state used to default to ``auto_apply`` at React-init
   time. A click on the Update button before the async runtime probe
   completed went down the auto_apply path (POST /api/system/update),
   which throws ``admin_session_required`` on fresh sessions. Desktop
   installs now default to ``manual_download`` based on synchronous
   ``window.__TAURI__`` detection at useState init.

One-time cost for current installs
----------------------------------

Anyone on v0.9.79 or v0.9.8 will see the in-app Update button still
trigger the broken path on their existing install (the fix only takes
effect once they're ON v0.9.81). The MANUAL DOWNLOAD button in the
update dialog opens the GitHub release page, where they grab the .msi
and run it. After that one manual hop, all future updates are seamless.

Release artifacts
-----------------

  ShadowBroker_v0.9.81.zip                  6.06 MB
    42f8a51f9a5690d1e7349d90d8ecf2d163c9061d6cf90c69ee03647a785437ff
  ShadowBroker_0.9.81_x64_en-US.msi       122.4 MB
    a45b177c26c95d2b28d71592d7147e88ff4e104865f214fde11249d311ec9e25
  ShadowBroker_0.9.81_x64-setup.exe        76.5 MB
    eca884b9d37eeccd0f11c91dcc6f6ae1b3609d9dee72bd73c37c9a427babfef2

Plus .sig files for the .msi and .exe, plus a signed latest.json for
the Tauri updater endpoint.

Sizes match the v0.9.79 / v0.9.8 reference shape within drift for
the new TopRightControls patch.

release_digests.json keeps v0.9.79 + v0.9.8 blocks alongside v0.9.81
so operators still on those versions continue to validate cleanly
during the rollout transition.

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-23 18:43:53 -06:00
Shadowbroker 896d1ae938 fix(#319,#296): v0.9.8 rebuild — bundle missing deps so backend launches (#322)
Issues #319 and #296 reported that the installed v0.9.79 Windows MSI/EXE
crashed on launch with:

    thread 'main' panicked ... failed to setup app: error encountered
    during setup hook: ShadowBroker cannot start: the bundled local
    backend failed to launch.
    technical detail: managed_backend_exited_early:exit code: 103

Root cause: ``backend/pyproject.toml`` declares ``defusedxml>=0.7.1`` and
``PySocks==1.7.1`` as runtime dependencies, but the venv used to build
v0.9.79 (and the initial v0.9.8 publish) had both missing. When
``services/fetchers/aircraft_database.py`` does
``import defusedxml.ElementTree`` at startup, Python raises
``ModuleNotFoundError`` and uvicorn exits, which Tauri reports as
``managed_backend_exited_early``.

Both packages now installed in the build venv. ``main.py`` imports
end-to-end with only the expected ``plane_alert_db.json not found``
warning (runtime-state file, populated on first launch).

Rebuilt artifacts on the maintainer's local machine:

    ShadowBroker_v0.9.8.zip                  6.06 MB
      183bb5cd62b9b9349d95df5ef7696cb6ca810ab4b991fa9dab6f898af4c7a175
    ShadowBroker_0.9.8_x64_en-US.msi       122.4 MB
      fe22f9d51e4360d74c18a7250c2fbb9ed4fa4c7a884b3ac0d04a21115466386b
    ShadowBroker_0.9.8_x64-setup.exe        76.5 MB
      94a0309862e9c81c92cdcbfea8eec9dbb97eef19ded82b26217b397defbc810c

After this merges, the v0.9.8 tag will be force-moved to this commit and
the GitHub release assets replaced so the integrity chain validates
against the working installers instead of the broken ones.

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-23 16:48:45 -06:00
Shadowbroker 8dfa6a7199 release: v0.9.8 — Cumulative Fuel/CO2, AIS Resilience, Data-Layer Repair (#321)
Bumps every hardcoded 0.9.79 → 0.9.8 across backend, frontend,
desktop-shell, helm, lockfiles, test fixtures. Refreshes the in-app
ChangelogModal HEADLINE_FEATURES, NEW_FEATURES, and BUG_FIXES with the
v0.9.8 highlights.

Release artifacts built locally and hashed into release_digests.json:

  ShadowBroker_v0.9.8.zip                  6.06 MB
    d506f6b8462ccb12096f0cd9462233be58928094240416b65fb3127bdd1f3820
  ShadowBroker_0.9.8_x64_en-US.msi       122.4 MB
    d4be4cb68c3e6409fff54c225acdcdd08e27d5d6d2b31616d78d2a4f6812991d
  ShadowBroker_0.9.8_x64-setup.exe        76.5 MB
    1115d1f5cf37edd03ea2c21d821c7626e1bf3319c990402aaa0293bca46fea67

Sizes match the v0.9.79 reference shape (5.76 MB / 117 MB / 72.9 MB)
within expected drift for new code. The .zip is a `git archive` of the
v0.9.8 source tree (matching v0.9.79's approach).

Audit confirms no .env, .key, .venv-dir, or cache files leaked into the
backend-runtime bundle. Python 3.11.9 + 199 site-packages + privacy_core
all staged correctly.

Headline changes since v0.9.79:
* Cumulative fuel/CO2 per flight (#317) — running totals since first
  observation, not just per-hour rate.
* AIS maritime resilience (#314, #316) — outage banner + AISHub REST
  fallback when AISStream WebSocket primary is offline.
* Data-layer repair (#311, #312) — UAP fallback respects the 60-day
  cutoff; GPS jamming threshold tuning + nac_p=0 inclusion so the layer
  actually fires.
* Per-flight source attribution (#313) — source field on every record.
* Cross-node DM mailbox replication (#309).
* Infonet sync HTTP 429 honored (#310).

Test fixtures updated:
* test_per_operator_outbound_attribution.py — added v0.9.8 UA strings
  to the banned-aggregate-literals list (alongside v0.9.79).
* updateRuntime.test.ts — bumped asset filename fixtures to v0.9.8.

release_digests.json keeps the v0.9.79 block alongside v0.9.8 so
operators still on 0.9.79 validate cleanly during the rollout.

The accent narrowing fix in ChangelogModal (one feature uses 'purple',
two use 'cyan' so the renderer's `accent === 'purple'` comparison
still type-checks) is included.

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-23 16:24:20 -06:00
Shadowbroker ef6b8ec181 fix(desktop-build): strip layout.tsx force-dynamic on CRLF checkouts too (#320)
build-frontend-export.cjs stages a desktop-only frontend export tree and
strips the ``force-dynamic`` + ``revalidate`` directives from
``frontend/src/app/layout.tsx`` so Next's ``output: "export"`` can
prerender every route.

The strip regexes only matched LF (``\n``). Any Windows checkout without
``core.autocrlf=input`` has CRLF line endings, the strip silently
no-op'd, and the desktop build failed at the static-export step:

    Error: Page with `dynamic = "force-dynamic"` couldn't be exported.
    `output: "export"` requires all pages be renderable statically
    because there is no runtime server to dynamically render routes
    in this output format.
    Export encountered an error on /_not-found/page: /_not-found

Reaches every Windows contributor who hasn't normalized line endings
locally. Replacing each ``\n`` in the strip regexes with ``\r?\n``
makes the strip CRLF-tolerant; LF behavior is unchanged.

Verified by running both regexes against the actual layout.tsx (302
bytes removed, force-dynamic + revalidate both gone) and against a
synthetic LF input (296 bytes removed, same outcome).

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-23 16:07:11 -06:00
Shadowbroker dcea325fba Merge pull request #317 from BigBodyCobain/feat/cumulative-fuel-burn
feat(flights): cumulative fuel burned + CO2 emitted per flight
2026-05-23 08:09:34 -06:00
BigBodyCobain 03b8053617 feat(flights): cumulative fuel burned + CO2 emitted per flight
Pre-fix the emissions tooltip only showed the per-hour *rate* — what most
users actually want is the cumulative *amount* burned. This adds running
totals computed by multiplying the model-based rate by the elapsed
observation time since we first saw the airframe.

New module ``flight_observations.py``:
* Tracks first_seen_at + last_seen_at per icao24 hex.
* Re-opens a fresh session when an aircraft is unseen for > 15 min
  (treated as a new flight — landed and took off, or transited a dead
  zone). Prevents the cumulative counter from resetting mid-flight if
  the trail-rendering cache prunes the trail.
* Clamps elapsed time to 24h max so clock skew can't produce comically
  large numbers.
* Pruned every 5 min via a new scheduler job (mirrors ais_prune cadence).

flights.py + military.py emission enrichment now also attaches:
* observed_seconds — how long we've been tracking this airframe.
* fuel_gallons_burned — rate * elapsed_h.
* co2_kg_emitted — rate * elapsed_h.

The existing per-hour rate fields stay in the dict for backward compat
and are shown as small secondary context in the tooltip.

Frontend EmissionsEstimateBlock (NewsFeed.tsx) now prominently shows
the cumulative totals with the rate as smaller context underneath plus
"Observed in flight for Xh Ym". When observed_seconds is 0 (first refresh)
it renders "Just observed · totals will appear on next refresh" instead
of a misleading "0 gal".

12 backend tests cover record/accumulate/reset, the 24h clamp, prune,
case-insensitive key normalization, and end-to-end emission integration
in _classify_and_publish.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-23 07:56:23 -06:00
Shadowbroker 20807a2d62 Merge pull request #316 from BigBodyCobain/feat/aishub-fallback
feat(ais): AISHub REST fallback when AISStream is offline (20-min polling)
2026-05-23 07:42:56 -06:00
Shadowbroker 79fbf9741b Merge pull request #314 from BigBodyCobain/feat/ais-upstream-health
feat(ais): surface AISStream upstream outage instead of failing silently
2026-05-23 07:12:37 -06:00
BigBodyCobain a2f5d62926 feat(ais): AISHub REST fallback when AISStream WebSocket is offline
When stream.aisstream.io is unreachable (cert outage, server down — see
2026-05-20 and 2026-05-23 events) the ships layer goes empty. This adds
a slow REST fallback to data.aishub.net so the layer stays populated in
degraded mode.

Behavior:

* Opt-in via AISHUB_USERNAME (free registration at aishub.net/api).
  Without the env var the fetcher is a no-op.
* Default poll cadence 20 min — well inside their free-tier limits, gives
  ships time to move enough to look "alive". Configurable via
  AISHUB_POLL_INTERVAL_MINUTES, clamped to [1, 360].
* Internal gate: skips the poll entirely when the WebSocket primary is
  currently connected. Stomping fresh live data with 20-min-old REST
  data would be worse than leaving it alone.
* Vessels merge into the shared _vessels dict with source="aishub" so
  the existing UI / health tooling can attribute the provider.
* Live data wins races: if a WebSocket update for the same MMSI lands in
  the last 1s, we don't overwrite with the slower REST record.

Scheduler job runs every AISHUB_POLL_INTERVAL_MINUTES minutes alongside
the existing ais_prune job in data_fetcher.py.

24 tests cover gating (no-username, primary-connected), response parsing
(success / error / empty / malformed / unexpected shape), record
normalization (sentinels, missing fields, range checks, AIS @ padding),
poll interval clamping, and end-to-end merge with live-data-wins.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-23 07:00:32 -06:00
BigBodyCobain 5e0b2c037e feat(ais): surface upstream outage instead of failing silently
On 2026-05-23, stream.aisstream.io went fully offline (TCP timeouts on port
443). The backend kept respawning the node WebSocket proxy every few
seconds with nothing arriving. From the operator's POV the ships layer
silently went empty — no banner, no log surfacing, no way to tell whether
it was their config / network / viewport filter / upstream.

Backend:
* ais_proxy_status() now also returns:
  - connected (bool): true when a vessel message arrived in last 60s
  - last_msg_age_seconds (int | None)
  - proxy_spawn_count (int): proxy respawns — sustained growth without
    connected means upstream is dead
* /api/health escalates top status to "degraded" when AIS_API_KEY is set
  but the proxy is currently disconnected. Existing degraded_tls signal
  preserved.

Frontend:
* useAisUpstreamHealth hook polls /api/health every 30s, derives the
  outage state. Defensively only reports outage once spawn_count > 0 so
  operators who haven't opted in don't see the banner.
* AisUpstreamBanner component renders a dismissible amber notice
  "Ship data temporarily unavailable — AISStream upstream is offline"
  mounted on the main app shell.

7 backend tests pin the status-shape contract and the /api/health
escalation behavior in both with-key and without-key configurations.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-23 06:38:05 -06:00
Shadowbroker 69ef231e5a Merge pull request #313 from BigBodyCobain/feat/flight-source-attribution
feat(flights): stamp source attribution on every flight record
2026-05-23 06:29:31 -06:00
Shadowbroker 7a5f47ca9e Merge pull request #312 from BigBodyCobain/fix/gps-jamming-thresholds
fix(gps-jamming): count nac_p=0 + lower thresholds so layer actually fires
2026-05-23 06:29:20 -06:00
Shadowbroker 5cd49542bf Merge pull request #311 from BigBodyCobain/fix/uap-fallback-cutoff
fix(uap): stop HF fallback from serving 3-year-old NUFORC sightings
2026-05-23 06:29:08 -06:00
BigBodyCobain f14d4feb6d feat(flights): stamp source attribution on every flight record
Pre-fix, adsb.lol records (the primary source for most flights) carried
no source marker. OpenSky records got is_opensky: True and supplementals
got supplemental_source, so any UI inspecting source labels saw
OpenSky/airplanes.live records as explicitly tagged and adsb.lol records
as "unlabeled" — making it look like adsb.lol wasn't being used at all
even though it's the primary source.

Changes:

* _fetch_adsb_lol_regions stamps source="adsb.lol" on each aircraft
  before returning, so the tag survives the OpenSky dedupe-by-hex merge.
* OpenSky records get source="OpenSky" (alongside is_opensky=True for
  back-compat).
* military fetcher tags source on both adsb.lol and airplanes.live
  records before they're merged, and propagates source into the
  military_flights and uavs output dicts.
* _classify_and_publish promotes the explicit source field into the
  published flight dict. Falls back to legacy supplemental_source if
  source is absent. Final fallback "adsb.lol" preserves prior behavior
  for any caller synthesizing records without going through a fetcher.

8 new tests cover the published-dict propagation, OpenSky tagging,
supplemental fallback, explicit-wins precedence, default behavior, the
adsb.lol regional fetcher tagging, and the military output dict.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-23 06:14:39 -06:00
BigBodyCobain 19a8560a80 fix(gps-jamming): count nac_p=0 + lower thresholds so the layer actually fires
Three stacked filters meant the gps_jamming layer almost never lit up:

1. nac_p == 0 aircraft were dropped on the theory that "0 = old transponder."
   That's only half right — modern Mode-S Enhanced Surveillance transponders
   also fall back to nac_p=0 when they lose GPS lock entirely, which IS the
   jamming signature we want to catch. Discarding them was discarding the
   strongest signal. None (no field at all — typical for OpenSky-sourced
   records) is still skipped because absence-of-data isn't evidence.
2. GPS_JAMMING_MIN_AIRCRAFT was 5 per 1°x1° cell. Jamming hotspots
   (eastern Med, Russia/Ukraine border, Iran/Iraq) tend to have sparser
   traffic because pilots avoid them. Lowered to 3.
3. GPS_JAMMING_MIN_RATIO was 0.30. Combined with the (preserved) -1 noise
   cushion that made the effective bar high. Lowered to 0.20.

The 1-aircraft noise cushion is intact so a single quirky transponder
still can't flag a zone alone.

Also extracted the detector loop into a pure ``detect_gps_jamming_zones()``
function at module scope so it's testable in isolation (was previously
inlined inside ``_classify_and_publish``). The public signature accepts
threshold overrides for ad-hoc re-tuning without code edits.

16 new tests cover nac_p=0 inclusion, None-skip preservation, MIN_AIRCRAFT
lowering, MIN_RATIO lowering, noise cushion preservation, constant pinning,
override behavior, lon/lng key compatibility, and robustness to empty/None
inputs.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-22 23:40:18 -06:00
BigBodyCobain 0d0e009867 fix(uap): stop HF fallback from serving 3-year-old NUFORC sightings
The UAP sightings layer is sourced from a live scrape of nuforc.org with a
static Hugging Face CSV mirror (kcimc/NUFORC) as a fallback. The fallback
parsed every row, sorted by occurred-desc, and took the top 250 — with no
date cutoff. The HF mirror is a third-party snapshot that hasn't been
refreshed in years, so the "newest 250" rows it returns are from ~2022-23.
When the live path fails (Cloudflare 403, curl disabled on Windows, wdtNonce
regex stale, etc.) users see a map full of sightings from 3 years ago,
labeled as the "last 60 days" layer.

Changes:

* HF fallback now applies the same 60-day cutoff the live path uses. Rows
  outside the window are dropped before take-top-N. If the mirror has
  nothing inside the window the fallback returns [] (don't serve stale).
* When the HF mirror is fully stale a loud ERROR log fires with the count
  of dropped rows so the operator can tell the mirror's the problem, not
  a network issue.
* When BOTH live AND HF fallback produce 0 rows, fetch_uap_sightings now
  trips assert_canary("uap_sightings", 0) so the health registry shows
  the layer as broken instead of "fresh and empty for days."
* Scheduler moved from daily 12:00 UTC to weekly Mondays 12:00 UTC. The
  layer is a rolling 60-day digest; refreshing once a week is enough
  cadence for human-readable map exploration and keeps nuforc.org load
  light.

6 new tests cover the cutoff filter, the doomsday-log path, the mixed-age
path, the both-paths-empty health failure, the positive fallback path, and
the scheduler cadence.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-22 23:27:12 -06:00
Shadowbroker febcce9125 Merge pull request #310 from BigBodyCobain/fix/infonet-sync-429-backoff
Infonet sync: honor HTTP 429 Retry-After + exponential backoff
2026-05-22 23:11:00 -06:00
305 changed files with 30301 additions and 5089 deletions
+38 -2
View File
@@ -10,6 +10,23 @@ OPENSKY_CLIENT_ID=
OPENSKY_CLIENT_SECRET=
AIS_API_KEY=
# Global Fishing Watch — fishing vessel activity events (Fishing Activity map layer).
# Free API token from https://globalfishingwatch.org/our-apis/tokens
# Without this the fishing_activity layer stays empty.
# GFW_API_TOKEN=
# Optional tuning — GFW can return 40k+ global events; defaults cap fetch for map paint.
# GFW_EVENTS_PAGE_SIZE=500
# GFW_EVENTS_MAX_PAGES=10
# GFW_EVENTS_LOOKBACK_DAYS=7
# GFW_EVENTS_TIMEOUT_S=90
# Windy Webcams global CCTV layer — free key from https://api.windy.com/webcams/docs
# WINDY_API_KEY=
# Telegram OSINT map layer — scrapes public t.me/s channel previews (no bot token).
# TELEGRAM_OSINT_ENABLED=true
# TELEGRAM_OSINT_CHANNELS=osintdefender,insiderpaper,aljazeeraenglish,nexta_live,war_monitor
# Admin key to protect sensitive endpoints (settings, updates).
# If blank, loopback/localhost requests still work for local single-host dev.
# Remote/non-loopback admin access requires ADMIN_KEY, or ALLOW_INSECURE_ADMIN=true in debug-only setups.
@@ -39,8 +56,8 @@ ADMIN_KEY=
# NUFORC_MAPBOX_TOKEN=
# Optional startup-risk controls.
# On Windows, external curl fallback and the Playwright LiveUAMap scraper are
# disabled by default so blocked upstream feeds cannot interrupt start.bat.
# On Windows, external curl fallback is off by default. LiveUAMap uses UI consent
# when you enable Global Incidents (or set SHADOWBROKER_ENABLE_LIVEUAMAP_SCRAPER=true).
# SHADOWBROKER_ENABLE_WINDOWS_CURL_FALLBACK=false
# SHADOWBROKER_ENABLE_LIVEUAMAP_SCRAPER=false
# AIS starts by default when AIS_API_KEY is set. Set to 0/false to force-disable.
@@ -77,6 +94,19 @@ ADMIN_KEY=
# pip install earthengine-api
# GEE_SERVICE_ACCOUNT_KEY=
# Copernicus CDSE — Sentinel-2 imagery (Settings → Imagery, or backend .env).
# Free OAuth app at https://dataspace.copernicus.eu/
# SENTINEL_CLIENT_ID=
# SENTINEL_CLIENT_SECRET=
# Sentinel-2 road corridor freight trends (DrishX engine port — opt-in slow layer).
# pip install -e backend[road-corridor] (or uv sync --extra road-corridor)
# ROAD_CORRIDOR_SAT_ENABLED=false
# ROAD_CORRIDOR_SCHEDULED_PRESETS=laredo_i35
# ROAD_CORRIDOR_MONTHS=2
# ROAD_CORRIDOR_MAX_FRAMES=6
# ROAD_CORRIDOR_REFRESH_HOURS=24
# Override the backend URL the frontend uses (leave blank for auto-detect)
# NEXT_PUBLIC_API_URL=http://192.168.1.50:8000
@@ -128,8 +158,14 @@ ADMIN_KEY=
# MESH_DM_ROOT_TRANSPARENCY_LEDGER_READBACK_URI=backend/../ops/root_transparency_ledger.json
# ── Self Update ────────────────────────────────────────────────
# Optional ZIP updater digest pin. The updater checks this first, then
# backend/data/release_digests.json, then the release SHA256SUMS.txt asset.
# MESH_UPDATE_SHA256=
# Optional strict nonce-only frontend CSP. Leave unset unless the exact build
# has been verified to hydrate cleanly in your deployment.
# SHADOWBROKER_STRICT_CSP=1
# ── Wormhole (Local Agent) ─────────────────────────────────────
# WORMHOLE_URL=http://127.0.0.1:8787
# WORMHOLE_TRANSPORT=direct
+13
View File
@@ -0,0 +1,13 @@
## Summary
<!-- What changed and why (13 bullets). -->
## Test plan
- [ ] <!-- How you verified the change -->
## Production hardening (data path / fetchers / unattended deploys only)
If this PR touches the data path, fetchers, or live-data APIs, walk through [docs/production-hardening.md](https://github.com/BigBodyCobain/Shadowbroker/blob/main/docs/production-hardening.md) and note any N/A items here.
- [ ] Checklist reviewed (or N/A — explain why)
+12
View File
@@ -109,6 +109,9 @@ backend/data/*
# release. Used ONLY on first-ever startup to bootstrap carrier_cache.json;
# after that the cache reflects this install's own GDELT observations.
!backend/data/carrier_seed.json
# DrishX RF model weights (MIT — see backend/third_party/drishx/NOTICE.md)
!backend/data/drishx/
!backend/data/drishx/rf_model.pickle
# OS generated files
.DS_Store
@@ -174,6 +177,8 @@ frontend/eslint-report.json
.git_backup/
local-artifacts/
release-secrets/
release-staging/
.tmp-release-inspect/
shadowbroker_repo/
frontend/src/components.bak/
frontend/src/components/map/icons/backups/
@@ -198,6 +203,8 @@ graphify-out/
# Internal docs & brainstorming (never commit)
# ========================
docs/*
!docs/OUTBOUND_DATA.md
!docs/production-hardening.md
!docs/mesh/
docs/mesh/*
!docs/mesh/threat-model.md
@@ -256,6 +263,11 @@ frontend/.desktop-export-stash-*/
backend/data/wormhole_stderr.log
backend/data/wormhole_stdout.log
# Hermes Agent (operator-local runtime install — not project source)
.hermes/
**/.hermes/
hermes-agent/
# Runtime caches that already slip through the backend/data/* blanket
# (these are caught by the wildcard but listing for clarity)
+42 -12
View File
@@ -13,13 +13,22 @@
# 2. Reverse-mirrors main back to GitHub (only if commits land directly
# on GitLab) so the two sources stay in sync.
#
# Pipelines on this repo were instant-failing for free-tier accounts until
# identity verification was added — the May 2026 bump in this comment is
# the marker commit that confirms runner allocation after verification.
#
# Auth notes:
# - The image build/push uses $CI_JOB_TOKEN, which GitLab provides
# automatically. No credentials need to be configured.
# - The reverse mirror requires a GitHub personal access token stored
# as the GitLab CI/CD variable GITHUB_MIRROR_TOKEN (Protected + Masked).
# Scope: public_repo (or repo for private). If the variable isn't
# set the mirror job is skipped — image builds still run.
# - The reverse mirror authenticates to GitHub via a per-repo SSH
# deploy key. The private half is stored as the File-type GitLab
# CI/CD variable GITHUB_MIRROR_SSH_KEY (Protected). The matching
# public key is added to github.com/BigBodyCobain/Shadowbroker/
# settings/keys with write access. This is a tighter-scoped
# replacement for a personal access token: it can ONLY push to
# Shadowbroker, never expires, and rotating it is a one-click
# delete on GitHub's deploy-keys page. If the variable isn't set,
# the mirror job is skipped — image builds still run.
stages:
- build
@@ -48,7 +57,11 @@ variables:
- docker info
- docker login -u "$CI_REGISTRY_USER" -p "$CI_JOB_TOKEN" "$CI_REGISTRY"
- docker run --privileged --rm tonistiigi/binfmt --install all
- docker buildx create --use --name multiarch --driver docker-container
# buildx --driver docker-container can't read TLS from the env vars
# the GitLab dind service exports. Wrap them in a docker context and
# bind buildx to it. See https://docs.gitlab.com/ee/ci/docker/using_docker_build.html#use-docker-buildx
- docker context create tls-env
- docker buildx create --use --name multiarch --driver docker-container tls-env
# ── Backend image ────────────────────────────────────────────────────────
build-backend:
@@ -93,18 +106,35 @@ build-frontend:
- .gitlab-ci.yml
# ── Reverse mirror to GitHub ─────────────────────────────────────────────
# Pushes refs/heads/main to github.com/BigBodyCobain/Shadowbroker.
# Fast-forward-only — if GitLab main and GitHub main have diverged, this
# fails loudly rather than silently overwriting either side.
# Pushes refs/heads/main to github.com/BigBodyCobain/Shadowbroker via SSH
# using a per-repo deploy key. Fast-forward-only by default — if GitLab
# main and GitHub main have diverged, the push fails loudly rather than
# silently overwriting either side.
#
# Only runs if GITHUB_MIRROR_TOKEN is set as a CI/CD variable. See the
# header comment of this file for setup instructions.
# Only runs if GITHUB_MIRROR_SSH_KEY is set as a File-type CI/CD variable.
# See the header comment of this file for setup instructions.
mirror-to-github:
stage: mirror
image: alpine:3.20
needs: []
before_script:
- apk add --no-cache git openssh-client ca-certificates
- mkdir -p ~/.ssh
- chmod 700 ~/.ssh
# Install the deploy key. File-type CI variable exposes the path; copy
# to ~/.ssh/id_ed25519 with restrictive perms so ssh accepts it.
- cp "$GITHUB_MIRROR_SSH_KEY" ~/.ssh/id_ed25519
- chmod 600 ~/.ssh/id_ed25519
# Pin github.com's current host keys so we never trust a man-in-the-
# middle. Sourced from https://docs.github.com/en/authentication/keeping-your-account-and-data-secure/githubs-ssh-key-fingerprints
# (rotated 2023-03-24 after the previous RSA key leak).
- |
cat > ~/.ssh/known_hosts <<'EOF'
github.com ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIOMqqnkVzrm0SdG6UOoqKLsabgH5C9okWi0dh2l9GKJl
github.com ecdsa-sha2-nistp256 AAAAE2VjZHNhLXNoYTItbmlzdHAyNTYAAAAIbmlzdHAyNTYAAABBBEmKSENjQEezOmxkZMy7opKgwFB9nkt5YRrYMjNuG5N87uRgg6CLrbo5wAdT/y6v0mKV0U2w0WZ2YB/++Tpockg=
github.com ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABgQCj7ndNxQowgcQnjshcLrqPEiiphnt+VTTvDP6mHBL9j1aNUkY4Ue1gvwnGLVlOhGeYrnZaMgRK6+PKCUXaDbC7qtbW8gIkhL7aGCsOr/C56SJMy/BCZfxd1nWzAOxSDPgVsmerOBYfNqltV9/hWCqBywINIR+5dIg6JTJ72pcEpEjcYgXkE2YEFXV1JHnsKgbLWNlhScqb2UmyRkQyytRLtL+38TGxkxCflmO+5Z8CSSNY7GidjMIZ7Q4zMjA2n1nGrlTDkzwDCsw+wqFPGQA179cnfGWOWRVruj16z6XyvxvjJwbz0wQZ75XK5tKSb7FNyeIEs4TT4jk+S4dhPeAUC5y+bDYirYgM4GC7uEnztnZyaVWQ7B381AK4Qdrwt51ZqExKbQpTUNn+EjqoTwvqNj4kqx5QUCI0ThS/YkOxJCXmPUWZbhjpCg56i+2aB6CmK2JGhn57K5mj0MNdBXA4/WnwH6XoPWJzK5Nyu2zB3nAZp+S5hpQs+p1vN1/wsjk=
EOF
- chmod 644 ~/.ssh/known_hosts
script:
- git config --global user.email "ci-mirror@gitlab.com"
- git config --global user.name "GitLab CI Mirror"
@@ -115,7 +145,7 @@ mirror-to-github:
- cd repo
- >
git push
"https://x-access-token:${GITHUB_MIRROR_TOKEN}@github.com/BigBodyCobain/Shadowbroker.git"
"git@github.com:BigBodyCobain/Shadowbroker.git"
"${CI_COMMIT_SHA}:refs/heads/main"
rules:
- if: $CI_COMMIT_BRANCH == "main" && $GITHUB_MIRROR_TOKEN
- if: $CI_COMMIT_BRANCH == "main" && $GITHUB_MIRROR_SSH_KEY
+2 -1
View File
@@ -44,7 +44,8 @@ These sources have their own terms; consult each link before redistributing.
| aisstream.io | https://aisstream.io | Free-tier API terms (attribution required) | AIS vessel positions |
| Global Fishing Watch | https://globalfishingwatch.org | CC BY 4.0 (for public data) | Fishing activity events |
| Microsoft Planetary Computer | https://planetarycomputer.microsoft.com | Sentinel-2 / ESA Copernicus terms | Sentinel-2 imagery |
| Copernicus CDSE (Sentinel Hub) | https://dataspace.copernicus.eu | ESA Copernicus open data terms | SAR + optical imagery |
| Copernicus CDSE (Sentinel Hub) | https://dataspace.copernicus.eu | ESA Copernicus open data terms | SAR + optical imagery, optional road-corridor truck trends |
| DrishX / Fisser et al. 2022 | https://github.com/sparkyniner/DRISH-X-Satellite-powered-freight-intelligence- | MIT (engine); research methodology attribution | Sentinel-2 motion-smear truck detection on major roads (opt-in) |
| Shodan | https://www.shodan.io | Operator-supplied API key, Shodan ToS | Internet device search |
| Smithsonian GVP | https://volcano.si.edu | Attribution required | Volcanoes |
| OpenAQ | https://openaq.org | CC BY 4.0 | Air quality stations |
+136 -21
View File
@@ -19,7 +19,7 @@
**ShadowBroker** is a decentralized intelligence platform that aggregates real-time, multi-domain OSINT telemetry from 60+ live intelligence feeds into a single dark-ops map interface. Aircraft, ships, satellites, conflict zones, CCTV networks, GPS jamming, internet-connected devices, police scanners, mesh radio nodes, and breaking geopolitical events — all updating in real time on one screen as well as an obfuscated communications protocol and information exchange infrastructure.
Built with **Next.js**, **MapLibre GL**, **FastAPI**, and **Python**. 35+ toggleable data layers, including SAR ground-change detection. Multiple visual modes (DEFAULT / SATELLITE / FLIR / NVG / CRT). Right-click any point on Earth for a country dossier, head-of-state lookup, and the latest Sentinel-2 satellite photo. No user data is collected or transmitted — the dashboard runs entirely in your browser against a self-hosted backend.
Built with **Next.js**, **MapLibre GL**, **FastAPI**, and **Python**. 40+ toggleable data layers, including SAR ground-change detection, **Telegram OSINT** (public channel previews geoparsed onto the map), a **server-side recon toolkit** (DNS, WHOIS, sanctions, BGP, IP sweep, and more), supply-chain risk overlays, and malware/C2 + CISA KEV cyber threat feeds. Multiple visual modes (DEFAULT / SATELLITE / FLIR / NVG / CRT). Right-click any point on Earth for a country dossier, head-of-state lookup, entity-graph expansion, and the latest Sentinel-2 satellite photo. ShadowBroker has no accounts, product telemetry, or analytics; the dashboard talks to your self-hosted backend. Sensitive recon and Shodan queries never hit third-party APIs from the browser — they are proxied through the backend with SSRF guards and local-operator auth. The **OpenClaw / agent command channel** exposes the same recon backends plus full telemetry search — no separate API integration required.
Designed for analysts, researchers, radio operators, and anyone who wants to see what the world looks like when every public signal is on the same map.
@@ -28,18 +28,20 @@ Designed for analysts, researchers, radio operators, and anyone who wants to see
A surprising amount of global telemetry is already public — aircraft ADS-B broadcasts, maritime AIS signals, satellite orbital data, earthquake sensors, mesh radio networks, police scanner feeds, environmental monitoring stations, internet infrastructure telemetry, and more. This data is scattered across dozens of tools and APIs. ShadowBroker combines all of it into a single interface.
The project does not introduce new surveillance capabilities — it aggregates and visualizes existing public datasets. It is fully open-source so anyone can audit exactly what data is accessed and how. No user data is collected or transmitted — everything runs locally against a self-hosted backend. No telemetry, no analytics, no accounts.
The project does not introduce new surveillance capabilities — it aggregates and visualizes existing public datasets. It is fully open-source so anyone can audit exactly what data is accessed and how. ShadowBroker does not include product telemetry, analytics, or accounts. Operator-supplied keys stay in your local deployment, but live OSINT features necessarily make outbound requests to the public data providers you enable or query.
### Shodan Connector
### Shodan & Recon (security-first)
ShadowBroker includes an optional Shodan connector for operator-supplied API access. Shodan results are fetched with your own `SHODAN_API_KEY`, rendered as a local investigative overlay (not merged into core feeds), and remain subject to Shodans terms of service.
ShadowBroker includes an optional **Shodan connector** for operator-supplied API access (`SHODAN_API_KEY`) and a **Recon Toolkit** panel for keyless OSINT lookups. Both run **server-side only**: the browser calls your self-hosted `/api/osint/*` and `/api/tools/shodan/*` routes; outbound requests are made by the backend after SSRF validation. Recon requires **local-operator** access (same trust model as layer toggles and admin routes). Shodan results render as a separate map overlay and remain subject to Shodans terms of service.
> **Not included:** embedded live-news YouTube grids or a built-in Gemini AI analyst panel — use the **OpenClaw / agent channel** for AI-assisted analysis instead.
---
## Interesting Use Cases
* **Track Air Force One**, the private jets of billionaires and dictators, and every military tanker, ISR, and fighter broadcasting ADS-B. Air Force One and all of the accompanying Presidential/Vice Presidential planes are highlighted and monitored from the moment they leave the ground.
* **Connect an AI agent as a co-analyst** through ShadowBroker's HMAC-signed agentic command channel — supports OpenClaw and any other agent that speaks the protocol (Claude, GPT, LangChain, custom). The agent gets full read/write access to all 35+ data layers, pin placement, map control, SAR ground-change, mesh networking, and alert delivery. It sees everything the operator sees and can take actions on the map in real time.
* **Connect an AI agent as a co-analyst** through ShadowBroker's HMAC-signed agentic command channel — supports OpenClaw and any other agent that speaks the protocol (Claude, GPT, LangChain, custom). The agent gets full read/write access to all 40+ data layers, compact cross-layer search (`search_telemetry`, `search_news`), the full recon toolkit (`osint_lookup` for IP/DNS/WHOIS/sanctions/CVE/etc.), entity-graph expansion, pin placement, map control, SAR ground-change, mesh networking, and alert delivery. It sees everything the operator sees and can take actions on the map in real time.
* **Communicate on the InfoNet testnet** — The first decentralized intelligence mesh built into an OSINT tool. Obfuscated messaging with gate personas, Dead Drop peer-to-peer exchange, and a built-in terminal CLI. No accounts, no signup. Privacy is not guaranteed yet — this is an experimental testnet — but the protocol is live and being hardened.
* **Right-click anywhere on Earth** for a country dossier (head of state, population, languages), Wikipedia summary, and the latest Sentinel-2 satellite photo at 10m resolution
* **Click a KiwiSDR node** and tune into live shortwave radio directly in the dashboard. Click a police scanner feed and eavesdrop in one click.
@@ -55,6 +57,12 @@ ShadowBroker includes an optional Shodan connector for operator-supplied API acc
* **Track trains** across the US (Amtrak) and Europe (DigiTraffic) in real time
* **Estimate where US aircraft carriers are** using automated GDELT news scraping — no other open tool does this
* **Search internet-connected devices worldwide** via Shodan — cameras, SCADA systems, databases — plotted as a live overlay on the map
* **Run a full recon toolkit** from the left sidebar — IP geolocation, DNS, RDAP/WHOIS, certificate transparency, BGP/ASN, OFAC sanctions search, CVE lookup, Tor/OTX threat checks, and subnet sweeps (InternetDB proxied server-side)
* **Expand an entity graph** when you select an aircraft, vessel, company, or IP — Wikidata + OFAC + live store cross-links rendered in the Entity Graph panel
* **Monitor supply-chain risk** — Tier 1/2 semiconductor and battery fabs scored against nearby earthquakes, wildfires, and conflict events (SCM panel)
* **Toggle malware C2 hotspots** — abuse.ch Feodo Tracker + URLhaus feeds mapped by country (opt-in layer)
* **Monitor Telegram OSINT channels** — public `t.me/s` war/conflict feeds (OSINTdefender, NEXTA, etc.) scraped hourly, risk-scored, geoparsed to metro anchors, and plotted as clickable map pins with inline media
* **Overlay global submarine cables** — static TeleGeography-derived cable routes (opt-in layer)
---
@@ -83,6 +91,8 @@ Both paths produce identical containers — same source, same CI, same images by
Open `http://localhost:3000` to view the dashboard! *(Requires [Docker Desktop](https://www.docker.com/products/docker-desktop/) or Docker Engine)*
> **Join the private InfoNet swarm (sb-testnet-0):** Click **NODE** in the dashboard, or run `./meshnode.sh` for a headless participant. No manual peer list — fleet defaults discover the seed and pull the signed manifest automatically. Set `MESH_INFONET_FLEET_JOIN=false` in `.env` for a private solo node.
> **Backend port already in use?** The browser only needs port `3000`, but the backend API is also published on host port `8000` for local diagnostics. If another app already uses `8000`, create or edit `.env` next to `docker-compose.yml` and set `BACKEND_PORT=8001`, then run `docker compose up -d`.
> **Blank news/UAP/bases/wastewater after several minutes?** Check for backend OOM restarts with `docker events --since 30m --filter container=shadowbroker-backend --filter event=oom`. The default compose file gives the backend 4GB; if your host has less memory, reduce enabled feeds or set `BACKEND_MEMORY_LIMIT=3G` and expect slower/heavier layers to warm more gradually.
@@ -113,6 +123,20 @@ That's it. `pull` grabs the latest images, `up -d` restarts the containers.
>
> Podman users should run the equivalent provider command, for example `podman-compose pull` and `podman-compose up -d`, or use `./compose.sh --engine podman pull` and `./compose.sh --engine podman up -d` from a bash-compatible shell.
### Update Integrity
Docker updates are delivered through signed container registries. The legacy ZIP self-updater verifies release archives through this chain, in order:
* `MESH_UPDATE_SHA256` when an operator pins a digest explicitly.
* `backend/data/release_digests.json` for bundled release pins.
* The release `SHA256SUMS.txt` asset on GitHub when a bundled pin is not present.
Release maintainers should run `python backend/scripts/release_helper.py hash <ShadowBroker_vX.Y.Z.zip>` before publishing, then publish `SHA256SUMS.txt` and update `backend/data/release_digests.json` when shipping a ZIP updater target. The updater keeps the operator override path intact instead of failing closed on missing bundled digests, so existing installs do not get stranded by a release-process mistake.
### CSP Hardening
The production frontend ships with a hydration-compatible CSP and a strict nonce-only CSP in `Content-Security-Policy-Report-Only`. Set `SHADOWBROKER_STRICT_CSP=1` only after verifying the exact build hydrates correctly in your deployment. Runtime Google Fonts are not required; the bundled Next font pipeline serves the dashboard font from the app build.
### ⚠️ **Stuck on the old version?**
**If `git pull` fails or `docker compose up` keeps building from source instead of pulling images**, your clone predates a March 2026 repository migration that rewrote commit history. A normal `git pull` cannot fix this. Run:
@@ -174,7 +198,7 @@ ShadowBroker v0.9.7 ships **InfoNet** (decentralized intelligence mesh + Soverei
| Channel | Privacy Status | Details |
|---|---|---|
| **Meshtastic / APRS** | **PUBLIC** | RF radio transmissions are public and interceptable by design. |
| **InfoNet Gate Chat** | **OBFUSCATED** | Messages are obfuscated with gate personas and canonical payload signing, but NOT end-to-end encrypted. Metadata is not hidden. |
| **InfoNet Gate Chat** | **OBFUSCATED** | Messages are obfuscated with gate personas and canonical payload signing, but NOT end-to-end encrypted. Metadata is not hidden despite being designed through Tor and Reticulum (Work in progress). |
| **Dead Drop DMs** | **STRONGEST CURRENT LANE** | Token-based epoch mailbox with SAS word verification. Strongest lane in this build, but not yet confidently private. |
| **Sovereign Shell governance** | **PUBLIC LEDGER** | Petitions, votes, upgrade hashes, and dispute stakes are signed events on a public hashchain. Pseudonymous via gate persona, but governance actions are intentionally observable. |
| **Privacy primitives (RingCT / stealth / DEX)** | **NOT YET WIRED** | Locked Protocol contracts are in place, but the cryptographic scheme has not been chosen. The privacy-core Rust crate is the integration target for a future sprint. |
@@ -199,7 +223,7 @@ The first decentralized intelligence communication and governance layer built di
**Communication layer (since v0.9.6):**
* **InfoNet Experimental Testnet** — A global, obfuscated message relay. Anyone running ShadowBroker can transmit and receive on the InfoNet. Messages pass through a Wormhole relay layer with gate personas, Ed25519 canonical payload signing, and transport obfuscation.
* **InfoNet Experimental Testnet** — A global, obfuscated message relay using Tor and Reticulum. Anyone running ShadowBroker can transmit and receive on the InfoNet. Messages pass through a Wormhole relay layer with gate personas, Ed25519 canonical payload signing, and transport obfuscation.
* **Mesh Chat Panel** — Three-tab interface: **INFONET** (gate chat with obfuscated transport), **MESH** (Meshtastic radio integration), **DEAD DROP** (peer-to-peer message exchange with token-based epoch mailboxes — strongest current lane).
* **Gate Persona System** — Pseudonymous identities with Ed25519 signing keys, prekey bundles, SAS word contact verification, and abuse reporting.
* **Mesh Terminal** — Built-in CLI: `send`, `dm`, market commands, gate state inspection. Draggable panel, minimizes to the top bar. Type `help` to see all commands.
@@ -219,17 +243,34 @@ The first decentralized intelligence communication and governance layer built di
**Privacy primitive runway (NEW in v0.9.7):**
* **Function Keys — Anonymous Citizenship Proof** — A citizen proves "I am an Infonet citizen" without revealing their Infonet identity. 5 of 6 pieces shipped: nullifiers, challenge-response, two-phase commit receipts, enumerated denial codes, batched settlement. Issuance via blind signatures waits on a primitive decision (RSA blind sigs vs BBS+ vs U-Prove vs Idemix).
* **Function Keys — Anonymous Credential Scaffolding** — The plumbing is in place for nullifiers, challenge-response, two-phase commit receipts, enumerated denial codes, and batched settlement. Today's challenge-response is an HMAC-based placeholder for integration testing, not a production anonymous or zero-knowledge citizenship proof. True unlinkable issuance still waits on a primitive decision (RSA blind sigs vs BBS+ vs U-Prove vs Idemix).
* **Locked Protocol Contracts** — Stable interfaces in `services/infonet/privacy/contracts.py` for ring signatures, stealth addresses, Pedersen commitments, range proofs, and DEX matching. The `privacy-core` Rust crate is the integration target — no caller of the privacy module needs to know which scheme is active.
* **Sprint 11+ Path** — When the cryptographic scheme is chosen, primitives wire into the locked Protocols without API churn.
> **Experimental Testnet — No Privacy Guarantee:** InfoNet messages are obfuscated but NOT end-to-end encrypted. The Mesh network (Meshtastic/APRS) is NOT private — radio transmissions are inherently public. The privacy primitive contracts are scaffolded but not yet wired. Do not send anything sensitive on any channel. Treat all channels as open and public for now.
### 🔍 Shodan Device Search (NEW in v0.9.6)
### 🔍 Recon Toolkit & Shodan (Osiris-derived, security-first)
* **Internet Device Search** — Query Shodan directly from ShadowBroker. Search by keyword, CVE, port, or service — results plotted as a live overlay on the map
Adapted from the [OSIRIS](https://github.com/simplifaisoul/osiris) recon stack (MIT) with ShadowBrokers proxy model. Attribution: `backend/third_party/osiris/NOTICE.md`.
**Recon Toolkit** (left sidebar — local operator only):
* **IP / DNS / WHOIS** — ip-api.com geolocation, Google DNS-over-HTTPS, RDAP registrant data with optional HTTP security header scoring
* **Certificates & BGP** — crt.sh subdomain discovery, bgpview.io ASN/prefix lookups
* **Threat intel** — AlienVault OTX pulses, Tor exit-node checks, optional per-IP/domain reputation
* **Sanctions** — OpenSanctions `us_ofac_sdn` index (CC-BY); cross-checks on WHOIS entities and IP ISP/org strings
* **CVE / MAC / GitHub / leaks** — MITRE CVE API, MAC vendor lookup, GitHub profile recon, public breach checks
* **IP sweep** — `/api/osint/sweep/scan` geolocates a target /24/32 and proxies Shodan InternetDB host discovery server-side (browser never contacts InternetDB directly)
* **SSRF guard** — Private, loopback, link-local, and metadata hostnames are blocked before any user-supplied fetch
**Entity graph** — Select any map entity to open the Entity Graph panel (`GET /api/entity/expand`). Resolves aircraft, vessels, companies, persons, IPs, and countries into a node/link graph (Wikidata SPARQL + OFAC + in-memory flight/ship store).
**OpenClaw / agent access** — The same recon backends are available on the HMAC command channel (no browser local-operator gate): `osint_lookup` (passive IP/DNS/WHOIS/certs/BGP/sanctions/CVE/MAC/GitHub/leaks/threats), `entity_expand` (relationship graph), and `osint_sweep` (active subnet scan — **full** access tier only). Call `osint_tools` to list supported lookup types. Skill package: `openclaw-skills/shadowbroker/` (`SKILL.md` + `sb_query.py`).
**Shodan overlay** (unchanged):
* **Internet Device Search** — Query Shodan with your own API key; results plotted as a live overlay
* **Configurable Markers** — Shape, color, and size customization for Shodan results
* **Operator-Supplied API** — Uses your own `SHODAN_API_KEY`; results rendered as a local investigative overlay
### 🛩️ Aviation Tracking
@@ -317,11 +358,12 @@ The first decentralized intelligence communication and governance layer built di
### 📷 Surveillance
* **CCTV Mesh** — 11,000+ live traffic cameras from 13 sources across 6 countries:
* **CCTV Mesh** — 22,000+ live traffic cameras from 21 ingestors across 10 countries (US, UK, Canada, Australia, Austria, Spain, Singapore, Netherlands when NDW feed is up, plus OSM):
* 🇬🇧 Transport for London JamCams
* 🇺🇸 NYC DOT, Austin TX (TxDOT)
* 🇺🇸 California (12 Caltrans districts), Washington State (WSDOT), Georgia DOT, Illinois DOT, Michigan DOT
* 🇪🇸 Spain DGT National (20 cities), Madrid City (357 cameras via KML)
* 🇦🇹 Austria ASFINAG motorway webcams
* 🇸🇬 Singapore LTA
* 🌍 Windy Webcams
* **Feed Rendering** — Automatic detection & rendering of video, MJPEG, HLS, embed, satellite tile, and image feeds
@@ -342,6 +384,12 @@ The first decentralized intelligence communication and governance layer built di
* **Data Center Mapping** — 2,000+ global data centers plotted from a curated dataset. Clustered purple markers with server-rack icons. Click for operator, location, and automatic internet outage cross-referencing by country.
* **Military Bases** — Global military installation and missile facility database (NEW)
* **Power Plants** — 35,000+ global power plants from the WRI database (NEW)
* **Submarine Cables** — Global undersea cable routes from static TeleGeography-derived GeoJSON (`frontend/public/data/submarine-cables.json`). Opt-in line overlay.
* **Malware C2 Layer** — Botnet C2 servers (Feodo Tracker) and recent malware URLs (URLhaus) from abuse.ch, refreshed on the slow tier when the layer is enabled.
* **SCM Supplier Risk** — Tier 1/2 fabs and battery plants (TSMC, Samsung, CATL, etc.) cross-referenced against earthquakes, FIRMS fires, and GDELT conflict proximity. Alerts in the SCM panel; optional map layer.
* **Cyber Threats Feed** — Recent CISA Known Exploited Vulnerabilities (KEV) entries exposed via `/api/cyber-threats` and the layer toggle.
* **Country Risk Index** — Static geopolitical risk scores with USGS earthquake enrichment via `/api/country-risk`.
* **Telegram OSINT** — Public channel web previews (`t.me/s/*`) from configurable war/OSINT feeds. Hourly incremental merge (no redundant re-scrape), keyword risk scoring, Cyrillic/Arabic place aliases, metro-anchor geocoding (separate from news centroids), inline photo/video via `/api/telegram/media` proxy. Layer key: `telegram_osint`.
### 🌐 Additional Layers & Tools
@@ -367,7 +415,9 @@ v0.9.7 turns ShadowBroker from a dashboard a human watches into an intelligence
**Capabilities:**
* **Full Telemetry Access** — The agent queries all 35+ data layers: flights, ships, satellites, SIGINT, conflict events, earthquakes, fires, wastewater, prediction markets, and more. Fast and slow tier endpoints return enriched data with geographic coordinates, timestamps, and source attribution.
* **Full Telemetry Access** — The agent queries all 40+ data layers: flights, ships, satellites, SIGINT, conflict events, earthquakes, fires, wastewater, **Telegram OSINT**, malware/C2, **CISA KEV cyber threats**, SCM overlays, fishing activity (GFW), prediction markets, and more. Fast and slow tier endpoints return enriched data with geographic coordinates, timestamps, and source attribution.
* **Compact Search (preferred over full dumps)** — `get_summary``get_layer_slice` with per-layer `since_layer_versions` (SSE `layer_changed` push tells the agent exactly which layers updated). `search_telemetry` is the Google-style cross-layer keyword index. `search_news` covers news, GDELT, CrowdThreat, LiveUAMap, frontlines, and Telegram posts. `entities_near`, `brief_area`, `find_flights`/`find_ships`/`find_entity`, and `correlate_entity` answer targeted questions without multi-megabyte pulls.
* **Recon Toolkit on the Channel** — `osint_lookup` runs the same SSRF-guarded backends as the Recon panel (`ip`, `dns`, `whois`, `certs`, `bgp`, `sanctions`, `cve`, `mac`, `github`, `leaks`, `threats`, `sweep_init`). `entity_expand` builds Wikidata + OFAC relationship graphs. `osint_sweep` runs Shodan InternetDB subnet discovery (**full** tier). Layer aliases: `telegram`, `malware`/`botnet`, `cyber`/`cisa`/`kev`, `scm`/`suppliers`, `gfw`/`fishing`.
* **AI Intel Pins** — Place color-coded investigation markers directly on the operator's map. 14 pin categories (threat, anomaly, military, maritime, aviation, SIGINT, infrastructure, etc.) with confidence scores, TTL expiry, source URLs, and batch placement up to 100 pins at once.
* **Map Control** — Fly the operator's map view to any coordinate, trigger satellite imagery lookups, and open region dossiers. The agent can direct the operator's attention to specific locations in real time.
* **SAR Ground-Change** — Query SAR anomaly feeds, inspect pin details, manage AOIs, and fly the map to watch areas. The agent can monitor for ground deformation, flood extent, or damage and promote anomalies to pins.
@@ -380,7 +430,7 @@ v0.9.7 turns ShadowBroker from a dashboard a human watches into an intelligence
* **Intelligence Reports** — Generate structured reports with summary stats, top military flights, correlations, earthquake activity, SIGINT counts, and pin inventories.
* **Auditable** — Every channel call is logged; the operator can introspect what the agent has done.
**Connect an agent:** Open the AI Intel panel in the left sidebar, click **Connect Agent**, and copy the HMAC secret. From there, point any compatible agent at the channel — for OpenClaw, import `ShadowBrokerClient` from the OpenClaw skill package; for any other agent, use the same HMAC contract documented above (timestamp + nonce + body digest, tier-gated). The channel is the protocol, not the agent.
**Connect an agent:** Open the AI Intel panel in the left sidebar, click **Connect Agent**, and copy the HMAC secret. From there, point any compatible agent at the channel — for OpenClaw, import `ShadowBrokerClient` from `openclaw-skills/shadowbroker/sb_query.py` (see `SKILL.md` for examples); for any other agent, use the same HMAC contract documented above (timestamp + nonce + body digest, tier-gated). Discovery: `GET /api/ai/tools` and `GET /api/ai/capabilities`. The channel is the protocol, not the agent.
### ⏱️ Time Machine — Snapshot Playback (NEW in v0.9.7)
@@ -529,9 +579,20 @@ ShadowBroker v0.9.7 is composed of three vertically-stacked planes — the **Ope
| [GDELT Project](https://www.gdeltproject.org) | Global conflict events | ~6h | No |
| [DeepState Map](https://deepstatemap.live) | Ukraine frontline | ~30min | No |
| [Shodan](https://www.shodan.io) | Internet-connected device search | On-demand | **Yes** |
| [OpenSanctions](https://www.opensanctions.org) | OFAC SDN sanctions index (recon + entity graph) | 24h cache | No |
| [abuse.ch Feodo + URLhaus](https://abuse.ch) | Malware C2 / distribution URLs | ~5min (opt-in layer) | No |
| [CISA KEV](https://www.cisa.gov/known-exploited-vulnerabilities-catalog) | Known exploited CVEs | ~5min (opt-in layer) | No |
| [ip-api.com](https://ip-api.com) | IP geolocation (recon, entity graph) | On-demand | No |
| [Google Public DNS](https://dns.google) | DNS-over-HTTPS lookups (recon) | On-demand | No |
| [RDAP.org](https://rdap.org) | Domain registration data (recon) | On-demand | No |
| [crt.sh](https://crt.sh) | Certificate transparency (recon) | On-demand | No |
| [bgpview.io](https://bgpview.io) | BGP/ASN routing (recon) | On-demand | No |
| TeleGeography (static) | Submarine cable routes | Static | No |
| [ASFINAG](https://www.asfinag.at) | Austria motorway webcams | ~10min | No |
| [Amtrak](https://www.amtrak.com) | US train positions | ~60s | No |
| [DigiTraffic](https://www.digitraffic.fi) | European rail positions | ~60s | No |
| [Global Fishing Watch](https://globalfishingwatch.org) | Fishing vessel activity events | ~10min | No |
| [Global Fishing Watch](https://globalfishingwatch.org) | Fishing vessel activity events | ~1hr | **Yes** (`GFW_API_TOKEN`) |
| [Telegram public previews](https://t.me/s) | War/OSINT channel posts (`telegram_osint`) | ~1hr | No (optional `TELEGRAM_OSINT_CHANNELS`) |
| Transport for London, NYC DOT, TxDOT | CCTV cameras (UK, US) | ~10min | No |
| Caltrans, WSDOT, GDOT, IDOT, MDOT | CCTV cameras (5 US states) | ~10min | No |
| Spain DGT, Madrid City | CCTV cameras (Spain) | ~10min | No |
@@ -563,6 +624,8 @@ ShadowBroker v0.9.7 is composed of three vertically-stacked planes — the **Ope
| [OSM Nominatim](https://nominatim.openstreetmap.org) | Place name geocoding (LOCATE bar) | On-demand | No |
| [CARTO Basemaps](https://carto.com) | Dark map tiles | Continuous | No |
**Outbound privacy & audit (#348#366):** Each self-hosted install uses its own backend IP and per-install User-Agent handle. See [docs/OUTBOUND_DATA.md](docs/OUTBOUND_DATA.md) for what contacts third parties, opt-in/env controls, and accepted tradeoffs (CCTV Referer, basemap CDN, LiveUAMap, etc.).
---
## 🚀 Getting Started
@@ -584,9 +647,16 @@ Open `http://localhost:3000` to view the dashboard.
> **Deploying publicly or on a LAN?** No configuration needed for most setups.
> The frontend proxies all API calls through the Next.js server to `BACKEND_URL`,
> which defaults to `http://backend:8000` (Docker internal networking).
> Host port `8000` is only published for local API/debug access. If it conflicts
> with another service, set `BACKEND_PORT=8001` in `.env`; leave `BACKEND_URL`
> as `http://backend:8000` because that is the Docker-internal port.
> Host port `8000` is only published for local API/debug access (`127.0.0.1:8000`
> in `docker-compose.yml`). If it conflicts with another service, set
> `BACKEND_PORT=8001` in `.env`; leave `BACKEND_URL` as `http://backend:8000`
> because that is the Docker-internal port.
>
> **Running the backend outside Docker** (`cd backend && python main.py`):
> the dev server binds **loopback only** (`127.0.0.1:8000`) so other machines on
> your LAN cannot hit admin/local-trust routes with an empty `ADMIN_KEY`. Set
> `SHADOWBROKER_DEV_BIND_ALL=true` in `.env` only when you deliberately need
> `0.0.0.0` and use a strong `ADMIN_KEY` for any non-local callers.
> The backend memory cap is controlled by `BACKEND_MEMORY_LIMIT` and defaults
> to `4G`. If Docker reports OOM events, the backend will restart and slow
> layers can look empty until they repopulate.
@@ -798,7 +868,7 @@ AIS-catcher decodes VHF radio signals on 161.975 MHz and 162.025 MHz and POSTs d
## 🎛️ Data Layers
All 37 layers are independently toggleable from the left panel:
All 41 layers are independently toggleable from the left panel:
| Layer | Default | Description |
|---|---|---|
@@ -840,6 +910,24 @@ All 37 layers are independently toggleable from the left panel:
| VIIRS Nightlights | ❌ OFF | Night-time light change detection |
| Power Plants | ❌ OFF | 35,000+ global power plants |
| Shodan Overlay | ❌ OFF | Internet device search results |
| Road Freight Trends | ❌ OFF | Sentinel-2 truck-motion trends on major highways (Analyze Here) |
| Submarine Cables | ❌ OFF | Global undersea cable routes (static GeoJSON) |
| Malware C2 | ❌ OFF | abuse.ch Feodo + URLhaus threat points |
| SCM Suppliers | ❌ OFF | Tier 1/2 supply-chain risk markers + panel alerts |
| Cyber Threats | ❌ OFF | Recent CISA KEV entries (stats in slow-tier payload) |
| Telegram OSINT | ✅ ON | Public war/OSINT Telegram channels — hourly scrape, geoparsed pins |
| SAR | ✅ ON | Synthetic aperture radar catalog + anomaly alerts |
**Recon & entity tools** (not map layers — left sidebar / selection):
| Tool | Dashboard access | OpenClaw command | Description |
|---|---|---|---|
| Recon Toolkit | Local operator (`/api/osint/*`) | `osint_lookup`, `osint_sweep`† | IP, DNS, WHOIS, certs, BGP, sanctions, CVE, MAC, GitHub, leaks, threats, subnet sweep |
| Entity Graph | Local operator (`/api/entity/expand`) | `entity_expand` | Wikidata + OFAC + live-store relationship graph |
| SCM Risk panel | Local operator (`/api/scm-suppliers`) | `get_layer_slice(["scm_suppliers"])` | Supplier threat rollup + map markers |
| Tool discovery | — | `osint_tools` | Lists recon lookup types and entity-expand schemas |
† `osint_sweep` (active InternetDB scan) requires `OPENCLAW_ACCESS_TIER=full`.
---
@@ -863,6 +951,7 @@ The platform is optimized for handling massive real-time datasets:
```
Shadowbroker/
├── openclaw-skills/shadowbroker/ # OpenClaw skill — SKILL.md, sb_query.py client, alerts/monitor helpers
├── backend/
│ ├── main.py # FastAPI app, middleware, API routes (~4,000 lines)
│ ├── cctv.db # SQLite CCTV camera database (auto-generated)
@@ -872,7 +961,18 @@ Shadowbroker/
│ │ ├── data_fetcher.py # Core scheduler — orchestrates all data sources
│ │ ├── ais_stream.py # AIS WebSocket client (25K+ vessels)
│ │ ├── carrier_tracker.py # OSINT carrier position estimator (GDELT news scraping)
│ │ ├── cctv_pipeline.py # 13-source CCTV camera ingestion pipeline
│ │ ├── cctv_pipeline.py # 14-source CCTV camera ingestion pipeline
│ │ ├── ssrf_guard.py # SSRF validation for operator recon fetches
│ │ ├── sanctions/ofac.py # OpenSanctions OFAC SDN index
│ │ ├── osint/lookups.py # Server-side recon lookups (Osiris port)
│ │ ├── osint/openclaw_recon.py # OpenClaw dispatch for recon + entity_expand
│ │ ├── osint_intel/resolve.py # Entity graph resolver (Wikidata + OFAC)
│ │ ├── scm/suppliers.py # Supply-chain risk overlay
│ │ ├── intel_feeds/ # Country risk index helpers
│ │ ├── fetchers/malware.py # abuse.ch Feodo + URLhaus
│ │ ├── fetchers/cyber_status.py # CISA KEV feed
│ │ ├── fetchers/telegram_osint.py # Public Telegram channel scrape + geoparse
│ │ ├── third_party/osiris/ # MIT attribution for Osiris-derived code
│ │ ├── geopolitics.py # GDELT + Ukraine frontline + air alerts
│ │ ├── region_dossier.py # Right-click country/city intelligence
│ │ ├── radio_intercept.py # Police scanner feeds + OpenMHZ
@@ -910,7 +1010,14 @@ Shadowbroker/
│ │ ├── mesh_reputation.py # Node reputation scoring
│ │ ├── mesh_oracle.py # Oracle consensus protocol
│ │ └── mesh_secure_storage.py # Secure credential storage
│ ├── routers/
│ │ ├── osint.py # /api/osint/* recon routes (local operator)
│ │ ├── entity_graph.py # /api/entity/expand
│ │ ├── scm.py # /api/scm-suppliers
│ │ └── intel_feeds.py # /api/malware, /api/cyber-threats, /api/telegram-feed, /api/country-risk
├── frontend/
│ ├── public/data/
│ │ └── submarine-cables.json # Static undersea cable GeoJSON
│ ├── src/
│ │ ├── app/
│ │ │ └── page.tsx # Main dashboard — state, polling, layout
@@ -919,7 +1026,12 @@ Shadowbroker/
│ │ ├── MeshChat.tsx # InfoNet / Mesh / Dead Drop chat panel
│ │ ├── MeshTerminal.tsx # Draggable CLI terminal
│ │ ├── NewsFeed.tsx # SIGINT feed + entity detail panels
│ │ ├── WorldviewLeftPanel.tsx # Data layer toggles (35+ layers)
│ │ ├── WorldviewLeftPanel.tsx # Data layer toggles (40+ layers)
│ │ ├── ShodanPanel.tsx # Shodan device search overlay
│ │ ├── ReconPanel.tsx # Server-side OSINT recon toolkit
│ │ ├── ScmPanel.tsx # Supply-chain risk command panel
│ │ ├── EntityGraphPanel.tsx # Entity graph on map selection
│ │ ├── MaplibreViewer/popups/TelegramOsintPopup.tsx # Threat-intercept styled Telegram pin popups
│ │ ├── WorldviewRightPanel.tsx # Search + filter sidebar
│ │ ├── AdvancedFilterModal.tsx # Airport/country/owner filtering
│ │ ├── MapLegend.tsx # Dynamic legend with all icons
@@ -956,6 +1068,9 @@ MESH_SAR_EARTHDATA_TOKEN= # NASA Earthdata token (paired wit
MESH_SAR_COPERNICUS_USER= # Copernicus Data Space user (SAR Mode B — EGMS / EMS)
MESH_SAR_COPERNICUS_TOKEN= # Copernicus token (paired with user above)
OPENCLAW_ACCESS_TIER=restricted # OpenClaw agent tier: "restricted" (read-only) or "full"
GFW_API_TOKEN=your_gfw_token # Global Fishing Watch — fishing_activity layer (Settings → Maritime)
TELEGRAM_OSINT_ENABLED=true # Telegram OSINT layer (default on)
TELEGRAM_OSINT_CHANNELS=osintdefender,... # Comma-separated public channel slugs (see .env.example)
# Private-lane privacy-core pinning (required when Arti or RNS is enabled)
PRIVACY_CORE_MIN_VERSION=0.1.0
+80 -14
View File
@@ -11,6 +11,22 @@ AIS_API_KEY= # https://aisstream.io/ — free tier WebSocket key
# ── Optional ───────────────────────────────────────────────────
# AISHub REST fallback. Used when stream.aisstream.io is unreachable
# (e.g. their cert expires or server goes offline). Free tier requires
# registration at https://www.aishub.net/api. Poll cadence defaults to
# 20 min to stay courteous; tunable via AISHUB_POLL_INTERVAL_MINUTES.
# AISHUB_USERNAME=
# AISHUB_POLL_INTERVAL_MINUTES=20
# `python main.py` (uvicorn reload) binds 127.0.0.1:8000 by default so LAN clients
# cannot reach a dev server with empty ADMIN_KEY (#375). Set true only when you
# intentionally need 0.0.0.0 and understand the local-trust implications.
# SHADOWBROKER_DEV_BIND_ALL=false
#
# Thread pool for GDELT, LiveUAMap, CCTV ingest, and slow-tier refresh batches.
# Keeps heavy jobs from starving fast flight/ship workers (default 2).
# SHADOWBROKER_HEAVY_FETCH_WORKERS=2
# Override allowed CORS origins (comma-separated). Defaults to localhost + LAN auto-detect.
# CORS_ORIGINS=http://192.168.1.50:3000,https://my-domain.com
@@ -24,11 +40,9 @@ AIS_API_KEY= # https://aisstream.io/ — free tier WebSocket key
# Requires MESH_DEBUG_MODE=true; do not enable this for ordinary use.
# ALLOW_INSECURE_ADMIN=false
# Per-install operator handle. Round 7a: every outbound third-party API
# call (Wikipedia, Wikidata, Nominatim, GDELT, OpenMHz, Broadcastify,
# weather.gov, NUFORC, etc.) includes this handle in the User-Agent so
# upstreams can rate-limit / contact the specific install instead of
# treating every Shadowbroker user as one entity.
# Per-install operator handle. Round 7a: outbound third-party API calls send
# this handle as the User-Agent (e.g. operator-7f3a92), not a shared app name,
# so upstreams rate-limit one install instead of blocking every user.
#
# Default empty -> a stable pseudonymous handle (e.g. "operator-7f3a92") is
# auto-generated on first run and persisted to backend/data/operator_handle.json.
@@ -36,10 +50,8 @@ AIS_API_KEY= # https://aisstream.io/ — free tier WebSocket key
# set it here. Special characters are sanitized to dashes.
# OPERATOR_HANDLE=
# Default outbound User-Agent for all third-party HTTP fetchers. Operators
# who run a public relay and want a completely custom UA can set this; it
# bypasses the per-operator helper entirely. Most installs should leave it
# unset and use OPERATOR_HANDLE instead.
# Full User-Agent override (replaces the operator handle entirely). Rare;
# most installs should use OPERATOR_HANDLE only.
# SHADOWBROKER_USER_AGENT=
# Nominatim-specific User-Agent override (OSM usage policy). Leave unset to
@@ -59,20 +71,48 @@ AIS_API_KEY= # https://aisstream.io/ — free tier WebSocket key
# FIMI_ENABLED=false
#
# Polymarket + Kalshi — US political/election prediction markets.
# Default off; enable from Global Threat Intercept (MKT toggle) or set true here.
# PREDICTION_MARKETS_ENABLED=false
# When enabled, polls use a jittered schedule (not the fixed 5-minute slow tier):
# PREDICTION_MARKETS_INTERVAL_MINUTES=7
# PREDICTION_MARKETS_SCHEDULER_JITTER_S=240
# PREDICTION_MARKETS_INITIAL_DELAY_MAX_S=180
# PREDICTION_MARKETS_PRE_FETCH_JITTER_S=90
# PREDICTION_MARKETS_PROVIDER_GAP_JITTER_S=45
# MESH_POLYMARKET_PAGE_DELAY_JITTER_S=0.08
# MESH_KALSHI_PAGE_DELAY_JITTER_S=0.2
#
# Finnhub fallback / yfinance — financial market data.
# Set FINNHUB_API_KEY to enable Finnhub, or set FINANCIAL_ENABLED=true to allow
# the unauthenticated yfinance fallback to call Yahoo Finance.
# FINANCIAL_ENABLED=false
#
# NUFORC UAP sightings — huggingface.co dataset download.
# NUFORC UAP map layer — live scrape from nuforc.org (rolling window, default 60 days).
# Refreshed weekly (Mon 12:00 UTC); cache reused for up to 7 days between runs.
# NUFORC_RECENT_DAYS=60
# NUFORC_CACHE_TTL_HOURS=168
# On Windows, live scrape uses Python requests by default; optional:
# SHADOWBROKER_ENABLE_WINDOWS_CURL_FALLBACK=true
# NUFORC enrichment index (HF dataset) is separate — opt-in only:
# NUFORC_ENABLED=false
#
# News RSS aggregator — defaults ON. Set to "false" to disable all
# configured news feeds (kill switch for the news layer).
# NEWS_ENABLED=true
# Global Fishing Watch — fishing vessel activity events (Fishing Activity map layer).
# Free API token from https://globalfishingwatch.org/our-apis/tokens
# Without this the fishing_activity layer stays empty.
# GFW_API_TOKEN=
# Optional tuning — GFW can return 40k+ global events; defaults cap fetch for map paint.
# GFW_EVENTS_PAGE_SIZE=500
# GFW_EVENTS_MAX_PAGES=10
# GFW_EVENTS_LOOKBACK_DAYS=7
# GFW_EVENTS_TIMEOUT_S=90
# Windy Webcams global CCTV layer — free key from https://api.windy.com/webcams/docs
# WINDY_API_KEY=
# LTA Singapore traffic cameras — leave blank to skip this data source.
# LTA_ACCOUNT_KEY=
@@ -80,6 +120,12 @@ AIS_API_KEY= # https://aisstream.io/ — free tier WebSocket key
# Free MAP_KEY from https://firms.modaps.eosdis.nasa.gov/map/#d:24hrs;@0.0,0.0,3.0z
# FIRMS_MAP_KEY=
# Ukraine frontline mirror (GitHub). Default follows cyterat/deepstate-map-data@main.
# Pin an immutable commit SHA so ingest cannot silently change if main is force-pushed (#362).
# Example (verify on GitHub before use): main @ b479954e94696bc5622c7818fd20a64a699f4fe8
# DEEPSTATE_MIRROR_COMMIT=b479954e94696bc5622c7818fd20a64a699f4fe8
# DEEPSTATE_MIRROR_REPO=cyterat/deepstate-map-data
# Ukraine air raid alerts from alerts.in.ua — free token from https://alerts.in.ua/
# ALERTS_IN_UA_TOKEN=
@@ -109,12 +155,16 @@ AIS_API_KEY= # https://aisstream.io/ — free tier WebSocket key
# can identify per-install traffic instead of aggregated "ShadowBroker" hits.
# Leave blank to send a generic UA. If you set MESHTASTIC_OPERATOR_CALLSIGN,
# it is included in outbound headers to meshtastic.org by default so they
# can rate-limit per-operator. Set MESHTASTIC_SEND_CALLSIGN_HEADER=false to
# suppress the callsign while still using it locally (e.g. for APRS).
# can rate-limit per-operator. Callsign is NOT sent upstream unless you opt in.
# MESHTASTIC_OPERATOR_CALLSIGN=
# MESHTASTIC_SEND_CALLSIGN_HEADER=true
# MESHTASTIC_SEND_CALLSIGN_HEADER=false
# MESH_MQTT_PSK= # hex-encoded, empty = default LongFast key
# LiveUAMap Playwright scraper (#348). Linux/macOS: on by default when Global
# Incidents layer is active. Windows: off until the operator enables Global
# Incidents in the UI (consent dialog) or sets SHADOWBROKER_ENABLE_LIVEUAMAP_SCRAPER=true.
# SHADOWBROKER_ENABLE_LIVEUAMAP_SCRAPER=false forces off on all platforms.
# ── Mesh / Reticulum (RNS) ─────────────────────────────────────
# Full-node / participant-node posture for public Infonet sync.
# MESH_NODE_MODE=participant # participant | relay | perimeter
@@ -177,7 +227,23 @@ AIS_API_KEY= # https://aisstream.io/ — free tier WebSocket key
# MESH_GATE_SESSION_STREAM_MAX_GATES=16
# MESH_BOOTSTRAP_DISABLED=false
# MESH_BOOTSTRAP_MANIFEST_PATH=data/bootstrap_peers.json
# MESH_BOOTSTRAP_SIGNER_PUBLIC_KEY=
# Swarm discovery (signed peer manifest). Participants need only the public key;
# the seed operator sets MESH_BOOTSTRAP_SIGNER_PRIVATE_KEY (never commit it).
# Generate a fleet keypair: uv run python backend/scripts/bootstrap_manifest_helper.py generate-keypair
# Public sb-testnet fleet defaults (auto-used when MESH_INFONET_FLEET_JOIN=true).
# MESH_BOOTSTRAP_SIGNER_PUBLIC_KEY=ul1d0kj/ODPIp0OhHzX8eLAVXzJ3CVvzW1vn2IC6q3I=
# MESH_INFONET_FLEET_JOIN=true
# MESH_INFONET_FLEET_JOIN_DISABLED=false
# MESH_BOOTSTRAP_SIGNER_PRIVATE_KEY= # seed only
# MESH_BOOTSTRAP_SIGNER_ID=shadowbroker-seed
# MESH_PEER_REGISTRY_ENABLED=true # seed only (auto-enabled when private key is set)
# Headless relay compose sets MESH_INFONET_RELAY_AUTO_WORMHOLE=true; seed nodes with
# MESH_BOOTSTRAP_SIGNER_PRIVATE_KEY also auto-enable Tor wormhole on startup.
# MESH_INFONET_RELAY_AUTO_WORMHOLE=false
# MESH_INFONET_RELAY_AUTO_WORMHOLE_DISABLED=false
# MESH_SWARM_MANIFEST_TTL_S=14400
# MESH_SWARM_MANIFEST_PULL_INTERVAL_S=300
# MESH_PEER_REGISTRY_STALE_S=604800
# Infonet/Wormhole fails closed to onion/RNS by default. Only enable clearnet
# sync for local relay development or an explicitly public testnet.
# MESH_INFONET_ALLOW_CLEARNET_SYNC=false
+3 -2
View File
@@ -27,6 +27,7 @@ WORKDIR /app
RUN apt-get update && apt-get install -y --no-install-recommends \
ca-certificates \
curl \
git \
tor \
&& curl -fsSL https://deb.nodesource.com/setup_20.x | bash - \
&& apt-get install -y --no-install-recommends nodejs \
@@ -45,7 +46,7 @@ COPY uv.lock /workspace/uv.lock
COPY backend/pyproject.toml /workspace/backend/pyproject.toml
# Install Python dependencies using the lockfile
RUN cd /workspace/backend && uv sync --frozen --no-dev \
RUN cd /workspace/backend && uv sync --frozen --no-dev --extra road-corridor \
&& playwright install --with-deps chromium
# Copy backend source code
@@ -72,7 +73,7 @@ ENV PRIVACY_CORE_LIB=/app/libprivacy_core.so
# Create a non-root user for security
# Grant write access to /app so the auto-updater can extract files
# Pre-create /app/data so mounted volumes inherit correct ownership
RUN adduser --system --uid 1001 backenduser \
RUN adduser --system --uid 1001 --home /app backenduser \
&& mkdir -p /app/data \
&& chown -R backenduser /app \
&& chmod -R u+w /app
+54 -30
View File
@@ -113,8 +113,14 @@ def _scoped_admin_tokens() -> dict[str, list[str]]:
return normalized
def _request_scope_path(request: Request) -> str:
"""Return the ASGI request-line path, not the Host-derived URL path."""
scope = getattr(request, "scope", {}) or {}
return str(scope.get("path") or "")
def _required_scope_for_request(request: Request) -> str:
path = str(request.url.path or "")
path = _request_scope_path(request)
if path.startswith("/api/wormhole/gate/"):
return "gate"
if path.startswith("/api/wormhole/dm/"):
@@ -443,7 +449,7 @@ async def _verify_openclaw_hmac(request: Request) -> bool:
# Compute expected signature: HMAC-SHA256(secret, METHOD|path|ts|nonce|body_digest)
method = str(request.method or "").upper()
path = str(request.url.path or "")
path = _request_scope_path(request)
message = f"{method}|{path}|{ts_str}|{nonce}|{body_digest}"
expected = hmac.new(
secret.encode("utf-8"),
@@ -515,33 +521,32 @@ _KNOWN_COMPROMISED_PEER_PUSH_SECRET_SHA256 = (
def _validate_admin_startup() -> None:
admin_key = _current_admin_key()
if not admin_key or len(admin_key) < 32:
import secrets
if not admin_key:
logger.warning(
"ADMIN_KEY is not set. Local-operator/admin endpoints will reject "
"remote callers until ADMIN_KEY is configured."
)
return
reason = "not set" if not admin_key else f"too short ({len(admin_key)} chars, minimum 32)"
new_key = secrets.token_hex(32) # 64-char hex string
if len(admin_key) < 32:
reason = f"too short ({len(admin_key)} chars, minimum 32)"
try:
from routers.ai_intel import _write_env_value
_write_env_value("ADMIN_KEY", new_key)
os.environ["ADMIN_KEY"] = new_key
logger.info(
"ADMIN_KEY was %s — auto-generated a strong 64-character key and "
"saved it to .env. Admin/mesh endpoints are now secured.",
reason,
)
# Clear settings cache so the rest of startup picks up the new key
try:
get_settings.cache_clear()
except Exception:
pass
except Exception as exc:
debug_mode = bool(getattr(get_settings(), "MESH_DEBUG_MODE", False))
except Exception:
debug_mode = False
if debug_mode:
logger.warning(
"ADMIN_KEY is %s and could not auto-generate: %s. "
"Admin/mesh endpoints may be unavailable.",
"ADMIN_KEY is %s. Debug mode is enabled, so startup will continue, "
"but production deployments must use a 32+ character key.",
reason,
exc,
)
return
logger.error(
"ADMIN_KEY is %s. Refusing to start because auto-generating a backend-only "
"replacement would desynchronize the frontend and backend containers.",
reason,
)
raise SystemExit(1)
def _validate_insecure_admin_startup() -> None:
@@ -744,8 +749,7 @@ def _is_debug_test_request(request: Request) -> bool:
if not _debug_mode_enabled():
return False
client_host = (request.client.host or "").lower() if request.client else ""
url_host = (request.url.hostname or "").lower() if request.url else ""
return client_host == "test" or url_host == "test"
return client_host == "test"
# ---------------------------------------------------------------------------
@@ -858,7 +862,9 @@ _ROUTE_TRANSPORT_POLICY: dict[tuple[str, str], RouteTransportPolicy] = {
("POST", "/api/wormhole/gate/messages/decrypt"): _local_only_route_policy("private_control_only"),
# ── Wormhole DM (strong) ──────────────────────────────────────────
("POST", "/api/wormhole/dm/compose"): _local_only_route_policy("private_control_only"),
("POST", "/api/wormhole/dm/connect-contact"): _local_only_route_policy("private_control_only"),
("POST", "/api/wormhole/dm/decrypt"): _local_only_route_policy("private_control_only"),
("POST", "/api/wormhole/dm/mls-key-package"): _local_only_route_policy("private_control_only"),
("POST", "/api/wormhole/dm/register-key"): _local_only_route_policy("private_control_only"),
("POST", "/api/wormhole/dm/prekey/register"): _local_only_route_policy("private_control_only"),
("POST", "/api/wormhole/dm/bootstrap-encrypt"): _local_only_route_policy("private_control_only"),
@@ -1397,10 +1403,28 @@ def _peer_hmac_url_from_request(request: Request) -> str:
header_url = normalize_peer_url(str(request.headers.get("x-peer-url", "") or ""))
if header_url:
return header_url
if not request.url:
return ""
base_url = f"{request.url.scheme}://{request.url.netloc}".rstrip("/")
return normalize_peer_url(base_url)
return ""
def _verify_peer_transport_hmac(request: Request, body_bytes: bytes) -> bool:
"""Verify HMAC-SHA256 peer authentication without an allowlist check."""
provided = str(request.headers.get("x-peer-hmac", "") or "").strip()
if not provided:
return False
peer_url = _peer_hmac_url_from_request(request)
if not peer_url:
return False
peer_key = resolve_peer_key_for_url(peer_url)
if not peer_key:
return False
expected = _hmac_mod.new(
peer_key,
body_bytes,
_hashlib_mod.sha256,
).hexdigest()
return _hmac_mod.compare_digest(provided.lower(), expected.lower())
def _verify_peer_push_hmac(request: Request, body_bytes: bytes) -> bool:
+2 -2
View File
@@ -7,7 +7,7 @@
},
{
"name": "BBC",
"url": "http://feeds.bbci.co.uk/news/world/rss.xml",
"url": "https://feeds.bbci.co.uk/news/world/rss.xml",
"weight": 3
},
{
@@ -47,7 +47,7 @@
},
{
"name": "Xinhua",
"url": "http://www.news.cn/english/rss/worldrss.xml",
"url": "https://www.news.cn/english/rss/worldrss.xml",
"weight": 2
},
{
+3
View File
@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:72b69418aa860a0d92ccae398a08722bc85e64a992b5515dd7bf9ae9f79f2fd1
size 107194128
+20
View File
@@ -36,5 +36,25 @@
"ShadowBroker_v0.9.79.zip": "f6877c1d66614525315ea82636ce9f7b41178332c4dbf90d27431a1ea1d9cd47",
"ShadowBroker_0.9.79_x64-setup.exe": "f7b676ada45cac7da05868b0a353678c9ee700e3abcf456a7c0c038c36da446f",
"ShadowBroker_0.9.79_x64_en-US.msi": "e0713c3cdda184cfbea750bfac0d62a35678fec00847e6476f2cac8e7e42046e"
},
"v0.9.8": {
"ShadowBroker_v0.9.8.zip": "183bb5cd62b9b9349d95df5ef7696cb6ca810ab4b991fa9dab6f898af4c7a175",
"ShadowBroker_0.9.8_x64-setup.exe": "94a0309862e9c81c92cdcbfea8eec9dbb97eef19ded82b26217b397defbc810c",
"ShadowBroker_0.9.8_x64_en-US.msi": "fe22f9d51e4360d74c18a7250c2fbb9ed4fa4c7a884b3ac0d04a21115466386b"
},
"v0.9.81": {
"ShadowBroker_v0.9.81.zip": "f81f454bdc88e9a32c351df38212b8cfa624704d65764b971bb091eef62259c6",
"ShadowBroker_0.9.81_x64-setup.exe": "25e9a95d0d8ce959a7d08fe8e7406772ae24b596652793e81d1de5d02510a5a6",
"ShadowBroker_0.9.81_x64_en-US.msi": "34e655fc0c0f195ee4ac978f228a4b2b9d5565253b8771aca9ef4693409e9e70"
},
"v0.9.82": {
"ShadowBroker_v0.9.82.zip": "202ab043465741dcc06de57c19ec8314904332f8e818b891d7174655719d084c",
"ShadowBroker_0.9.82_x64-setup.exe": "0eb9f2bda02ab691b39687641abc97e6bfb507b42f48de21970ad7dfb4ea15fc",
"ShadowBroker_0.9.82_x64_en-US.msi": "ced08f930171c0c08009a958cc30b0171a09f982230fc217c6808c2ed7ab2e30"
},
"v0.9.83": {
"ShadowBroker_v0.9.83.zip": "53f56631731ad3cdc7be68df09bedd6570ed91ecda6fa57c39651098e15666c7",
"ShadowBroker_0.9.83_x64-setup.exe": "d62170af4b9df0b190832b7bb3ad6bfe8a7ac01472f2c7b39cf2a1b61edc7492",
"ShadowBroker_0.9.83_x64_en-US.msi": "b664cc0003a29f7ce88b04c2b425643dbe7ed897342fc6e9a2378bc1910c6850"
}
}
+941 -565
View File
File diff suppressed because it is too large Load Diff
+18 -6
View File
@@ -7,15 +7,15 @@ py-modules = []
[project]
name = "backend"
version = "0.9.79"
version = "0.9.83"
requires-python = ">=3.10"
dependencies = [
"apscheduler==3.10.3",
"beautifulsoup4>=4.9.0",
"cachetools==5.5.2",
"cryptography>=41.0.0",
"cryptography>=46.0.7",
"defusedxml>=0.7.1",
"fastapi==0.115.12",
"fastapi==0.136.3",
"feedparser==6.0.10",
"httpx==0.28.1",
"playwright==1.59.0",
@@ -24,7 +24,7 @@ dependencies = [
"pydantic-settings==2.8.1",
"pystac-client==0.8.6",
"python-dotenv==1.2.2",
"requests==2.31.0",
"requests==2.33.0",
"PySocks==1.7.1",
"reverse-geocoder==1.5.1",
"sgp4==2.25",
@@ -33,17 +33,29 @@ dependencies = [
"paho-mqtt>=1.6.0,<2.0.0",
"PyNaCl>=1.5.0",
"slowapi==0.1.9",
"starlette==1.0.1",
"vaderSentiment>=3.3.0",
"uvicorn==0.34.0",
"yfinance==1.3.0",
]
[project.optional-dependencies]
road-corridor = [
"geopandas>=1.0.0",
"imageio>=2.34.0",
"osmnx>=2.0.0",
"rasterio>=1.4.0",
"scikit-learn>=1.5.0",
"sentinelhub>=3.10.0",
"shapely>=2.0.0",
]
[dependency-groups]
dev = ["pytest>=8.3.4", "pytest-asyncio==0.25.0", "ruff>=0.9.0", "black>=24.0.0"]
dev = ["pytest>=9.0.3", "pytest-asyncio>=1.4.0", "ruff>=0.9.0", "black>=24.0.0"]
[tool.ruff.lint]
# The current backend carries historical style debt in large legacy modules.
# Keep CI focused on actionable correctness checks for the v0.9.79 release.
# Keep CI focused on actionable correctness checks for the v0.9.82 release.
ignore = ["E401", "E402", "E701", "E731", "E741", "F401", "F402", "F541", "F811", "F841"]
[tool.black]
+230
View File
@@ -0,0 +1,230 @@
"""Local-operator PTY WebSocket for the Mesh Chat agent shell."""
from __future__ import annotations
import asyncio
import fcntl
import hmac
import json
import logging
import os
import pty
import select
import signal
import struct
import sys
import termios
from typing import Any
from fastapi import APIRouter, Depends, HTTPException, Query, WebSocket, WebSocketDisconnect
from pydantic import BaseModel, Field
from auth import (
_current_admin_key,
_debug_mode_enabled,
_is_trusted_local_runtime_host,
require_local_operator,
)
from services.agent_shell_settings import (
get_agent_shell_settings,
set_agent_shell_working_directory,
)
logger = logging.getLogger(__name__)
router = APIRouter(tags=["agent-shell"])
class AgentShellSettingsUpdate(BaseModel):
working_directory: str = Field(min_length=1)
def _set_winsize(fd: int, rows: int, cols: int) -> None:
winsize = struct.pack("HHHH", rows, cols, 0, 0)
fcntl.ioctl(fd, termios.TIOCSWINSZ, winsize)
def _published_local_dashboard_ws(ws: WebSocket) -> bool:
"""Browser → published Docker port appears as a bridge IP, not loopback.
For the operator shell only, also accept when the upgrade request clearly
targets the local dashboard (Host/Origin on localhost).
"""
host_header = str(ws.headers.get("host") or "").strip().lower()
host_name = host_header.split(":", 1)[0]
if host_name in {"127.0.0.1", "localhost", "::1"}:
return True
origin = str(ws.headers.get("origin") or "").strip().lower()
if origin.startswith("http://127.0.0.1:") or origin.startswith("http://localhost:"):
return True
if origin.startswith("https://127.0.0.1:") or origin.startswith("https://localhost:"):
return True
return False
async def _authorize_agent_shell_ws(ws: WebSocket, admin_key_query: str = "") -> None:
host = (ws.client.host or "").lower() if ws.client else ""
if (
_is_trusted_local_runtime_host(host)
or _published_local_dashboard_ws(ws)
or (_debug_mode_enabled() and host == "test")
):
return
admin_key = _current_admin_key()
presented = str(admin_key_query or ws.headers.get("x-admin-key", "") or "").strip()
if admin_key and presented and hmac.compare_digest(presented.encode(), admin_key.encode()):
return
await ws.close(code=4403, reason="local operator access only")
raise WebSocketDisconnect()
def _resolve_shell_cwd(requested: str) -> str:
requested = str(requested or "").strip()
if requested:
resolved = os.path.abspath(os.path.expanduser(requested))
if os.path.isdir(resolved):
return resolved
return get_agent_shell_settings()["working_directory"]
def _default_shell() -> str:
if sys.platform == "win32":
return os.environ.get("COMSPEC", "cmd.exe")
return os.environ.get("SHELL", "/bin/bash")
async def _relay_pty(master_fd: int, proc: asyncio.subprocess.Process, ws: WebSocket) -> None:
loop = asyncio.get_running_loop()
while True:
if proc.returncode is not None:
break
try:
readable, _, _ = await loop.run_in_executor(
None, lambda: select.select([master_fd], [], [], 0.05)
)
except Exception:
break
if master_fd in readable:
try:
chunk = os.read(master_fd, 4096)
except OSError:
break
if not chunk:
break
await ws.send_bytes(chunk)
try:
message = await asyncio.wait_for(ws.receive(), timeout=0.05)
except asyncio.TimeoutError:
continue
if message.get("type") == "websocket.disconnect":
break
if message.get("type") != "websocket.receive":
continue
if message.get("bytes"):
os.write(master_fd, message["bytes"])
continue
text = message.get("text")
if not text:
continue
try:
payload = json.loads(text)
except json.JSONDecodeError:
os.write(master_fd, text.encode("utf-8", errors="replace"))
continue
if payload.get("type") == "resize":
rows = int(payload.get("rows") or 24)
cols = int(payload.get("cols") or 80)
_set_winsize(master_fd, max(rows, 2), max(cols, 2))
@router.get("/api/agent-shell/settings", dependencies=[Depends(require_local_operator)])
async def read_agent_shell_settings() -> dict[str, Any]:
return get_agent_shell_settings()
@router.put("/api/agent-shell/settings", dependencies=[Depends(require_local_operator)])
async def write_agent_shell_settings(body: AgentShellSettingsUpdate) -> dict[str, Any]:
try:
return set_agent_shell_working_directory(body.working_directory)
except ValueError as exc:
detail = str(exc)
if detail == "working_directory_not_found":
raise HTTPException(status_code=400, detail="Working directory does not exist") from exc
raise HTTPException(status_code=400, detail="Working directory is required") from exc
@router.websocket("/api/agent-shell/ws")
async def agent_shell_websocket(
ws: WebSocket,
cwd: str = Query(default=""),
cols: int = Query(default=80),
rows: int = Query(default=24),
admin_key: str = Query(default=""),
) -> None:
await ws.accept()
try:
await _authorize_agent_shell_ws(ws, admin_key)
except WebSocketDisconnect:
return
if sys.platform == "win32":
await ws.send_text(
json.dumps(
{
"type": "error",
"message": "Host PTY is not available on Windows backend builds yet. Use the ShadowBroker desktop app or run the backend in Docker/Linux for an embedded shell.",
}
)
)
await ws.close(code=1011)
return
shell_cwd = _resolve_shell_cwd(cwd)
shell = _default_shell()
master_fd, slave_fd = pty.openpty()
_set_winsize(master_fd, max(rows, 2), max(cols, 2))
env = os.environ.copy()
env.setdefault("TERM", "xterm-256color")
env.setdefault("COLORTERM", "truecolor")
home = shell_cwd if os.path.isdir(shell_cwd) else "/app"
env["HOME"] = home
env["USER"] = env.get("USER") or "operator"
path_prefixes = [
os.path.join(home, ".local", "bin"),
os.path.join(home, ".hermes", "bin"),
]
path = env.get("PATH", "")
for prefix in path_prefixes:
if os.path.isdir(prefix):
path = f"{prefix}:{path}" if path else prefix
env["PATH"] = path
proc = await asyncio.create_subprocess_exec(
shell,
stdin=slave_fd,
stdout=slave_fd,
stderr=slave_fd,
cwd=shell_cwd,
env=env,
preexec_fn=os.setsid,
)
os.close(slave_fd)
try:
await _relay_pty(master_fd, proc, ws)
finally:
try:
os.close(master_fd)
except OSError:
pass
if proc.returncode is None:
try:
os.killpg(proc.pid, signal.SIGHUP)
except ProcessLookupError:
pass
try:
await asyncio.wait_for(proc.wait(), timeout=2.0)
except asyncio.TimeoutError:
proc.kill()
await proc.wait()
+93 -8
View File
@@ -1590,7 +1590,7 @@ async def agent_tool_manifest(request: Request):
return {
"ok": True,
"version": "0.9.79",
"version": "0.9.82",
"access_tier": access_tier,
"available_commands": available_commands,
"transport": {
@@ -1705,11 +1705,12 @@ async def agent_tool_manifest(request: Request):
{
"name": "search_news",
"type": "read",
"description": "Search news and event layers server-side by keyword. Includes news, GDELT, CrowdThreat, and major incident/event feeds without pulling the full slow telemetry feed.",
"description": "Search news and event layers server-side by keyword. Includes news, GDELT, CrowdThreat, Telegram OSINT, and major incident/event feeds without pulling the full slow telemetry feed.",
"parameters": {
"query": {"type": "string", "required": True, "description": "Keyword or phrase to search for"},
"limit": {"type": "integer", "required": False, "description": "Max results (default 10, max 50)"},
"include_gdelt": {"type": "boolean", "required": False, "description": "Include GDELT matches (default true)"},
"include_telegram": {"type": "boolean", "required": False, "description": "Include Telegram OSINT channel posts (default true)"},
"compact": {"type": "boolean", "required": False, "description": "If true, strips empty/None fields from each result and rounds lat/lng to 3 decimals. Response includes format: 'compressed_v1'."},
},
"returns": "{results: [{source_layer, title, summary, source, link, lat, lng, risk_score}], version: int, truncated: bool}",
@@ -1743,6 +1744,55 @@ async def agent_tool_manifest(request: Request):
},
"returns": "{center, radius_km, nearby, topic_news, context_layers}",
},
{
"name": "osint_lookup",
"type": "read",
"description": "Run a passive OSINT recon lookup server-side (same backends as the Recon panel). SSRF-guarded outbound proxies for IP geolocation, DNS, WHOIS, certs, BGP/ASN, sanctions, CVE, MAC vendor, GitHub profile, breach checks, and threat feeds.",
"parameters": {
"tool": {"type": "string", "required": True, "description": "Lookup type: ip, dns, whois, certs, threats, bgp, sanctions, cve, mac, github, leaks, sweep_init"},
"ip": {"type": "string", "required": False, "description": "IPv4/IPv6 for ip or sweep_init"},
"domain": {"type": "string", "required": False, "description": "Domain for dns, whois, certs"},
"query": {"type": "string", "required": False, "description": "Generic query (BGP ASN, sanctions name, optional threats filter)"},
"cve": {"type": "string", "required": False, "description": "CVE id for cve lookup"},
"mac": {"type": "string", "required": False, "description": "MAC address for mac lookup"},
"username": {"type": "string", "required": False, "description": "GitHub username"},
"email": {"type": "string", "required": False, "description": "Email for breach/leak lookup"},
"schema": {"type": "string", "required": False, "description": "Sanctions schema filter: Person, Organization, Company, Vessel, Airplane, LegalEntity"},
"limit": {"type": "integer", "required": False, "description": "Sanctions result cap (default 25, max 100)"},
"cidr": {"type": "integer", "required": False, "description": "CIDR mask for sweep_init (24-32, default 24)"},
},
"returns": "Tool-specific JSON (geo, DNS records, WHOIS, sanctions hits, CVE details, etc.)",
},
{
"name": "osint_tools",
"type": "read",
"description": "List available OSINT recon tools, entity-expand types, and sanctions schemas.",
"parameters": {},
"returns": "{tools: [...], entity_types: [...], sanctions_schemas: [...], notes: {...}}",
},
{
"name": "entity_expand",
"type": "read",
"description": "Expand an entity relationship graph around an aircraft, vessel, IP, company, person, or country. Same backend as /api/entity/expand.",
"parameters": {
"type": {"type": "string", "required": True, "description": "Entity type: aircraft, vessel, company, person, ip, country"},
"id": {"type": "string", "required": True, "description": "Entity identifier (tail number, MMSI, IP, company name, etc.)"},
"registration": {"type": "string", "required": False, "description": "Aircraft registration hint"},
"model": {"type": "string", "required": False, "description": "Aircraft model hint"},
"icao24": {"type": "string", "required": False, "description": "ICAO24 hex for aircraft"},
},
"returns": "{nodes: [...], links: [...]}",
},
{
"name": "osint_sweep",
"type": "write",
"description": "Active subnet device discovery via Shodan InternetDB (ports, vulns, hostnames). Requires full OpenClaw access tier. Private/reserved IPs blocked.",
"parameters": {
"ip": {"type": "string", "required": True, "description": "Public IPv4 anchor for the sweep"},
"cidr": {"type": "integer", "required": False, "description": "Subnet size /24-/32 (default 24)"},
},
"returns": "{center, target_ip, cidr, subnet, devices, summary, sweep_time_ms}",
},
{
"name": "what_changed",
"type": "read",
@@ -2194,6 +2244,11 @@ async def agent_tool_manifest(request: Request):
"Prefer compact lookups first: search_telemetry, find_flights, find_ships, search_news, entities_near, get_layer_slice. Use get_telemetry/get_slow_telemetry/get_report only when focused commands are insufficient.",
"ShadowBroker does expose UAP sightings, wastewater, and tracked_flights/VIP aircraft when those layers are populated. Verify with get_summary or get_layer_slice before claiming a layer is absent.",
"ShadowBroker also exposes fishing_activity, which is the fishing-vessel activity layer backed by Global Fishing Watch data when GFW_API_TOKEN is configured. Do not confuse it with the AIS ships layer.",
"telegram_osint, malware_threats, cyber_threats, and scm_suppliers are live map layers. Use get_summary or get_layer_slice(['telegram_osint']) before claiming they are absent. Aliases: telegram, malware/botnet, cyber/cisa/kev, scm/suppliers.",
"search_telemetry and search_news both index Telegram OSINT posts. For malware C2, botnet IPs, CISA KEV CVEs, or semiconductor suppliers, use search_telemetry or get_layer_slice on the matching layer.",
"The Recon toolkit is available via osint_lookup: IP geolocation, DNS, WHOIS, certs, BGP, sanctions, CVE, MAC vendor, GitHub, breach checks, threat feeds. Call osint_tools first to list supported tools.",
"entity_expand builds relationship graphs for aircraft, vessels, IPs, companies, people, and countries — use after resolving an entity from telemetry or osint_lookup.",
"osint_sweep runs active subnet discovery (Shodan InternetDB) and requires full OpenClaw access tier. Use osint_lookup tool=sweep_init for passive geolocation context only.",
"Use search_telemetry as the Google-style entry point whenever the user gives you a person, place, company, topic, owner, nickname, or natural-language phrase and you do not already know the source layer.",
"Example: for 'Where is Jerry Jones yacht?' search 'Jerry Jones' across all telemetry first, identify the ship match, then refine with find_ships or raw layer context only if needed.",
"For fuzzy natural-language lookups like 'Patriots jet' or 'Jerry Jones yacht', use search_telemetry first and inspect the ranked candidate list before making a hard claim.",
@@ -2221,12 +2276,14 @@ async def agent_tool_manifest(request: Request):
async def api_capabilities(request: Request):
"""Return full API manifest so the agent knows every available endpoint."""
from services.openclaw_channel import READ_COMMANDS, WRITE_COMMANDS, detect_tier
from services.openclaw_routing import routing_manifest
from services.config import get_settings
tier = detect_tier()
access_tier = str(get_settings().OPENCLAW_ACCESS_TIER or "restricted").strip().lower()
return {
"ok": True,
"version": "0.9.79",
"version": "0.9.82",
"routing": routing_manifest(),
"auth": {
"method": "HMAC-SHA256",
"headers": ["X-SB-Timestamp", "X-SB-Nonce", "X-SB-Signature"],
@@ -2342,8 +2399,16 @@ async def api_capabilities(request: Request):
"description": "Compact server-side ship search by MMSI/IMO/name/query, including yacht-owner enrichment.",
},
"find_entity": {
"args": {"query": "str (optional)", "entity_type": "aircraft|ship|person|event|infrastructure (optional)", "callsign": "str (optional)", "registration": "str (optional)", "icao24": "str (optional)", "mmsi": "str (optional)", "imo": "str (optional)", "name": "str (optional)", "owner": "str (optional)", "layers": "list[str] (optional)", "limit": "int (default 10)"},
"description": "Exact-first resolver for planes, ships, operators, callsigns, registrations, MMSI/IMO, and named entities. Use before tracking to avoid fuzzy prompt matching.",
"args": {"query": "str (optional)", "entity_type": "aircraft|ship|person|event|infrastructure (optional)", "callsign": "str (optional)", "registration": "str (optional)", "icao24": "str (optional)", "mmsi": "str (optional)", "imo": "str (optional)", "name": "str (optional)", "owner": "str (optional)", "layers": "list[str] (optional)", "limit": "int (default 10)", "fallback_search": "bool (default false)", "confirm_fuzzy": "bool (alias for fallback_search)"},
"description": "Exact-first resolver for planes, ships, operators, callsigns, registrations, MMSI/IMO, and named entities. Skips fuzzy search unless fallback_search=true or no exact match.",
},
"route_query": {
"args": {"text": "str", "lat": "float (optional)", "lng": "float (optional)", "radius_km": "float (default 50)", "compact": "bool (default true)"},
"description": "Deterministic intent router — returns recommended fast command, alternates, and latency estimate. Preferred entry for natural-language reads.",
},
"run_playbook": {
"args": {"name": "str", "query": "str (optional)", "lat": "float (optional)", "lng": "float (optional)"},
"description": "Execute a named batch plan (hot_snapshot, morning_brief, monitor_heartbeat, track_snapshot, area_brief, entity_recon).",
},
"correlate_entity": {
"args": {"query": "str (optional)", "entity_type": "str (optional)", "callsign": "str (optional)", "registration": "str (optional)", "icao24": "str (optional)", "mmsi": "str (optional)", "imo": "str (optional)", "name": "str (optional)", "owner": "str (optional)", "radius_km": "float (default 100)", "limit": "int (default 10)"},
@@ -2354,13 +2419,29 @@ async def api_capabilities(request: Request):
"description": "Universal compact search across telemetry when the entity type or source layer is not obvious.",
},
"search_news": {
"args": {"query": "str", "limit": "int (default 10)", "include_gdelt": "bool (default true)"},
"description": "Search news and event layers by keyword without pulling the whole slow feed.",
"args": {"query": "str", "limit": "int (default 10)", "include_gdelt": "bool (default true)", "include_telegram": "bool (default true)"},
"description": "Search news and event layers by keyword without pulling the whole slow feed. Includes Telegram OSINT when include_telegram is true.",
},
"entities_near": {
"args": {"lat": "float", "lng": "float", "radius_km": "float (default 50)", "entity_types": "list[str] (optional)", "limit": "int (default 25)"},
"description": "Compact proximity search around a point across selected layers.",
},
"osint_lookup": {
"args": {"tool": "str (ip|dns|whois|certs|threats|bgp|sanctions|cve|mac|github|leaks|sweep_init)", "...": "tool-specific params"},
"description": "Passive OSINT recon lookup — same backends as the Recon panel.",
},
"osint_tools": {
"args": {},
"description": "List available recon tools and entity-expand types.",
},
"entity_expand": {
"args": {"type": "str", "id": "str", "registration": "str (optional)", "icao24": "str (optional)"},
"description": "Entity relationship graph expansion.",
},
"osint_sweep": {
"args": {"ip": "str", "cidr": "int (default 24)"},
"description": "Active subnet scan — requires full access tier.",
},
"brief_area": {
"args": {"lat": "float", "lng": "float", "radius_km": "float (default 50)", "entity_types": "list[str] (optional)", "query": "str (optional)", "limit": "int (default 25)", "context_limit": "int (default 10)"},
"description": "One compact area brief: nearby aircraft/ships/entities, optional topic news, and selected context layers.",
@@ -2507,7 +2588,8 @@ async def api_capabilities(request: Request):
"layers are serialized, unchanged layers transfer zero bytes. The client tracks versions "
"automatically from SSE events and previous responses. "
"3) Pass compact=true on every read command for compressed_v1 responses (~60-90% smaller). "
"4) Use targeted commands first (find_flights, search_telemetry, entities_near). "
"4) Use route_query / find_entity / run_playbook before search_telemetry. "
"Expensive commands require confirm_expensive=true. "
"Reserve get_telemetry/get_slow_telemetry for rare full-context pulls.",
"pins": "Pins are server-side, NOT localStorage. Use place_pin command or POST /api/ai/pins. The agent can place and delete pins.",
"tracking": "To track a specific aircraft without polling: use add_watch with track_callsign or track_registration. Over SSE, you'll get instant push alerts.",
@@ -2637,6 +2719,7 @@ def _connect_info_metadata(settings) -> dict:
"get_telemetry", "get_pins", "satellite_images",
"news_near", "ai_summary", "ai_report",
"timemachine_list", "timemachine_view",
"infonet_status", "list_gates", "read_gate_messages", "poll_dms",
],
},
"full": {
@@ -2647,6 +2730,8 @@ def _connect_info_metadata(settings) -> dict:
"satellite_images", "news_near", "data_injection",
"ai_summary", "ai_report", "timemachine_snapshot",
"timemachine_list", "timemachine_view", "timemachine_diff",
"ensure_infonet_ready", "join_infonet_swarm",
"post_gate_message", "cast_vote", "send_dm",
],
},
},
+36 -2
View File
@@ -47,6 +47,8 @@ _CCTV_PROXY_ALLOWED_HOSTS = {
"www.tripcheck.com",
"infocar.dgt.es",
"informo.madrid.es",
"webcams2.asfinag.at",
"odo.asfinag.at",
"www.windy.com",
"imgproxy.windy.com",
"www.lakecountypassage.com",
@@ -55,6 +57,14 @@ _CCTV_PROXY_ALLOWED_HOSTS = {
"www.nps.gov",
"home.lewiscounty.com",
"www.seattle.gov",
"511on.ca",
"511.alberta.ca",
"fl511.com",
"www.fl511.com",
"webcams.transport.nsw.gov.au",
"www.livetraffic.com",
"livetraffic.com",
"opendata.ndw.nu",
}
@@ -120,7 +130,7 @@ def _cctv_proxy_profile_for_url(target_url: str) -> _CCTVProxyProfile:
read_timeout = 18.0 if "/snapshots/" in path else 12.0
return _CCTVProxyProfile(name="gdot-snapshot", timeout=(_CCTV_PROXY_CONNECT_TIMEOUT_S, read_timeout), cache_seconds=15,
headers={"Accept": "image/avif,image/webp,image/apng,image/*,*/*;q=0.8",
"Referer": "http://navigator-c2c.dot.ga.gov/"})
"Referer": "https://navigator-c2c.dot.ga.gov/"})
if host == "511ga.org":
return _CCTVProxyProfile(name="gdot-511ga-image", timeout=(_CCTV_PROXY_CONNECT_TIMEOUT_S, 12.0), cache_seconds=15,
headers={"Accept": "image/avif,image/webp,image/apng,image/*,*/*;q=0.8",
@@ -128,7 +138,7 @@ def _cctv_proxy_profile_for_url(target_url: str) -> _CCTVProxyProfile:
if host.startswith("vss") and host.endswith("dot.ga.gov"):
return _CCTVProxyProfile(name="gdot-hls", timeout=(_CCTV_PROXY_CONNECT_TIMEOUT_S, 20.0), cache_seconds=10,
headers={"Accept": "application/vnd.apple.mpegurl,application/x-mpegURL,video/*,*/*;q=0.8",
"Referer": "http://navigator-c2c.dot.ga.gov/"})
"Referer": "https://navigator-c2c.dot.ga.gov/"})
if host in {"gettingaroundillinois.com", "cctv.travelmidwest.com"}:
return _CCTVProxyProfile(name="illinois-dot", timeout=(_CCTV_PROXY_CONNECT_TIMEOUT_S, 12.0), cache_seconds=30,
headers={"Accept": "image/avif,image/webp,image/apng,image/*,*/*;q=0.8"})
@@ -156,10 +166,34 @@ def _cctv_proxy_profile_for_url(target_url: str) -> _CCTVProxyProfile:
return _CCTVProxyProfile(name="madrid-city", timeout=(_CCTV_PROXY_CONNECT_TIMEOUT_S, 12.0), cache_seconds=30,
headers={"Accept": "image/avif,image/webp,image/apng,image/*,*/*;q=0.8",
"Referer": "https://informo.madrid.es/"})
if host in {"webcams2.asfinag.at", "odo.asfinag.at"}:
return _CCTVProxyProfile(name="asfinag-austria", timeout=(_CCTV_PROXY_CONNECT_TIMEOUT_S, 15.0), cache_seconds=60,
headers={"Accept": "image/avif,image/webp,image/apng,image/*,*/*;q=0.8",
"Referer": "https://www.asfinag.at/"})
if host in {"www.windy.com", "imgproxy.windy.com"}:
return _CCTVProxyProfile(name="windy-webcams", timeout=(_CCTV_PROXY_CONNECT_TIMEOUT_S, 12.0), cache_seconds=60,
headers={"Accept": "image/avif,image/webp,image/apng,image/*,*/*;q=0.8",
"Referer": "https://www.windy.com/"})
if host == "511on.ca":
return _CCTVProxyProfile(name="ontario-511", timeout=(_CCTV_PROXY_CONNECT_TIMEOUT_S, 15.0), cache_seconds=30,
headers={"Accept": "image/avif,image/webp,image/apng,image/*,*/*;q=0.8",
"Referer": "https://511on.ca/"})
if host == "511.alberta.ca":
return _CCTVProxyProfile(name="alberta-511", timeout=(_CCTV_PROXY_CONNECT_TIMEOUT_S, 15.0), cache_seconds=30,
headers={"Accept": "image/avif,image/webp,image/apng,image/*,*/*;q=0.8",
"Referer": "https://511.alberta.ca/"})
if host in {"fl511.com", "www.fl511.com"}:
return _CCTVProxyProfile(name="florida-511", timeout=(_CCTV_PROXY_CONNECT_TIMEOUT_S, 15.0), cache_seconds=30,
headers={"Accept": "image/avif,image/webp,image/apng,image/*,*/*;q=0.8",
"Referer": "https://fl511.com/"})
if host == "webcams.transport.nsw.gov.au":
return _CCTVProxyProfile(name="nsw-live-traffic", timeout=(_CCTV_PROXY_CONNECT_TIMEOUT_S, 12.0), cache_seconds=60,
headers={"Accept": "image/avif,image/webp,image/apng,image/*,*/*;q=0.8",
"Referer": "https://www.livetraffic.com/"})
if host in {"opendata.ndw.nu", "www.ndw.nu"}:
return _CCTVProxyProfile(name="ndw-netherlands", timeout=(_CCTV_PROXY_CONNECT_TIMEOUT_S, 12.0), cache_seconds=120,
headers={"Accept": "image/avif,image/webp,image/apng,image/*,*/*;q=0.8",
"Referer": "https://www.ndw.nu/"})
return _CCTVProxyProfile(name="generic-cctv", timeout=(_CCTV_PROXY_CONNECT_TIMEOUT_S, 8.0), cache_seconds=30,
headers={"Accept": "*/*"})
+167 -7
View File
@@ -1,6 +1,7 @@
import asyncio
import logging
import math
import os
import threading
from typing import Any
from fastapi import APIRouter, Request, Response, Query, Depends
@@ -8,7 +9,7 @@ from fastapi.responses import JSONResponse
from pydantic import BaseModel
from limiter import limiter
from auth import require_admin, require_local_operator
from services.data_fetcher import get_latest_data, update_all_data
from services.data_fetcher import update_all_data
import orjson
import json as json_mod
@@ -30,6 +31,14 @@ class LayerUpdate(BaseModel):
layers: dict[str, bool]
class LiveUamapOptInUpdate(BaseModel):
opted_in: bool
class PredictionMarketsOptInUpdate(BaseModel):
opted_in: bool
_LAST_VIEWPORT_UPDATE: tuple | None = None
_LAST_VIEWPORT_UPDATE_TS = 0.0
_VIEWPORT_UPDATE_LOCK = threading.Lock()
@@ -202,6 +211,15 @@ def _sanitize_payload(value):
return value
def _live_data_json_bytes(payload: dict) -> bytes:
"""Serialize dashboard payloads with the same defensive orjson options everywhere."""
return orjson.dumps(
_sanitize_payload(payload),
default=str,
option=orjson.OPT_NON_STR_KEYS,
)
def _bbox_filter(items: list, s: float, w: float, n: float, e: float,
lat_key: str = "lat", lng_key: str = "lng") -> list:
pad_lat = (n - s) * 0.2
@@ -386,6 +404,95 @@ async def update_viewport(vp: ViewportUpdate, request: Request): # noqa: ARG001
return {"status": "ok"}
@router.get("/api/liveuamap/scraper-status", dependencies=[Depends(require_local_operator)])
async def api_liveuamap_scraper_status():
"""Whether LiveUAMap Playwright may run (Windows needs UI opt-in unless env forces)."""
from services.liveuamap_settings import liveuamap_scraper_status
return liveuamap_scraper_status()
@router.post("/api/liveuamap/scraper-opt-in", dependencies=[Depends(require_local_operator)])
@limiter.limit("10/minute")
async def api_liveuamap_scraper_opt_in(body: LiveUamapOptInUpdate, request: Request):
"""Persist operator consent for LiveUAMap scraper (#348)."""
from services.liveuamap_settings import liveuamap_scraper_status, set_liveuamap_ui_opt_in
set_liveuamap_ui_opt_in(body.opted_in)
if body.opted_in:
from services.fetchers._store import is_any_active
if is_any_active("global_incidents"):
threading.Thread(target=_run_liveuamap_refresh, daemon=True).start()
return liveuamap_scraper_status()
def _run_liveuamap_refresh() -> None:
try:
from services.fetchers.geo import update_liveuamap
update_liveuamap()
except Exception as e:
logger.warning("LiveUAMap refresh after opt-in failed: %s", e)
@router.get("/api/prediction-markets/status", dependencies=[Depends(require_local_operator)])
async def api_prediction_markets_status():
"""Whether Polymarket/Kalshi fetches and news market correlation are enabled."""
from services.prediction_markets_settings import prediction_markets_status
return prediction_markets_status()
@router.post("/api/prediction-markets/opt-in", dependencies=[Depends(require_local_operator)])
@limiter.limit("10/minute")
async def api_prediction_markets_opt_in(body: PredictionMarketsOptInUpdate, request: Request):
"""Enable or disable prediction market fetches + intercept story correlation."""
from services.config import get_settings
from services.prediction_markets_settings import (
prediction_markets_status,
set_prediction_markets_ui_opt_in,
)
from routers.ai_intel import _write_env_value
set_prediction_markets_ui_opt_in(body.opted_in)
_write_env_value("PREDICTION_MARKETS_ENABLED", "true" if body.opted_in else "false")
os.environ["PREDICTION_MARKETS_ENABLED"] = "true" if body.opted_in else "false"
get_settings.cache_clear()
if body.opted_in:
threading.Thread(target=_run_prediction_markets_refresh, daemon=True).start()
else:
threading.Thread(target=_run_prediction_markets_disable, daemon=True).start()
return prediction_markets_status()
def _run_prediction_markets_refresh() -> None:
try:
from services.fetchers.prediction_markets import fetch_prediction_markets
from services.fetchers.news import fetch_news
fetch_prediction_markets()
fetch_news()
except Exception as e:
logger.warning("Prediction markets refresh after opt-in failed: %s", e)
def _run_prediction_markets_disable() -> None:
try:
from services.fetchers._store import _data_lock, _mark_fresh, latest_data
from services.fetchers.news import fetch_news
with _data_lock:
latest_data["prediction_markets"] = []
latest_data["trending_markets"] = []
_mark_fresh("prediction_markets")
fetch_news()
except Exception as e:
logger.warning("Prediction markets disable cleanup failed: %s", e)
@router.post("/api/layers", dependencies=[Depends(require_local_operator)])
@limiter.limit("30/minute")
async def update_layers(update: LayerUpdate, request: Request):
@@ -395,6 +502,8 @@ async def update_layers(update: LayerUpdate, request: Request):
old_mesh = is_any_active("sigint_meshtastic")
old_aprs = is_any_active("sigint_aprs")
old_viirs = is_any_active("viirs_nightlights")
old_datacenters = is_any_active("datacenters")
old_fishing = is_any_active("fishing_activity")
changed = False
for key, value in update.layers.items():
if key in active_layers:
@@ -407,6 +516,8 @@ async def update_layers(update: LayerUpdate, request: Request):
new_mesh = is_any_active("sigint_meshtastic")
new_aprs = is_any_active("sigint_aprs")
new_viirs = is_any_active("viirs_nightlights")
new_datacenters = is_any_active("datacenters")
new_fishing = is_any_active("fishing_activity")
if old_ships and not new_ships:
from services.ais_stream import stop_ais_stream
stop_ais_stream()
@@ -450,13 +561,33 @@ async def update_layers(update: LayerUpdate, request: Request):
if not old_viirs and new_viirs:
_queue_viirs_change_refresh()
logger.info("VIIRS change refresh queued (layer enabled)")
if not old_datacenters and new_datacenters:
from services.fetchers.infrastructure import fetch_datacenters
fetch_datacenters()
logger.info("Datacenters loaded (layer enabled)")
if not old_fishing and new_fishing:
from services.fetchers.geo import fetch_fishing_activity
fetch_fishing_activity()
logger.info("Fishing activity refresh queued (layer enabled)")
return {"status": "ok"}
@router.get("/api/live-data")
@limiter.limit("120/minute")
async def live_data(request: Request):
return get_latest_data()
etag = _current_etag(prefix="live|full|")
if request.headers.get("if-none-match") == etag:
return Response(status_code=304, headers={"ETag": etag, "Cache-Control": "no-cache"})
from services.fetchers._store import get_latest_data_deepcopy_snapshot
payload = get_latest_data_deepcopy_snapshot()
return Response(
content=_live_data_json_bytes(payload),
media_type="application/json",
headers={"ETag": etag, "Cache-Control": "no-cache"},
)
@router.get("/api/bootstrap/critical")
@@ -551,7 +682,7 @@ async def bootstrap_critical(request: Request):
"bootstrap_payload": True,
}
return Response(
content=orjson.dumps(_sanitize_payload(payload), default=str, option=orjson.OPT_NON_STR_KEYS),
content=_live_data_json_bytes(payload),
media_type="application/json",
headers={"ETag": etag, "Cache-Control": "no-cache"},
)
@@ -613,8 +744,11 @@ async def live_data_fast(
# to the pre-#288 implementation.
if _has_full_bbox(s, w, n, e):
payload = _apply_bbox_to_payload(payload, _FAST_BBOX_HEAVY_KEYS, s, w, n, e)
return Response(content=orjson.dumps(_sanitize_payload(payload)), media_type="application/json",
headers={"ETag": etag, "Cache-Control": "no-cache"})
return Response(
content=_live_data_json_bytes(payload),
media_type="application/json",
headers={"ETag": etag, "Cache-Control": "no-cache"},
)
@router.get("/api/live-data/slow")
@@ -638,7 +772,8 @@ async def live_data_slow(
"firms_fires", "datacenters", "military_bases", "power_plants", "viirs_change_nodes",
"scanners", "weather_alerts", "ukraine_alerts", "air_quality", "volcanoes",
"fishing_activity", "psk_reporter", "correlations", "uap_sightings", "wastewater",
"crowdthreat", "threat_level", "trending_markets",
"crowdthreat", "threat_level", "trending_markets", "road_corridor_trends",
"malware_threats", "cyber_threats", "scm_suppliers", "telegram_osint",
)
freshness = get_source_timestamps_snapshot()
payload = {
@@ -679,6 +814,31 @@ async def live_data_slow(
"uap_sightings": (d.get("uap_sightings") or []) if active_layers.get("uap_sightings", True) else [],
"wastewater": (d.get("wastewater") or []) if active_layers.get("wastewater", True) else [],
"crowdthreat": (d.get("crowdthreat") or []) if active_layers.get("crowdthreat", True) else [],
"road_corridor_trends": (
d.get("road_corridor_trends") or {"updated_at": None, "corridors": []}
)
if active_layers.get("road_corridor_trends", False)
else {"updated_at": None, "corridors": []},
"malware_threats": (
d.get("malware_threats") or {"threats": [], "total": 0}
)
if active_layers.get("malware_c2", False)
else {"threats": [], "total": 0},
"cyber_threats": (
d.get("cyber_threats") or {"threats": [], "stats": {}}
)
if active_layers.get("cyber_threats", False)
else {"threats": [], "stats": {}},
"scm_suppliers": (
d.get("scm_suppliers") or {"suppliers": [], "total": 0, "critical_count": 0}
)
if active_layers.get("scm_suppliers", False)
else {"suppliers": [], "total": 0, "critical_count": 0},
"telegram_osint": (
d.get("telegram_osint") or {"posts": [], "total": 0, "geolocated": 0}
)
if active_layers.get("telegram_osint", True)
else {"posts": [], "total": 0, "geolocated": 0},
"freshness": freshness,
}
# Issue #288: bbox filter heavy/dense layers only when all four bounds
@@ -688,7 +848,7 @@ async def live_data_slow(
if _has_full_bbox(s, w, n, e):
payload = _apply_bbox_to_payload(payload, _SLOW_BBOX_HEAVY_KEYS, s, w, n, e)
return Response(
content=orjson.dumps(_sanitize_payload(payload), default=str, option=orjson.OPT_NON_STR_KEYS),
content=_live_data_json_bytes(payload),
media_type="application/json",
headers={"ETag": etag, "Cache-Control": "no-cache"},
)
+30
View File
@@ -0,0 +1,30 @@
"""Entity graph expansion (intel layer)."""
from __future__ import annotations
from fastapi import APIRouter, Depends, HTTPException, Query, Request
from auth import require_local_operator
from limiter import limiter
from services.osint_intel.resolve import resolve_entity
router = APIRouter()
@router.get("/api/entity/expand")
@limiter.limit("30/minute")
async def entity_expand(
request: Request,
_: None = Depends(require_local_operator),
type: str = Query(..., min_length=3, max_length=32),
id: str = Query(..., min_length=2, max_length=200),
registration: str | None = Query(default=None, max_length=32),
model: str | None = Query(default=None, max_length=64),
icao24: str | None = Query(default=None, max_length=16),
) -> dict:
props = {"label": id, "registration": registration, "model": model, "icao24": icao24}
try:
return resolve_entity(type, id, props)
except ValueError as exc:
raise HTTPException(status_code=400, detail=str(exc)) from exc
except Exception as exc:
raise HTTPException(status_code=502, detail="Intelligence layer unavailable") from exc
+16 -1
View File
@@ -8,7 +8,7 @@ from services.data_fetcher import get_latest_data
from services.schemas import HealthResponse
import os
APP_VERSION = os.environ.get("_HEALTH_APP_VERSION", "0.9.79")
APP_VERSION = os.environ.get("_HEALTH_APP_VERSION", "0.9.82")
router = APIRouter()
@@ -59,6 +59,12 @@ async def health_check(request: Request):
# when the SPKI-pinned fallback is in effect. The data plane keeps
# flowing (this is by design — see ais_proxy.js comments) but observers
# who care about MITM-protection posture deserve a visible signal.
#
# Plus connectivity health (added 2026-05-23 when stream.aisstream.io
# went fully offline): ``connected`` tells the frontend whether ship
# data is actually flowing. When false, a banner explains that ships
# are unavailable due to an upstream outage — better than the user
# silently seeing an empty ocean and assuming we broke something.
ais_status: dict = {}
try:
from services.ais_stream import ais_proxy_status
@@ -69,6 +75,15 @@ async def health_check(request: Request):
# Don't override a worse top-level status if SLOs already failed,
# but escalate ok -> degraded so the field surfaces in dashboards.
top_status = "degraded"
# AIS_API_KEY not configured is "feature off", not "system broken" —
# so we only escalate when the operator opted into AIS (key set) AND
# the stream is currently offline.
if (
os.environ.get("AIS_API_KEY")
and ais_status.get("connected") is False
and top_status == "ok"
):
top_status = "degraded"
return {
"status": top_status,
+122
View File
@@ -0,0 +1,122 @@
"""Malware, cyber threats, and country risk feeds."""
from __future__ import annotations
import logging
from urllib.parse import urlparse
import requests
from fastapi import APIRouter, HTTPException, Query, Request
from fastapi.responses import StreamingResponse
from starlette.background import BackgroundTask
from limiter import limiter
from services.fetchers._store import get_latest_data_subset_refs
from services.fetchers.telegram_osint import telegram_media_host_allowed
from services.intel_feeds.country_risk import build_country_risk_payload
from services.network_utils import outbound_user_agent
logger = logging.getLogger(__name__)
router = APIRouter()
@router.get("/api/malware")
@limiter.limit("60/minute")
async def malware_feed(request: Request) -> dict:
snap = get_latest_data_subset_refs("malware_threats")
payload = snap.get("malware_threats")
if isinstance(payload, dict) and payload.get("threats") is not None:
return payload
return {"threats": [], "total": 0, "timestamp": None, "source": "abuse.ch"}
@router.get("/api/cyber-threats")
@limiter.limit("60/minute")
async def cyber_threats(request: Request) -> dict:
snap = get_latest_data_subset_refs("cyber_threats")
return snap.get("cyber_threats") or {"threats": [], "stats": {}}
@router.get("/api/country-risk")
@limiter.limit("30/minute")
async def country_risk(request: Request) -> dict:
return build_country_risk_payload()
@router.get("/api/telegram-feed")
@limiter.limit("30/minute")
async def telegram_feed(request: Request) -> dict:
snap = get_latest_data_subset_refs("telegram_osint")
payload = snap.get("telegram_osint")
if isinstance(payload, dict) and payload.get("posts") is not None:
return payload
return {"posts": [], "total": 0, "geolocated": 0, "timestamp": None}
def _infer_telegram_media_type(target_url: str, content_type: str) -> str:
clean_type = str(content_type or "").split(";", 1)[0].strip().lower()
if clean_type and clean_type not in {"application/octet-stream", "binary/octet-stream"}:
return content_type
path = str(urlparse(target_url).path or "").lower()
if path.endswith((".jpg", ".jpeg")):
return "image/jpeg"
if path.endswith(".png"):
return "image/png"
if path.endswith(".webp"):
return "image/webp"
if path.endswith(".gif"):
return "image/gif"
if path.endswith(".mp4"):
return "video/mp4"
if path.endswith(".webm"):
return "video/webm"
return content_type or "application/octet-stream"
@router.get("/api/telegram/media")
@limiter.limit("60/minute")
async def telegram_media_proxy(request: Request, url: str = Query(...)) -> StreamingResponse:
"""Stream Telegram CDN media for in-app playback (host allowlist only)."""
parsed = urlparse(url)
if parsed.scheme not in ("http", "https"):
raise HTTPException(status_code=400, detail="Invalid scheme")
if not telegram_media_host_allowed(parsed.hostname):
raise HTTPException(status_code=403, detail="Host not allowed")
headers = {
"User-Agent": (
f"Mozilla/5.0 (compatible; {outbound_user_agent('telegram-media')}) "
"AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36"
),
"Accept": "*/*",
}
if range_header := request.headers.get("range"):
headers["Range"] = range_header
try:
resp = requests.get(url, stream=True, timeout=(3, 45), headers=headers)
except requests.RequestException as exc:
logger.warning("Telegram media upstream failure %s: %s", url, exc)
raise HTTPException(status_code=502, detail="Upstream fetch failed") from exc
if resp.status_code >= 400:
resp.close()
raise HTTPException(status_code=int(resp.status_code), detail=f"Upstream returned {resp.status_code}")
media_type = _infer_telegram_media_type(url, resp.headers.get("Content-Type", "application/octet-stream"))
response_headers = {
"Cache-Control": "private, max-age=300",
"Accept-Ranges": resp.headers.get("Accept-Ranges", "bytes"),
}
if content_length := resp.headers.get("Content-Length"):
response_headers["Content-Length"] = content_length
if content_range := resp.headers.get("Content-Range"):
response_headers["Content-Range"] = content_range
return StreamingResponse(
resp.iter_content(chunk_size=65536),
status_code=resp.status_code,
media_type=media_type,
headers=response_headers,
background=BackgroundTask(resp.close),
)
+15
View File
@@ -55,10 +55,20 @@ def _hydrate_gate_store_from_chain(events: list) -> int:
return count
def _hydrate_dm_relay_from_chain(events: list) -> int:
import main as _m
return int(_m._hydrate_dm_relay_from_chain(events))
@router.post("/api/mesh/infonet/peer-push")
@limiter.limit("30/minute")
async def infonet_peer_push(request: Request):
"""Accept pushed Infonet events from relay peers (HMAC-authenticated)."""
from services.mesh.mesh_fleet_defaults import infonet_fleet_join_enabled
if not infonet_fleet_join_enabled():
return {"ok": True, "accepted": 0, "duplicates": 0, "rejected": [], "skipped": "fleet_join_disabled"}
content_length = request.headers.get("content-length")
if content_length:
try:
@@ -82,6 +92,7 @@ async def infonet_peer_push(request: Request):
return {"ok": True, "accepted": 0, "duplicates": 0, "rejected": []}
result = infonet.ingest_events(events)
_hydrate_gate_store_from_chain(events)
_hydrate_dm_relay_from_chain(events)
return {"ok": True, **result}
@@ -147,6 +158,10 @@ async def dm_replicate_envelope(request: Request):
@limiter.limit("30/minute")
async def gate_peer_push(request: Request):
"""Accept pushed gate events from relay peers (private plane)."""
from services.mesh.mesh_fleet_defaults import infonet_fleet_join_enabled
if not infonet_fleet_join_enabled():
return {"ok": True, "accepted": 0, "duplicates": 0, "skipped": "fleet_join_disabled"}
content_length = request.headers.get("content-length")
if content_length:
try:
+33 -8
View File
@@ -65,6 +65,7 @@ from services.mesh.mesh_signed_events import (
logger = logging.getLogger(__name__)
router = APIRouter()
_INFONET_SYNC_RATE_LIMIT = "600/minute"
def _signed_body(request: Request) -> dict[str, Any]:
@@ -263,6 +264,19 @@ def _redact_public_event(event: dict) -> dict:
return _redact_vote_gate(_redact_key_rotate_payload(_redact_gate_metadata(event)))
def _infonet_private_transport_required() -> bool:
import main as _m
return bool(_m._infonet_private_transport_required())
def _infonet_sync_response_events(events: list[dict], request=None) -> list[dict]:
"""Build the sync event surface for the current transport policy."""
import main as _m
return _m._infonet_sync_response_events(events, request=request)
def _trusted_gate_reply_to(event: dict) -> str:
if not isinstance(event, dict):
return ""
@@ -574,6 +588,12 @@ def _hydrate_gate_store_from_chain(events: list[dict]) -> int:
pass
return count
def _hydrate_dm_relay_from_chain(events: list[dict]) -> int:
import main as _m
return int(_m._hydrate_dm_relay_from_chain(events))
# --- Safe type helpers ---
def _safe_int(val, default=0):
@@ -1531,7 +1551,7 @@ async def infonet_locator(request: Request, limit: int = Query(32, ge=4, le=128)
@router.post("/api/mesh/infonet/sync")
@limiter.limit("30/minute")
@limiter.limit(_INFONET_SYNC_RATE_LIMIT)
@mesh_write_exempt(MeshWriteExemption.PEER_GOSSIP)
async def infonet_sync_post(
request: Request,
@@ -1584,8 +1604,7 @@ async def infonet_sync_post(
elif matched_hash == GENESIS_HASH and len(locator) > 1:
forked = True
# Filter out legacy gate_message events — not part of the public sync surface.
events = [_redact_public_event(e) for e in events if e.get("event_type") != "gate_message"]
events = _infonet_sync_response_events(events, request=request)
response = {
"events": events,
@@ -1646,7 +1665,7 @@ async def mesh_rns_status(request: Request):
@router.get("/api/mesh/infonet/sync")
@limiter.limit("30/minute")
@limiter.limit(_INFONET_SYNC_RATE_LIMIT)
async def infonet_sync(
request: Request,
after_hash: str = "",
@@ -1684,8 +1703,7 @@ async def infonet_sync(
)
base = after_hash or GENESIS_HASH
events = infonet.get_events_after(base, limit=limit)
# Filter out legacy gate_message events — not part of the public sync surface.
events = [_redact_public_event(e) for e in events if e.get("event_type") != "gate_message"]
events = _infonet_sync_response_events(events, request=request)
return {
"events": events,
"after_hash": base,
@@ -1724,6 +1742,7 @@ async def infonet_ingest(request: Request):
result = infonet.ingest_events(events)
_hydrate_gate_store_from_chain(events)
_hydrate_dm_relay_from_chain(events)
return {"ok": True, **result}
@@ -2279,6 +2298,12 @@ async def infonet_event(request: Request, event_id: str):
)
return _strip_gate_for_access(evt, access)
return {"ok": False, "detail": "Event not found"}
if evt.get("event_type") == "dm_message":
return await _private_plane_refusal_response(
request,
status_code=403,
payload=_private_plane_access_denied_payload(),
)
if evt.get("event_type") == "gate_message":
gate_id = str(evt.get("payload", {}).get("gate", "") or evt.get("gate", "") or "").strip()
access = _verify_gate_access(request, gate_id) if gate_id else ""
@@ -2303,7 +2328,7 @@ async def infonet_node_events(
from services.mesh.mesh_hashchain import infonet
events = infonet.get_events_by_node(node_id, limit=limit)
events = [e for e in events if e.get("event_type") != "gate_message"]
events = [e for e in events if e.get("event_type") not in {"gate_message", "dm_message"}]
events = [_redact_public_event(e) for e in infonet.decorate_events(events)]
events = _redact_public_node_history(
events,
@@ -2328,7 +2353,7 @@ async def infonet_events_by_type(
else:
events = list(reversed(infonet.events))
events = events[offset : offset + limit]
events = [e for e in events if e.get("event_type") != "gate_message"]
events = [e for e in events if e.get("event_type") not in {"gate_message", "dm_message"}]
events = [_redact_public_event(e) for e in infonet.decorate_events(events)]
return {
"events": events,
+151
View File
@@ -0,0 +1,151 @@
"""Operator OSINT recon routes (server-side proxies, SSRF guarded)."""
from __future__ import annotations
from fastapi import APIRouter, Depends, HTTPException, Query, Request
from pydantic import BaseModel, Field
from auth import require_local_operator
from limiter import limiter
from services.osint import lookups
router = APIRouter(dependencies=[Depends(require_local_operator)])
_ALLOWED_SCHEMAS = {
"Person",
"Organization",
"Company",
"Vessel",
"Airplane",
"LegalEntity",
}
class SweepScanRequest(BaseModel):
ip: str = Field(min_length=7, max_length=45)
cidr: int = Field(default=24, ge=24, le=32)
def _bad_request(exc: ValueError) -> HTTPException:
return HTTPException(status_code=400, detail=str(exc))
@router.get("/api/osint/ip")
@limiter.limit("20/minute")
async def osint_ip(request: Request, ip: str = Query(..., min_length=7, max_length=45)) -> dict:
try:
return lookups.lookup_ip(ip)
except ValueError as exc:
raise _bad_request(exc) from exc
@router.get("/api/osint/dns")
@limiter.limit("20/minute")
async def osint_dns(request: Request, domain: str = Query(..., min_length=4, max_length=253)) -> dict:
try:
return lookups.lookup_dns(domain)
except ValueError as exc:
raise _bad_request(exc) from exc
@router.get("/api/osint/whois")
@limiter.limit("20/minute")
async def osint_whois(request: Request, domain: str = Query(..., min_length=4, max_length=253)) -> dict:
try:
return lookups.lookup_whois(domain)
except ValueError as exc:
raise _bad_request(exc) from exc
@router.get("/api/osint/certs")
@limiter.limit("20/minute")
async def osint_certs(request: Request, domain: str = Query(..., min_length=4, max_length=253)) -> dict:
try:
return lookups.lookup_certs(domain)
except ValueError as exc:
raise _bad_request(exc) from exc
@router.get("/api/osint/threats")
@limiter.limit("20/minute")
async def osint_threats(request: Request, query: str | None = Query(default=None, max_length=253)) -> dict:
return lookups.lookup_threats(query)
@router.get("/api/osint/bgp")
@limiter.limit("20/minute")
async def osint_bgp(request: Request, query: str = Query(..., min_length=2, max_length=64)) -> dict:
try:
return lookups.lookup_bgp(query)
except ValueError as exc:
raise _bad_request(exc) from exc
@router.get("/api/osint/sanctions")
@limiter.limit("20/minute")
async def osint_sanctions(
request: Request,
query: str = Query(..., min_length=4, max_length=200),
schema: str | None = Query(default=None),
limit: int = Query(default=25, ge=1, le=100),
) -> dict:
if schema and schema not in _ALLOWED_SCHEMAS:
raise HTTPException(status_code=400, detail=f"Invalid schema. Allowed: {', '.join(sorted(_ALLOWED_SCHEMAS))}")
return lookups.lookup_sanctions(query, schema=schema, limit=limit)
@router.get("/api/osint/cve")
@limiter.limit("30/minute")
async def osint_cve(request: Request, cve: str = Query(..., min_length=10, max_length=32)) -> dict:
try:
return lookups.lookup_cve(cve)
except ValueError as exc:
raise HTTPException(status_code=404 if "not found" in str(exc).lower() else 400, detail=str(exc)) from exc
@router.get("/api/osint/mac")
@limiter.limit("20/minute")
async def osint_mac(request: Request, mac: str = Query(..., min_length=5, max_length=32)) -> dict:
return lookups.lookup_mac(mac)
@router.get("/api/osint/github")
@limiter.limit("20/minute")
async def osint_github(request: Request, username: str = Query(..., min_length=1, max_length=64)) -> dict:
try:
return lookups.lookup_github(username)
except ValueError as exc:
raise HTTPException(status_code=404, detail=str(exc)) from exc
@router.get("/api/osint/leaks")
@limiter.limit("10/minute")
async def osint_leaks(request: Request, email: str = Query(..., min_length=5, max_length=254)) -> dict:
try:
return lookups.lookup_leaks(email)
except ValueError as exc:
raise _bad_request(exc) from exc
@router.get("/api/osint/sweep")
@limiter.limit("5/minute")
async def osint_sweep_init(
request: Request,
ip: str = Query(..., min_length=7, max_length=45),
cidr: int = Query(default=24, ge=24, le=32),
) -> dict:
try:
return lookups.sweep_init(ip, cidr)
except ValueError as exc:
raise _bad_request(exc) from exc
@router.post("/api/osint/sweep/scan")
@limiter.limit("3/minute")
async def osint_sweep_scan(request: Request, payload: SweepScanRequest) -> dict:
try:
subnet = lookups.subnet_start_for(payload.ip, payload.cidr)
scan = lookups.sweep_scan(subnet, payload.cidr)
init = lookups.sweep_init(payload.ip, payload.cidr)
return {**init, **scan, "subnet": f"{subnet}/{payload.cidr}"}
except ValueError as exc:
raise _bad_request(exc) from exc
+105
View File
@@ -0,0 +1,105 @@
"""Road corridor Sentinel-2 freight trend endpoints (opt-in slow layer)."""
from fastapi import APIRouter, HTTPException, Query, Request
from pydantic import BaseModel, Field
from limiter import limiter
from services.road_corridor_sat.config import optional_deps_available, road_corridor_sat_enabled
from services.road_corridor_sat.credentials import sentinel_credentials_configured
from services.road_corridor_sat.jobs import enqueue_analyze, get_job, get_latest_job, job_to_dict
from services.road_corridor_sat.presets import CORRIDOR_PRESETS, get_preset
from services.road_corridor_sat.storage import build_trends_payload, preset_metadata
router = APIRouter()
def _status_payload() -> dict:
latest = get_latest_job()
return {
"enabled": road_corridor_sat_enabled(),
"deps_installed": optional_deps_available(),
"credentials_configured": sentinel_credentials_configured(),
"preset_count": len(CORRIDOR_PRESETS),
"attribution": "backend/third_party/drishx/NOTICE.md",
"active_job": job_to_dict(latest) if latest and latest.status in {"queued", "running"} else None,
}
def _require_analyze_ready() -> None:
if not optional_deps_available():
raise HTTPException(
status_code=503,
detail="Install optional road-corridor dependencies (uv sync --extra road-corridor)",
)
if not sentinel_credentials_configured():
raise HTTPException(
status_code=503,
detail="Set SENTINEL_CLIENT_ID and SENTINEL_CLIENT_SECRET in Imagery settings",
)
class AnalyzeRequest(BaseModel):
lat: float = Field(ge=-90, le=90)
lon: float = Field(ge=-180, le=180)
label: str | None = Field(default=None, max_length=120)
@router.get("/api/road-corridors/status")
@limiter.limit("60/minute")
async def road_corridors_status(request: Request) -> dict:
return {"ok": True, **_status_payload()}
@router.get("/api/road-corridors")
@limiter.limit("60/minute")
async def list_road_corridors(request: Request) -> dict:
return {
"ok": True,
"status": _status_payload(),
"presets": CORRIDOR_PRESETS,
"trends": build_trends_payload(),
}
@router.post("/api/road-corridors/analyze")
@limiter.limit("6/minute")
async def analyze_road_corridor_here(request: Request, payload: AnalyzeRequest) -> dict:
"""Start an on-demand Sentinel-2 corridor analysis at map center."""
_require_analyze_ready()
try:
job = enqueue_analyze(payload.lat, payload.lon, payload.label)
except RuntimeError as exc:
if str(exc) == "analysis_already_running":
active = get_latest_job()
raise HTTPException(
status_code=409,
detail="Analysis already in progress",
headers={"X-Job-Id": active.job_id if active else ""},
) from exc
raise
return {"ok": True, **job_to_dict(job)}
@router.get("/api/road-corridors/analyze/status")
@limiter.limit("120/minute")
async def analyze_road_corridor_status(
request: Request,
job_id: str | None = Query(default=None),
) -> dict:
job = get_job(job_id) if job_id else get_latest_job()
if job is None:
return {"ok": True, "job": None}
return {"ok": True, "job": job_to_dict(job)}
@router.get("/api/road-corridors/{preset_id}")
@limiter.limit("60/minute")
async def get_road_corridor(preset_id: str, request: Request) -> dict:
meta = preset_metadata(preset_id)
if meta is None:
raise HTTPException(status_code=404, detail="Unknown corridor preset")
preset = get_preset(preset_id)
if preset is None:
# Ad-hoc viewport runs are stored on disk but not in CORRIDOR_PRESETS.
return {"ok": True, "preset": None, "result": meta, "status": _status_payload()}
return {"ok": True, "preset": preset, "result": meta, "status": _status_payload()}
+16
View File
@@ -0,0 +1,16 @@
"""Supply-chain risk overlay."""
from __future__ import annotations
from fastapi import APIRouter, Depends, Request
from auth import require_local_operator
from limiter import limiter
from services.scm.suppliers import build_scm_payload
router = APIRouter()
@router.get("/api/scm-suppliers")
@limiter.limit("30/minute")
async def scm_suppliers(request: Request, _: None = Depends(require_local_operator)) -> dict:
return build_scm_payload()
+33
View File
@@ -85,6 +85,39 @@ async def api_geocode_reverse(
return await asyncio.to_thread(reverse_geocode, lat, lng, local_only)
# ── Wikimedia proxy (#360) — browser calls these instead of wikipedia.org ───
@router.get("/api/wikipedia/summary")
@limiter.limit("60/minute")
def api_wikipedia_summary(
request: Request,
title: str = Query(..., min_length=1, max_length=256),
):
"""Proxy Wikipedia REST summaries through the self-hosted backend."""
from services.region_dossier import fetch_wikipedia_page_summary
summary = fetch_wikipedia_page_summary(title)
if summary is None:
return JSONResponse(status_code=404, content={"detail": "not_found"})
return summary
class WikidataSparqlRequest(BaseModel):
query: str
@router.post("/api/wikidata/sparql")
@limiter.limit("30/minute")
def api_wikidata_sparql(request: Request, body: WikidataSparqlRequest):
"""Proxy Wikidata SPARQL so the browser never contacts query.wikidata.org."""
from services.region_dossier import fetch_wikidata_sparql_bindings
q = (body.query or "").strip()
if len(q) > 12_000:
raise HTTPException(400, "SPARQL query too large")
bindings = fetch_wikidata_sparql_bindings(q)
return {"bindings": bindings}
# ── Sentinel proxy routes (Issue #299/#300/#301, reported by tg12) ──────────
# These three endpoints relay external Sentinel / Planetary Computer
# requests through the backend to avoid browser CORS blocks. They are
+82 -2
View File
@@ -308,6 +308,10 @@ class WormholeDmDecryptRequest(BaseModel):
session_welcome: str | None = None
class WormholeDmMlsKeyPackageRequest(BaseModel):
alias: str
class WormholeDmResetRequest(BaseModel):
peer_id: str | None = None
@@ -326,6 +330,14 @@ class WormholeDmBootstrapDecryptRequest(BaseModel):
ciphertext: str
class WormholeDmConnectContactRequest(BaseModel):
lookup_token: str = ""
peer_id: str = ""
note: str = ""
lookup_peer_url: str = ""
cached_prekey_bundle: dict[str, Any] | None = None
class WormholeDmInviteImportRequest(BaseModel):
invite: dict[str, Any]
alias: str = ""
@@ -1085,7 +1097,21 @@ async def api_wormhole_dm_bootstrap_decrypt(request: Request, body: WormholeDmBo
)
@router.post("/api/wormhole/dm/sender-token", dependencies=[Depends(require_admin)])
@router.post("/api/wormhole/dm/connect-contact", dependencies=[Depends(require_local_operator)])
@limiter.limit("30/minute")
async def api_wormhole_dm_connect_contact(request: Request, body: WormholeDmConnectContactRequest):
from services.openclaw_infonet import send_contact_request
return send_contact_request(
lookup_token=str(body.lookup_token or ""),
peer_id=str(body.peer_id or ""),
note=str(body.note or ""),
lookup_peer_url=str(body.lookup_peer_url or ""),
cached_prekey_bundle=dict(body.cached_prekey_bundle or {}) if body.cached_prekey_bundle else None,
)
@router.post("/api/wormhole/dm/sender-token", dependencies=[Depends(require_local_operator)])
@limiter.limit("60/minute")
async def api_wormhole_dm_sender_token(request: Request, body: WormholeDmSenderTokenRequest):
if _safe_int(body.count or 1, 1) > 1:
@@ -1228,6 +1254,23 @@ async def api_wormhole_dm_decrypt(request: Request, body: WormholeDmDecryptReque
)
@router.post("/api/wormhole/dm/mls-key-package", dependencies=[Depends(require_admin)])
@limiter.limit("60/minute")
async def api_wormhole_dm_mls_key_package(request: Request, body: WormholeDmMlsKeyPackageRequest):
from services.mesh.mesh_dm_mls import export_dm_key_package_for_alias
return export_dm_key_package_for_alias(str(body.alias or "").strip())
@router.post("/api/wormhole/dm/mls-reset", dependencies=[Depends(require_admin)])
@limiter.limit("30/minute")
async def api_wormhole_dm_mls_reset(request: Request):
from services.mesh.mesh_dm_mls import reset_dm_mls_state
reset_dm_mls_state(clear_privacy_core=True, clear_persistence=True)
return {"ok": True}
@router.post("/api/wormhole/dm/reset", dependencies=[Depends(require_admin)])
@limiter.limit("30/minute")
async def api_wormhole_dm_reset(request: Request, body: WormholeDmResetRequest):
@@ -1287,7 +1330,25 @@ async def api_wormhole_dm_contact_delete(request: Request, peer_id: str):
return {"ok": True, "peer_id": peer_id, "deleted": deleted}
_WORMHOLE_PUBLIC_FIELDS = {"installed", "configured", "running", "ready"}
@router.post("/api/wormhole/dm/contact/{peer_id}/sever", dependencies=[Depends(require_admin)])
@limiter.limit("60/minute")
async def api_wormhole_dm_contact_sever(request: Request, peer_id: str):
from services.mesh.mesh_wormhole_contacts import sever_wormhole_dm_contact
try:
body = await request.json()
except Exception:
body = {}
if not isinstance(body, dict):
body = {}
block = bool(body.get("block", False))
try:
return sever_wormhole_dm_contact(peer_id, block=block)
except ValueError as exc:
return {"ok": False, "detail": str(exc)}
_WORMHOLE_PUBLIC_FIELDS = {"installed", "configured", "running", "ready", "arti_ready"}
def _redact_wormhole_status(state: dict[str, Any], authenticated: bool) -> dict[str, Any]:
@@ -1308,6 +1369,25 @@ async def api_wormhole_status(request: Request):
return await _m.api_wormhole_status(request)
@router.get(
"/api/wormhole/private-delivery/{item_id}",
dependencies=[Depends(require_local_operator)],
)
@limiter.limit("120/minute")
async def api_wormhole_private_delivery_item(request: Request, item_id: str):
from services.mesh.mesh_metadata_exposure import metadata_exposure_for_request
from services.mesh.mesh_private_outbox import private_delivery_outbox
exposure = metadata_exposure_for_request(
request,
authenticated=True,
)
item = private_delivery_outbox.get_item(item_id, exposure=exposure)
if item is None:
raise HTTPException(status_code=404, detail="private_delivery_item_not_found")
return {"ok": True, "item": item}
@router.post("/api/wormhole/private-delivery/{item_id}/action", dependencies=[Depends(require_local_operator)])
@limiter.limit("30/minute")
async def api_wormhole_private_delivery_action(
+1 -1
View File
@@ -29,7 +29,7 @@ def main() -> None:
from services.network_utils import outbound_user_agent
ua = outbound_user_agent("release-script-power-plants")
except Exception:
ua = "Shadowbroker/0.9 (release-script-power-plants; +https://github.com/BigBodyCobain/Shadowbroker/issues)"
ua = "operator-release-script (purpose: power-plants)"
req = urllib.request.Request(CSV_URL, headers={"User-Agent": ua})
with urllib.request.urlopen(req, timeout=60) as resp:
raw = resp.read().decode("utf-8")
+5
View File
@@ -167,6 +167,11 @@ def cmd_hash(args: argparse.Namespace) -> int:
print("")
print("Updater pin:")
print(f"MESH_UPDATE_SHA256={digest}")
print("")
print("Release checklist:")
print(" - add this digest to SHA256SUMS.txt for the GitHub release")
print(" - add/update backend/data/release_digests.json for bundled updater verification")
print(" - keep MESH_UPDATE_SHA256 available as the operator override path")
return 0 if asset_matches else 2
+28 -9
View File
@@ -92,18 +92,37 @@ SECRET_REGEX+='pypi-[0-9a-zA-Z-]{50,}' # PyPI token
TEXT_FILES=$(grep -ivE '\.(png|jpg|jpeg|gif|ico|svg|woff2?|ttf|eot|pbf|zip|tar|gz|db|sqlite|xlsx|pdf|mp[34]|wav|ogg|webm|webp|avif)$' "$FILELIST" | grep -v 'scan-secrets\.sh$' || true)
if [[ -n "$TEXT_FILES" ]]; then
# Known-public exclusions: lines matching `<host-or-ip> ssh-<algo> <key>`
# are SSH known_hosts entries — the host's PUBLIC fingerprint, which is
# by definition safe to commit (the whole point of pinning known_hosts
# is to publish the fingerprint widely so MITM is detectable). Filter
# these out before flagging the file.
KNOWN_HOSTS_LINE='^[[:space:]]*[a-zA-Z0-9._:,*-]+([[:space:]]+[a-zA-Z0-9._:,*-]+)?[[:space:]]+(ssh-rsa|ssh-ed25519|ssh-dss|ecdsa-sha2-nistp256|ecdsa-sha2-nistp384|ecdsa-sha2-nistp521)[[:space:]]+AAAA'
# Use grep with file list, skip missing/binary, limit output
CONTENT_HITS=$(echo "$TEXT_FILES" | xargs grep -lE "$SECRET_REGEX" 2>/dev/null || true)
if [[ -n "$CONTENT_HITS" ]]; then
echo -e "\n${RED}BLOCKED: Embedded secrets/tokens found in:${NC}"
echo "$CONTENT_HITS" | while read -r f; do
echo -e " ${RED}$f${NC}"
# Show first matching line for context
grep -nE "$SECRET_REGEX" "$f" 2>/dev/null | head -2 | while read -r line; do
echo -e " ${YELLOW}$line${NC}"
done
done
FOUND=1
REAL_HITS=""
REAL_REPORT=""
while IFS= read -r f; do
[[ -z "$f" ]] && continue
# Re-grep this file, but filter out known_hosts-style lines.
FILE_HITS=$(grep -nE "$SECRET_REGEX" "$f" 2>/dev/null | grep -vE "$KNOWN_HOSTS_LINE" || true)
if [[ -n "$FILE_HITS" ]]; then
REAL_HITS+="$f"$'\n'
REAL_REPORT+=" ${RED}$f${NC}"$'\n'
# Show first 2 matching lines for context
while IFS= read -r line; do
[[ -z "$line" ]] && continue
REAL_REPORT+=" ${YELLOW}$line${NC}"$'\n'
done < <(echo "$FILE_HITS" | head -2)
fi
done <<< "$CONTENT_HITS"
if [[ -n "$REAL_HITS" ]]; then
echo -e "\n${RED}BLOCKED: Embedded secrets/tokens found in:${NC}"
echo -en "$REAL_REPORT"
FOUND=1
fi
fi
fi
+54
View File
@@ -0,0 +1,54 @@
"""Operator settings for the embedded agent shell (working directory)."""
from __future__ import annotations
import json
import logging
import os
import threading
from pathlib import Path
from typing import Any
logger = logging.getLogger(__name__)
_SETTINGS_FILE = Path(__file__).resolve().parent.parent / "data" / "agent_shell_settings.json"
_LOCK = threading.Lock()
def _default_working_directory() -> str:
explicit = str(os.environ.get("AGENT_SHELL_DEFAULT_CWD") or "").strip()
if explicit and os.path.isdir(explicit):
return explicit
home = str(os.environ.get("HOME") or "").strip()
if home and home != "/nonexistent" and os.path.isdir(home):
return home
return "/app"
def get_agent_shell_settings() -> dict[str, Any]:
with _LOCK:
if not _SETTINGS_FILE.exists():
return {"working_directory": _default_working_directory()}
try:
payload = json.loads(_SETTINGS_FILE.read_text(encoding="utf-8"))
except Exception:
logger.warning("agent_shell_settings_unreadable")
return {"working_directory": _default_working_directory()}
cwd = str(payload.get("working_directory") or "").strip() or _default_working_directory()
return {"working_directory": cwd}
def set_agent_shell_working_directory(path: str) -> dict[str, Any]:
normalized = str(path or "").strip()
if not normalized:
raise ValueError("working_directory_required")
resolved = os.path.abspath(os.path.expanduser(normalized))
if not os.path.isdir(resolved):
raise ValueError("working_directory_not_found")
with _LOCK:
_SETTINGS_FILE.parent.mkdir(parents=True, exist_ok=True)
_SETTINGS_FILE.write_text(
json.dumps({"working_directory": resolved}, indent=2) + "\n",
encoding="utf-8",
)
return {"working_directory": resolved}
+54 -7
View File
@@ -350,19 +350,58 @@ _proxy_process = None
# path during an upstream cert outage. Surfaced via ais_proxy_status() for
# /api/health.
_proxy_status: dict = {}
# Upstream-connectivity telemetry (added when stream.aisstream.io went fully
# offline on 2026-05-23). ``_last_msg_at`` is the unix timestamp of the most
# recent vessel message received from the proxy. ``_proxy_spawn_count`` is
# how many times we've started the node proxy; combined with no recent
# messages it tells us the proxy is respawning in a tight loop because the
# upstream is unreachable. Surfaced via ais_proxy_status() so the operator
# can see "AIS is dead" instead of guessing whether it's their map filter,
# their api key, or upstream.
_last_msg_at: float = 0.0
_proxy_spawn_count: int = 0
_VESSEL_TRAIL_INTERVAL_S = 120
_VESSEL_TRAIL_MAX_POINTS = 240
def ais_proxy_status() -> dict:
"""Return a copy of the latest ais_proxy.js status (issue #258).
# How stale "last vessel message" can be before we consider the stream
# disconnected. AISStream typically pushes multiple messages/sec, so a 60s
# gap means something's wrong upstream or in transit.
_AIS_CONNECTED_FRESHNESS_S = 60
Currently surfaces ``degraded_tls`` (bool) which is true when the
proxy is using SPKI-pinned fallback because AISStream's cert expired.
Returns an empty dict when no status has been received yet.
def ais_proxy_status() -> dict:
"""Return a copy of the latest ais_proxy.js status + connectivity health.
Fields:
* ``degraded_tls`` (bool, issue #258) — true when the proxy is using
SPKI-pinned fallback because AISStream's cert expired.
* ``connected`` (bool) true when we received a vessel message in
the last ``_AIS_CONNECTED_FRESHNESS_S`` seconds.
* ``last_msg_age_seconds`` (int | None) seconds since the last
vessel message; None if we've never received one.
* ``proxy_spawn_count`` (int) how many times we've spawned the
node proxy. Sustained increases here without ``connected`` means
we're respawning in a tight loop because upstream is dead.
Returns an empty dict when called before the AIS subsystem starts
(e.g. during tests or when no API key is set).
"""
with _vessels_lock:
return dict(_proxy_status)
status = dict(_proxy_status)
last = _last_msg_at
spawns = _proxy_spawn_count
now = time.time()
if last > 0:
last_age = int(now - last)
status["last_msg_age_seconds"] = last_age
status["connected"] = last_age <= _AIS_CONNECTED_FRESHNESS_S
else:
status["last_msg_age_seconds"] = None
status["connected"] = False
status["proxy_spawn_count"] = spawns
return status
import os
@@ -588,8 +627,10 @@ def _ais_stream_loop():
env=proxy_env,
**popen_kwargs,
)
global _proxy_spawn_count
with _vessels_lock:
_proxy_process = process
_proxy_spawn_count += 1
# Drain stderr in a background thread to prevent deadlock
import threading
@@ -645,9 +686,15 @@ def _ais_stream_loop():
if not mmsi:
continue
# Telemetry: stamp the timestamp of the most recent real
# vessel message. ais_proxy_status() reads this to decide
# whether the stream is currently "connected" — i.e. has
# any data flowed in the last 60s.
global _last_msg_at
with _vessels_lock:
_last_msg_at = time.time()
if mmsi not in _vessels:
_vessels[mmsi] = {"_updated": time.time()}
_vessels[mmsi] = {"_updated": _last_msg_at}
vessel = _vessels[mmsi]
# Update position from PositionReport or StandardClassBPositionReport
+9
View File
@@ -51,6 +51,15 @@ API_REGISTRY = [
"url": "https://aisstream.io/",
"required": True,
},
{
"id": "gfw_api_token",
"env_key": "GFW_API_TOKEN",
"name": "Global Fishing Watch",
"description": "Bearer token for Global Fishing Watch fishing-vessel activity events (Fishing Activity map layer). Free registration at globalfishingwatch.org.",
"category": "Maritime",
"url": "https://globalfishingwatch.org/our-apis/",
"required": False,
},
{
"id": "adsb_lol",
"env_key": None,
+419 -93
View File
@@ -17,6 +17,9 @@ _KNOWN_CCTV_MEDIA_HOST_ALIASES = {
# Trusted upstream occasionally publishes a typo for this Georgia camera
# host. Normalize it at ingest so the proxy and client stay consistent.
"navigatos-c2c.dot.ga.gov": "navigator-c2c.dot.ga.gov",
# TravelIQ staging hosts occasionally appear in 511 catalog metadata.
"on.stage.traveliq.co": "511on.ca",
"ab.stage.traveliq.co": "511.alberta.ca",
}
_POINT_WKT_RE = re.compile(
@@ -40,6 +43,17 @@ def _normalize_cctv_media_url(raw_url: str) -> str:
return urlunparse(parsed._replace(netloc=netloc))
def _ensure_https_url(raw_url: str) -> str:
"""Upgrade http:// media/catalog URLs to https:// at ingest time."""
candidate = _normalize_cctv_media_url(str(raw_url or "").strip())
if not candidate:
return ""
parsed = urlparse(candidate)
if parsed.scheme.lower() == "http":
return urlunparse(parsed._replace(scheme="https"))
return candidate
def _looks_like_direct_cctv_media_url(url: str) -> bool:
candidate = str(url or "").strip().lower()
if not candidate.startswith(("http://", "https://")):
@@ -93,6 +107,165 @@ def _parse_wkt_point(raw_point: str) -> tuple[float | None, float | None]:
return lat, lon
def _fetch_traveliq_v2_cameras(
*,
api_url: str,
base_url: str,
id_prefix: str,
source_agency: str,
) -> List[Dict[str, Any]]:
"""Parse TravelIQ-style GET /api/v2/get/cameras feeds (Ontario, Alberta)."""
resp = fetch_with_curl(
api_url,
timeout=30,
headers={"Accept": "application/json"},
)
if not resp or resp.status_code != 200:
logger.error(
"%s CCTV fetch failed: HTTP %s",
source_agency,
resp.status_code if resp else "no response",
)
return []
data = resp.json()
if not isinstance(data, list):
return []
cameras: List[Dict[str, Any]] = []
for cam in data:
if not isinstance(cam, dict):
continue
try:
lat = float(cam.get("Latitude"))
lon = float(cam.get("Longitude"))
except (TypeError, ValueError):
continue
site_id = cam.get("Id")
location = str(cam.get("Location") or cam.get("Roadway") or "Camera")[:120]
views = cam.get("Views") or []
if not views:
continue
for view in views:
if not isinstance(view, dict):
continue
status = str(view.get("Status") or "enabled").strip().lower()
if status and status not in {"enabled", "active"}:
continue
media_url = _ensure_https_url(
urljoin(base_url, str(view.get("Url") or "").strip())
)
if not media_url:
continue
view_id = view.get("Id") or site_id
if site_id is None or view_id is None:
continue
label = str(view.get("Description") or location or "Camera")[:120]
cameras.append(
{
"id": f"{id_prefix}-{site_id}-{view_id}",
"source_agency": source_agency,
"lat": lat,
"lon": lon,
"direction_facing": label,
"media_url": media_url,
"media_type": "image",
"refresh_rate_seconds": 60,
}
)
return cameras
def _fetch_511_datatables_cameras(
*,
list_url: str,
base_url: str,
id_prefix: str,
source_agency: str,
referer: str,
page_size: int = 500,
) -> List[Dict[str, Any]]:
"""Parse 511 DataTables POST /List/GetData/Cameras feeds (Georgia, Florida)."""
cameras: List[Dict[str, Any]] = []
start = 0
draw = 1
while True:
resp = fetch_with_curl(
list_url,
method="POST",
json_data={"draw": draw, "start": start, "length": page_size},
timeout=30,
headers={
"Accept": "application/json",
"Referer": referer,
"Origin": base_url.rstrip("/"),
},
)
if not resp or resp.status_code != 200:
logger.error(
"%s CCTV fetch failed: HTTP %s",
source_agency,
resp.status_code if resp else "no response",
)
break
data = resp.json()
rows = data.get("data") or []
if not rows:
break
for row in rows:
if not isinstance(row, dict):
continue
site_id = row.get("id") or row.get("DT_RowId")
location = row.get("location") or row.get("roadway") or source_agency
lat_lng = row.get("latLng") or {}
geography = lat_lng.get("geography") if isinstance(lat_lng, dict) else {}
lat, lon = _parse_wkt_point(
geography.get("wellKnownText") if isinstance(geography, dict) else ""
)
images = row.get("images") or []
image = next(
(
candidate
for candidate in images
if str(candidate.get("imageUrl") or "").strip()
and not bool(candidate.get("blocked"))
),
None,
)
if not (site_id and image and lat is not None and lon is not None):
continue
media_url = _ensure_https_url(
urljoin(base_url, str(image.get("imageUrl") or "").strip())
)
if not media_url:
continue
cameras.append(
{
"id": f"{id_prefix}-{site_id}",
"source_agency": source_agency,
"lat": lat,
"lon": lon,
"direction_facing": str(location)[:120],
"media_url": media_url,
"media_type": "image",
"refresh_rate_seconds": 60,
}
)
start += len(rows)
draw += 1
total = int(data.get("recordsTotal") or 0)
if total and start >= total:
break
if not total and len(rows) < page_size:
break
return cameras
def init_db():
DB_PATH.parent.mkdir(parents=True, exist_ok=True)
conn = sqlite3.connect(str(DB_PATH))
@@ -169,7 +342,7 @@ class BaseCCTVIngestor(ABC):
cam.get("lat"),
cam.get("lon"),
cam.get("direction_facing", "Unknown"),
cam.get("media_url"),
_ensure_https_url(cam.get("media_url", "")),
cam.get("media_type", _detect_media_type(cam.get("media_url", ""))),
cam.get("refresh_rate_seconds", 60),
),
@@ -454,77 +627,14 @@ class WSDOTIngestor(BaseCCTVIngestor):
class GeorgiaDOTIngestor(BaseCCTVIngestor):
"""Georgia cameras via the public 511GA list feed."""
URL = "https://511ga.org/List/GetData/Cameras"
BASE_URL = "https://511ga.org"
PAGE_SIZE = 500
def fetch_data(self) -> List[Dict[str, Any]]:
cameras = []
start = 0
draw = 1
while True:
resp = fetch_with_curl(
self.URL,
method="POST",
json_data={"draw": draw, "start": start, "length": self.PAGE_SIZE},
timeout=30,
headers={
"Accept": "application/json",
"Referer": "https://511ga.org/cctv",
"Origin": "https://511ga.org",
},
)
if not resp or resp.status_code != 200:
logger.error(
"Georgia CCTV fetch failed: HTTP %s",
resp.status_code if resp else "no response",
)
break
data = resp.json()
rows = data.get("data") or []
if not rows:
break
for row in rows:
site_id = row.get("id") or row.get("DT_RowId")
location = row.get("location") or row.get("roadway") or "GA Camera"
lat_lng = row.get("latLng") or {}
geography = lat_lng.get("geography") if isinstance(lat_lng, dict) else {}
lat, lon = _parse_wkt_point(geography.get("wellKnownText") if isinstance(geography, dict) else "")
images = row.get("images") or []
image = next(
(
candidate
for candidate in images
if str(candidate.get("imageUrl") or "").strip()
and not bool(candidate.get("blocked"))
),
None,
)
if not (site_id and image and lat is not None and lon is not None):
continue
media_url = _normalize_cctv_media_url(
urljoin(self.BASE_URL, str(image.get("imageUrl") or "").strip())
)
cameras.append(
{
"id": f"GDOT-{site_id}",
"source_agency": "Georgia DOT",
"lat": lat,
"lon": lon,
"direction_facing": str(location)[:120],
"media_url": media_url,
"media_type": "image",
"refresh_rate_seconds": 60,
}
)
start += len(rows)
draw += 1
total = int(data.get("recordsTotal") or 0)
if total and start >= total:
break
if not total and len(rows) < self.PAGE_SIZE:
break
return cameras
return _fetch_511_datatables_cameras(
list_url="https://511ga.org/List/GetData/Cameras",
base_url="https://511ga.org",
id_prefix="GDOT",
source_agency="Georgia DOT",
referer="https://511ga.org/cctv",
)
class IllinoisDOTIngestor(BaseCCTVIngestor):
@@ -1009,17 +1119,72 @@ def _extract_img_src(html_fragment: str):
return None
class AsfinagIngestor(BaseCCTVIngestor):
"""Austria ASFINAG motorway webcams (Osiris port)."""
API_URL = "https://odo.asfinag.at/odo/rest/sec/resource/001/json/webcams?language=atDE"
HEADERS = {
"User-Agent": "Shadowbroker-CCTV/1.0",
"Accept": "application/json",
"Referer": "https://www.asfinag.at/",
"Authorization": "Basic bWFwX3dpZGdldDp0ZWdkaXc=",
}
def fetch_data(self) -> List[Dict[str, Any]]:
try:
response = fetch_with_curl(self.API_URL, timeout=15, headers=self.HEADERS)
response.raise_for_status()
payload = response.json()
except Exception as exc:
logger.error("AsfinagIngestor: fetch failed: %s", exc)
return []
if not isinstance(payload, list):
return []
cameras: List[Dict[str, Any]] = []
for cam in payload:
cam_id = cam.get("wcs_id")
lat = cam.get("wgs84_lat")
lon = cam.get("wgs84_lon")
image_url = cam.get("url_campic")
if not cam_id or lat is None or lon is None or not image_url:
continue
if str(cam_id).startswith("Utinform"):
continue
label = cam.get("position_txt") or cam.get("direction_txt") or "ASFINAG Webcam"
secure_url = _ensure_https_url(image_url)
if not secure_url:
continue
cameras.append(
{
"id": f"ASFINAG-{cam_id}",
"source_agency": "ASFINAG Austria",
"lat": float(lat),
"lon": float(lon),
"direction_facing": label,
"media_url": secure_url,
"media_type": "image",
"refresh_rate_seconds": 300,
}
)
logger.info("AsfinagIngestor: parsed %s cameras", len(cameras))
return cameras
class MadridCityIngestor(BaseCCTVIngestor):
"""Madrid City Hall traffic cameras from datos.madrid.es KML feed."""
KML_URL = "http://datos.madrid.es/egob/catalogo/202088-0-trafico-camaras.kml"
KML_URL = "https://datos.madrid.es/egob/catalogo/202088-0-trafico-camaras.kml"
def _fetch_kml(self):
response = fetch_with_curl(self.KML_URL, timeout=20)
response.raise_for_status()
return response
def fetch_data(self) -> List[Dict[str, Any]]:
import defusedxml.ElementTree as ET
try:
response = fetch_with_curl(self.KML_URL, timeout=20)
response.raise_for_status()
response = self._fetch_kml()
except Exception as e:
logger.error(f"MadridCityIngestor: failed to fetch KML: {e}")
return []
@@ -1055,6 +1220,9 @@ class MadridCityIngestor(BaseCCTVIngestor):
if desc_el is not None and desc_el.text:
image_url = _extract_img_src(desc_el.text)
if not image_url:
continue
image_url = _ensure_https_url(image_url)
if not image_url:
continue
@@ -1076,6 +1244,153 @@ class MadridCityIngestor(BaseCCTVIngestor):
return cameras
class Ontario511Ingestor(BaseCCTVIngestor):
"""Ontario highway cameras via 511on.ca TravelIQ API."""
def fetch_data(self) -> List[Dict[str, Any]]:
return _fetch_traveliq_v2_cameras(
api_url="https://511on.ca/api/v2/get/cameras",
base_url="https://511on.ca",
id_prefix="ON511",
source_agency="511 Ontario",
)
class Alberta511Ingestor(BaseCCTVIngestor):
"""Alberta highway cameras via 511 Alberta TravelIQ API."""
def fetch_data(self) -> List[Dict[str, Any]]:
return _fetch_traveliq_v2_cameras(
api_url="https://511.alberta.ca/api/v2/get/cameras",
base_url="https://511.alberta.ca",
id_prefix="AB511",
source_agency="511 Alberta",
)
class Florida511Ingestor(BaseCCTVIngestor):
"""Florida cameras via FL511 DataTables feed (~4,800 sites)."""
def fetch_data(self) -> List[Dict[str, Any]]:
return _fetch_511_datatables_cameras(
list_url="https://fl511.com/List/GetData/Cameras",
base_url="https://fl511.com",
id_prefix="FL511",
source_agency="Florida 511",
referer="https://fl511.com/",
)
class AustraliaLiveTrafficIngestor(BaseCCTVIngestor):
"""NSW / Australia live traffic cameras via Transport for NSW JSON feed."""
URL = "https://www.livetraffic.com/datajson/all-feeds-web.json"
def fetch_data(self) -> List[Dict[str, Any]]:
resp = fetch_with_curl(self.URL, timeout=35, headers={"Accept": "application/json"})
if not resp or resp.status_code != 200:
logger.error(
"Australia Live Traffic CCTV fetch failed: HTTP %s",
resp.status_code if resp else "no response",
)
return []
data = resp.json()
if not isinstance(data, list):
return []
cameras: List[Dict[str, Any]] = []
for item in data:
if not isinstance(item, dict) or item.get("eventType") != "liveCams":
continue
geometry = item.get("geometry") if isinstance(item.get("geometry"), dict) else {}
coords = geometry.get("coordinates") if isinstance(geometry.get("coordinates"), list) else []
if len(coords) < 2:
continue
try:
lon = float(coords[0])
lat = float(coords[1])
except (TypeError, ValueError):
continue
props = item.get("properties") if isinstance(item.get("properties"), dict) else {}
media_url = _ensure_https_url(str(props.get("href") or "").strip())
if not media_url:
continue
cam_id = str(item.get("path") or props.get("id") or len(cameras)).strip("/")
label = str(props.get("title") or props.get("headline") or "Australia Camera")[:120]
cameras.append(
{
"id": f"AUS-{cam_id}",
"source_agency": "NSW Live Traffic",
"lat": lat,
"lon": lon,
"direction_facing": label,
"media_url": media_url,
"media_type": "image",
"refresh_rate_seconds": 120,
}
)
logger.info("AustraliaLiveTrafficIngestor: parsed %s cameras", len(cameras))
return cameras
class NetherlandsRWSIngestor(BaseCCTVIngestor):
"""Netherlands Rijkswaterstaat cameras from legacy NDW open-data JSON.
The opendata.ndw.nu/cameras.json feed Osiris used is often offline; when
unavailable this ingestor returns an empty set and logs a warning.
"""
URL = "https://opendata.ndw.nu/cameras.json"
MAX_CAMERAS = 1200
def fetch_data(self) -> List[Dict[str, Any]]:
resp = fetch_with_curl(self.URL, timeout=25, headers={"Accept": "application/json"})
if not resp or resp.status_code != 200:
logger.warning(
"Netherlands RWS cameras.json unavailable (HTTP %s) — "
"NDW retired this open-data endpoint; no cameras ingested",
resp.status_code if resp else "no response",
)
return []
data = resp.json()
if not isinstance(data, list):
return []
cameras: List[Dict[str, Any]] = []
for i, cam in enumerate(data[: self.MAX_CAMERAS]):
if not isinstance(cam, dict):
continue
lat = cam.get("lat") if cam.get("lat") is not None else cam.get("latitude")
lon = cam.get("lng") if cam.get("lng") is not None else cam.get("longitude")
media_url = _ensure_https_url(
str(cam.get("imageUrl") or cam.get("feed_url") or cam.get("url") or "").strip()
)
if lat is None or lon is None or not media_url:
continue
try:
lat_f, lon_f = float(lat), float(lon)
except (TypeError, ValueError):
continue
cameras.append(
{
"id": f"NLRWS-{cam.get('id') or i}",
"source_agency": "Rijkswaterstaat",
"lat": lat_f,
"lon": lon_f,
"direction_facing": str(cam.get("name") or "Netherlands Camera")[:120],
"media_url": media_url,
"media_type": "image",
"refresh_rate_seconds": 120,
}
)
logger.info("NetherlandsRWSIngestor: parsed %s cameras", len(cameras))
return cameras
def _detect_media_type(url: str) -> str:
"""Detect the media type from a camera URL for proper frontend rendering."""
if not url:
@@ -1094,29 +1409,40 @@ def _detect_media_type(url: str) -> str:
return "image"
def scheduled_cctv_ingestors() -> List[tuple["BaseCCTVIngestor", str]]:
"""Canonical list of CCTV ingestors for startup, scheduler, and DB seeding."""
return [
(TFLJamCamIngestor(), "cctv_tfl"),
(LTASingaporeIngestor(), "cctv_lta"),
(AustinTXIngestor(), "cctv_atx"),
(NYCDOTIngestor(), "cctv_nyc"),
(CaltransIngestor(), "cctv_caltrans"),
(ColoradoDOTIngestor(), "cctv_codot"),
(WSDOTIngestor(), "cctv_wsdot"),
(GeorgiaDOTIngestor(), "cctv_gdot"),
(IllinoisDOTIngestor(), "cctv_idot"),
(MichiganDOTIngestor(), "cctv_mdot"),
(WindyWebcamsIngestor(), "cctv_windy"),
(DGTNationalIngestor(), "cctv_dgt"),
(MadridCityIngestor(), "cctv_madrid"),
(OSMTrafficCameraIngestor(), "cctv_osm"),
(AsfinagIngestor(), "cctv_asfinag"),
(OSMALPRCameraIngestor(), "cctv_osm_alpr"),
(Ontario511Ingestor(), "cctv_on511"),
(Alberta511Ingestor(), "cctv_ab511"),
(Florida511Ingestor(), "cctv_fl511"),
(AustraliaLiveTrafficIngestor(), "cctv_australia"),
(NetherlandsRWSIngestor(), "cctv_nl_rws"),
]
def run_all_ingestors():
"""Run all CCTV ingestors synchronously. Used for first-run DB seeding."""
ingestors = [
TFLJamCamIngestor(),
LTASingaporeIngestor(),
AustinTXIngestor(),
NYCDOTIngestor(),
CaltransIngestor(),
ColoradoDOTIngestor(),
WSDOTIngestor(),
GeorgiaDOTIngestor(),
IllinoisDOTIngestor(),
MichiganDOTIngestor(),
WindyWebcamsIngestor(),
OSMTrafficCameraIngestor(),
DGTNationalIngestor(),
MadridCityIngestor(),
]
for ing in ingestors:
for ingestor, _name in scheduled_cctv_ingestors():
try:
ing.ingest()
ingestor.ingest()
except Exception as e:
logger.warning(f"Ingestor {ing.__class__.__name__} failed during seed: {e}")
logger.warning(f"Ingestor {ingestor.__class__.__name__} failed during seed: {e}")
def get_all_cameras() -> List[Dict[str, Any]]:
+23 -1
View File
@@ -30,8 +30,13 @@ class Settings(BaseSettings):
MESH_MQTT_INCLUDE_DEFAULT_ROOTS: bool = True
MESH_RNS_ENABLED: bool = False
MESH_ARTI_ENABLED: bool = False
# When true, trust wormhole_status.json ready bit if the child process is
# alive — avoids transport-tier flapping when /api/health probes time out
# under Tor load (common during live DM E2E).
MESH_WORMHOLE_TRUST_FILE_READY: bool = False
MESH_ARTI_SOCKS_PORT: int = 9050
MESH_RELAY_PEERS: str = ""
MESH_PUBLIC_PEER_URL: str = ""
# Bootstrap seeds are discovery hints, not authoritative network roots.
# Nodes promote healthy discovered peers from the store/manifest over time.
MESH_BOOTSTRAP_SEED_PEERS: str = "http://gqpbunqbgtkcqilvclm3xrkt3zowjyl3s62kkktvojgvxzizamvbrqid.onion:8000"
@@ -42,7 +47,24 @@ class Settings(BaseSettings):
MESH_INFONET_ALLOW_CLEARNET_SYNC: bool = False
MESH_BOOTSTRAP_DISABLED: bool = False
MESH_BOOTSTRAP_MANIFEST_PATH: str = "data/bootstrap_peers.json"
MESH_BOOTSTRAP_SIGNER_PUBLIC_KEY: str = ""
# Public sb-testnet-0 fleet signer (participants). Seed operator holds the private key.
MESH_BOOTSTRAP_SIGNER_PUBLIC_KEY: str = (
"ul1d0kj/ODPIp0OhHzX8eLAVXzJ3CVvzW1vn2IC6q3I="
)
MESH_BOOTSTRAP_SIGNER_PRIVATE_KEY: str = ""
# When true, empty MESH_PEER_PUSH_SECRET uses the public fleet HMAC for seed join/announce.
MESH_INFONET_FLEET_JOIN: bool = True
MESH_INFONET_FLEET_JOIN_DISABLED: bool = False
# Headless relay/seed compose: auto-enable Tor wormhole on startup so
# docker compose redeploys keep the fleet onion reachable.
MESH_INFONET_RELAY_AUTO_WORMHOLE: bool = False
MESH_INFONET_RELAY_AUTO_WORMHOLE_DISABLED: bool = False
MESH_BOOTSTRAP_SIGNER_ID: str = ""
MESH_PEER_REGISTRY_ENABLED: bool = False
MESH_PEER_REGISTRY_DISABLED: bool = False
MESH_PEER_REGISTRY_STALE_S: int = 604800
MESH_SWARM_MANIFEST_TTL_S: int = 14400
MESH_SWARM_MANIFEST_PULL_INTERVAL_S: int = 300
MESH_NODE_MODE: str = "participant"
MESH_SYNC_INTERVAL_S: int = 300
MESH_SYNC_FAILURE_BACKOFF_S: int = 60
+7 -2
View File
@@ -11,8 +11,13 @@ DEFAULT_TRAIL_TTL_S = 300 # 5 min - trail TTL for non-tracked flights
HOLD_PATTERN_DEGREES = 300 # Total heading change to flag holding pattern
GPS_JAMMING_NACP_THRESHOLD = 8 # NACp below this = degraded GPS signal
GPS_JAMMING_GRID_SIZE = 1.0 # 1 degree grid for aggregation
GPS_JAMMING_MIN_RATIO = 0.30 # 30% degraded aircraft to flag zone
GPS_JAMMING_MIN_AIRCRAFT = 5 # Min aircraft in grid cell for statistical significance
# Tuned 2026-05: previously 0.30 / 5 aircraft which — combined with the
# -1 noise cushion in the detector AND the pre-fix nac_p==0 filter that
# discarded jamming victims — meant the layer almost never lit up.
# Lowering the bar so genuine jamming zones with sparser ADS-B coverage
# clear (eastern Med, Russia/Ukraine border, Iran/Iraq).
GPS_JAMMING_MIN_RATIO = 0.20 # 20% degraded aircraft to flag zone
GPS_JAMMING_MIN_AIRCRAFT = 3 # Min aircraft in grid cell for statistical significance
# ─── Network & Circuit Breaker ──────────────────────────────────────────────
CIRCUIT_BREAKER_TTL_S = 120 # Skip domain for 2 min after total failure
+172 -58
View File
@@ -19,6 +19,7 @@ import concurrent.futures
import json
import math
import os
import random
import threading
import time
from datetime import datetime, timedelta
@@ -75,6 +76,7 @@ from services.fetchers.infrastructure import ( # noqa: F401
fetch_tinygs,
fetch_psk_reporter,
)
from services.fetchers.road_corridor_sat import fetch_road_corridor_trends # noqa: F401
from services.fetchers.geo import ( # noqa: F401
fetch_ships,
fetch_airports,
@@ -99,6 +101,10 @@ from services.fetchers.crowdthreat import fetch_crowdthreat # noqa: F401
from services.fetchers.wastewater import fetch_wastewater # noqa: F401
from services.fetchers.sar_catalog import fetch_sar_catalog # noqa: F401
from services.fetchers.sar_products import fetch_sar_products # noqa: F401
from services.fetchers.malware import fetch_malware_threats # noqa: F401
from services.fetchers.telegram_osint import fetch_telegram_osint # noqa: F401
from services.fetchers.cyber_status import fetch_cyber_threats # noqa: F401
from services.scm.suppliers import fetch_scm_suppliers # noqa: F401
from services.ais_stream import prune_stale_vessels # noqa: F401
logger = logging.getLogger(__name__)
@@ -144,13 +150,18 @@ _STARTUP_HEAVY_REFRESH_DELAY_S = float(os.environ.get("SHADOWBROKER_STARTUP_HEAV
_STARTUP_HEAVY_REFRESH_STARTED = False
_STARTUP_HEAVY_REFRESH_LOCK = threading.Lock()
_FETCH_WORKERS = int(os.environ.get("SHADOWBROKER_FETCH_WORKERS", "8"))
_HEAVY_FETCH_WORKERS = int(os.environ.get("SHADOWBROKER_HEAVY_FETCH_WORKERS", "2"))
_SLOW_FETCH_CONCURRENCY = int(os.environ.get("SHADOWBROKER_SLOW_FETCH_CONCURRENCY", "4"))
_STARTUP_HEAVY_CONCURRENCY = int(os.environ.get("SHADOWBROKER_STARTUP_HEAVY_CONCURRENCY", "2"))
# Shared thread pool — reused across all fetch cycles instead of creating/destroying per tick
# Fast-tier pool (flights, ships, sigint, …). Slow / heavy work uses a separate pool
# so Playwright, GDELT, CCTV ingest, etc. cannot starve the 60s refresh path (#375).
_SHARED_EXECUTOR = concurrent.futures.ThreadPoolExecutor(
max_workers=max(2, _FETCH_WORKERS), thread_name_prefix="fetch"
)
_SLOW_EXECUTOR = concurrent.futures.ThreadPoolExecutor(
max_workers=max(1, _HEAVY_FETCH_WORKERS), thread_name_prefix="fetch-slow"
)
def _cache_json_safe(value):
@@ -319,10 +330,49 @@ def seed_startup_caches() -> None:
# ---------------------------------------------------------------------------
# Scheduler & Orchestration
# ---------------------------------------------------------------------------
def _executor_for_task_label(label: str) -> concurrent.futures.ThreadPoolExecutor:
if label.startswith(("slow-tier", "startup-heavy")):
return _SLOW_EXECUTOR
return _SHARED_EXECUTOR
def _run_task_with_health_on_executor(
executor: concurrent.futures.ThreadPoolExecutor,
func,
name: str | None = None,
) -> None:
"""Run a scheduled job on the given pool so it cannot starve fast-tier workers."""
task_name = name or getattr(func, "__name__", "task")
future = executor.submit(func)
start = time.perf_counter()
try:
future.result(timeout=_TASK_HARD_TIMEOUT_S)
duration = time.perf_counter() - start
from services.fetch_health import record_success
record_success(task_name, duration_s=duration)
if duration > _SLOW_FETCH_S:
logger.warning("task slow: %s took %.2fs", task_name, duration)
except concurrent.futures.TimeoutError:
future.cancel()
duration = time.perf_counter() - start
from services.fetch_health import record_failure
record_failure(task_name, error=TimeoutError(f"{task_name} timed out"), duration_s=duration)
logger.error("task timed out: %s (%.2fs)", task_name, duration)
except Exception as e:
duration = time.perf_counter() - start
from services.fetch_health import record_failure
record_failure(task_name, error=e, duration_s=duration)
logger.exception("task failed: %s", task_name)
def _run_tasks(label: str, funcs: list, *, max_concurrency: int | None = None):
"""Run tasks concurrently and log any exceptions (do not fail silently)."""
if not funcs:
return
executor = _executor_for_task_label(label)
if max_concurrency is None:
if label.startswith("slow-tier"):
max_concurrency = _SLOW_FETCH_CONCURRENCY
@@ -330,12 +380,13 @@ def _run_tasks(label: str, funcs: list, *, max_concurrency: int | None = None):
max_concurrency = _STARTUP_HEAVY_CONCURRENCY
else:
max_concurrency = len(funcs)
max_concurrency = max(1, min(max_concurrency, len(funcs)))
pool_workers = getattr(executor, "_max_workers", len(funcs))
max_concurrency = max(1, min(max_concurrency, len(funcs), pool_workers))
remaining_funcs = list(funcs)
while remaining_funcs:
batch, remaining_funcs = remaining_funcs[:max_concurrency], remaining_funcs[max_concurrency:]
futures = {_SHARED_EXECUTOR.submit(func): (func.__name__, time.perf_counter()) for func in batch}
futures = {executor.submit(func): (func.__name__, time.perf_counter()) for func in batch}
_drain_task_futures(label, futures)
@@ -352,6 +403,13 @@ def _drain_task_futures(label: str, futures: dict):
record_success(name, duration_s=duration)
if duration > _SLOW_FETCH_S:
logger.warning(f"{label} task slow: {name} took {duration:.2f}s")
except concurrent.futures.TimeoutError:
future.cancel()
duration = time.perf_counter() - start
from services.fetch_health import record_failure
record_failure(name, error=TimeoutError(f"{name} timed out"), duration_s=duration)
logger.error("%s task timed out: %s (%.2fs)", label, name, duration)
except Exception as e:
duration = time.perf_counter() - start
from services.fetch_health import record_failure
@@ -405,7 +463,6 @@ def update_slow_data():
logger.info("Slow-tier data update starting...")
slow_funcs = [
fetch_news,
fetch_prediction_markets,
fetch_earthquakes,
fetch_firms_fires,
fetch_firms_country_fires,
@@ -427,6 +484,9 @@ def update_slow_data():
fetch_fishing_activity,
fetch_power_plants,
fetch_ukraine_air_raid_alerts,
fetch_malware_threats,
fetch_cyber_threats,
fetch_scm_suppliers,
]
_run_tasks("slow-tier", slow_funcs)
# Run correlation engine after all data is fresh
@@ -470,6 +530,15 @@ def _load_cctv_cache_for_startup() -> None:
logger.warning("Startup CCTV cache load failed (non-fatal): %s", e)
def _load_static_infrastructure_for_startup() -> None:
"""Disk-backed reference layers — instant, no network."""
for func in (fetch_datacenters, fetch_military_bases, fetch_power_plants):
try:
func()
except Exception as e:
logger.warning("Startup static infrastructure load failed for %s: %s", func.__name__, e)
def _run_delayed_startup_heavy_refresh() -> None:
if _STARTUP_HEAVY_REFRESH_DELAY_S > 0:
logger.info(
@@ -482,6 +551,7 @@ def _run_delayed_startup_heavy_refresh() -> None:
"startup-heavy",
[
update_slow_data,
fetch_telegram_osint,
fetch_volcanoes,
fetch_viirs_change_nodes,
fetch_unusual_whales,
@@ -520,6 +590,7 @@ def update_all_data(*, startup_mode: bool = False):
logger.info("Full data update starting (parallel)...")
# Preload Meshtastic map cache immediately (instant, from disk)
seed_startup_caches()
_load_static_infrastructure_for_startup()
with _data_lock:
meshtastic_seeded = bool(latest_data.get("meshtastic_map_nodes"))
if startup_mode:
@@ -596,22 +667,9 @@ def update_all_data(*, startup_mode: bool = False):
# (the scheduled job also runs every 10 min for ongoing refresh).
if startup_mode:
try:
from services.cctv_pipeline import (
TFLJamCamIngestor, LTASingaporeIngestor, AustinTXIngestor,
NYCDOTIngestor, CaltransIngestor, ColoradoDOTIngestor,
WSDOTIngestor, GeorgiaDOTIngestor, IllinoisDOTIngestor,
MichiganDOTIngestor, WindyWebcamsIngestor, DGTNationalIngestor,
MadridCityIngestor, OSMTrafficCameraIngestor, get_all_cameras,
)
from services.cctv_pipeline import OSMALPRCameraIngestor
_startup_ingestors = [
TFLJamCamIngestor(), LTASingaporeIngestor(), AustinTXIngestor(),
NYCDOTIngestor(), CaltransIngestor(), ColoradoDOTIngestor(),
WSDOTIngestor(), GeorgiaDOTIngestor(), IllinoisDOTIngestor(),
MichiganDOTIngestor(), WindyWebcamsIngestor(), DGTNationalIngestor(),
MadridCityIngestor(), OSMTrafficCameraIngestor(),
OSMALPRCameraIngestor(),
]
from services.cctv_pipeline import get_all_cameras, scheduled_cctv_ingestors
_startup_ingestors = [ing for ing, _name in scheduled_cctv_ingestors()]
logger.info("Running CCTV ingest at startup (%d ingestors)...", len(_startup_ingestors))
ingest_futures = {
_SHARED_EXECUTOR.submit(ing.ingest): ing.__class__.__name__
@@ -747,6 +805,39 @@ def start_scheduler():
misfire_grace_time=120,
)
# Telegram OSINT — hourly t.me/s channel scrape (kept off the 5-minute slow tier).
_telegram_interval_m = max(15, int(os.environ.get("TELEGRAM_OSINT_INTERVAL_MINUTES", "60")))
_scheduler.add_job(
lambda: _run_task_with_health(fetch_telegram_osint, "fetch_telegram_osint"),
"interval",
minutes=_telegram_interval_m,
next_run_time=datetime.utcnow() + timedelta(seconds=45),
id="telegram_osint",
max_instances=1,
misfire_grace_time=600,
)
# Prediction markets — own jittered cadence (Polymarket/Kalshi clearnet egress).
# Kept off the fixed 5-minute slow tier so poll timing is less fingerprintable.
from services.fetchers.prediction_markets import fetch_prediction_markets
_pm_interval_m = max(5, int(os.environ.get("PREDICTION_MARKETS_INTERVAL_MINUTES", "7")))
_pm_jitter_s = max(0, int(os.environ.get("PREDICTION_MARKETS_SCHEDULER_JITTER_S", "240")))
_pm_initial_max_s = max(0, int(os.environ.get("PREDICTION_MARKETS_INITIAL_DELAY_MAX_S", "180")))
_pm_first_run = datetime.utcnow() + timedelta(
seconds=random.randint(30, max(30, _pm_initial_max_s))
)
_scheduler.add_job(
lambda: _run_task_with_health(fetch_prediction_markets, "fetch_prediction_markets"),
"interval",
minutes=_pm_interval_m,
jitter=_pm_jitter_s,
next_run_time=_pm_first_run,
id="prediction_markets",
max_instances=1,
misfire_grace_time=300,
)
# Weather alerts — every 5 minutes (time-critical, separate from slow tier)
_scheduler.add_job(
lambda: _run_task_with_health(fetch_weather_alerts, "fetch_weather_alerts"),
@@ -777,6 +868,39 @@ def start_scheduler():
misfire_grace_time=60,
)
# Flight observation pruning — drops icao24 → first_seen_at entries we
# haven't seen in an hour. Same cadence as AIS prune for symmetry; the
# per-tick scan is O(in-flight aircraft) so it's cheap.
from services.fetchers.flight_observations import prune as _prune_flight_observations
_scheduler.add_job(
lambda: _run_task_with_health(_prune_flight_observations, "prune_flight_observations"),
"interval",
minutes=5,
id="flight_observation_prune",
max_instances=1,
misfire_grace_time=60,
)
# AISHub REST fallback — slow polling when the AISStream WebSocket
# primary is offline. Configurable interval via
# AISHUB_POLL_INTERVAL_MINUTES env (default 20 min). Operator must
# set AISHUB_USERNAME to opt in. The fetcher is gated internally on
# the primary being disconnected, so this job is cheap when the
# WebSocket is healthy (early-returns after a status check).
from services.fetchers.aishub_fallback import (
aishub_poll_interval_minutes,
fetch_aishub_vessels,
)
_aishub_interval = aishub_poll_interval_minutes()
_scheduler.add_job(
lambda: _run_task_with_health(fetch_aishub_vessels, "fetch_aishub_vessels"),
"interval",
minutes=_aishub_interval,
id="aishub_fallback",
max_instances=1,
misfire_grace_time=120,
)
# Route database — bulk refresh from vrs-standing-data.adsb.lol every 5
# days. Replaces the legacy /api/0/routeset POST (blocked under our UA,
# and broken upstream). Airline schedules change on a quarterly cycle,
@@ -811,7 +935,7 @@ def start_scheduler():
# GDELT — every 30 minutes (downloads 32 ZIP files per call, avoid rate limits)
_scheduler.add_job(
lambda: _run_task_with_health(fetch_gdelt, "fetch_gdelt"),
lambda: _run_task_with_health_on_executor(_SLOW_EXECUTOR, fetch_gdelt, "fetch_gdelt"),
"interval",
minutes=30,
id="gdelt",
@@ -819,7 +943,9 @@ def start_scheduler():
misfire_grace_time=120,
)
_scheduler.add_job(
lambda: _run_task_with_health(update_liveuamap, "update_liveuamap"),
lambda: _run_task_with_health_on_executor(
_SLOW_EXECUTOR, update_liveuamap, "update_liveuamap"
),
"interval",
minutes=30,
id="liveuamap",
@@ -829,39 +955,9 @@ def start_scheduler():
# CCTV pipeline refresh — runs all ingestors, then refreshes in-memory data.
# Delay the first run slightly so startup serves cached/DB-backed data first.
from services.cctv_pipeline import (
TFLJamCamIngestor,
LTASingaporeIngestor,
AustinTXIngestor,
NYCDOTIngestor,
CaltransIngestor,
ColoradoDOTIngestor,
WSDOTIngestor,
GeorgiaDOTIngestor,
IllinoisDOTIngestor,
MichiganDOTIngestor,
WindyWebcamsIngestor,
DGTNationalIngestor,
MadridCityIngestor,
OSMTrafficCameraIngestor,
)
from services.cctv_pipeline import scheduled_cctv_ingestors
_cctv_ingestors = [
(TFLJamCamIngestor(), "cctv_tfl"),
(LTASingaporeIngestor(), "cctv_lta"),
(AustinTXIngestor(), "cctv_atx"),
(NYCDOTIngestor(), "cctv_nyc"),
(CaltransIngestor(), "cctv_caltrans"),
(ColoradoDOTIngestor(), "cctv_codot"),
(WSDOTIngestor(), "cctv_wsdot"),
(GeorgiaDOTIngestor(), "cctv_gdot"),
(IllinoisDOTIngestor(), "cctv_idot"),
(MichiganDOTIngestor(), "cctv_mdot"),
(WindyWebcamsIngestor(), "cctv_windy"),
(DGTNationalIngestor(), "cctv_dgt"),
(MadridCityIngestor(), "cctv_madrid"),
(OSMTrafficCameraIngestor(), "cctv_osm"),
]
_cctv_ingestors = scheduled_cctv_ingestors()
def _run_cctv_ingest_cycle():
from services.fetchers._store import is_any_active
@@ -880,7 +976,9 @@ def start_scheduler():
logger.warning(f"CCTV post-ingest refresh failed: {e}")
_scheduler.add_job(
_run_cctv_ingest_cycle,
lambda: _run_task_with_health_on_executor(
_SLOW_EXECUTOR, _run_cctv_ingest_cycle, "cctv_ingest_cycle"
),
"interval",
minutes=10,
id="cctv_ingest",
@@ -950,6 +1048,16 @@ def start_scheduler():
misfire_grace_time=600,
)
# Sentinel-2 road corridor freight trends — daily (opt-in, heavy CDSE usage)
_scheduler.add_job(
lambda: _run_task_with_health(fetch_road_corridor_trends, "fetch_road_corridor_trends"),
"interval",
hours=24,
id="road_corridor_trends",
max_instances=1,
misfire_grace_time=3600,
)
# FIMI disinformation index — every 12 hours (weekly editorial feed)
_scheduler.add_job(
lambda: _run_task_with_health(fetch_fimi, "fetch_fimi"),
@@ -960,16 +1068,19 @@ def start_scheduler():
misfire_grace_time=600,
)
# UAP sightings (NUFORC) — daily at 12:00 UTC
# UAP sightings (NUFORC) — weekly Mondays 12:00 UTC. Rolling ~60-day window;
# each self-hosted install pulls live nuforc.org so operators see current
# reports (typically ~400500 mappable pins). Disk cache TTL defaults to 7d.
_scheduler.add_job(
lambda: _run_task_with_health(
lambda: fetch_uap_sightings(force_refresh=True),
"fetch_uap_sightings",
),
"cron",
day_of_week="mon",
hour=12,
minute=0,
id="uap_sightings_daily",
id="uap_sightings_weekly",
max_instances=1,
misfire_grace_time=3600,
)
@@ -1094,7 +1205,10 @@ def start_scheduler():
def stop_scheduler():
if _scheduler:
_scheduler.shutdown(wait=False)
_SLOW_EXECUTOR.shutdown(wait=False, cancel_futures=True)
def get_latest_data():
return get_latest_data_subset(*latest_data.keys())
from services.fetchers._store import get_latest_data_deepcopy_snapshot
return get_latest_data_deepcopy_snapshot()
+1
View File
@@ -46,6 +46,7 @@ _CRITICAL_WARN = {
_OPTIONAL = {
"AIS_API_KEY": "AIS vessel streaming (ships layer will be empty without it)",
"GFW_API_TOKEN": "Global Fishing Watch fishing-vessel activity (fishing_activity layer)",
"LTA_ACCOUNT_KEY": "Singapore LTA traffic cameras (CCTV layer)",
"PUBLIC_API_KEY": "Optional client auth for public endpoints (recommended for exposed deployments)",
}
+34 -10
View File
@@ -69,6 +69,11 @@ class DashboardData(TypedDict, total=False):
sar_scenes: List[Dict[str, Any]]
sar_anomalies: List[Dict[str, Any]]
sar_aoi_coverage: List[Dict[str, Any]]
road_corridor_trends: Dict[str, Any]
malware_threats: Dict[str, Any]
cyber_threats: Dict[str, Any]
scm_suppliers: Dict[str, Any]
telegram_osint: Dict[str, Any]
# In-memory store
@@ -119,6 +124,11 @@ latest_data: DashboardData = {
"sar_scenes": [],
"sar_anomalies": [],
"sar_aoi_coverage": [],
"road_corridor_trends": {"updated_at": None, "corridors": []},
"malware_threats": {"threats": [], "total": 0, "timestamp": None},
"cyber_threats": {"threats": [], "stats": {}},
"scm_suppliers": {"suppliers": [], "total": 0, "critical_count": 0},
"telegram_osint": {"posts": [], "total": 0, "geolocated": 0, "timestamp": None},
}
# Per-source freshness timestamps
@@ -230,27 +240,35 @@ _active_layers_version: int = 0
def bump_active_layers_version() -> None:
"""Increment the active-layer version when frontend toggles change response shape."""
global _active_layers_version
_active_layers_version += 1
with _data_lock:
_active_layers_version += 1
def get_active_layers_version() -> int:
"""Return the current active-layer version (for ETag generation)."""
return _active_layers_version
with _data_lock:
return _active_layers_version
def get_latest_data_subset(*keys: str) -> DashboardData:
"""Return a deep snapshot of only the requested top-level keys.
This avoids cloning the entire dashboard store for endpoints that only need
a small tier-specific subset. Deep copy ensures callers cannot mutate
nested structures (e.g. individual flight dicts) and affect the live store.
Grabs references under the lock, then deep-copies outside it so fetcher
writers are not blocked for the duration of a large clone (#375).
"""
with _data_lock:
snap: DashboardData = {}
for key in keys:
value = latest_data.get(key)
snap[key] = copy.deepcopy(value)
return snap
items = [(key, latest_data.get(key)) for key in keys]
snap: DashboardData = {}
for key, value in items:
snap[key] = copy.deepcopy(value)
return snap
def get_latest_data_deepcopy_snapshot() -> DashboardData:
"""Deep-copy the full dashboard for legacy /api/live-data consumers."""
with _data_lock:
items = list(latest_data.items())
return {key: copy.deepcopy(value) for key, value in items}
def get_latest_data_subset_refs(*keys: str) -> DashboardData:
@@ -320,6 +338,12 @@ active_layers: dict[str, bool] = {
"ai_intel": True,
"crowdthreat": False,
"sar": True,
"road_corridor_trends": False,
"malware_c2": False,
"submarine_cables": False,
"scm_suppliers": False,
"cyber_threats": False,
"telegram_osint": True,
}
@@ -38,8 +38,6 @@ _S3_NS = "{http://s3.amazonaws.com/doc/2006-03-01/}"
_REFRESH_INTERVAL_S = 5 * 24 * 3600
_LIST_TIMEOUT_S = 30
_DOWNLOAD_TIMEOUT_S = 600
from services.network_utils import DEFAULT_USER_AGENT as _USER_AGENT
_lock = threading.RLock()
_aircraft_by_hex: dict[str, dict[str, str]] = {}
_last_refresh = 0.0
@@ -0,0 +1,290 @@
"""AISHub REST fallback for ship tracking when AISStream is unreachable.
Background
----------
On 2026-05-23 ``stream.aisstream.io`` (the primary live AIS WebSocket feed)
went fully offline. Backend's only ship signal vanished. This module polls
``data.aishub.net``'s free REST API on a slow cadence (default 20 min) when
the WebSocket primary is disconnected, so the ships layer doesn't go fully
dark during upstream outages.
Why 20 minutes
--------------
AISHub's free tier is rate-limited and explicitly asks consumers to be
courteous. 20 minutes is well inside their limits, gives ships time to
move enough to look "alive" on the map, and won't drain their service.
Configurable via the ``AISHUB_POLL_INTERVAL_MINUTES`` env var (clamped to
[1, 360]).
Why slow vs primary
-------------------
This is degraded mode, not a replacement. A ship at 20 knots moves about
6 nautical miles in 20 minutes visible on the map but coarser than the
real-time WebSocket signal. When AISStream comes back online, the
WebSocket data will overwrite these records via the same ``_vessels``
dict and ``source`` will flip from ``"aishub"`` back to upstream-live.
Opt-in
------
Operator must set ``AISHUB_USERNAME`` (free registration at
https://www.aishub.net/api). If unset, this fetcher is a no-op.
"""
from __future__ import annotations
import json
import logging
import os
import time
from typing import Any
from services.network_utils import fetch_with_curl
logger = logging.getLogger(__name__)
AISHUB_URL = "https://data.aishub.net/ws.php"
def aishub_username() -> str:
return str(os.environ.get("AISHUB_USERNAME", "")).strip()
def aishub_fallback_enabled() -> bool:
"""Returns True only when the operator has registered with AISHub and
set ``AISHUB_USERNAME``. The presence of the username is the opt-in."""
return bool(aishub_username())
def aishub_poll_interval_minutes() -> int:
"""Default 20 minutes. Clamped to [1, 360] so a hostile or
misconfigured env var can't either hammer the upstream or silence the
fallback for a day."""
raw = os.environ.get("AISHUB_POLL_INTERVAL_MINUTES", "20")
try:
value = int(str(raw).strip())
except (TypeError, ValueError):
value = 20
return max(1, min(360, value))
def _should_run_fallback() -> bool:
"""Only run when the primary WebSocket is disconnected. Avoids stomping
over fresher live data when AISStream is healthy.
Returns False if:
* AISHub isn't configured (no username)
* AISStream primary is currently connected (recent vessel messages)
Returns True only when AIS is configured-but-down. The
``proxy_spawn_count > 0`` guard means "the primary has at least tried
to run" — if the user set AISHUB_USERNAME but not AIS_API_KEY at all,
AISHub will still serve as a primary on its own slow cadence.
"""
if not aishub_fallback_enabled():
return False
try:
from services.ais_stream import ais_proxy_status
status = ais_proxy_status() or {}
except Exception:
return True # ais_stream not importable? still try AISHub.
# If the WebSocket primary is connected, skip the fallback — fresher
# data is already flowing.
if status.get("connected") is True:
return False
return True
def _parse_aishub_response(payload: str) -> list[dict]:
"""Parse the AISHub JSON response into a list of vessel records.
Successful response shape::
[
{"ERROR": false, "USERNAME": "...", "FORMAT": "1", "RECORDS": N},
[{"MMSI": ..., "LATITUDE": ..., "LONGITUDE": ..., ...}, ...]
]
Error response shape::
[{"ERROR": true, "ERROR_MESSAGE": "..."}]
Empty payload (e.g. silent rate-limit drop) returns ``[]``.
"""
if not payload or not payload.strip():
return []
try:
data = json.loads(payload)
except json.JSONDecodeError as e:
logger.warning("AISHub: response is not JSON: %s", e)
return []
if not isinstance(data, list) or not data:
return []
header = data[0] if isinstance(data[0], dict) else {}
if header.get("ERROR") is True:
logger.warning(
"AISHub: upstream error: %s",
header.get("ERROR_MESSAGE", "<unspecified>"),
)
return []
if len(data) < 2 or not isinstance(data[1], list):
return []
return [row for row in data[1] if isinstance(row, dict)]
def _normalize_record(row: dict) -> dict | None:
"""Map an AISHub vessel record to our internal vessel schema.
Returns None when the record can't be used (no MMSI, bad position,
sentinel "not available" lat/lng).
"""
try:
mmsi = int(row.get("MMSI") or 0)
except (TypeError, ValueError):
return None
if not mmsi:
return None
try:
lat = float(row.get("LATITUDE"))
lng = float(row.get("LONGITUDE"))
except (TypeError, ValueError):
return None
# AIS uses 91/181 as "no position available" sentinels.
if abs(lat) > 90 or abs(lng) > 180:
return None
if lat == 91.0 or lng == 181.0:
return None
# SOG raw 102.3 is "speed not available"; sanitize to 0.
try:
sog_raw = float(row.get("SOG") or 0)
except (TypeError, ValueError):
sog_raw = 0.0
sog = 0.0 if sog_raw >= 102.2 else sog_raw
try:
cog = float(row.get("COG") or 0)
except (TypeError, ValueError):
cog = 0.0
try:
heading_raw = int(row.get("HEADING") or 511)
except (TypeError, ValueError):
heading_raw = 511
# AIS heading sentinel 511 = "not available" — fall back to COG.
heading = heading_raw if heading_raw != 511 else cog
try:
ais_type = int(row.get("TYPE") or 0)
except (TypeError, ValueError):
ais_type = 0
return {
"mmsi": mmsi,
"lat": lat,
"lng": lng,
"sog": sog,
"cog": cog,
"heading": heading,
"name": str(row.get("NAME") or "").strip() or "UNKNOWN",
"callsign": str(row.get("CALLSIGN") or "").strip(),
"destination": str(row.get("DEST") or "").strip().replace("@", "") or "",
"imo": int(row.get("IMO") or 0),
"ais_type_code": ais_type,
}
def fetch_aishub_vessels() -> int:
"""Poll AISHub and merge vessels into the shared ``_vessels`` store.
Returns the number of vessels updated (0 on skip, error, or no data).
Designed to be called by the APScheduler tier see
``data_fetcher.py`` for the 20-minute interval job that wraps this.
"""
if not _should_run_fallback():
logger.debug("AISHub fallback skipped: primary connected or not configured")
return 0
username = aishub_username()
url = (
f"{AISHUB_URL}?username={username}&format=1&output=json"
f"&compress=0"
)
try:
response = fetch_with_curl(url, timeout=30)
except Exception as e:
logger.warning("AISHub fetch failed: %s", e)
return 0
if not response or response.status_code != 200:
logger.warning(
"AISHub HTTP %s",
getattr(response, "status_code", "None"),
)
return 0
rows = _parse_aishub_response(getattr(response, "text", "") or "")
if not rows:
return 0
# Inline imports to avoid a circular dependency at module load time
# (ais_stream imports lots of things and is loaded by main.py).
from services.ais_stream import (
_vessels,
_vessels_lock,
_record_vessel_trail_locked,
classify_vessel,
get_country_from_mmsi,
)
now = time.time()
count = 0
with _vessels_lock:
for row in rows:
normalized = _normalize_record(row)
if normalized is None:
continue
mmsi = normalized["mmsi"]
vessel = _vessels.setdefault(mmsi, {"mmsi": mmsi})
# Don't overwrite fresher live data: if the WebSocket pushed an
# update for this MMSI more recently than now-1s (race during
# the brief reconnection window) keep the live one.
last = float(vessel.get("_updated") or 0)
if last > now - 1:
continue
vessel.update(
{
"lat": normalized["lat"],
"lng": normalized["lng"],
"sog": normalized["sog"],
"cog": normalized["cog"],
"heading": normalized["heading"],
"_updated": now,
"source": "aishub",
}
)
if normalized["name"] and normalized["name"] != "UNKNOWN":
vessel["name"] = normalized["name"]
if normalized["callsign"]:
vessel["callsign"] = normalized["callsign"]
if normalized["destination"]:
vessel["destination"] = normalized["destination"]
if normalized["imo"]:
vessel["imo"] = normalized["imo"]
if normalized["ais_type_code"]:
vessel["ais_type_code"] = normalized["ais_type_code"]
vessel["type"] = classify_vessel(normalized["ais_type_code"], mmsi)
if not vessel.get("country"):
vessel["country"] = get_country_from_mmsi(mmsi)
_record_vessel_trail_locked(
mmsi,
normalized["lat"],
normalized["lng"],
normalized["sog"],
now,
)
count += 1
if count:
logger.info(
"AISHub fallback: merged %d vessels (poll interval %d min)",
count,
aishub_poll_interval_minutes(),
)
return count
+62
View File
@@ -0,0 +1,62 @@
"""CISA KEV + cyber threat stats (Osiris port)."""
from __future__ import annotations
import logging
from datetime import datetime, timezone
from typing import Any
from services.fetchers._store import _data_lock, _mark_fresh, is_any_active, latest_data
from services.network_utils import fetch_with_curl
logger = logging.getLogger(__name__)
def fetch_cyber_threats() -> dict[str, Any]:
if not is_any_active("cyber_threats"):
return latest_data.get("cyber_threats") or {"threats": [], "stats": {}}
results: dict[str, Any] = {"threats": [], "stats": {}, "timestamp": datetime.now(timezone.utc).isoformat()}
try:
resp = fetch_with_curl(
"https://www.cisa.gov/sites/default/files/feeds/known_exploited_vulnerabilities.json",
timeout=15,
)
if resp.status_code == 200:
data = resp.json()
vulns = data.get("vulnerabilities") or []
results["stats"]["cisa_total"] = len(vulns)
now = datetime.now(timezone.utc)
recent = []
for v in vulns:
try:
added = datetime.fromisoformat(v.get("dateAdded", "").replace("Z", "+00:00"))
days = (now - added).total_seconds() / 86400
except Exception:
continue
if days <= 30:
recent.append(v)
recent = recent[:10]
results["threats"] = [
{
"id": v.get("cveID"),
"name": v.get("vulnerabilityName"),
"vendor": v.get("vendorProject"),
"product": v.get("product"),
"severity": "CRITICAL",
"date": v.get("dateAdded"),
"due": v.get("dueDate"),
"source": "CISA KEV",
}
for v in recent
]
except Exception as exc:
logger.warning("CISA KEV fetch failed: %s", exc)
count = len(results["threats"])
results["stats"]["active_cves"] = count
results["stats"]["threat_level"] = "CRITICAL" if count >= 8 else "HIGH" if count >= 4 else "ELEVATED"
with _data_lock:
latest_data["cyber_threats"] = results
_mark_fresh("cyber_threats")
return results
+241 -70
View File
@@ -692,7 +692,8 @@ _NUFORC_TILESET = "nuforc.cmm18aqea06bu1mmselhpnano-0ce5v"
_NUFORC_TOKEN = os.environ.get("NUFORC_MAPBOX_TOKEN", "").strip()
_NUFORC_RADIUS_M = 200_000 # 200 km query radius
_NUFORC_LIMIT = 50 # max features per tilequery call
_NUFORC_RECENT_DAYS = int(os.environ.get("NUFORC_RECENT_DAYS", "60"))
# Rolling window shown on the map (~2 calendar months). Override via NUFORC_RECENT_DAYS.
_NUFORC_RECENT_DAYS = max(1, int(os.environ.get("NUFORC_RECENT_DAYS", "60")))
_NUFORC_HF_FALLBACK_LIMIT = max(25, int(os.environ.get("NUFORC_HF_FALLBACK_LIMIT", "250")))
_NUFORC_HF_GEOCODE_LIMIT = max(25, int(os.environ.get("NUFORC_HF_GEOCODE_LIMIT", "150")))
_NUFORC_GEOCODE_WORKERS = max(1, int(os.environ.get("NUFORC_GEOCODE_WORKERS", "1")))
@@ -700,6 +701,12 @@ _NUFORC_GEOCODE_WORKERS = max(1, int(os.environ.get("NUFORC_GEOCODE_WORKERS", "1
# practice, so a 0.3s spacing keeps us well under any soft throttle while
# still rebuilding a full 12-month window in ~10 minutes.
_NUFORC_GEOCODE_SPACING_S = float(os.environ.get("NUFORC_GEOCODE_SPACING_S", "0.3"))
# Disk cache TTL — match the weekly scheduler so restarts between fetches still
# serve the same rolling 60-day snapshot without hammering nuforc.org daily.
_NUFORC_CACHE_TTL_S = max(
3600,
int(os.environ.get("NUFORC_CACHE_TTL_HOURS", "168")) * 3600,
)
_NUFORC_DATA_DIR = Path(__file__).resolve().parent.parent.parent / "data"
_NUFORC_SIGHTINGS_CACHE_FILE = _NUFORC_DATA_DIR / "nuforc_recent_sightings.json"
_NUFORC_LOCATION_CACHE_FILE = _NUFORC_DATA_DIR / "nuforc_location_cache.json"
@@ -766,6 +773,35 @@ def _fetch_nuforc_tilequery(lng: float, lat: float) -> list[dict]:
return []
def _uap_cutoff_date_str() -> str:
return (datetime.utcnow() - timedelta(days=_NUFORC_RECENT_DAYS)).strftime("%Y-%m-%d")
def _uap_sighting_date_str(sighting: dict) -> str | None:
"""Normalize a sighting row to YYYY-MM-DD for window filtering."""
from services.fetchers.nuforc_enrichment import _parse_date
raw = str(sighting.get("date_time") or sighting.get("occurred") or "").strip()
if not raw:
return None
parsed = _parse_date(raw)
if parsed:
return parsed
if len(raw) >= 10 and raw[4] == "-" and raw[7] == "-":
return raw[:10]
return None
def _filter_uap_sightings_recent(sightings: list[dict]) -> list[dict]:
"""Drop anything outside the rolling NUFORC_RECENT_DAYS window."""
cutoff = _uap_cutoff_date_str()
return [
sighting
for sighting in sightings
if (_uap_sighting_date_str(sighting) or "") >= cutoff
]
def _parse_nuforc_tile_date(value: str) -> datetime | None:
raw = str(value or "").strip()
if not raw:
@@ -802,19 +838,41 @@ def _load_nuforc_sightings_cache(*, force_refresh: bool = False) -> list[dict] |
built_dt = datetime.fromisoformat(built) if built else None
if built_dt is None:
return None
if (datetime.utcnow() - built_dt).total_seconds() > 86400:
if (datetime.utcnow() - built_dt).total_seconds() > _NUFORC_CACHE_TTL_S:
return None
if raw.get("cutoff_days") != _NUFORC_RECENT_DAYS:
logger.info(
"UAP sightings: cache cutoff_days mismatch (%s != %s); rebuilding",
raw.get("cutoff_days"),
_NUFORC_RECENT_DAYS,
)
return None
sightings = raw.get("sightings")
if isinstance(sightings, list):
if len(sightings) <= 0:
logger.info("UAP sightings: cache is fresh but empty; rebuilding")
return None
filtered = _filter_uap_sightings_recent(sightings)
if not filtered:
logger.warning(
"UAP sightings: cache had %d rows but none within last %d days; rebuilding",
len(sightings),
_NUFORC_RECENT_DAYS,
)
return None
if len(filtered) < len(sightings):
logger.info(
"UAP sightings: dropped %d stale cached rows outside %d-day window",
len(sightings) - len(filtered),
_NUFORC_RECENT_DAYS,
)
logger.info(
"UAP sightings: loaded %d cached reports from %s",
len(sightings),
"UAP sightings: loaded %d cached reports from %s (within %d-day window)",
len(filtered),
built,
_NUFORC_RECENT_DAYS,
)
return sightings
return filtered
except Exception as e:
logger.warning("UAP sightings: cache load error: %s", e)
return None
@@ -828,6 +886,7 @@ def _save_nuforc_sightings_cache(sightings: list[dict]) -> None:
_NUFORC_DATA_DIR.mkdir(parents=True, exist_ok=True)
payload = {
"built": datetime.utcnow().isoformat(),
"cutoff_days": _NUFORC_RECENT_DAYS,
"count": len(sightings),
"sightings": sightings,
}
@@ -1035,27 +1094,128 @@ def _nuforc_months_for_window(days: int) -> list[str]:
return months
def _nuforc_fetch_month_live(yyyymm: str, cookie_jar: Path) -> list[dict]:
"""Pull one month of NUFORC sightings via the live wpDataTables AJAX.
Returns a list of raw row dicts with the fields we care about:
id, occurred (YYYY-MM-DD), posted (YYYY-MM-DD), city, state, country,
shape_raw, summary, explanation. Empty list on any failure caller
decides whether a failure is fatal.
"""
def _parse_nuforc_live_datatables_rows(raw_rows: list) -> list[dict]:
"""Parse wpDataTables ``data`` array into normalized row dicts."""
from services.fetchers.nuforc_enrichment import _parse_date
curl_bin = shutil.which("curl") or "curl"
out: list[dict] = []
for raw in raw_rows:
if not isinstance(raw, list) or len(raw) < 8:
continue
link_html = str(raw[0] or "")
occurred_raw = str(raw[1] or "")
city = str(raw[2] or "").strip()
state = str(raw[3] or "").strip()
country = str(raw[4] or "").strip()
shape_raw = (str(raw[5] or "").strip() or "Unknown")
summary = str(raw[6] or "").strip()
reported_raw = str(raw[7] or "")
explanation = str(raw[9] or "").strip() if len(raw) > 9 and raw[9] else ""
occurred_ymd = _parse_date(occurred_raw)
if not occurred_ymd:
continue
if not city and not state and not country:
continue
id_match = _NUFORC_LIVE_SIGHTING_ID_RE.search(link_html)
if id_match:
sighting_id = f"NUFORC-{id_match.group(1)}"
else:
digest = hashlib.sha1(
f"{occurred_ymd}|{city}|{state}|{summary}".encode("utf-8", "ignore")
).hexdigest()[:12]
sighting_id = f"NUFORC-{digest}"
if summary and len(summary) > 280:
summary = summary[:277] + "..."
if not summary:
summary = "Sighting reported"
out.append({
"id": sighting_id,
"occurred": occurred_ymd,
"posted": _parse_date(reported_raw) or occurred_ymd,
"city": city,
"state": state,
"country": country,
"shape_raw": shape_raw,
"summary": summary,
"explanation": explanation,
})
return out
def _nuforc_fetch_month_live_requests(yyyymm: str) -> list[dict]:
"""Live NUFORC month fetch via requests (Windows-safe when curl is disabled)."""
import requests
index_url = _NUFORC_LIVE_INDEX_URL.format(yyyymm=yyyymm)
ajax_url = _NUFORC_LIVE_AJAX_URL.format(yyyymm=yyyymm)
if not external_curl_fallback_enabled():
headers = {"User-Agent": _nuforc_live_user_agent()}
session = requests.Session()
session.headers.update(headers)
try:
index_res = session.get(index_url, timeout=60)
except requests.RequestException as e:
logger.warning("NUFORC live (requests): index fetch failed for %s: %s", yyyymm, e)
return []
if index_res.status_code != 200 or not index_res.text:
logger.warning(
"NUFORC live: external curl disabled on Windows for %s; "
"set SHADOWBROKER_ENABLE_WINDOWS_CURL_FALLBACK=1 to opt in.",
"NUFORC live (requests): index HTTP %s for %s",
index_res.status_code,
yyyymm,
)
return []
nonce_match = _NUFORC_LIVE_NONCE_RE.search(index_res.text)
if not nonce_match:
logger.warning("NUFORC live (requests): wdtNonce not found for %s", yyyymm)
return []
nonce = nonce_match.group(1)
post_data = (
"draw=1"
"&columns%5B0%5D%5Bdata%5D=0&columns%5B0%5D%5Bsearchable%5D=true&columns%5B0%5D%5Borderable%5D=false"
"&columns%5B1%5D%5Bdata%5D=1&columns%5B1%5D%5Bsearchable%5D=true&columns%5B1%5D%5Borderable%5D=true"
"&order%5B0%5D%5Bcolumn%5D=1&order%5B0%5D%5Bdir%5D=desc"
"&start=0&length=-1"
"&search%5Bvalue%5D=&search%5Bregex%5D=false"
f"&wdtNonce={nonce}"
)
try:
ajax_res = session.post(
ajax_url,
data=post_data,
headers={
**headers,
"Referer": index_url,
"X-Requested-With": "XMLHttpRequest",
"Content-Type": "application/x-www-form-urlencoded",
},
timeout=120,
)
except requests.RequestException as e:
logger.warning("NUFORC live (requests): ajax failed for %s: %s", yyyymm, e)
return []
if ajax_res.status_code != 200 or not ajax_res.text:
logger.warning(
"NUFORC live (requests): ajax HTTP %s for %s",
ajax_res.status_code,
yyyymm,
)
return []
try:
payload = ajax_res.json()
except json.JSONDecodeError as e:
logger.warning("NUFORC live (requests): ajax JSON decode failed for %s: %s", yyyymm, e)
return []
return _parse_nuforc_live_datatables_rows(payload.get("data") or [])
def _nuforc_fetch_month_live_curl(yyyymm: str, cookie_jar: Path) -> list[dict]:
"""Pull one month of NUFORC sightings via curl + wpDataTables AJAX."""
curl_bin = shutil.which("curl") or "curl"
index_url = _NUFORC_LIVE_INDEX_URL.format(yyyymm=yyyymm)
ajax_url = _NUFORC_LIVE_AJAX_URL.format(yyyymm=yyyymm)
# Step 1: GET the month index to capture session cookies + fresh nonce.
try:
@@ -1125,65 +1285,27 @@ def _nuforc_fetch_month_live(yyyymm: str, cookie_jar: Path) -> list[dict]:
logger.warning("NUFORC live: ajax JSON decode failed for %s: %s", yyyymm, e)
return []
raw_rows = payload.get("data") or []
out: list[dict] = []
for raw in raw_rows:
if not isinstance(raw, list) or len(raw) < 8:
continue
link_html = str(raw[0] or "")
occurred_raw = str(raw[1] or "")
city = str(raw[2] or "").strip()
state = str(raw[3] or "").strip()
country = str(raw[4] or "").strip()
shape_raw = (str(raw[5] or "").strip() or "Unknown")
summary = str(raw[6] or "").strip()
reported_raw = str(raw[7] or "")
explanation = str(raw[9] or "").strip() if len(raw) > 9 and raw[9] else ""
return _parse_nuforc_live_datatables_rows(payload.get("data") or [])
occurred_ymd = _parse_date(occurred_raw)
if not occurred_ymd:
continue
if not city and not state and not country:
continue
id_match = _NUFORC_LIVE_SIGHTING_ID_RE.search(link_html)
if id_match:
sighting_id = f"NUFORC-{id_match.group(1)}"
else:
digest = hashlib.sha1(
f"{occurred_ymd}|{city}|{state}|{summary}".encode("utf-8", "ignore")
).hexdigest()[:12]
sighting_id = f"NUFORC-{digest}"
if summary and len(summary) > 280:
summary = summary[:277] + "..."
if not summary:
summary = "Sighting reported"
out.append({
"id": sighting_id,
"occurred": occurred_ymd,
"posted": _parse_date(reported_raw) or occurred_ymd,
"city": city,
"state": state,
"country": country,
"shape_raw": shape_raw,
"summary": summary,
"explanation": explanation,
})
return out
def _nuforc_fetch_month_live(yyyymm: str, cookie_jar: Path) -> list[dict]:
"""Pull one month of NUFORC sightings via live wpDataTables AJAX."""
if external_curl_fallback_enabled():
rows = _nuforc_fetch_month_live_curl(yyyymm, cookie_jar)
if rows:
return rows
return _nuforc_fetch_month_live_requests(yyyymm)
def _build_recent_uap_sightings() -> list[dict]:
"""Build the rolling 1-year UAP sightings layer from live NUFORC data.
"""Build the rolling UAP sightings layer from live NUFORC data.
Hits nuforc.org's public sub-index once per month in the window, drops
anything outside the exact day-precision cutoff, dedupes by sighting id,
geocodes city+state via the existing location cache, and returns rows
keyed to the same schema the frontend already renders.
"""
cutoff_dt = datetime.utcnow() - timedelta(days=_NUFORC_RECENT_DAYS)
cutoff_str = cutoff_dt.strftime("%Y-%m-%d")
cutoff_str = _uap_cutoff_date_str()
months = _nuforc_months_for_window(_NUFORC_RECENT_DAYS)
try:
@@ -1383,10 +1505,21 @@ def _build_uap_sightings_from_hf_mirror() -> list[dict]:
This is a resilience fallback for local/Windows runs where nuforc.org is
Cloudflare-gated and the Mapbox token is not configured. It is not as fresh
as the live NUFORC AJAX feed, but it keeps the layer visible and cached.
Date-cutoff guard: the kcimc/NUFORC HF dataset is a static snapshot whose
maintainer refreshes it sporadically. Without a cutoff, sorting by
occurred-desc and taking the top N rows returns whatever the mirror's
newest rows happen to be which can be years old if the snapshot is
stale. We apply the same ``_NUFORC_RECENT_DAYS`` window the live path
uses (60 days). If the HF mirror has nothing inside the window we return
``[]`` rather than silently serving 3-year-old "newest" rows.
"""
from services.fetchers.nuforc_enrichment import _HF_CSV_URL, _parse_date
from services.geocode_validate import coord_in_country
cutoff_dt = datetime.utcnow() - timedelta(days=_NUFORC_RECENT_DAYS)
cutoff_str = cutoff_dt.strftime("%Y-%m-%d")
try:
response = fetch_with_curl(_HF_CSV_URL, timeout=180, follow_redirects=True)
if not response or response.status_code != 200:
@@ -1400,6 +1533,7 @@ def _build_uap_sightings_from_hf_mirror() -> list[dict]:
return []
candidates: list[dict] = []
stale_rows_dropped = 0
try:
reader = csv.DictReader(io.StringIO(response.text))
for row in reader:
@@ -1410,6 +1544,9 @@ def _build_uap_sightings_from_hf_mirror() -> list[dict]:
)
if not occurred:
continue
if occurred < cutoff_str:
stale_rows_dropped += 1
continue
raw_location = _normalize_uap_location(
row.get("Location", "")
or row.get("City", "")
@@ -1444,6 +1581,19 @@ def _build_uap_sightings_from_hf_mirror() -> list[dict]:
logger.warning("UAP sightings: HF fallback parse failed: %s", e)
return []
if not candidates:
# HF mirror returned rows, but none inside the rolling window. This is
# the smoking gun for "the public HF dataset hasn't been refreshed in
# years" — log loudly so the operator sees it instead of guessing.
logger.error(
"UAP sightings: HF fallback yielded 0 rows within last %d days "
"(dropped %d stale rows). HF mirror is likely stale; the layer "
"will be empty until the live NUFORC path recovers.",
_NUFORC_RECENT_DAYS,
stale_rows_dropped,
)
return []
candidates.sort(key=lambda row: (row["occurred"], row["posted"], row["id"]), reverse=True)
candidates = candidates[:_NUFORC_HF_FALLBACK_LIMIT]
@@ -1502,11 +1652,12 @@ def _build_uap_sightings_from_hf_mirror() -> list[dict]:
@with_retry(max_retries=1, base_delay=5)
def fetch_uap_sightings(*, force_refresh: bool = False):
"""Fetch last-year UAP sightings from NUFORC.
"""Fetch rolling-window UAP sightings from live NUFORC.
Startup reads the cached daily snapshot when it is still fresh. The daily
scheduler forces a rebuild so this layer updates once per day instead of
churning continuously.
Startup reads the cached snapshot when still within NUFORC_CACHE_TTL_HOURS
(default 168h / one week). The weekly scheduler forces a rebuild so every
install refreshes the same ~60-day layer without daily load on nuforc.org.
Operators can also POST /api/refresh (admin) to pull immediately.
"""
from services.fetchers._store import is_any_active
@@ -1515,13 +1666,32 @@ def fetch_uap_sightings(*, force_refresh: bool = False):
sightings = _load_nuforc_sightings_cache(force_refresh=force_refresh)
if sightings is None:
live_error: Exception | None = None
try:
sightings = _build_recent_uap_sightings()
except Exception as e:
live_error = e
logger.warning("UAP sightings: live NUFORC rebuild failed, using fallback: %s", e)
sightings = _build_uap_sightings_from_hf_mirror()
if sightings:
_save_nuforc_sightings_cache(sightings)
elif live_error is not None:
# Both paths failed: live raised AND HF fallback returned empty
# (either the HF mirror is stale beyond the cutoff or the network
# is gone entirely). The previous code silently set the layer to
# ``[]`` and kept marking it fresh; that masked the failure for
# days. Surface it via assert_canary so the health registry shows
# the layer as broken instead of "fresh and empty".
from services.slo import assert_canary
assert_canary("uap_sightings", 0)
logger.error(
"UAP sightings: both live NUFORC and HF fallback produced 0 "
"rows; layer is unavailable. Live error: %s",
live_error,
)
if sightings:
sightings = _filter_uap_sightings_recent(sightings)
with _data_lock:
latest_data["uap_sightings"] = sightings or []
@@ -1529,6 +1699,7 @@ def fetch_uap_sightings(*, force_refresh: bool = False):
_mark_fresh("uap_sightings")
return
# Unreachable legacy Mapbox tilequery path (kept for reference).
cutoff = datetime.utcnow() - timedelta(days=_NUFORC_RECENT_DAYS)
# Query the grid concurrently (up to 8 threads)
@@ -0,0 +1,148 @@
"""Per-aircraft observation tracking for cumulative fuel/CO2 estimates.
Background
----------
The pre-existing emissions enrichment attached a *rate* to each flight
(GPH and kg/hr) based on aircraft model. Users reasonably wanted the
running total: how much fuel HAS this plane burned since we started
seeing it? Multiplying the rate by elapsed observation time gets us
there, but it requires somewhere to remember "when did this icao24
first appear on our radar?"
Why this lives outside ``flight_trails``
----------------------------------------
``flight_trails`` is sized and pruned aggressively for map rendering
(5-minute TTL for untracked aircraft, 200 trail points max). That's
wrong for cumulative burn: if a plane has been airborne 2 hours but
its trail was pruned 30 min in, the "first trail point" timestamp is
30 min ago, not 2h ago. Worse, when the trail expires and re-creates,
the cumulative counter would reset mid-flight.
This module tracks observation lifecycle separately:
* When a hex is first observed: start a new flight session.
* While observed regularly (gap < ``REOPEN_GAP_S``): keep accumulating.
* When unseen for longer than ``REOPEN_GAP_S``: treat next sighting as
a new session (the plane landed and took off again, or it's a
different leg). Reset ``first_seen_at``.
* Stale sessions are pruned every ``PRUNE_INTERVAL_S`` so memory stays
bounded.
The user explicitly asked for this counting semantic: "as soon as a
plane appears there should be a counter that keeps a running count of
the fuel being burned... If there is no estimate take off time then it
can just be from the time the server starts to keep a log of whats in
the air."
"""
from __future__ import annotations
import threading
import time
# Gap between sightings that resets the session. ADS-B refreshes the
# whole aircraft list every minute or two, so anything over a few
# minutes means the plane left our coverage window (landed, transit
# through dead zone, etc). 15 minutes is conservative.
REOPEN_GAP_S = 15 * 60
# Don't accumulate runaway memory: drop entries unseen for an hour.
PRUNE_AFTER_S = 60 * 60
# Cap on accumulated airtime per session so a single bug elsewhere
# (e.g. ts clock skew) can't produce comically large numbers.
MAX_SESSION_SECONDS = 24 * 3600 # 24h — longest realistic civilian leg
_observations: dict[str, dict[str, float]] = {}
_lock = threading.Lock()
_last_prune_at = 0.0
def record_observation(icao_hex: str, *, now: float | None = None) -> int:
"""Record a sighting of ``icao_hex`` and return airtime so far (seconds).
Returns 0 for the first-ever sighting (no elapsed time yet) or when
``icao_hex`` is falsy. The caller can multiply the returned seconds
by ``rate_per_hour / 3600`` to get cumulative consumption.
"""
if not icao_hex:
return 0
key = str(icao_hex).strip().lower()
if not key:
return 0
current = float(now if now is not None else time.time())
with _lock:
entry = _observations.get(key)
if entry is None:
_observations[key] = {"first_seen_at": current, "last_seen_at": current}
return 0
# Use explicit ``is None`` checks instead of ``or`` short-circuit:
# ``0.0`` is a legitimate timestamp value (e.g. test fixtures
# seeding a far-past first_seen_at to exercise the clamp) but
# ``0.0 or fallback`` collapses to ``fallback`` because 0.0 is
# falsy. Bit me on my own test — leaving the safer form here.
last_raw = entry.get("last_seen_at")
last_seen = float(last_raw) if last_raw is not None else current
gap = current - last_seen
if gap > REOPEN_GAP_S:
# Treat as a new flight session — the plane landed/disappeared
# long enough that the prior cumulative count is no longer
# the same flight.
_observations[key] = {"first_seen_at": current, "last_seen_at": current}
return 0
first_raw = entry.get("first_seen_at")
first = float(first_raw) if first_raw is not None else current
# Clamp absurd values from clock skew or bad input.
elapsed = max(0, min(int(current - first), MAX_SESSION_SECONDS))
entry["last_seen_at"] = current
return elapsed
def prune(*, now: float | None = None) -> int:
"""Drop entries we haven't seen in ``PRUNE_AFTER_S`` seconds.
Returns number of entries dropped. Safe to call from a scheduler tick;
cheap (single dict scan) so cadence doesn't matter much.
"""
current = float(now if now is not None else time.time())
dropped = 0
with _lock:
stale_keys = []
for k, v in _observations.items():
last_raw = v.get("last_seen_at")
last = float(last_raw) if last_raw is not None else 0.0
if current - last > PRUNE_AFTER_S:
stale_keys.append(k)
for k in stale_keys:
del _observations[k]
dropped += 1
return dropped
def get_session_seconds(icao_hex: str, *, now: float | None = None) -> int:
"""Read-only accessor: airtime for a known icao without bumping last-seen.
Used by tests and external consumers (e.g. when rendering a snapshot
of all in-flight aircraft, you want the current value, not to update
last_seen_at as a side effect).
"""
if not icao_hex:
return 0
key = str(icao_hex).strip().lower()
with _lock:
entry = _observations.get(key)
if entry is None:
return 0
current = float(now if now is not None else time.time())
first_raw = entry.get("first_seen_at")
first = float(first_raw) if first_raw is not None else current
return max(0, min(int(current - first), MAX_SESSION_SECONDS))
def _reset_for_tests() -> None:
"""Drop all observations. Test helper only."""
with _lock:
_observations.clear()
+123 -50
View File
@@ -17,6 +17,7 @@ from services.network_utils import fetch_with_curl
from services.fetchers._store import latest_data, _data_lock, _mark_fresh
from services.fetchers.plane_alert import enrich_with_plane_alert, enrich_with_tracked_names
from services.fetchers.emissions import get_emissions_info
from services.fetchers.flight_observations import record_observation as _record_flight_observation
from services.fetchers.retry import with_retry
from services.fetchers.route_database import lookup_route
from services.fetchers.aircraft_database import lookup_aircraft_type
@@ -29,6 +30,88 @@ _RE_AIRLINE_CODE_1 = re.compile(r"^([A-Z]{3})\d")
_RE_AIRLINE_CODE_2 = re.compile(r"^([A-Z]{3})[A-Z\d]")
def detect_gps_jamming_zones(
raw_flights: list[dict],
*,
min_aircraft: int | None = None,
min_ratio: float | None = None,
nacp_threshold: int | None = None,
) -> list[dict]:
"""Detect GPS interference zones from a snapshot of raw ADS-B aircraft.
Methodology mirrors GPSJam.org / Flightradar24: bin aircraft into 1°x1°
grid cells, flag cells where the fraction of aircraft reporting degraded
NACp clears a threshold.
Inputs
------
raw_flights:
Iterable of dicts. Each item is expected to carry ``lat``, ``lng``
(or ``lon``), and ``nac_p``. Records missing position OR missing
``nac_p`` entirely (typical for OpenSky-sourced flights) are
skipped absence-of-data isn't evidence of anything.
nac_p == 0 IS counted as degraded. Pre-fix code skipped it on the theory
that "0 = old transponder, never computed accuracy." That's only half
right: modern Mode-S Enhanced Surveillance transponders also fall back
to nac_p=0 when they lose GPS lock entirely which is exactly the
jamming signature we're trying to detect. Filtering 0 out was discarding
the strongest evidence.
Denoising:
1. Require ``min_aircraft`` per grid cell for statistical validity.
2. Subtract 1 from degraded count per cell (GPSJam's technique) so
a single quirky transponder can't flag an entire zone.
3. Require ratio ``adjusted_degraded / total > min_ratio``.
All thresholds default to the module-level constants but can be
overridden for testing.
"""
min_aircraft = GPS_JAMMING_MIN_AIRCRAFT if min_aircraft is None else int(min_aircraft)
min_ratio = GPS_JAMMING_MIN_RATIO if min_ratio is None else float(min_ratio)
nacp_threshold = (
GPS_JAMMING_NACP_THRESHOLD if nacp_threshold is None else int(nacp_threshold)
)
jamming_grid: dict[str, dict[str, int]] = {}
for rf in raw_flights or []:
rlat = rf.get("lat")
rlng = rf.get("lng") if rf.get("lng") is not None else rf.get("lon")
if rlat is None or rlng is None:
continue
nacp = rf.get("nac_p")
if nacp is None:
continue
grid_key = f"{int(rlat)},{int(rlng)}"
cell = jamming_grid.setdefault(grid_key, {"degraded": 0, "total": 0})
cell["total"] += 1
if nacp < nacp_threshold:
cell["degraded"] += 1
jamming_zones: list[dict] = []
for gk, counts in jamming_grid.items():
if counts["total"] < min_aircraft:
continue
adjusted_degraded = max(counts["degraded"] - 1, 0)
if adjusted_degraded == 0:
continue
ratio = adjusted_degraded / counts["total"]
if ratio > min_ratio:
lat_i, lng_i = gk.split(",")
severity = "low" if ratio < 0.5 else "medium" if ratio < 0.75 else "high"
jamming_zones.append(
{
"lat": int(lat_i) + 0.5,
"lng": int(lng_i) + 0.5,
"severity": severity,
"ratio": round(ratio, 2),
"degraded": counts["degraded"],
"total": counts["total"],
}
)
return jamming_zones
# ---------------------------------------------------------------------------
# OpenSky Network API Client (OAuth2)
# ---------------------------------------------------------------------------
@@ -459,6 +542,18 @@ def _classify_and_publish(all_adsb_flights):
ac_category = "heli" if model_upper in _HELI_TYPES_BACKEND else "plane"
# Source attribution: prefer the explicit ``source`` tag stamped
# at fetch time (adsb.lol, OpenSky). If absent, fall back to the
# legacy ``supplemental_source`` (airplanes.live, adsb.fi) so
# supplementals are still attributed without changing their
# tagger. Final fallback "adsb.lol" preserves prior behavior for
# any caller that synthesizes records without going through one
# of our fetchers (e.g. tests).
source = (
f.get("source")
or f.get("supplemental_source")
or "adsb.lol"
)
flights.append(
{
"callsign": flight_str,
@@ -480,6 +575,7 @@ def _classify_and_publish(all_adsb_flights):
"airline_code": airline_code,
"aircraft_category": ac_category,
"nac_p": f.get("nac_p"),
"source": source,
}
)
except (ValueError, TypeError, KeyError, AttributeError) as loop_e:
@@ -506,6 +602,22 @@ def _classify_and_publish(all_adsb_flights):
if model:
emi = get_emissions_info(model)
if emi:
# Cumulative fuel/CO2: multiply the per-hour rate by how
# long we've been observing this airframe. Users want to
# see the *amount* burned, not just the rate. If we've
# never seen this hex before, observed_seconds is 0 and
# the cumulative values are 0 until the next refresh —
# the rate is still useful info on its own.
observed_seconds = _record_flight_observation(
f.get("icao24") or ""
)
elapsed_h = observed_seconds / 3600.0
emi = {
**emi,
"observed_seconds": observed_seconds,
"fuel_gallons_burned": round(emi["fuel_gph"] * elapsed_h, 1),
"co2_kg_emitted": round(emi["co2_kg_per_hour"] * elapsed_h, 1),
}
f["emissions"] = emi
callsign = f.get("callsign", "").strip().upper()
@@ -724,56 +836,8 @@ def _classify_and_publish(all_adsb_flights):
latest_data["military_flights"] = military_snapshot
# --- GPS Jamming Detection ---
# Uses NACp (Navigation Accuracy Category Position) from ADS-B to infer
# GPS interference zones, similar to GPSJam.org / Flightradar24.
# NACp < 8 = position accuracy worse than the FAA-mandated 0.05 NM.
#
# Denoising (to suppress false positives from old GA transponders):
# 1. Skip nac_p == 0 ("unknown accuracy") — old transponders that never
# computed accuracy, NOT evidence of jamming. Real jamming shows 1-7.
# 2. Require minimum aircraft per grid cell for statistical validity.
# 3. Subtract 1 from degraded count per cell (GPSJam's technique) so a
# single quirky transponder can't flag an entire zone.
# 4. Require the adjusted ratio to exceed the threshold.
try:
jamming_grid = {}
raw_flights = raw_flights_snapshot
for rf in raw_flights:
rlat = rf.get("lat")
rlng = rf.get("lng") or rf.get("lon")
if rlat is None or rlng is None:
continue
nacp = rf.get("nac_p")
if nacp is None or nacp == 0:
continue
grid_key = f"{int(rlat)},{int(rlng)}"
if grid_key not in jamming_grid:
jamming_grid[grid_key] = {"degraded": 0, "total": 0}
jamming_grid[grid_key]["total"] += 1
if nacp < GPS_JAMMING_NACP_THRESHOLD:
jamming_grid[grid_key]["degraded"] += 1
jamming_zones = []
for gk, counts in jamming_grid.items():
if counts["total"] < GPS_JAMMING_MIN_AIRCRAFT:
continue
adjusted_degraded = max(counts["degraded"] - 1, 0)
if adjusted_degraded == 0:
continue
ratio = adjusted_degraded / counts["total"]
if ratio > GPS_JAMMING_MIN_RATIO:
lat_i, lng_i = gk.split(",")
severity = "low" if ratio < 0.5 else "medium" if ratio < 0.75 else "high"
jamming_zones.append(
{
"lat": int(lat_i) + 0.5,
"lng": int(lng_i) + 0.5,
"severity": severity,
"ratio": round(ratio, 2),
"degraded": counts["degraded"],
"total": counts["total"],
}
)
jamming_zones = detect_gps_jamming_zones(raw_flights_snapshot)
with _data_lock:
latest_data["gps_jamming"] = jamming_zones
if jamming_zones:
@@ -849,7 +913,15 @@ def _fetch_adsb_lol_regions():
res = fetch_with_curl(url, timeout=10)
if res.status_code == 200:
data = res.json()
return data.get("ac", [])
aircraft = data.get("ac", [])
# Stamp the source at the fetch site so attribution survives
# the OpenSky/supplemental dedupe-by-hex merge downstream.
# Previously adsb.lol records carried no marker while OpenSky
# records got ``is_opensky: True`` — which made flight tooltips
# look like everything came from OpenSky.
for a in aircraft:
a["source"] = "adsb.lol"
return aircraft
except (
requests.RequestException,
ConnectionError,
@@ -932,6 +1004,7 @@ def _enrich_with_opensky_and_supplemental(adsb_flights):
"gs": (s[9] * 1.94384) if s[9] else 0,
"t": "Unknown",
"is_opensky": True,
"source": "OpenSky",
}
)
elif os_res.status_code == 429:
+47 -17
View File
@@ -20,17 +20,9 @@ def _env_flag(name: str) -> str:
def liveuamap_scraper_enabled() -> bool:
"""Return whether the Playwright-based LiveUAMap scraper should run.
from services.liveuamap_settings import liveuamap_scraper_enabled as _enabled
It is useful enrichment, but it starts a browser/Node driver and must not be
allowed to destabilize Windows local startup.
"""
setting = _env_flag("SHADOWBROKER_ENABLE_LIVEUAMAP_SCRAPER")
if setting in {"1", "true", "yes", "on"}:
return True
if setting in {"0", "false", "no", "off"}:
return False
return os.name != "nt"
return _enabled()
# ---------------------------------------------------------------------------
@@ -210,10 +202,17 @@ def update_liveuamap():
if not is_any_active("global_incidents"):
return
if not liveuamap_scraper_enabled():
logger.info(
"Liveuamap scraper disabled for this runtime; set "
"SHADOWBROKER_ENABLE_LIVEUAMAP_SCRAPER=1 to opt in."
)
from services.liveuamap_settings import liveuamap_requires_ui_opt_in
if liveuamap_requires_ui_opt_in():
logger.info(
"Liveuamap scraper disabled: enable Global Incidents in the UI to "
"consent, or set SHADOWBROKER_ENABLE_LIVEUAMAP_SCRAPER=1."
)
else:
logger.info(
"Liveuamap scraper disabled; set SHADOWBROKER_ENABLE_LIVEUAMAP_SCRAPER=1 to opt in."
)
return
logger.info("Running scheduled Liveuamap scraper...")
try:
@@ -279,6 +278,16 @@ _FISHING_FETCH_INTERVAL_S = 3600 # once per hour — GFW data has ~5 day lag
_last_fishing_fetch_ts: float = 0.0
def _gfw_int_env(name: str, default: int, *, minimum: int = 1, maximum: int | None = None) -> int:
try:
value = int(os.environ.get(name, str(default)) or default)
except (TypeError, ValueError):
value = default
if maximum is not None:
value = min(maximum, value)
return max(minimum, value)
@with_retry(max_retries=1, base_delay=5)
def fetch_fishing_activity():
"""Fetch recent fishing events from Global Fishing Watch (~5 day lag)."""
@@ -301,10 +310,16 @@ def fetch_fishing_activity():
try:
import datetime as _dt
# GFW publishes with ~5 day lag; windows shorter than ~7 days often return 0 events.
lookback_days = _gfw_int_env("GFW_EVENTS_LOOKBACK_DAYS", 7, minimum=1, maximum=14)
max_pages = _gfw_int_env("GFW_EVENTS_MAX_PAGES", 10, minimum=1, maximum=100)
timeout_s = _gfw_int_env("GFW_EVENTS_TIMEOUT_S", 90, minimum=30, maximum=180)
_end = _dt.date.today().isoformat()
_start = (_dt.date.today() - _dt.timedelta(days=7)).isoformat()
page_size = max(1, int(os.environ.get("GFW_EVENTS_PAGE_SIZE", "500") or "500"))
_start = (_dt.date.today() - _dt.timedelta(days=lookback_days)).isoformat()
page_size = _gfw_int_env("GFW_EVENTS_PAGE_SIZE", 500, minimum=1, maximum=1000)
offset = 0
pages_fetched = 0
total_available: int | None = None
seen_offsets: set[int] = set()
seen_ids: set[str] = set()
headers = {"Authorization": f"Bearer {token}"}
@@ -325,7 +340,7 @@ def fetch_fishing_activity():
}
)
url = f"https://gateway.api.globalfishingwatch.org/v3/events?{query}"
response = fetch_with_curl(url, timeout=30, headers=headers)
response = fetch_with_curl(url, timeout=timeout_s, headers=headers)
if response.status_code != 200:
logger.warning(
"Fishing activity fetch failed at offset=%s: HTTP %s",
@@ -335,10 +350,16 @@ def fetch_fishing_activity():
break
payload = response.json() or {}
if total_available is None:
try:
total_available = int(payload.get("total")) if payload.get("total") is not None else None
except (TypeError, ValueError):
total_available = None
entries = payload.get("entries", [])
if not entries:
break
pages_fetched += 1
added_this_page = 0
for e in entries:
pos = e.get("position", {})
@@ -373,6 +394,15 @@ def fetch_fishing_activity():
if len(entries) < page_size:
break
if pages_fetched >= max_pages:
logger.info(
"Fishing activity: capped at %s pages (%s events fetched; GFW total=%s)",
max_pages,
len(events),
total_available if total_available is not None else "unknown",
)
break
next_offset = payload.get("nextOffset")
if next_offset is None:
next_offset = (payload.get("pagination") or {}).get("nextOffset")
+4 -4
View File
@@ -235,11 +235,11 @@ _DC_GEOCODED_PATH = Path(__file__).parent.parent.parent / "data" / "datacenters_
def fetch_datacenters():
"""Load geocoded data centers (5K+ street-level precise locations)."""
from services.fetchers._store import is_any_active
"""Load geocoded data centers (5K+ street-level precise locations).
if not is_any_active("datacenters"):
return
Always loads from disk; /api/live-data/slow gates the payload on the
datacenters layer toggle so enabling the layer can render immediately.
"""
dcs = []
try:
if not _DC_GEOCODED_PATH.exists():
+107
View File
@@ -0,0 +1,107 @@
"""Malware C2 / URLhaus feed (abuse.ch, Osiris port)."""
from __future__ import annotations
import logging
from datetime import datetime, timezone
from typing import Any
from services.fetchers._store import _data_lock, _mark_fresh, is_any_active, latest_data
from services.network_utils import fetch_with_curl
logger = logging.getLogger(__name__)
COUNTRY_CENTROIDS: dict[str, tuple[float, float]] = {
"AF": (65, 33), "AL": (20, 41), "DZ": (3, 28), "AR": (-64, -34), "AU": (134, -25),
"AT": (14, 47.5), "BE": (4, 50.8), "BR": (-51, -10), "CA": (-96, 62), "CN": (105, 35),
"DE": (10, 51), "FR": (2, 46), "GB": (-2, 54), "IN": (79, 22), "IR": (53, 32),
"IT": (12.5, 42.8), "JP": (138, 36), "KR": (128, 36), "MX": (-102, 23.5), "NL": (5.5, 52.5),
"PL": (19.5, 52), "RU": (100, 60), "SG": (103.8, 1.35), "TW": (121, 23.7), "UA": (32, 49),
"US": (-97, 38), "VN": (106, 16),
}
def fetch_malware_threats() -> list[dict[str, Any]]:
if not is_any_active("malware_c2"):
return latest_data.get("malware_threats") or []
threats: list[dict[str, Any]] = []
threat_id = 0
try:
resp = fetch_with_curl(
"https://feodotracker.abuse.ch/downloads/ipblocklist.json",
timeout=10,
headers={"User-Agent": "Shadowbroker/1.0", "Accept": "application/json"},
)
if resp.status_code == 200:
entries = resp.json()
if not isinstance(entries, list):
entries = []
for entry in entries[:200]:
cc = entry.get("country")
if not cc or cc not in COUNTRY_CENTROIDS:
continue
lng, lat = COUNTRY_CENTROIDS[cc]
j_lng = ((threat_id * 173.7) % 200 - 100) / 100 * 4
j_lat = ((threat_id * 293.1) % 200 - 100) / 100 * 4
threats.append(
{
"id": f"feodo-{threat_id}",
"lat": lat + j_lat,
"lng": lng + j_lng,
"ip": entry.get("ip_address") or "unknown",
"port": entry.get("dst_port") or 0,
"malware": entry.get("malware") or "unknown",
"status": entry.get("status") or "active",
"first_seen": entry.get("first_seen"),
"last_online": entry.get("last_online"),
"country": cc,
"threat_type": "botnet_c2",
}
)
threat_id += 1
except Exception as exc:
logger.warning("Feodo fetch failed: %s", exc)
try:
resp = fetch_with_curl(
"https://urlhaus-api.abuse.ch/v1/urls/recent/limit/100/",
timeout=8,
)
if resp.status_code == 200:
urls = (resp.json() or {}).get("urls") or []
for u in urls:
cc = u.get("country")
if not cc or cc not in COUNTRY_CENTROIDS:
cc = next(iter(COUNTRY_CENTROIDS))
lng, lat = COUNTRY_CENTROIDS[cc]
j_lng = ((threat_id * 137.3) % 200 - 100) / 100 * 5
j_lat = ((threat_id * 211.7) % 200 - 100) / 100 * 5
threats.append(
{
"id": f"urlhaus-{threat_id}",
"lat": lat + j_lat,
"lng": lng + j_lng,
"ip": u.get("host") or "unknown",
"port": 0,
"malware": ", ".join(u.get("tags") or []) or u.get("threat") or "malware",
"status": u.get("url_status") or "online",
"first_seen": u.get("dateadded"),
"country": cc,
"threat_type": "malware_url",
}
)
threat_id += 1
except Exception as exc:
logger.debug("URLhaus supplement failed: %s", exc)
payload = {
"threats": threats,
"total": len(threats),
"timestamp": datetime.now(timezone.utc).isoformat(),
"source": "abuse.ch Feodo Tracker + URLhaus",
}
with _data_lock:
latest_data["malware_threats"] = payload
_mark_fresh("malware_threats")
return threats
+2 -2
View File
@@ -188,8 +188,8 @@ def fetch_meshtastic_nodes():
callsign = ""
send_callsign_header = str(
_os.environ.get("MESHTASTIC_SEND_CALLSIGN_HEADER", "true")
).strip().lower() not in {"0", "false", "no", "off", ""}
_os.environ.get("MESHTASTIC_SEND_CALLSIGN_HEADER", "false")
).strip().lower() in {"1", "true", "yes", "on"}
# Round 7a: outbound_user_agent already includes the per-install handle.
# The optional Meshtastic callsign is appended as additional context so
+18 -1
View File
@@ -7,6 +7,7 @@ import requests
from services.network_utils import fetch_with_curl
from services.fetchers._store import latest_data, _data_lock, _mark_fresh
from services.fetchers.emissions import get_emissions_info
from services.fetchers.flight_observations import record_observation as _record_flight_observation
from services.fetchers.plane_alert import enrich_with_plane_alert
logger = logging.getLogger("services.data_fetcher")
@@ -171,6 +172,7 @@ def fetch_military_flights():
h = a.get("hex", "").lower()
if h and h not in seen_hex:
seen_hex.add(h)
a["source"] = "adsb.lol"
all_mil_ac.append(a)
except Exception as e:
logger.warning(f"adsb.lol mil fetch failed: {e}")
@@ -182,6 +184,7 @@ def fetch_military_flights():
h = a.get("hex", "").lower()
if h and h not in seen_hex:
seen_hex.add(h)
a["source"] = "airplanes.live"
all_mil_ac.append(a)
logger.info(f"airplanes.live mil: +{len(resp2.json().get('ac', []))} raw, {len(all_mil_ac)} total unique")
except Exception as e:
@@ -234,6 +237,7 @@ def fetch_military_flights():
"registration": f.get("r", "N/A"),
"icao24": icao_hex,
"squawk": f.get("squawk", ""),
"source": f.get("source") or "adsb.lol",
})
continue
@@ -258,7 +262,8 @@ def fetch_military_flights():
"model": f.get("t", "Unknown"),
"icao24": icao_hex,
"speed_knots": speed_knots,
"squawk": f.get("squawk", "")
"squawk": f.get("squawk", ""),
"source": f.get("source") or "adsb.lol",
})
except Exception as loop_e:
logger.error(f"Mil flight interpolation error: {loop_e}")
@@ -296,6 +301,18 @@ def fetch_military_flights():
if model:
emissions = get_emissions_info(model)
if emissions:
# Cumulative fuel/CO2 since first observation — mirrors
# the civilian path in flights._classify_and_publish.
observed_seconds = _record_flight_observation(
mf.get("icao24") or ""
)
elapsed_h = observed_seconds / 3600.0
emissions = {
**emissions,
"observed_seconds": observed_seconds,
"fuel_gallons_burned": round(emissions["fuel_gph"] * elapsed_h, 1),
"co2_kg_emitted": round(emissions["co2_kg_per_hour"] * elapsed_h, 1),
}
mf["emissions"] = emissions
if mf.get("alert_category"):
mf["type"] = "tracked_flight"
+14 -9
View File
@@ -158,21 +158,26 @@ _KEYWORD_COORDS = {
_SORTED_KEYWORDS = sorted(_KEYWORD_COORDS.items(), key=lambda x: len(x[0]), reverse=True)
def resolve_coords_match(text: str) -> tuple[tuple[float, float], str] | None:
"""Return ((lat, lng), matched_keyword) for the most specific keyword hit."""
padded_text = f" {text} "
for kw, coords in _SORTED_KEYWORDS:
if kw.startswith(" ") or kw.endswith(" "):
if kw in padded_text:
return coords, kw
elif re.search(r"\b" + re.escape(kw) + r"\b", text):
return coords, kw
return None
def _resolve_coords(text: str) -> tuple[float, float] | None:
"""Return (lat, lng) for the most specific keyword match, or None.
Longer keywords are tried first. Space-padded keywords (" us ", " uk ")
use substring matching on padded text; all others use word-boundary regex.
"""
padded_text = f" {text} "
for kw, coords in _SORTED_KEYWORDS:
if kw.startswith(" ") or kw.endswith(" "):
if kw in padded_text:
return coords
else:
if re.search(r'\b' + re.escape(kw) + r'\b', text):
return coords
return None
match = resolve_coords_match(text)
return match[0] if match else None
@with_retry(max_retries=1, base_delay=2)
@@ -9,6 +9,7 @@ import json
import logging
import math
import os
import random
import threading
import time
from urllib.parse import urlencode
@@ -21,23 +22,34 @@ _prev_probabilities: dict[str, float] = {}
_market_cache = TTLCache(maxsize=1, ttl=300)
_POLYMARKET_PAGE_DELAY_S = float(os.environ.get("MESH_POLYMARKET_PAGE_DELAY_S", "0.02"))
_KALSHI_PAGE_DELAY_S = float(os.environ.get("MESH_KALSHI_PAGE_DELAY_S", "0.08"))
_POLYMARKET_PAGE_DELAY_JITTER_S = float(os.environ.get("MESH_POLYMARKET_PAGE_DELAY_JITTER_S", "0.08"))
_KALSHI_PAGE_DELAY_JITTER_S = float(os.environ.get("MESH_KALSHI_PAGE_DELAY_JITTER_S", "0.2"))
# Random delay before each full Polymarket+Kalshi cycle (decorrelates from other slow-tier jobs).
_PRE_FETCH_JITTER_S = float(os.environ.get("PREDICTION_MARKETS_PRE_FETCH_JITTER_S", "90"))
# Random pause between finishing Polymarket pagination and starting Kalshi.
_PROVIDER_GAP_JITTER_S = float(os.environ.get("PREDICTION_MARKETS_PROVIDER_GAP_JITTER_S", "45"))
_provider_pace_lock = threading.Lock()
_provider_last_request_at: dict[str, float] = {}
def prediction_markets_fetch_enabled() -> bool:
"""Return True only when the operator explicitly opts into Polymarket/Kalshi pulls."""
return str(os.environ.get("PREDICTION_MARKETS_ENABLED", "")).strip().lower() in {
"1",
"true",
"yes",
"on",
}
"""Return True when UI opt-in or PREDICTION_MARKETS_ENABLED enables pulls."""
from services.prediction_markets_settings import prediction_markets_fetch_enabled as _enabled
return _enabled()
def _pace_provider(provider: str, min_interval_s: float) -> None:
if min_interval_s <= 0:
return
jitter_s = (
_POLYMARKET_PAGE_DELAY_JITTER_S
if provider == "polymarket"
else _KALSHI_PAGE_DELAY_JITTER_S
if provider == "kalshi"
else 0.0
)
min_interval_s += random.uniform(0.0, jitter_s) if jitter_s > 0 else 0.0
with _provider_pace_lock:
now = time.monotonic()
wait_s = min_interval_s - (now - _provider_last_request_at.get(provider, 0.0))
@@ -47,6 +59,24 @@ def _pace_provider(provider: str, min_interval_s: float) -> None:
_provider_last_request_at[provider] = now
def _apply_pre_fetch_jitter() -> None:
if _PRE_FETCH_JITTER_S <= 0:
return
delay = random.uniform(0.0, _PRE_FETCH_JITTER_S)
if delay >= 1.0:
logger.debug("Prediction markets: pre-fetch jitter %.1fs", delay)
time.sleep(delay)
def _apply_provider_gap_jitter() -> None:
if _PROVIDER_GAP_JITTER_S <= 0:
return
delay = random.uniform(0.0, _PROVIDER_GAP_JITTER_S)
if delay >= 1.0:
logger.debug("Prediction markets: provider gap jitter %.1fs", delay)
time.sleep(delay)
def _finite_or_none(value):
try:
n = float(value)
@@ -750,7 +780,9 @@ def _merge_markets(poly_events: list[dict], kalshi_events: list[dict]) -> list[d
@cached(_market_cache)
def fetch_prediction_markets_raw() -> list[dict]:
"""Fetch and merge prediction markets from both sources. Cached 5 min."""
_apply_pre_fetch_jitter()
poly = _fetch_polymarket_events()
_apply_provider_gap_jitter()
kalshi = _fetch_kalshi_events()
merged = _merge_markets(poly, kalshi)
logger.info(
+9 -2
View File
@@ -11,15 +11,20 @@ import random
import logging
import functools
import requests
from requests.exceptions import ChunkedEncodingError, ConnectionError as RequestsConnectionError
from requests.exceptions import Timeout as RequestsTimeout
logger = logging.getLogger(__name__)
# Only retry on transient network/OS errors — not on parse errors, key errors, etc.
# Only retry on transient network/OS errors — not parse/key errors or HTTP 4xx/5xx.
# requests.HTTPError (from raise_for_status) is intentionally excluded.
TRANSIENT_ERRORS = (
TimeoutError,
ConnectionError,
OSError,
requests.RequestException,
RequestsConnectionError,
RequestsTimeout,
ChunkedEncodingError,
)
@@ -43,6 +48,8 @@ def with_retry(max_retries: int = 3, base_delay: float = 2.0, max_delay: float =
for attempt in range(1 + max_retries):
try:
return func(*args, **kwargs)
except requests.HTTPError:
raise
except TRANSIENT_ERRORS as exc:
last_exc = exc
if attempt < max_retries:
@@ -0,0 +1,84 @@
"""Scheduled Sentinel-2 road corridor freight trend fetcher (opt-in, slow tier)."""
from __future__ import annotations
import logging
import os
from datetime import datetime, timezone
from services.fetchers._store import _data_lock, _mark_fresh, is_any_active, latest_data
logger = logging.getLogger(__name__)
_REFRESH_HOURS = float(os.environ.get("ROAD_CORRIDOR_REFRESH_HOURS", "24"))
def _hours_since(iso_ts: str) -> float | None:
try:
dt = datetime.fromisoformat(iso_ts.replace("Z", "+00:00"))
if dt.tzinfo is None:
dt = dt.replace(tzinfo=timezone.utc)
return (datetime.now(timezone.utc) - dt).total_seconds() / 3600.0
except ValueError:
return None
def _feature_ready() -> bool:
from services.road_corridor_sat.config import optional_deps_available, road_corridor_sat_enabled
from services.road_corridor_sat.credentials import sentinel_credentials_configured
if not road_corridor_sat_enabled():
return False
if not optional_deps_available():
logger.debug("road_corridor_trends skipped — optional deps not installed")
return False
if not sentinel_credentials_configured():
logger.debug("road_corridor_trends skipped — Sentinel credentials missing")
return False
return True
def refresh_road_corridor_store() -> None:
from services.road_corridor_sat.storage import build_trends_payload
payload = build_trends_payload()
with _data_lock:
latest_data["road_corridor_trends"] = payload
_mark_fresh("road_corridor_trends")
def fetch_road_corridor_trends(force: bool = False) -> None:
"""Refresh scheduled corridor presets (default: laredo_i35 every 24h)."""
if not is_any_active("road_corridor_trends"):
return
if not _feature_ready():
return
from services.road_corridor_sat.config import SCHEDULED_PRESET_IDS
from services.road_corridor_sat.pipeline import analyze_preset
from services.road_corridor_sat.presets import get_preset
from services.road_corridor_sat.storage import load_refresh_state
state = load_refresh_state()
for preset_id in SCHEDULED_PRESET_IDS:
preset = get_preset(preset_id)
if preset is None:
logger.warning("Unknown scheduled road corridor preset: %s", preset_id)
continue
last = state.get(preset_id)
if last and not force:
age_h = _hours_since(last)
if age_h is not None and age_h < _REFRESH_HOURS:
logger.info(
"road_corridor %s fresh (%.1fh < %.1fh) — skipping",
preset_id,
age_h,
_REFRESH_HOURS,
)
continue
try:
logger.info("road_corridor analysis starting for %s", preset_id)
analyze_preset(preset_id)
except Exception as exc:
logger.exception("road_corridor analysis failed for %s: %s", preset_id, exc)
refresh_road_corridor_store()
@@ -30,8 +30,6 @@ _AIRPORTS_URL = "https://vrs-standing-data.adsb.lol/airports.csv.gz"
_REFRESH_INTERVAL_S = 5 * 24 * 3600
_HTTP_TIMEOUT_S = 60
from services.network_utils import DEFAULT_USER_AGENT as _USER_AGENT
_lock = threading.RLock()
_routes_by_callsign: dict[str, dict[str, Any]] = {}
_airports_by_icao: dict[str, dict[str, Any]] = {}
+381
View File
@@ -0,0 +1,381 @@
"""Telegram OSINT — public channel web previews (t.me/s) with keyword geoparsing."""
from __future__ import annotations
import hashlib
import logging
import os
import re
from datetime import datetime, timezone
from typing import Any
from services.fetchers._store import _data_lock, _mark_fresh, is_any_active, latest_data
from services.fetchers.news import resolve_coords_match
from services.network_utils import fetch_with_curl, outbound_user_agent
logger = logging.getLogger(__name__)
_DEFAULT_CHANNELS = (
"osintdefender",
"insiderpaper",
"aljazeeraenglish",
"nexta_live",
"war_monitor",
"OSINTtechnical",
"Liveuamap",
)
_MESSAGE_BLOCK_RE = re.compile(
r'<div class="tgme_widget_message_wrap js-widget_message_wrap"[\s\S]*?</div>\s*</div>\s*</div>',
re.IGNORECASE,
)
_TEXT_RE = re.compile(
r'<div class="tgme_widget_message_text[^>]*>([\s\S]*?)</div>',
re.IGNORECASE,
)
_DATE_RE = re.compile(
r'<a class="tgme_widget_message_date" href="(https://t\.me/[^"]+)".*?<time datetime="([^"]+)"',
re.IGNORECASE,
)
_HAS_VIDEO_RE = re.compile(
r'tgme_widget_message_video|js-message_video|<video\s',
re.IGNORECASE,
)
_HAS_PHOTO_RE = re.compile(r'tgme_widget_message_photo_wrap', re.IGNORECASE)
_VIDEO_SRC_RE = re.compile(r'<video[^>]+src="([^"]+)"', re.IGNORECASE)
_BG_IMAGE_RE = re.compile(r"background-image:url\('([^']+)'\)", re.IGNORECASE)
_TELEGRAM_MEDIA_HOST_SUFFIXES = (".telesco.pe", ".telegram-cdn.org")
# Cyrillic / Arabic aliases for war-reporting channels (merged after English resolver).
_EXTRA_PLACE_KEYWORDS: dict[str, tuple[float, float]] = {
"киев": (50.450, 30.523),
"київ": (50.450, 30.523),
"харьков": (49.993, 36.231),
"харків": (49.993, 36.231),
"одесса": (46.482, 30.724),
"одеса": (46.482, 30.724),
"донецк": (48.015, 37.803),
"донецьк": (48.015, 37.803),
"луганск": (48.574, 39.307),
"луганськ": (48.574, 39.307),
"москва": (55.755, 37.617),
"крым": (45.000, 34.000),
"крим": (45.000, 34.000),
"бахмут": (48.595, 38.000),
"запорожье": (47.838, 35.139),
"запоріжжя": (47.838, 35.139),
"غزة": (31.416, 34.333),
"دمشق": (33.513, 36.276),
"بيروت": (33.893, 35.501),
"tel aviv": (32.085, 34.781),
"תל אביב": (32.085, 34.781),
}
# Country-level news geocodes sit on national centroids that stack with threat alerts.
# Telegram uses major metro anchors so pins land on a different map cell than news.
_TELEGRAM_ANCHOR_OVERRIDES: dict[str, tuple[float, float]] = {
"israel": (32.085, 34.781), # Tel Aviv (news uses central Israel ~Jerusalem corridor)
"middle east": (32.085, 34.781),
"china": (39.904, 116.407), # Beijing (news uses country centroid)
"united states": (40.712, -74.006), # New York (news uses Washington DC)
"usa": (40.712, -74.006),
"us": (40.712, -74.006),
"america": (40.712, -74.006),
"uk": (51.507, -0.127), # London
"iran": (35.689, 51.389), # Tehran
"russia": (55.755, 37.617), # Moscow
"ukraine": (50.450, 30.523), # Kyiv
"france": (48.856, 2.352), # Paris
"germany": (52.520, 13.405), # Berlin
"lebanon": (34.433, 35.844), # Tripoli (news uses Beirut corridor)
}
_RISK_KEYWORDS = (
"war",
"missile",
"strike",
"attack",
"crisis",
"tension",
"military",
"conflict",
"defense",
"clash",
"nuclear",
"invasion",
"bomb",
"drone",
"weapon",
"sanctions",
"ceasefire",
"escalation",
"killed",
"destroyed",
"operation",
"casualty",
"frontline",
"threat",
"explosion",
"shelling",
)
def telegram_osint_enabled() -> bool:
return str(os.environ.get("TELEGRAM_OSINT_ENABLED", "true")).strip().lower() not in {
"0",
"false",
"no",
"off",
"",
}
def _configured_channels() -> list[str]:
raw = str(os.environ.get("TELEGRAM_OSINT_CHANNELS", "")).strip()
if raw:
return [part.strip().lstrip("@") for part in raw.split(",") if part.strip()]
return list(_DEFAULT_CHANNELS)
def telegram_media_host_allowed(hostname: str | None) -> bool:
host = str(hostname or "").strip().lower()
if not host:
return False
return any(host.endswith(suffix) for suffix in _TELEGRAM_MEDIA_HOST_SUFFIXES)
def _extract_media(block: str, link: str) -> dict[str, Any]:
has_video = bool(_HAS_VIDEO_RE.search(block))
has_photo = bool(_HAS_PHOTO_RE.search(block))
media_type: str | None = None
media_url: str | None = None
if has_video:
media_type = "video"
video_match = _VIDEO_SRC_RE.search(block)
if video_match:
media_url = video_match.group(1).strip()
elif has_photo:
media_type = "photo"
photo_match = _BG_IMAGE_RE.search(block)
if photo_match:
media_url = photo_match.group(1).strip()
embed_url: str | None = None
if media_type and link:
embed_url = f"{link}?embed=1"
return {
"media_type": media_type,
"media_url": media_url,
"embed_url": embed_url,
}
def _strip_html(text: str) -> str:
cleaned = re.sub(r"<br\s*/?>", "\n", text, flags=re.IGNORECASE)
cleaned = re.sub(r"<[^>]+>", "", cleaned)
return (
cleaned.replace("&quot;", '"')
.replace("&amp;", "&")
.replace("&lt;", "<")
.replace("&gt;", ">")
.strip()
)
def _score_risk(text: str) -> int:
lower = text.lower()
score = 1
for kw in _RISK_KEYWORDS:
if kw in lower:
score += 2
return min(10, score)
def _refresh_post_coords(post: dict[str, Any]) -> dict[str, Any]:
"""Re-apply geoparsing so stored posts pick up anchor updates."""
text = "\n".join(
str(part).strip()
for part in (post.get("title"), post.get("description"))
if part and str(part).strip()
)
if not text:
return post
coords = _resolve_telegram_coords(text)
if not coords:
return post
updated = dict(post)
updated["coords"] = [coords[0], coords[1]]
return updated
def _resolve_telegram_coords(text: str) -> tuple[float, float] | None:
lower = text.lower()
match = resolve_coords_match(lower)
if match:
_coords, keyword = match
anchor = _TELEGRAM_ANCHOR_OVERRIDES.get(keyword.strip().lower())
if anchor:
return anchor
return _coords
for keyword, coords in sorted(_EXTRA_PLACE_KEYWORDS.items(), key=lambda x: len(x[0]), reverse=True):
if keyword in lower:
return coords
return None
def _post_link(post: dict[str, Any]) -> str:
return str(post.get("link") or "").strip()
def _extract_new_channel_posts(
html: str,
channel: str,
known_links: set[str],
*,
bootstrap_limit: int = 12,
) -> list[dict[str, Any]]:
"""Return unseen posts from a channel page; stop once we hit a stored link."""
parsed = parse_telegram_channel_html(html, channel)
if not parsed:
return []
if not known_links:
return parsed[-bootstrap_limit:]
fresh: list[dict[str, Any]] = []
for post in reversed(parsed):
link = _post_link(post)
if not link:
continue
if link in known_links:
break
fresh.append(post)
fresh.reverse()
return fresh
def _merge_telegram_posts(
existing: list[dict[str, Any]],
incoming: list[dict[str, Any]],
*,
max_posts: int = 120,
) -> tuple[list[dict[str, Any]], int]:
known_links = {_post_link(post) for post in existing if _post_link(post)}
added = 0
for post in incoming:
link = _post_link(post)
if not link or link in known_links:
continue
known_links.add(link)
existing.append(post)
added += 1
existing.sort(key=lambda p: str(p.get("published") or ""), reverse=True)
return existing[:max_posts], added
def parse_telegram_channel_html(html: str, channel: str) -> list[dict[str, Any]]:
"""Parse public t.me/s channel preview HTML into post dicts."""
posts: list[dict[str, Any]] = []
for block in _MESSAGE_BLOCK_RE.findall(html or ""):
text_match = _TEXT_RE.search(block)
if not text_match:
continue
text = _strip_html(text_match.group(1))
if len(text) < 10:
continue
date_match = _DATE_RE.search(block)
link = date_match.group(1) if date_match else f"https://t.me/{channel}"
published = date_match.group(2) if date_match else datetime.now(timezone.utc).isoformat()
title = text.split("\n", 1)[0][:160]
risk_score = _score_risk(text)
coords = _resolve_telegram_coords(text)
post_id = hashlib.sha1(f"{link}|{published}".encode("utf-8")).hexdigest()[:16]
media = _extract_media(block, link)
posts.append(
{
"id": post_id,
"title": title,
"description": text[:1200],
"link": link,
"published": published,
"source": f"t.me/{channel}",
"channel": channel,
"risk_score": risk_score,
"coords": [coords[0], coords[1]] if coords else None,
**media,
}
)
return posts
def fetch_telegram_osint() -> dict[str, Any]:
if not is_any_active("telegram_osint"):
return latest_data.get("telegram_osint") or {"posts": [], "total": 0, "timestamp": None}
if not telegram_osint_enabled():
with _data_lock:
latest_data["telegram_osint"] = {"posts": [], "total": 0, "timestamp": None, "disabled": True}
_mark_fresh("telegram_osint")
return latest_data["telegram_osint"]
headers = {
"User-Agent": (
f"Mozilla/5.0 (compatible; {outbound_user_agent('telegram-osint')}) "
"AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36"
),
"Accept": "text/html,application/xhtml+xml",
}
with _data_lock:
prior = latest_data.get("telegram_osint") or {}
existing_posts = list(prior.get("posts") or [])
known_links = {_post_link(post) for post in existing_posts if _post_link(post)}
incoming: list[dict[str, Any]] = []
for channel in _configured_channels():
url = f"https://t.me/s/{channel}"
try:
resp = fetch_with_curl(url, timeout=15, headers=headers)
if not resp or resp.status_code != 200:
logger.warning(
"Telegram channel %s fetch failed: HTTP %s",
channel,
resp.status_code if resp else "no response",
)
continue
channel_new = _extract_new_channel_posts(resp.text, channel, known_links)
for post in channel_new:
link = _post_link(post)
if not link or link in known_links:
continue
known_links.add(link)
incoming.append(post)
except Exception as exc:
logger.warning("Telegram channel %s parse failed: %s", channel, exc)
merged_posts, added = _merge_telegram_posts(existing_posts, incoming)
merged_posts = [_refresh_post_coords(post) for post in merged_posts]
geolocated = sum(1 for p in merged_posts if p.get("coords"))
payload = {
"posts": merged_posts,
"total": len(merged_posts),
"geolocated": geolocated,
"timestamp": datetime.now(timezone.utc).isoformat(),
"channels": _configured_channels(),
"last_fetch_new": added,
}
with _data_lock:
latest_data["telegram_osint"] = payload
_mark_fresh("telegram_osint")
logger.info(
"Telegram OSINT: +%s new, %s retained (%s geolocated)",
added,
len(merged_posts),
geolocated,
)
return payload
+64 -52
View File
@@ -1,3 +1,4 @@
import os
import requests
import logging
import zipfile
@@ -20,6 +21,50 @@ logger = logging.getLogger(__name__)
# Cache Frontline data for 30 minutes, it doesn't move that fast
frontline_cache = TTLCache(maxsize=1, ttl=1800)
_DEFAULT_DEEPSTATE_MIRROR_REPO = "cyterat/deepstate-map-data"
def _deepstate_mirror_ref() -> tuple[str, str]:
"""Return (github_repo_slug, git_ref) for the DeepState mirror.
When ``DEEPSTATE_MIRROR_COMMIT`` is set, ingest is pinned to that immutable
SHA instead of following the mutable ``main`` branch (#362).
"""
repo = (os.environ.get("DEEPSTATE_MIRROR_REPO") or _DEFAULT_DEEPSTATE_MIRROR_REPO).strip()
if repo.count("/") != 1:
repo = _DEFAULT_DEEPSTATE_MIRROR_REPO
commit = (os.environ.get("DEEPSTATE_MIRROR_COMMIT") or "").strip()
ref = commit if commit else "main"
return repo, ref
def _latest_deepstate_geo_path(tree_items: list) -> str | None:
geo_files = [
item["path"]
for item in tree_items
if isinstance(item, dict)
and str(item.get("path", "")).startswith("data/deepstatemap_data_")
and str(item.get("path", "")).endswith(".geojson")
]
return sorted(geo_files)[-1] if geo_files else None
def _annotate_deepstate_geojson(data: dict) -> dict:
name_map = {
0: "Russian-occupied areas",
1: "Russian advance",
2: "Liberated area",
3: "Russian-occupied areas", # Crimea / LPR / DPR
4: "Directions of UA attacks",
}
if "features" in data:
for idx, feature in enumerate(data["features"]):
if "properties" not in feature or feature["properties"] is None:
feature["properties"] = {}
feature["properties"]["name"] = name_map.get(idx, "Russian-occupied areas")
feature["properties"]["zone_id"] = idx
return data
@cached(frontline_cache)
def fetch_ukraine_frontlines():
@@ -27,67 +72,34 @@ def fetch_ukraine_frontlines():
Fetches the latest GeoJSON data representing the Ukraine frontline.
We use the cyterat/deepstate-map-data github mirror since the public API is locked.
"""
repo, ref = _deepstate_mirror_ref()
try:
logger.info("Fetching DeepStateMap from GitHub mirror...")
logger.info("Fetching DeepStateMap from GitHub mirror (%s @ %s)...", repo, ref)
# First, query the repo tree to find the latest file name
tree_url = (
"https://api.github.com/repos/cyterat/deepstate-map-data/git/trees/main?recursive=1"
)
tree_url = f"https://api.github.com/repos/{repo}/git/trees/{ref}?recursive=1"
res_tree = requests.get(tree_url, timeout=10)
if res_tree.status_code == 200:
tree_data = res_tree.json().get("tree", [])
# Filter for geojson files in data folder
geo_files = [
item["path"]
for item in tree_data
if item["path"].startswith("data/deepstatemap_data_")
and item["path"].endswith(".geojson")
]
if geo_files:
# Get the alphabetically latest file (since it's named with YYYYMMDD)
latest_file = sorted(geo_files)[-1]
raw_url = f"https://raw.githubusercontent.com/cyterat/deepstate-map-data/main/{latest_file}"
logger.info(f"Downloading latest DeepStateMap: {raw_url}")
latest_file = _latest_deepstate_geo_path(res_tree.json().get("tree", []))
if latest_file:
raw_url = f"https://raw.githubusercontent.com/{repo}/{ref}/{latest_file}"
logger.info("Downloading DeepStateMap: %s", raw_url)
res_geo = requests.get(raw_url, timeout=20)
if res_geo.status_code == 200:
data = res_geo.json()
# The Cyterat GitHub mirror strips all properties and just provides a raw array of Feature polygons.
# Based on DeepStateMap's frontend mapping, the array index corresponds to the zone type:
# 0: Russian-occupied areas
# 1: Russian advance
# 2: Liberated area
# 3: Uncontested/Crimea (often folded into occupied)
name_map = {
0: "Russian-occupied areas",
1: "Russian advance",
2: "Liberated area",
3: "Russian-occupied areas", # Crimea / LPR / DPR
4: "Directions of UA attacks",
}
if "features" in data:
for idx, feature in enumerate(data["features"]):
if "properties" not in feature or feature["properties"] is None:
feature["properties"] = {}
feature["properties"]["name"] = name_map.get(
idx, "Russian-occupied areas"
)
feature["properties"]["zone_id"] = idx
return data
else:
logger.error(
f"Failed to fetch parsed Github Raw GeoJSON: {res_geo.status_code}"
)
return _annotate_deepstate_geojson(res_geo.json())
logger.error(
"Failed to fetch parsed Github Raw GeoJSON: %s", res_geo.status_code
)
else:
logger.error("No deepstatemap_data_*.geojson files in mirror tree at %s", ref)
else:
logger.error(f"Failed to fetch Github Tree for Deepstatemap: {res_tree.status_code}")
logger.error(
"Failed to fetch Github tree for Deepstatemap (%s @ %s): %s",
repo,
ref,
res_tree.status_code,
)
except (requests.RequestException, ConnectionError, TimeoutError, ValueError, KeyError) as e:
logger.error(f"Error fetching DeepStateMap: {e}")
return None
@@ -1,14 +1,20 @@
"""Function Keys — anonymous citizenship proof.
"""Function Keys — anonymous credential scaffolding.
Source of truth: ``infonet-economy/IMPLEMENTATION_PLAN.md`` §4.4,
``infonet-economy/BRAINDUMP.md`` §11 item 9.
A citizen should be able to prove "I am a UBI-eligible Infonet
citizen" to a real-world operator (food bank, community service)
**without revealing their Infonet identity**. The naive approach
(scramble a public key, record each redemption on chain) leaks
identity through metadata correlation (time, location, operator,
frequency).
A citizen should eventually be able to prove "I am a UBI-eligible
Infonet citizen" to a real-world operator (food bank, community
service) **without revealing their Infonet identity**. The current
Python implementation wires the accounting, nullifier, receipt, and
operator flows, but its HMAC challenge-response is a placeholder for
integration tests. It is not a production anonymous or zero-knowledge
citizenship proof until blind signatures or anonymous credentials are
selected and wired.
The naive approach (scramble a public key, record each redemption on
chain) leaks identity through metadata correlation (time, location,
operator, frequency).
The full design has six pieces; five are implemented in pure Python
here. The remaining piece issuance via blind signatures or
@@ -27,7 +33,8 @@ Pieces:
operator: tracked via ``NullifierTracker``.
3. **Challenge-response** (`challenge_response.py`) operator
issues a fresh nonce, key-holder signs with the Function Key's
secret. Prevents screenshot attacks, key sharing, replay.
secret. This is HMAC placeholder plumbing for screenshot/replay
resistance, not the final anonymous credential proof.
4. **Two-phase commit receipts** (`receipt.py`) Phase 1
verification receipt (operator-signed, day-level date NOT
timestamp, no node_id). Phase 2 fulfillment receipt (citizen
@@ -0,0 +1,94 @@
"""Country risk index (static scores + USGS quake enrichment)."""
from __future__ import annotations
from datetime import datetime, timezone
from typing import Any
from zoneinfo import ZoneInfo
from services.network_utils import fetch_with_curl
RISK_FACTORS: dict[str, dict[str, Any]] = {
"UA": {"base": 85, "tags": ["active_conflict", "infrastructure_damage"]},
"RU": {"base": 72, "tags": ["sanctions", "military_mobilization"]},
"IL": {"base": 78, "tags": ["active_conflict", "regional_instability"]},
"PS": {"base": 90, "tags": ["active_conflict", "humanitarian_crisis"]},
"SY": {"base": 82, "tags": ["post_conflict", "infrastructure_damage"]},
"YE": {"base": 88, "tags": ["active_conflict", "humanitarian_crisis"]},
"MM": {"base": 76, "tags": ["civil_unrest", "military_junta"]},
"SD": {"base": 84, "tags": ["active_conflict", "humanitarian_crisis"]},
"AF": {"base": 80, "tags": ["post_conflict", "governance_collapse"]},
"KP": {"base": 70, "tags": ["nuclear_risk", "isolation"]},
"IR": {"base": 68, "tags": ["sanctions", "nuclear_program", "regional_proxy"]},
"CN": {"base": 35, "tags": ["strategic_competition", "taiwan_tensions"]},
"TW": {"base": 45, "tags": ["invasion_risk", "semiconductor_dependency"]},
"VE": {"base": 60, "tags": ["economic_collapse", "political_instability"]},
"HT": {"base": 85, "tags": ["gang_violence", "governance_collapse"]},
"LB": {"base": 65, "tags": ["economic_crisis", "political_deadlock"]},
"PK": {"base": 55, "tags": ["terrorism", "political_instability"]},
"SO": {"base": 82, "tags": ["terrorism", "state_fragility"]},
"LY": {"base": 72, "tags": ["divided_government", "militia_control"]},
"ET": {"base": 62, "tags": ["ethnic_tensions", "regional_conflicts"]},
}
EXCHANGES = [
{"name": "NYSE", "tz": "America/New_York", "open": 9.5, "close": 16, "country": "US"},
{"name": "NASDAQ", "tz": "America/New_York", "open": 9.5, "close": 16, "country": "US"},
{"name": "LSE", "tz": "Europe/London", "open": 8, "close": 16.5, "country": "GB"},
{"name": "TSE", "tz": "Asia/Tokyo", "open": 9, "close": 15, "country": "JP"},
{"name": "SSE", "tz": "Asia/Shanghai", "open": 9.5, "close": 15, "country": "CN"},
{"name": "HKEX", "tz": "Asia/Hong_Kong", "open": 9.5, "close": 16, "country": "HK"},
{"name": "FRA", "tz": "Europe/Berlin", "open": 8, "close": 20, "country": "DE"},
{"name": "TSX", "tz": "America/Toronto", "open": 9.5, "close": 16, "country": "CA"},
{"name": "MOEX", "tz": "Europe/Moscow", "open": 10, "close": 18.5, "country": "RU"},
]
def _exchange_open(ex: dict[str, Any]) -> bool:
try:
now = datetime.now(ZoneInfo(ex["tz"]))
if now.weekday() >= 5:
return False
decimal = now.hour + now.minute / 60
return ex["open"] <= decimal < ex["close"]
except Exception:
return False
def build_country_risk_payload() -> dict[str, Any]:
quake_risks: dict[str, float] = {}
try:
resp = fetch_with_curl(
"https://earthquake.usgs.gov/earthquakes/feed/v1.0/summary/4.5_day.geojson",
timeout=5,
)
if resp.status_code == 200:
for f in resp.json().get("features") or []:
place = (f.get("properties") or {}).get("place") or ""
mag = (f.get("properties") or {}).get("mag") or 0
for code in RISK_FACTORS:
if code.lower() in place.lower():
quake_risks[code] = quake_risks.get(code, 0) + mag
except Exception:
pass
countries = []
for code, data in RISK_FACTORS.items():
base = data["base"]
score = min(100, base + quake_risks.get(code, 0))
countries.append(
{
"code": code,
"risk_score": score,
"risk_level": "CRITICAL" if base >= 80 else "HIGH" if base >= 60 else "ELEVATED" if base >= 40 else "LOW",
"tags": data["tags"],
}
)
countries.sort(key=lambda c: c["risk_score"], reverse=True)
exchanges = [{"name": e["name"], "country": e["country"], "open": _exchange_open(e)} for e in EXCHANGES]
return {
"countries": countries,
"exchanges": exchanges,
"open_exchanges": sum(1 for e in exchanges if e["open"]),
"total_exchanges": len(exchanges),
"timestamp": datetime.now(timezone.utc).isoformat(),
}
+34 -16
View File
@@ -32,14 +32,14 @@ logger = logging.getLogger(__name__)
_REFRESH_SECONDS = 24 * 3600
kiwisdr_cache: TTLCache = TTLCache(maxsize=1, ttl=_REFRESH_SECONDS)
_SOURCE_URL = "http://rx.linkfanel.net/kiwisdr_com.js"
_SOURCE_URL_HTTP = "http://rx.linkfanel.net/kiwisdr_com.js"
_SOURCE_URL_HTTPS = "https://rx.linkfanel.net/kiwisdr_com.js"
_CACHE_FILE = Path(__file__).resolve().parent.parent / "data" / "kiwisdr_cache.json"
# Bundled fallback — shipped with the codebase so the KiwiSDR layer always
# has something to render even when the upstream is unreachable, returns
# garbage, or appears to have been tampered with. Issue #206: the upstream
# only speaks HTTP, so we can't rely on TLS for integrity — instead we
# validate the response's shape and fall back to this bundle if it doesn't
# look right.
# garbage, or appears to have been tampered with. Issue #206 / #364: try HTTPS
# first, then HTTP; we still validate shape and fall back to this bundle if the
# payload does not look right.
_BUNDLED_FALLBACK = Path(__file__).resolve().parent.parent / "data" / "kiwisdr_directory.json"
# Minimum number of receivers we expect from a healthy upstream response.
@@ -184,6 +184,29 @@ def _validate_fetched_nodes(nodes: list[dict]) -> bool:
return True
def _fetch_mirror_payload_text() -> str | None:
"""Try HTTPS first, then HTTP. Shape validation still applies (#364)."""
from services.network_utils import fetch_with_curl
last_error: Exception | None = None
for url in (_SOURCE_URL_HTTPS, _SOURCE_URL_HTTP):
try:
res = fetch_with_curl(url, timeout=20)
if res and res.status_code == 200:
if url == _SOURCE_URL_HTTP:
logger.info(
"KiwiSDR: HTTPS mirror unavailable; using HTTP with shape validation"
)
return res.text
last_error = RuntimeError(f"HTTP {getattr(res, 'status_code', 'unknown')}")
except Exception as e:
last_error = e
logger.debug("KiwiSDR mirror fetch failed for %s: %s", url, e)
if last_error is not None:
logger.warning("KiwiSDR mirror fetch failed: %s", last_error)
return None
def _load_bundled_fallback() -> list[dict]:
"""Last-resort directory shipped with the codebase. Always returns a
list (may be empty if the bundle is missing in older deployments)."""
@@ -202,9 +225,8 @@ def _load_bundled_fallback() -> list[dict]:
def fetch_kiwisdr_nodes() -> list[dict]:
"""Return the KiwiSDR receiver list, refreshed at most once per day.
Layered fallback (issue #206 — upstream is HTTP-only, so we defend with
content validation + bundled static directory rather than trying to
upgrade the transport):
Layered fallback (issue #206 / #364 — HTTPS first, HTTP fallback, plus
content validation + bundled static directory):
1. In-memory cache (handled by @cached on this function)
2. On-disk cache if <24h old
@@ -216,8 +238,6 @@ def fetch_kiwisdr_nodes() -> list[dict]:
tampered upstream returning garbage is caught by _validate_fetched_nodes()
and falls through to whatever previously-trusted snapshot we have.
"""
from services.network_utils import fetch_with_curl
# 1. Trust on-disk cache if fresh.
cached_nodes = _load_disk_cache()
if cached_nodes is not None:
@@ -230,14 +250,12 @@ def fetch_kiwisdr_nodes() -> list[dict]:
fresh_nodes: list[dict] = []
fetch_succeeded = False
try:
res = fetch_with_curl(_SOURCE_URL, timeout=20)
if res and res.status_code == 200:
fresh_nodes = _parse_mirror_payload(res.text)
body = _fetch_mirror_payload_text()
if body:
fresh_nodes = _parse_mirror_payload(body)
fetch_succeeded = True
else:
logger.warning(
f"KiwiSDR fetch returned HTTP {res.status_code if res else 'no response'}"
)
logger.warning("KiwiSDR fetch returned no usable mirror payload")
except (requests.RequestException, ConnectionError, TimeoutError, ValueError, KeyError) as e:
logger.warning(f"KiwiSDR fetch exception: {e}")
+11 -1
View File
@@ -27,11 +27,21 @@ def fetch_liveuamap():
browser = p.chromium.launch(
headless=True, args=["--disable-blink-features=AutomationControlled"]
)
from services.network_utils import outbound_user_agent
# Per-install handle (no shared Shadowbroker product token). Stealth remains
# for Turnstile; see docs/OUTBOUND_DATA.md #348.
playwright_ua = (
f"Mozilla/5.0 (compatible; {outbound_user_agent('liveuamap')})"
)
context = browser.new_context(
user_agent="Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36",
user_agent=playwright_ua,
viewport={"width": 1920, "height": 1080},
color_scheme="dark",
)
# Bound navigation and script evaluation so a stuck region cannot hang the slow pool.
context.set_default_navigation_timeout(60_000)
context.set_default_timeout(30_000)
page = context.new_page()
stealth_sync(page)
+73
View File
@@ -0,0 +1,73 @@
"""LiveUAMap Playwright scraper opt-in (#348) — UI consent on Windows."""
from __future__ import annotations
import json
import logging
import os
import threading
from pathlib import Path
from typing import Any
logger = logging.getLogger(__name__)
_OPT_IN_FILE = Path(__file__).resolve().parent.parent / "data" / "liveuamap_scraper_opt_in.json"
_OPT_IN_LOCK = threading.Lock()
def _env_flag(name: str) -> str:
return str(os.getenv(name, "")).strip().lower()
def liveuamap_requires_ui_opt_in() -> bool:
"""Windows local installs need explicit consent before Playwright contacts LiveUAMap."""
return os.name == "nt"
def get_liveuamap_ui_opt_in() -> bool:
if not _OPT_IN_FILE.exists():
return False
try:
payload = json.loads(_OPT_IN_FILE.read_text(encoding="utf-8"))
return bool(payload.get("opted_in"))
except (OSError, json.JSONDecodeError, TypeError) as e:
logger.warning("LiveUAMap opt-in file unreadable: %s", e)
return False
def set_liveuamap_ui_opt_in(opted_in: bool) -> None:
_OPT_IN_FILE.parent.mkdir(parents=True, exist_ok=True)
with _OPT_IN_LOCK:
_OPT_IN_FILE.write_text(
json.dumps({"opted_in": bool(opted_in)}, indent=2),
encoding="utf-8",
)
def liveuamap_scraper_enabled() -> bool:
"""Whether the Playwright LiveUAMap scraper may run on this backend."""
setting = _env_flag("SHADOWBROKER_ENABLE_LIVEUAMAP_SCRAPER")
if setting in {"1", "true", "yes", "on"}:
return True
if setting in {"0", "false", "no", "off"}:
return False
if not liveuamap_requires_ui_opt_in():
return True
return get_liveuamap_ui_opt_in()
def liveuamap_scraper_status() -> dict[str, Any]:
setting = _env_flag("SHADOWBROKER_ENABLE_LIVEUAMAP_SCRAPER")
env_override = None
if setting in {"1", "true", "yes", "on"}:
env_override = "on"
elif setting in {"0", "false", "no", "off"}:
env_override = "off"
ui_opted_in = get_liveuamap_ui_opt_in()
requires = liveuamap_requires_ui_opt_in()
return {
"platform_requires_opt_in": requires,
"ui_opted_in": ui_opted_in,
"scraper_enabled": liveuamap_scraper_enabled(),
"env_override": env_override,
}
@@ -287,28 +287,18 @@ def write_signed_bootstrap_manifest(
return manifest
def load_bootstrap_manifest(
path: str | Path,
def parse_bootstrap_manifest_dict(
raw: dict[str, Any],
*,
signer_public_key_b64: str,
now: float | None = None,
) -> BootstrapManifest:
manifest_path = _resolve_manifest_path(str(path))
try:
raw = json.loads(manifest_path.read_text(encoding="utf-8"))
except FileNotFoundError as exc:
raise BootstrapManifestError(f"bootstrap manifest not found: {manifest_path}") from exc
except json.JSONDecodeError as exc:
raise BootstrapManifestError("bootstrap manifest is not valid JSON") from exc
if not isinstance(raw, dict):
raise BootstrapManifestError("bootstrap manifest root must be an object")
signature = str(raw.get("signature", "") or "").strip()
payload = {key: value for key, value in raw.items() if key != "signature"}
if not signature:
raise BootstrapManifestError("bootstrap manifest signature is required")
_verify_manifest_signature(
payload,
signature_b64=signature,
@@ -325,11 +315,36 @@ def load_bootstrap_manifest(
)
def load_bootstrap_manifest(
path: str | Path,
*,
signer_public_key_b64: str,
now: float | None = None,
) -> BootstrapManifest:
manifest_path = _resolve_manifest_path(str(path))
try:
raw = json.loads(manifest_path.read_text(encoding="utf-8"))
except FileNotFoundError as exc:
raise BootstrapManifestError(f"bootstrap manifest not found: {manifest_path}") from exc
except json.JSONDecodeError as exc:
raise BootstrapManifestError("bootstrap manifest is not valid JSON") from exc
if not isinstance(raw, dict):
raise BootstrapManifestError("bootstrap manifest root must be an object")
return parse_bootstrap_manifest_dict(
raw,
signer_public_key_b64=signer_public_key_b64,
now=now,
)
def load_bootstrap_manifest_from_settings(*, now: float | None = None) -> BootstrapManifest | None:
settings = get_settings()
if bool(getattr(settings, "MESH_BOOTSTRAP_DISABLED", False)):
return None
signer_public_key_b64 = str(getattr(settings, "MESH_BOOTSTRAP_SIGNER_PUBLIC_KEY", "") or "").strip()
from services.mesh.mesh_fleet_defaults import effective_bootstrap_signer_public_key_b64
signer_public_key_b64 = effective_bootstrap_signer_public_key_b64()
if not signer_public_key_b64:
return None
manifest_path = _resolve_manifest_path(str(getattr(settings, "MESH_BOOTSTRAP_MANIFEST_PATH", "") or ""))
+3 -3
View File
@@ -168,9 +168,9 @@ def resolve_peer_key_for_url(peer_url: str) -> bytes:
try:
from services.config import get_settings
global_secret = str(
getattr(get_settings(), "MESH_PEER_PUSH_SECRET", "") or ""
).strip()
from services.mesh.mesh_fleet_defaults import effective_peer_push_secret
global_secret = effective_peer_push_secret()
except Exception:
return b""
if not global_secret:
@@ -0,0 +1,179 @@
"""Invite-scoped DM connect delivery: auto relay release and contact severance."""
from __future__ import annotations
from typing import Any
CONNECT_AUTO_RELEASE_INTENTS = frozenset(
{
"invite_short_address",
"invite_import",
"contact_request",
"contact_accept",
"contact_offer",
}
)
INVITE_CONNECT_TRUST_LEVELS = frozenset({"invite_pinned", "sas_verified"})
def _release_profile() -> str:
try:
from services.release_profiles import current_release_profile
return str(current_release_profile() or "dev")
except Exception:
return "dev"
def grant_connect_relay_policy(
recipient_id: str,
*,
reason: str = "connect_scoped_auto_release",
) -> dict[str, Any]:
"""Pre-authorize hidden relay delivery for an explicit connect target."""
peer_key = str(recipient_id or "").strip()
if not peer_key:
return {"ok": False, "detail": "recipient_id required"}
try:
from services.mesh.mesh_relay_policy import grant_relay_policy
return grant_relay_policy(
scope_type="dm_contact",
scope_id=peer_key,
profile=_release_profile(),
hidden_transport_required=True,
reason=str(reason or "connect_scoped_auto_release"),
)
except Exception as exc:
return {"ok": False, "detail": str(exc) or type(exc).__name__}
def revoke_connect_relay_policy(recipient_id: str) -> dict[str, Any]:
peer_key = str(recipient_id or "").strip()
if not peer_key:
return {"ok": False, "detail": "recipient_id required"}
try:
from services.mesh.mesh_relay_policy import revoke_relay_policy
revoked = int(
revoke_relay_policy(
scope_type="dm_contact",
scope_id=peer_key,
profile=_release_profile(),
)
or 0
)
return {"ok": True, "revoked": revoked}
except Exception as exc:
return {"ok": False, "detail": str(exc) or type(exc).__name__}
def recipient_has_invite_connect_scope(recipient_id: str) -> bool:
peer_key = str(recipient_id or "").strip()
if not peer_key:
return False
try:
from services.mesh.mesh_wormhole_contacts import get_wormhole_dm_contact
contact = get_wormhole_dm_contact(peer_key) or {}
except Exception:
return False
if str(contact.get("invitePinnedPrekeyLookupHandle", "") or "").strip():
return True
if str(contact.get("invitePinnedLookupPeerUrl", "") or "").strip():
return True
trust = str(contact.get("trust_level", "") or "").strip().lower()
return trust in INVITE_CONNECT_TRUST_LEVELS
def relay_push_peer_urls_for_payload(payload: dict[str, Any]) -> list[str]:
urls: list[str] = []
for raw in list(payload.get("relay_push_peer_urls") or []):
normalized = str(raw or "").strip().rstrip("/")
if normalized and normalized not in urls:
urls.append(normalized)
lookup_peer_url = str(payload.get("lookup_peer_url", "") or "").strip().rstrip("/")
if lookup_peer_url:
urls = [url for url in urls if url != lookup_peer_url]
urls.insert(0, lookup_peer_url)
recipient_id = str(payload.get("recipient_id", "") or "").strip()
if recipient_id and not urls:
try:
from services.mesh.mesh_wormhole_contacts import get_wormhole_dm_contact
contact = get_wormhole_dm_contact(recipient_id) or {}
pinned = str(contact.get("invitePinnedLookupPeerUrl", "") or "").strip().rstrip("/")
if pinned:
urls.append(pinned)
except Exception:
pass
return urls
def should_auto_release_dm_payload(payload: dict[str, Any]) -> bool:
if str(payload.get("delivery_class", "") or "").strip().lower() != "request":
return False
intent = str(payload.get("connect_intent", "") or "").strip().lower()
if intent in CONNECT_AUTO_RELEASE_INTENTS:
return True
if str(payload.get("lookup_peer_url", "") or "").strip():
return True
recipient_id = str(payload.get("recipient_id", "") or "").strip()
return bool(recipient_id and recipient_has_invite_connect_scope(recipient_id))
def enrich_connect_release_payload(payload: dict[str, Any]) -> dict[str, Any]:
"""Attach invite-owner relay hints used during private release."""
enriched = dict(payload or {})
recipient_id = str(enriched.get("recipient_id", "") or "").strip()
lookup_peer_url = str(enriched.get("lookup_peer_url", "") or "").strip().rstrip("/")
if not lookup_peer_url and recipient_id:
try:
from services.mesh.mesh_wormhole_contacts import get_wormhole_dm_contact
contact = get_wormhole_dm_contact(recipient_id) or {}
lookup_peer_url = str(contact.get("invitePinnedLookupPeerUrl", "") or "").strip().rstrip("/")
except Exception:
lookup_peer_url = ""
if lookup_peer_url:
enriched["lookup_peer_url"] = lookup_peer_url
push_urls = relay_push_peer_urls_for_payload(enriched)
if push_urls:
enriched["relay_push_peer_urls"] = push_urls
return enriched
def auto_release_connect_dm_outbox(*, outbox_id: str, payload: dict[str, Any]) -> dict[str, Any]:
"""Grant scoped relay policy and approve release for invite-scoped connect traffic."""
normalized_outbox = str(outbox_id or "").strip()
enriched = enrich_connect_release_payload(payload)
if not normalized_outbox:
return {"ok": False, "detail": "missing outbox_id"}
if not should_auto_release_dm_payload(enriched):
return {"ok": True, "skipped": True, "reason": "not_connect_scoped"}
recipient_id = str(enriched.get("recipient_id", "") or "").strip()
if not recipient_id:
return {"ok": False, "detail": "missing recipient_id"}
grant = grant_connect_relay_policy(recipient_id)
try:
from services.mesh.mesh_private_outbox import private_delivery_outbox
from services.mesh.mesh_private_release_worker import private_release_worker
private_delivery_outbox.approve_relay_release(normalized_outbox)
private_release_worker.ensure_started()
private_release_worker.wake()
except Exception as exc:
return {
"ok": False,
"detail": str(exc) or type(exc).__name__,
"grant": grant,
}
return {
"ok": True,
"auto_released": True,
"outbox_id": normalized_outbox,
"recipient_id": recipient_id,
"grant": grant,
"relay_push_peer_urls": relay_push_peer_urls_for_payload(enriched),
}
+240 -105
View File
@@ -1506,6 +1506,7 @@ class DMRelay:
sender_token_hash: str = "",
payload_format: str = "dm1",
session_welcome: str = "",
replication_peer_urls: list[str] | None = None,
) -> dict[str, Any]:
with self._lock:
self._refresh_from_shared_relay()
@@ -1573,46 +1574,214 @@ class DMRelay:
}
if not msg_id:
msg_id = f"dm_{int(time.time() * 1000)}_{secrets.token_hex(6)}"
elif any(m.msg_id == msg_id for m in self._mailboxes[mailbox_key]):
return {"ok": True, "msg_id": msg_id}
relay_sender_id = (
f"sender_token:{sender_token_hash}"
if sender_token_hash
else sender_id
)
self._mailboxes[mailbox_key].append(
DMMessage(
sender_id=relay_sender_id,
ciphertext=ciphertext,
timestamp=time.time(),
msg_id=msg_id,
delivery_class=delivery_class,
sender_seal=sender_seal,
sender_block_ref=sender_block_ref,
payload_format=str(payload_format or "dm1"),
session_welcome=str(session_welcome or ""),
duplicate_hit = any(m.msg_id == msg_id for m in self._mailboxes[mailbox_key])
if not duplicate_hit:
relay_sender_id = (
f"sender_token:{sender_token_hash}"
if sender_token_hash
else sender_id
)
)
self._stats["messages_in_memory"] = sum(len(v) for v in self._mailboxes.values())
self._save()
# Cross-node mailbox replication: push the freshly-stored
# envelope to every authenticated relay peer so the recipient
# can log into ANY node and find their messages. The push is
# async (fire-and-forget thread) so deposit() returns
# immediately — slow Tor peers can't block the sender's UX.
# Each receiving peer re-enforces the per-sender cap on
# acceptance, so hostile relays can't widen the cap.
self._mailboxes[mailbox_key].append(
DMMessage(
sender_id=relay_sender_id,
ciphertext=ciphertext,
timestamp=time.time(),
msg_id=msg_id,
delivery_class=delivery_class,
sender_seal=sender_seal,
sender_block_ref=sender_block_ref,
payload_format=str(payload_format or "dm1"),
session_welcome=str(session_welcome or ""),
)
)
self._stats["messages_in_memory"] = sum(len(v) for v in self._mailboxes.values())
self._save()
preferred_urls = list(replication_peer_urls or [])
envelope_for_push: dict[str, Any] | None = None
try:
envelope_for_push = self.envelope_for_replication(
mailbox_key=mailbox_key, msg_id=msg_id,
mailbox_key=mailbox_key,
msg_id=msg_id,
recipient_id=recipient_id,
recipient_token=recipient_token,
)
if envelope_for_push:
self._replicate_envelope_to_peers_async(
envelope=envelope_for_push,
)
except Exception:
metrics_inc("dm_replication_push_error")
return {"ok": True, "msg_id": msg_id}
deposit_result = {"ok": True, "msg_id": msg_id}
if duplicate_hit:
deposit_result["duplicate"] = True
if envelope_for_push:
# Invite-scoped connect traffic names an explicit recipient relay
# (lookup_peer_url). Block until that push completes so the
# recipient can poll their own node; fleet-wide fan-out stays
# async so dead manifest peers cannot wedge deposit().
if preferred_urls:
logger.info(
"DM deposit awaiting scoped replicate to %d peer(s)",
len(preferred_urls),
)
deposit_result["replicate"] = self._replicate_envelope_to_peers(
envelope=envelope_for_push,
preferred_peer_urls=preferred_urls,
)
else:
self._replicate_envelope_to_peers_async(
envelope=envelope_for_push,
preferred_peer_urls=[],
)
elif preferred_urls:
logger.warning(
"DM deposit skipped scoped replicate: envelope missing for msg_id=%s",
msg_id,
)
return deposit_result
def _replicate_envelope_to_peers(
self,
*,
envelope: dict[str, Any],
preferred_peer_urls: list[str] | None = None,
) -> dict[str, Any]:
"""Push an envelope to relay peers. Returns per-peer results."""
import hashlib
import hmac
import requests as _requests
from services.mesh.mesh_crypto import (
normalize_peer_url,
resolve_peer_key_for_url,
)
from services.mesh.mesh_router import authenticated_push_peer_urls
peers: list[str] = []
for raw_url in list(preferred_peer_urls or []):
normalized_preferred = normalize_peer_url(str(raw_url or "").strip())
if normalized_preferred and normalized_preferred not in peers:
peers.append(normalized_preferred)
if not peers:
for peer_url in authenticated_push_peer_urls():
normalized_peer = normalize_peer_url(str(peer_url or "").strip())
if normalized_peer and normalized_peer not in peers:
peers.append(normalized_peer)
if not peers:
return {"ok": False, "detail": "no_relay_peers", "pushed": [], "failed": []}
logger.info(
"DM replicate push starting for %d peer(s): %s",
len(peers),
", ".join(peers[:3]) + ("..." if len(peers) > 3 else ""),
)
payload = json.dumps(
{"envelope": envelope},
separators=(",", ":"),
ensure_ascii=False,
).encode("utf-8")
base_timeout = max(
1,
int(getattr(self._settings(), "MESH_RELAY_PUSH_TIMEOUT_S", 10) or 10),
)
from main import _infonet_peer_requests_proxies
preferred_set = {
normalize_peer_url(str(raw_url or "").strip())
for raw_url in list(preferred_peer_urls or [])
}
preferred_set.discard("")
pushed: list[str] = []
failed: list[dict[str, str]] = []
for peer_url in peers:
try:
normalized = normalize_peer_url(peer_url)
timeout = max(180 if ".onion" in normalized else 1, base_timeout)
headers = {"Content-Type": "application/json"}
peer_key = resolve_peer_key_for_url(normalized)
if peer_key:
headers["X-Peer-Url"] = normalized
headers["X-Peer-HMAC"] = hmac.new(
peer_key, payload, hashlib.sha256
).hexdigest()
url = f"{peer_url}/api/mesh/dm/replicate-envelope"
request_kwargs: dict[str, Any] = {
"data": payload,
"timeout": timeout,
"headers": headers,
}
proxies = _infonet_peer_requests_proxies(normalized)
if proxies:
request_kwargs["proxies"] = proxies
resp = None
max_attempts = 3 if normalized in preferred_set else 2
last_exc = ""
for attempt in range(max_attempts):
try:
resp = _requests.post(url, **request_kwargs)
break
except Exception as exc:
last_exc = str(exc) or type(exc).__name__
if attempt + 1 < max_attempts:
time.sleep(5.0 * (attempt + 1))
continue
logger.warning(
"DM replicate push to %s failed: %s",
peer_url,
last_exc,
)
metrics_inc("dm_replication_push_error")
resp = None
break
if resp is None:
failed.append({"url": peer_url, "detail": last_exc or "request_failed"})
continue
if resp.status_code == 200:
body_ok = True
detail = ""
try:
body = resp.json()
if isinstance(body, dict) and body.get("ok") is False:
body_ok = False
detail = str(body.get("detail", "") or "replicate rejected")[:200]
except Exception:
body_ok = True
if body_ok:
logger.info("DM replicate push to %s succeeded", peer_url)
metrics_inc("dm_replication_push_ok")
pushed.append(peer_url)
else:
logger.warning(
"DM replicate push to %s rejected: %s",
peer_url,
detail,
)
metrics_inc("dm_replication_push_rejected")
failed.append({"url": peer_url, "detail": detail or "replicate_rejected"})
else:
detail = (resp.text or "")[:200]
logger.warning(
"DM replicate push to %s -> %s: %s",
peer_url,
resp.status_code,
detail,
)
metrics_inc("dm_replication_push_rejected")
failed.append({"url": peer_url, "detail": f"http_{resp.status_code}: {detail}"})
except Exception as exc:
logger.warning("DM replicate push outer failure for %s: %s", peer_url, exc)
metrics_inc("dm_replication_push_error")
failed.append({"url": peer_url, "detail": str(exc) or type(exc).__name__})
scoped = bool(preferred_set)
ok = bool(pushed) if scoped else bool(pushed) or not failed
return {
"ok": ok,
"scoped": scoped,
"pushed": pushed,
"failed": failed,
}
def accept_replica(
self,
@@ -1645,6 +1814,33 @@ class DMRelay:
mailbox_key = str(envelope.get("mailbox_key", "") or "").strip()
sender_block_ref = str(envelope.get("sender_block_ref", "") or "").strip()
ciphertext = str(envelope.get("ciphertext", "") or "")
delivery_class = str(envelope.get("delivery_class", "") or "").strip().lower()
recipient_id = str(envelope.get("recipient_id", "") or "").strip()
recipient_token = str(envelope.get("recipient_token", "") or "").strip()
if delivery_class not in ("request", "shared", "self"):
if recipient_id and not recipient_token:
delivery_class = "request"
elif recipient_token:
delivery_class = "shared"
if delivery_class == "request":
if not recipient_id:
try:
from services.mesh.mesh_wormhole_persona import get_dm_identity
recipient_id = str((get_dm_identity() or {}).get("node_id") or "").strip()
except Exception:
recipient_id = ""
if recipient_id:
mailbox_key = self.mailbox_key_for_delivery(
recipient_id=recipient_id,
delivery_class="request",
)
elif delivery_class == "shared" and recipient_token:
mailbox_key = self.mailbox_key_for_delivery(
recipient_id=recipient_id,
delivery_class="shared",
recipient_token=recipient_token,
)
if not msg_id or not mailbox_key or not sender_block_ref or not ciphertext:
return {"ok": False, "detail": "envelope missing required fields"}
@@ -1662,7 +1858,6 @@ class DMRelay:
# Same per-class cap as the deposit path — defense in depth
# against a peer that wraps a "deposit" as a "replica" to
# bypass the class limit.
delivery_class = str(envelope.get("delivery_class", "") or "")
if delivery_class in ("request", "shared", "self"):
class_limit = self._mailbox_limit_for_class(delivery_class)
else:
@@ -1716,82 +1911,18 @@ class DMRelay:
self,
*,
envelope: dict[str, Any],
preferred_peer_urls: list[str] | None = None,
) -> None:
"""Push an outbound DM envelope to every authenticated relay peer.
Fire-and-forget: spawned in a background thread so ``deposit``
returns to the caller immediately. Per-peer errors are logged
and swallowed the sender's UX must not block on slow Tor
peers, and a peer that's down today gets the next message
whenever it comes back. Inbound recipient polling from a healthy
peer keeps the system functional during peer failures.
Each peer is authed with the existing per-peer HMAC pattern
(#256) — same headers and key resolver gate-message replication
uses, so a hostile node that doesn't know any peer's HMAC key
can't impersonate a legitimate relay.
"""
"""Fire-and-forget fleet-wide replicate push (non-scoped traffic)."""
import threading
def _do_push():
def _do_push() -> None:
try:
import hashlib
import hmac
import requests as _requests
from services.mesh.mesh_crypto import (
normalize_peer_url,
resolve_peer_key_for_url,
self._replicate_envelope_to_peers(
envelope=envelope,
preferred_peer_urls=preferred_peer_urls,
)
from services.mesh.mesh_router import (
authenticated_push_peer_urls,
)
peers = authenticated_push_peer_urls()
if not peers:
return
payload = json.dumps(
{"envelope": envelope},
separators=(",", ":"),
ensure_ascii=False,
).encode("utf-8")
timeout = max(
1,
int(getattr(self._settings(), "MESH_RELAY_PUSH_TIMEOUT_S", 10) or 10),
)
for peer_url in peers:
try:
normalized = normalize_peer_url(peer_url)
headers = {"Content-Type": "application/json"}
peer_key = resolve_peer_key_for_url(normalized)
if peer_key:
headers["X-Peer-Url"] = normalized
headers["X-Peer-HMAC"] = hmac.new(
peer_key, payload, hashlib.sha256
).hexdigest()
url = f"{peer_url}/api/mesh/dm/replicate-envelope"
resp = _requests.post(
url, data=payload, timeout=timeout, headers=headers,
)
if resp.status_code == 200:
metrics_inc("dm_replication_push_ok")
else:
# 4xx including the structured cap_violation
# rejection from accept_replica — sender's
# relay learns and stops retrying this msg_id.
metrics_inc("dm_replication_push_rejected")
except Exception:
# Per-peer failure is non-fatal — log to metrics
# but don't break the loop. Other peers and a
# future retry can still propagate the envelope.
metrics_inc("dm_replication_push_error")
continue
except Exception:
# Outer guard — never let replication errors propagate
# back to the sender's deposit() caller.
metrics_inc("dm_replication_push_error")
thread = threading.Thread(
@@ -1806,6 +1937,8 @@ class DMRelay:
*,
mailbox_key: str,
msg_id: str,
recipient_id: str = "",
recipient_token: str | None = None,
) -> dict[str, Any] | None:
"""Return the wire-form envelope for a stored message, suitable
for POSTing to a peer relay's replicate-envelope endpoint.
@@ -1822,6 +1955,8 @@ class DMRelay:
return {
"msg_id": m.msg_id,
"mailbox_key": mailbox_key,
"recipient_id": str(recipient_id or "").strip(),
"recipient_token": str(recipient_token or "").strip(),
"sender_id": m.sender_id,
"sender_block_ref": m.sender_block_ref,
"sender_seal": m.sender_seal,
@@ -0,0 +1,64 @@
"""Public Infonet fleet defaults for sb-testnet-0 participants.
Operators who run private single-node installs can set ``MESH_INFONET_FLEET_JOIN=false``
and provide their own signer keys / peer secrets.
"""
from __future__ import annotations
FLEET_NETWORK_ID = "sb-testnet-0"
FLEET_SEED_ONION_URL = (
"http://gqpbunqbgtkcqilvclm3xrkt3zowjyl3s62kkktvojgvxzizamvbrqid.onion:8000"
)
FLEET_BOOTSTRAP_SIGNER_PUBLIC_KEY_B64 = (
"ul1d0kj/ODPIp0OhHzX8eLAVXzJ3CVvzW1vn2IC6q3I="
)
# Shared fleet HMAC for sb-testnet peer announce/push/sync. Public testnet join model.
FLEET_PEER_PUSH_SECRET = "b7GoqsvoUD9MV7tyt0ZOzMptLA84QG6KCfaV9nDqz5Y"
def infonet_fleet_join_enabled() -> bool:
try:
from services.config import get_settings
if bool(getattr(get_settings(), "MESH_INFONET_FLEET_JOIN_DISABLED", False)):
return False
return bool(getattr(get_settings(), "MESH_INFONET_FLEET_JOIN", True))
except Exception:
return True
def effective_bootstrap_signer_public_key_b64() -> str:
try:
from services.config import get_settings
configured = str(getattr(get_settings(), "MESH_BOOTSTRAP_SIGNER_PUBLIC_KEY", "") or "").strip()
if configured:
return configured
except Exception:
pass
if infonet_fleet_join_enabled():
return FLEET_BOOTSTRAP_SIGNER_PUBLIC_KEY_B64
return ""
def effective_peer_push_secret() -> str:
try:
from services.config import get_settings
configured = str(getattr(get_settings(), "MESH_PEER_PUSH_SECRET", "") or "").strip()
if configured:
return configured
except Exception:
pass
if infonet_fleet_join_enabled():
return FLEET_PEER_PUSH_SECRET
return ""
def configured_bootstrap_seed_peers_with_fleet_default(peers: list[str]) -> list[str]:
if peers:
return peers
if infonet_fleet_join_enabled():
return [FLEET_SEED_ONION_URL]
return []
+522 -83
View File
@@ -33,8 +33,9 @@ Each event contains:
Persistence: JSON file at backend/data/infonet.json
Encrypted gate chat events are intentionally kept off the public chain and
persisted separately via GateMessageStore.
Encrypted gate chat events are private-chain ciphertext records. They are
excluded from public read surfaces and replicated only over private Infonet
transports.
"""
import json
@@ -64,6 +65,8 @@ from services.mesh.mesh_schema import (
ACTIVE_PUBLIC_LEDGER_EVENT_TYPES,
PUBLIC_LEDGER_EVENT_TYPES,
validate_event_payload,
validate_private_dm_ledger_payload,
validate_private_gate_ledger_payload,
validate_protocol_fields,
validate_public_ledger_payload,
)
@@ -127,6 +130,12 @@ GATE_SEGMENT_MAX_COMPRESSED_BYTES = max(
int(os.environ.get("MESH_GATE_SEGMENT_MAX_COMPRESSED_BYTES", str(2 * 1024 * 1024)) or str(2 * 1024 * 1024)),
)
GATE_SEGMENT_STORAGE_VERSION = 1
DM_HASHCHAIN_SPOOL_LIMIT = max(1, int(os.environ.get("MESH_DM_HASHCHAIN_SPOOL_LIMIT", "2") or "2"))
DM_HASHCHAIN_SPOOL_SENDER_LIMIT = max(
1,
int(os.environ.get("MESH_DM_HASHCHAIN_SPOOL_SENDER_LIMIT", "1") or "1"),
)
DM_HASHCHAIN_SPOOL_TTL_S = max(60, int(os.environ.get("MESH_DM_HASHCHAIN_SPOOL_TTL_S", "3600") or "3600"))
_PUBLIC_EVENT_APPEND_HOOKS: list[Any] = []
_PUBLIC_EVENT_APPEND_HOOKS_LOCK = threading.Lock()
@@ -340,6 +349,32 @@ def _private_gate_event_id(
).hexdigest()
def _private_gate_signature_payload_variants(gate_id: str, event: dict[str, Any]) -> list[dict[str, Any]]:
payload = _private_gate_signature_payload(gate_id, event)
variants: list[dict[str, Any]] = [payload]
event_payload = event.get("payload") if isinstance(event.get("payload"), dict) else {}
reply_to = str(event_payload.get("reply_to", "") or "").strip()
if reply_to:
variants.append(_private_gate_signature_payload(gate_id, event, include_reply_to=False))
if "epoch" in payload:
no_epoch = dict(payload)
no_epoch.pop("epoch", None)
variants.append(no_epoch)
if reply_to:
no_epoch_no_reply = _private_gate_signature_payload(gate_id, event, include_reply_to=False)
no_epoch_no_reply.pop("epoch", None)
variants.append(no_epoch_no_reply)
deduped: list[dict[str, Any]] = []
seen: set[str] = set()
for variant in variants:
material = json.dumps(variant, sort_keys=True, separators=(",", ":"), ensure_ascii=False)
if material in seen:
continue
seen.add(material)
deduped.append(variant)
return deduped
def _sanitize_private_gate_event(gate_id: str, event: dict[str, Any]) -> dict[str, Any]:
payload = event.get("payload") if isinstance(event.get("payload"), dict) else {}
sanitized = {
@@ -1568,11 +1603,18 @@ class Infonet:
def _rebuild_state(self) -> None:
self.event_index = {}
self.node_sequences = {}
# Keep private signed-write replay domains across public-chain
# rebuilds; these domains protect local side effects that are not
# represented as public Infonet events.
if not isinstance(getattr(self, "sequence_domains", None), dict):
self.sequence_domains = {}
# Keep private signed-write replay domains that are not represented
# on-chain, but rebuild the gate_message sequence domain from chain
# events so reloads/fork application do not mix it with public
# per-node message sequences.
preserved_domains = {}
if isinstance(getattr(self, "sequence_domains", None), dict):
preserved_domains = {
key: value
for key, value in self.sequence_domains.items()
if not str(key or "").endswith("|gate_message")
}
self.sequence_domains = dict(preserved_domains)
self.public_key_bindings = {}
self.revocations = {}
self._replay_filter = ReplayFilter()
@@ -1584,9 +1626,12 @@ class Infonet:
node_id = evt.get("node_id", "")
sequence = _safe_int(evt.get("sequence", 0) or 0, 0)
if node_id and sequence:
last = self.node_sequences.get(node_id, 0)
sequence_table, sequence_key = self._sequence_table_for_event(
evt.get("event_type", ""), node_id
)
last = sequence_table.get(sequence_key, 0)
if sequence > last:
self.node_sequences[node_id] = sequence
sequence_table[sequence_key] = sequence
public_key = str(evt.get("public_key", "") or "")
if public_key and node_id:
existing = self.public_key_bindings.get(public_key)
@@ -1898,6 +1943,295 @@ class Infonet:
self._save()
return True, "ok"
def _sequence_table_for_event(self, event_type: str, node_id: str) -> tuple[dict[str, int], str]:
normalized = str(event_type or "").strip().lower()
if normalized == "gate_message":
return self.sequence_domains, f"{node_id}|gate_message"
if normalized == "dm_message":
return self.sequence_domains, f"{node_id}|dm_message"
return self.node_sequences, node_id
def _dm_spool_target_key(self, payload: dict[str, Any]) -> tuple[str, str]:
delivery_class = str(payload.get("delivery_class", "") or "").strip().lower()
if delivery_class == "shared":
key = str(payload.get("recipient_token", "") or "").strip()
else:
key = str(payload.get("recipient_id", "") or "").strip()
return delivery_class, key
def _dm_spool_active_counts(
self,
payload: dict[str, Any],
*,
sender_id: str = "",
now: float | None = None,
) -> tuple[int, int]:
delivery_class, key = self._dm_spool_target_key(payload)
if not key:
return 0, 0
sender_id = str(sender_id or "").strip()
current = time.time() if now is None else float(now)
total_count = 0
sender_count = 0
for evt in reversed(self.events):
if evt.get("event_type") != "dm_message":
continue
evt_payload = evt.get("payload") if isinstance(evt.get("payload"), dict) else {}
evt_delivery_class, evt_key = self._dm_spool_target_key(evt_payload)
if evt_delivery_class != delivery_class:
continue
if evt_key != key:
continue
evt_ts = float(evt_payload.get("timestamp", evt.get("timestamp", 0)) or 0)
if evt_ts > 0 and current - evt_ts > DM_HASHCHAIN_SPOOL_TTL_S:
continue
total_count += 1
if sender_id and str(evt.get("node_id", "") or "").strip() == sender_id:
sender_count += 1
if total_count >= DM_HASHCHAIN_SPOOL_LIMIT and (
not sender_id or sender_count >= DM_HASHCHAIN_SPOOL_SENDER_LIMIT
):
break
return total_count, sender_count
def _dm_spool_active_count(self, payload: dict[str, Any], *, now: float | None = None) -> int:
total_count, _sender_count = self._dm_spool_active_counts(payload, now=now)
return total_count
def append_private_dm_message(
self,
*,
node_id: str,
payload: dict,
signature: str,
sequence: int,
public_key: str,
public_key_algo: str,
protocol_version: str = "",
timestamp: float = 0,
) -> dict:
"""Append an encrypted DM dead-drop message to the private Infonet ledger.
The event is a small offline spool, capped per mailbox target, so the
hashchain can carry a couple of sealed DMs without becoming an
unbounded global mailbox.
"""
event_type = "dm_message"
if sequence <= 0:
raise ValueError("sequence is required and must be > 0")
sequence_table, sequence_key = self._sequence_table_for_event(event_type, node_id)
last = sequence_table.get(sequence_key, 0)
if sequence <= last:
raise ValueError(f"Replay detected: sequence {sequence} <= last {last}")
raw_payload = dict(payload or {})
if "message" in raw_payload or "plaintext" in raw_payload or "_local_plaintext" in raw_payload:
raise ValueError("private DM ledger payload must not contain plaintext")
if str(raw_payload.get("transport_lock", "") or "").strip().lower() != "private_strong":
raise ValueError("DM hashchain spool requires private_strong transport_lock")
payload = normalize_payload(event_type, raw_payload)
ok, reason = validate_private_dm_ledger_payload(payload)
if not ok:
raise ValueError(reason)
total_count, sender_count = self._dm_spool_active_counts(payload, sender_id=node_id)
if sender_count >= DM_HASHCHAIN_SPOOL_SENDER_LIMIT:
raise ValueError("DM hashchain sender spool full for recipient")
if total_count >= DM_HASHCHAIN_SPOOL_LIMIT:
raise ValueError("DM hashchain spool full for recipient")
payload_json = json.dumps(payload, sort_keys=True, separators=(",", ":"), ensure_ascii=False)
if len(payload_json.encode("utf-8")) > MAX_PAYLOAD_BYTES:
raise ValueError("payload exceeds max size")
protocol_version = str(protocol_version or PROTOCOL_VERSION)
ok, reason = validate_protocol_fields(protocol_version, NETWORK_ID)
if not ok:
raise ValueError(reason)
if not (signature and public_key and public_key_algo):
raise ValueError("Missing signature fields")
algo = parse_public_key_algo(public_key_algo)
if not algo:
raise ValueError("Unsupported public_key_algo")
if not verify_node_binding(node_id, public_key):
raise ValueError("node_id mismatch")
bound, bind_reason = self._bind_public_key(public_key, node_id)
if not bound:
raise ValueError(bind_reason)
sig_payload = build_signature_payload(
event_type=event_type,
node_id=node_id,
sequence=sequence,
payload=payload,
)
if not verify_signature(
public_key_b64=public_key,
public_key_algo=public_key_algo,
signature_hex=signature,
payload=sig_payload,
):
raise ValueError("Invalid signature")
revoked, _info = self._revocation_status(public_key)
if revoked:
raise ValueError("public key is revoked")
event = ChainEvent(
prev_hash=self.head_hash,
event_type=event_type,
node_id=node_id,
payload=payload,
timestamp=float(timestamp or time.time()),
sequence=sequence,
signature=signature,
public_key=public_key,
public_key_algo=public_key_algo,
protocol_version=protocol_version,
)
event_dict = event.to_dict()
self._write_wal(event_dict)
self.events.append(event_dict)
self.event_index[event.event_id] = len(self.events) - 1
self.head_hash = event.event_id
sequence_table[sequence_key] = sequence
self._replay_filter.add(event.event_id)
self._invalidate_merkle_cache()
self._update_counters_for_event(event_dict)
self._save()
try:
from services.mesh.mesh_rns import rns_bridge
rns_bridge.publish_event(event_dict)
except Exception:
pass
_notify_public_event_append_hooks(event_dict)
logger.info(
f"Infonet append [dm_message] by {_redact_node(node_id)} seq={sequence} "
f"id={event.event_id[:16]}..."
)
return event_dict
def append_private_gate_message(
self,
*,
node_id: str,
payload: dict,
signature: str,
sequence: int,
public_key: str,
public_key_algo: str,
protocol_version: str = "",
timestamp: float = 0,
) -> dict:
"""Append an encrypted gate message to the private Infonet ledger.
Gate messages use their own sequence domain so a gate post cannot
consume or replay-block the author's public broadcast sequence.
"""
event_type = "gate_message"
if sequence <= 0:
raise ValueError("sequence is required and must be > 0")
sequence_table, sequence_key = self._sequence_table_for_event(event_type, node_id)
last = sequence_table.get(sequence_key, 0)
if sequence <= last:
raise ValueError(f"Replay detected: sequence {sequence} <= last {last}")
raw_payload = dict(payload or {})
if "message" in raw_payload or "_local_plaintext" in raw_payload or "_local_reply_to" in raw_payload:
raise ValueError("private gate ledger payload must not contain plaintext")
if str(raw_payload.get("transport_lock", "") or "").strip().lower() != "private_strong":
raise ValueError("gate messages require private_strong transport_lock")
payload = normalize_payload(event_type, raw_payload)
ok, reason = validate_private_gate_ledger_payload(payload)
if not ok:
raise ValueError(reason)
payload_json = json.dumps(payload, sort_keys=True, separators=(",", ":"), ensure_ascii=False)
if len(payload_json.encode("utf-8")) > MAX_PAYLOAD_BYTES:
raise ValueError("payload exceeds max size")
protocol_version = str(protocol_version or PROTOCOL_VERSION)
ok, reason = validate_protocol_fields(protocol_version, NETWORK_ID)
if not ok:
raise ValueError(reason)
if not (signature and public_key and public_key_algo):
raise ValueError("Missing signature fields")
algo = parse_public_key_algo(public_key_algo)
if not algo:
raise ValueError("Unsupported public_key_algo")
if not verify_node_binding(node_id, public_key):
raise ValueError("node_id mismatch")
bound, bind_reason = self._bind_public_key(public_key, node_id)
if not bound:
raise ValueError(bind_reason)
event_for_signature = {"payload": payload}
signature_ok = False
for signature_payload in _private_gate_signature_payload_variants(
str(payload.get("gate", "") or ""),
event_for_signature,
):
sig_payload = build_signature_payload(
event_type=event_type,
node_id=node_id,
sequence=sequence,
payload=signature_payload,
)
if verify_signature(
public_key_b64=public_key,
public_key_algo=public_key_algo,
signature_hex=signature,
payload=sig_payload,
):
signature_ok = True
break
if not signature_ok:
raise ValueError("Invalid signature")
revoked, _info = self._revocation_status(public_key)
if revoked:
raise ValueError("public key is revoked")
event = ChainEvent(
prev_hash=self.head_hash,
event_type=event_type,
node_id=node_id,
payload=payload,
timestamp=float(timestamp or time.time()),
sequence=sequence,
signature=signature,
public_key=public_key,
public_key_algo=public_key_algo,
protocol_version=protocol_version,
)
event_dict = event.to_dict()
self._write_wal(event_dict)
self.events.append(event_dict)
self.event_index[event.event_id] = len(self.events) - 1
self.head_hash = event.event_id
sequence_table[sequence_key] = sequence
self._replay_filter.add(event.event_id)
self._invalidate_merkle_cache()
self._update_counters_for_event(event_dict)
self._save()
try:
from services.mesh.mesh_rns import rns_bridge
rns_bridge.publish_event(event_dict)
except Exception:
pass
_notify_public_event_append_hooks(event_dict)
logger.info(
f"Infonet append [gate_message] by {_redact_node(node_id)} seq={sequence} "
f"id={event.event_id[:16]}..."
)
return event_dict
def append(
self,
event_type: str,
@@ -2078,6 +2412,18 @@ class Infonet:
if not event_id or not prev_hash:
rejected.append({"index": idx, "reason": "Missing event_id or prev_hash"})
continue
if event_id in self.event_index:
duplicates += 1
continue
if self._replay_filter.seen(event_id):
try:
from services.mesh.mesh_metrics import increment as metrics_inc
metrics_inc("ingest_replay_seen")
except Exception:
pass
duplicates += 1
continue
if prev_hash != expected_prev:
try:
from services.mesh.mesh_metrics import increment as metrics_inc
@@ -2096,25 +2442,14 @@ class Infonet:
pass
rejected.append({"index": idx, "reason": "network_id mismatch"})
continue
if event_id in self.event_index:
duplicates += 1
continue
if self._replay_filter.seen(event_id):
try:
from services.mesh.mesh_metrics import increment as metrics_inc
metrics_inc("ingest_replay_seen")
except Exception:
pass
duplicates += 1
continue
if prev_hash != self.head_hash:
rejected.append({"index": idx, "reason": "prev_hash does not match head"})
continue
if sequence <= 0:
rejected.append({"index": idx, "reason": "Invalid sequence"})
continue
last = self.node_sequences.get(node_id, 0)
sequence_table, sequence_key = self._sequence_table_for_event(event_type, node_id)
last = sequence_table.get(sequence_key, 0)
if sequence <= last:
rejected.append({"index": idx, "reason": "Replay detected"})
continue
@@ -2149,7 +2484,18 @@ class Infonet:
if not ok:
rejected.append({"index": idx, "reason": reason})
continue
ok, reason = validate_public_ledger_payload(event_type, payload)
if event_type == "gate_message":
ok, reason = validate_private_gate_ledger_payload(payload)
elif event_type == "dm_message":
ok, reason = validate_private_dm_ledger_payload(payload)
if ok:
total_count, sender_count = self._dm_spool_active_counts(payload, sender_id=str(evt.get("node_id", "") or ""))
if sender_count >= DM_HASHCHAIN_SPOOL_SENDER_LIMIT:
ok, reason = False, "DM hashchain sender spool full for recipient"
elif total_count >= DM_HASHCHAIN_SPOOL_LIMIT:
ok, reason = False, "DM hashchain spool full for recipient"
else:
ok, reason = validate_public_ledger_payload(event_type, payload)
if not ok:
rejected.append({"index": idx, "reason": reason})
continue
@@ -2225,7 +2571,7 @@ class Infonet:
pass
rejected.append({"index": idx, "reason": "public key is revoked"})
continue
last_seq = self.node_sequences.get(node_id, 0)
last_seq = sequence_table.get(sequence_key, 0)
if sequence <= last_seq:
try:
from services.mesh.mesh_metrics import increment as metrics_inc
@@ -2261,18 +2607,30 @@ class Infonet:
rejected.append({"index": idx, "reason": bind_reason})
continue
sig_payload = build_signature_payload(
event_type=event_type,
node_id=node_id,
sequence=sequence,
payload=payload,
)
if not verify_signature(
public_key_b64=public_key,
public_key_algo=public_key_algo,
signature_hex=signature,
payload=sig_payload,
):
if event_type == "gate_message":
signature_payloads = _private_gate_signature_payload_variants(
str(payload.get("gate", "") or ""),
evt,
)
else:
signature_payloads = [payload]
signature_ok = False
for signature_payload in signature_payloads:
sig_payload = build_signature_payload(
event_type=event_type,
node_id=node_id,
sequence=sequence,
payload=signature_payload,
)
if verify_signature(
public_key_b64=public_key,
public_key_algo=public_key_algo,
signature_hex=signature,
payload=sig_payload,
):
signature_ok = True
break
if not signature_ok:
try:
from services.mesh.mesh_metrics import increment as metrics_inc
@@ -2302,7 +2660,7 @@ class Infonet:
self.events.append(evt)
self.event_index[event_id] = len(self.events) - 1
self.head_hash = event_id
self.node_sequences[node_id] = sequence
sequence_table[sequence_key] = sequence
self._update_counters_for_event(evt)
accepted += 1
expected_prev = event_id
@@ -2365,6 +2723,7 @@ class Infonet:
verify_node_binding,
)
event_type = evt_dict.get("event_type", "")
node_id = evt_dict.get("node_id", "")
if not parse_public_key_algo(public_key_algo):
return False, f"Unsupported public_key_algo at index {i}"
@@ -2375,21 +2734,41 @@ class Infonet:
return False, f"public key binding conflict at index {i}"
seen_public_keys[public_key] = node_id
normalized = normalize_payload(
evt_dict.get("event_type", ""), evt_dict.get("payload", {})
)
sig_payload = build_signature_payload(
event_type=evt_dict.get("event_type", ""),
node_id=node_id,
sequence=_safe_int(evt_dict.get("sequence", 0) or 0, 0),
payload=normalized,
)
if not verify_signature(
public_key_b64=public_key,
public_key_algo=public_key_algo,
signature_hex=signature,
payload=sig_payload,
):
payload = evt_dict.get("payload", {})
if event_type == "gate_message":
ok, reason = validate_private_gate_ledger_payload(payload)
if not ok:
return False, f"Invalid gate_message payload at index {i}: {reason}"
signature_payloads = _private_gate_signature_payload_variants(
str(payload.get("gate", "") or ""),
evt_dict,
)
elif event_type == "dm_message":
ok, reason = validate_private_dm_ledger_payload(payload)
if not ok:
return False, f"Invalid dm_message payload at index {i}: {reason}"
signature_payloads = [normalize_payload(event_type, payload)]
else:
signature_payloads = [
normalize_payload(event_type, payload)
]
signature_ok = False
for signature_payload in signature_payloads:
sig_payload = build_signature_payload(
event_type=event_type,
node_id=node_id,
sequence=_safe_int(evt_dict.get("sequence", 0) or 0, 0),
payload=signature_payload,
)
if verify_signature(
public_key_b64=public_key,
public_key_algo=public_key_algo,
signature_hex=signature,
payload=sig_payload,
):
signature_ok = True
break
if not signature_ok:
return False, f"Invalid signature at index {i}"
prev = evt_dict["event_id"]
@@ -2454,27 +2833,48 @@ class Infonet:
verify_node_binding,
)
event_type = evt_dict.get("event_type", "")
node_id = evt_dict.get("node_id", "")
if not parse_public_key_algo(public_key_algo):
return False, f"Unsupported public_key_algo at index {i}"
if not verify_node_binding(node_id, public_key):
return False, f"node_id mismatch at index {i}"
normalized = normalize_payload(
evt_dict.get("event_type", ""), evt_dict.get("payload", {})
)
sig_payload = build_signature_payload(
event_type=evt_dict.get("event_type", ""),
node_id=node_id,
sequence=_safe_int(evt_dict.get("sequence", 0) or 0, 0),
payload=normalized,
)
if not verify_signature(
public_key_b64=public_key,
public_key_algo=public_key_algo,
signature_hex=signature,
payload=sig_payload,
):
payload = evt_dict.get("payload", {})
if event_type == "gate_message":
ok, reason = validate_private_gate_ledger_payload(payload)
if not ok:
return False, f"Invalid gate_message payload at index {i}: {reason}"
signature_payloads = _private_gate_signature_payload_variants(
str(payload.get("gate", "") or ""),
evt_dict,
)
elif event_type == "dm_message":
ok, reason = validate_private_dm_ledger_payload(payload)
if not ok:
return False, f"Invalid dm_message payload at index {i}: {reason}"
signature_payloads = [normalize_payload(event_type, payload)]
else:
signature_payloads = [
normalize_payload(event_type, payload)
]
signature_ok = False
for signature_payload in signature_payloads:
sig_payload = build_signature_payload(
event_type=event_type,
node_id=node_id,
sequence=_safe_int(evt_dict.get("sequence", 0) or 0, 0),
payload=signature_payload,
)
if verify_signature(
public_key_b64=public_key,
public_key_algo=public_key_algo,
signature_hex=signature,
payload=sig_payload,
):
signature_ok = True
break
if not signature_ok:
return False, f"Invalid signature at index {i}"
prev = evt_dict["event_id"]
@@ -2538,7 +2938,14 @@ class Infonet:
node_id = evt.get("node_id", "")
sequence = _safe_int(evt.get("sequence", 0) or 0, 0)
if node_id and sequence:
last_seq[node_id] = max(last_seq.get(node_id, 0), sequence)
sequence_key = (
f"{node_id}|gate_message"
if str(evt.get("event_type", "") or "").strip().lower() == "gate_message"
else f"{node_id}|dm_message"
if str(evt.get("event_type", "") or "").strip().lower() == "dm_message"
else node_id
)
last_seq[sequence_key] = max(last_seq.get(sequence_key, 0), sequence)
public_key = str(evt.get("public_key", "") or "")
if public_key and node_id:
seen_public_keys.setdefault(public_key, node_id)
@@ -2558,8 +2965,21 @@ class Infonet:
existing_idx = self.event_index.get(event_id)
if existing_idx is not None and existing_idx <= prev_index:
return False, "duplicate event_id"
payload = normalize_payload(event_type, dict(payload or {}))
if event_type == "gate_message":
payload = dict(payload or {})
elif event_type == "dm_message":
payload = normalize_payload(event_type, dict(payload or {}))
else:
payload = normalize_payload(event_type, dict(payload or {}))
ok, reason = validate_event_payload(event_type, payload)
if not ok:
return False, reason
if event_type == "gate_message":
ok, reason = validate_private_gate_ledger_payload(payload)
elif event_type == "dm_message":
ok, reason = validate_private_dm_ledger_payload(payload)
else:
ok, reason = validate_public_ledger_payload(event_type, payload)
if not ok:
return False, reason
proto = evt.get("protocol_version") or PROTOCOL_VERSION
@@ -2573,7 +2993,14 @@ class Infonet:
revoked, _info = self._revocation_status(public_key)
if revoked and event_type != "key_revoke":
return False, "public key revoked"
last = last_seq.get(node_id, 0)
sequence_key = (
f"{node_id}|gate_message"
if event_type == "gate_message"
else f"{node_id}|dm_message"
if event_type == "dm_message"
else node_id
)
last = last_seq.get(sequence_key, 0)
if sequence <= last:
return False, "sequence replay"
from services.mesh.mesh_crypto import (
@@ -2591,23 +3018,35 @@ class Infonet:
if existing and existing != node_id:
return False, "public key binding conflict"
seen_public_keys[public_key] = node_id
sig_payload = build_signature_payload(
event_type=event_type,
node_id=node_id,
sequence=sequence,
payload=payload,
)
if not verify_signature(
public_key_b64=public_key,
public_key_algo=public_key_algo,
signature_hex=signature,
payload=sig_payload,
):
if event_type == "gate_message":
signature_payloads = _private_gate_signature_payload_variants(
str(payload.get("gate", "") or ""),
evt,
)
else:
signature_payloads = [payload]
signature_ok = False
for signature_payload in signature_payloads:
sig_payload = build_signature_payload(
event_type=event_type,
node_id=node_id,
sequence=sequence,
payload=signature_payload,
)
if verify_signature(
public_key_b64=public_key,
public_key_algo=public_key_algo,
signature_hex=signature,
payload=sig_payload,
):
signature_ok = True
break
if not signature_ok:
return False, "invalid signature"
computed = ChainEvent.from_dict(evt).event_id
if computed != event_id:
return False, "event_id mismatch"
last_seq[node_id] = sequence
last_seq[sequence_key] = sequence
# Apply fork
self.events = prefix + ordered
@@ -0,0 +1,86 @@
"""Auto-enable Tor wormhole transport on Infonet relay/seed nodes."""
from __future__ import annotations
import logging
from typing import Any
from services.config import get_settings
from services.wormhole_settings import read_wormhole_settings, write_wormhole_settings
logger = logging.getLogger(__name__)
def infonet_relay_auto_wormhole_requested() -> bool:
settings = get_settings()
if bool(settings.MESH_INFONET_RELAY_AUTO_WORMHOLE_DISABLED):
return False
if bool(settings.MESH_INFONET_RELAY_AUTO_WORMHOLE):
return True
if str(settings.MESH_BOOTSTRAP_SIGNER_PRIVATE_KEY or "").strip():
return True
return False
def _relay_tor_wormhole_target_settings() -> dict[str, Any]:
settings = get_settings()
socks_port = int(settings.MESH_ARTI_SOCKS_PORT or 9050)
return {
"enabled": True,
"transport": "tor_arti",
"socks_proxy": f"socks5h://127.0.0.1:{socks_port}",
"socks_dns": True,
"anonymous_mode": True,
}
def _wormhole_settings_match(existing: dict[str, Any], target: dict[str, Any]) -> bool:
return (
bool(existing.get("enabled")) is bool(target["enabled"])
and str(existing.get("transport", "")) == str(target["transport"])
and str(existing.get("socks_proxy", "")) == str(target["socks_proxy"])
and bool(existing.get("socks_dns", True)) is bool(target["socks_dns"])
and bool(existing.get("anonymous_mode", False)) is bool(target["anonymous_mode"])
)
def ensure_infonet_relay_wormhole_ready(*, reason: str = "relay_auto") -> dict[str, Any]:
"""Persist Tor wormhole settings and connect on relay/seed startup."""
if not infonet_relay_auto_wormhole_requested():
return {"ok": True, "skipped": True, "reason": "not_requested"}
from routers.ai_intel import _write_env_value
from services.tor_hidden_service import tor_service
from services.wormhole_supervisor import connect_wormhole, restart_wormhole
existing = read_wormhole_settings()
target = _relay_tor_wormhole_target_settings()
settings_updated = not _wormhole_settings_match(existing, target)
updated = write_wormhole_settings(**target) if settings_updated else existing
tor_result: dict[str, Any] = {"ok": False, "detail": "not started"}
try:
tor_result = tor_service.start(target_port=8000)
if tor_result.get("ok"):
_write_env_value("MESH_ARTI_ENABLED", "true")
get_settings.cache_clear()
except Exception as exc:
tor_result = {"ok": False, "detail": str(exc or type(exc).__name__)}
runtime = (
restart_wormhole(reason=reason)
if settings_updated
else connect_wormhole(reason=reason)
)
if settings_updated:
logger.info("Infonet relay auto-wormhole enabled (%s)", reason)
return {
"ok": True,
"skipped": False,
"settings_updated": settings_updated,
"tor": tor_result,
"runtime": runtime,
"settings": updated,
}
@@ -276,5 +276,6 @@ def should_run_sync(
) -> bool:
current_time = int(now if now is not None else time.time())
if state.last_outcome == "running":
return False
started_at = int(state.last_sync_started_at or 0)
return started_at <= 0 or current_time - started_at >= 300
return int(state.next_sync_due_at or 0) <= current_time
@@ -125,8 +125,8 @@ def dm_lookup_response_view(
view.pop("lookup_mode", None)
view.pop("removal_target", None)
return view
if invite_lookup:
view.pop("agent_id", None)
# Successful invite lookups keep agent_id: the handle is the capability and
# first-contact messaging needs a delivery target. Failures stay generic.
return view
+152
View File
@@ -0,0 +1,152 @@
"""Operator-signed peer registry for private Infonet swarm discovery."""
from __future__ import annotations
import json
import time
from dataclasses import asdict, dataclass, field
from pathlib import Path
from typing import Any
from services.mesh.mesh_crypto import normalize_peer_url
from services.mesh.mesh_router import peer_transport_kind
BACKEND_DIR = Path(__file__).resolve().parents[2]
DATA_DIR = BACKEND_DIR / "data"
DEFAULT_PEER_REGISTRY_PATH = DATA_DIR / "peer_registry.json"
REGISTRY_VERSION = 1
ALLOWED_REGISTRY_ROLES = {"participant", "relay", "seed"}
@dataclass
class RegistryPeer:
peer_url: str
transport: str
role: str
node_id: str = ""
label: str = ""
announced_at: int = 0
last_seen_at: int = 0
failure_count: int = 0
def to_dict(self) -> dict[str, Any]:
return asdict(self)
def manifest_peer(self) -> dict[str, str]:
return {
"peer_url": self.peer_url,
"transport": self.transport,
"role": self.role,
"label": self.label or self.node_id[:16],
}
class PeerRegistry:
def __init__(self, path: str | Path = DEFAULT_PEER_REGISTRY_PATH):
self.path = Path(path)
self._peers: dict[str, RegistryPeer] = {}
def load(self) -> list[RegistryPeer]:
if not self.path.exists():
self._peers = {}
return []
raw = json.loads(self.path.read_text(encoding="utf-8"))
if not isinstance(raw, dict):
raise ValueError("peer registry root must be an object")
version = int(raw.get("version", 0) or 0)
if version != REGISTRY_VERSION:
raise ValueError(f"unsupported peer registry version: {version}")
entries = raw.get("peers", [])
if not isinstance(entries, list):
raise ValueError("peer registry peers must be a list")
peers: dict[str, RegistryPeer] = {}
for entry in entries:
if not isinstance(entry, dict):
continue
peer = self._normalize_entry(entry)
peers[peer.peer_url] = peer
self._peers = peers
return self.records()
def save(self) -> None:
self.path.parent.mkdir(parents=True, exist_ok=True)
payload = {
"version": REGISTRY_VERSION,
"updated_at": int(time.time()),
"peers": [peer.to_dict() for peer in self.records()],
}
self.path.write_text(
json.dumps(payload, sort_keys=True, indent=2) + "\n",
encoding="utf-8",
)
def records(self) -> list[RegistryPeer]:
return sorted(self._peers.values(), key=lambda item: (item.role, item.peer_url))
def upsert_announcement(
self,
*,
peer_url: str,
transport: str,
role: str,
node_id: str = "",
label: str = "",
now: float | None = None,
) -> RegistryPeer:
normalized = normalize_peer_url(peer_url)
if not normalized:
raise ValueError("peer_url is required")
resolved_transport = str(transport or "").strip().lower() or str(peer_transport_kind(normalized) or "")
if resolved_transport not in {"onion", "clearnet"}:
raise ValueError("unsupported peer transport")
resolved_role = str(role or "participant").strip().lower()
if resolved_role not in ALLOWED_REGISTRY_ROLES:
raise ValueError("unsupported peer role")
timestamp = int(now if now is not None else time.time())
existing = self._peers.get(normalized)
peer = RegistryPeer(
peer_url=normalized,
transport=resolved_transport,
role=resolved_role,
node_id=str(node_id or (existing.node_id if existing else "") or "").strip(),
label=str(label or (existing.label if existing else "") or "").strip(),
announced_at=int(existing.announced_at if existing and existing.announced_at else timestamp),
last_seen_at=timestamp,
failure_count=int(existing.failure_count if existing else 0),
)
self._peers[normalized] = peer
return peer
def prune_stale(self, *, max_age_s: int, now: float | None = None) -> int:
timestamp = int(now if now is not None else time.time())
removed = 0
for peer_url, peer in list(self._peers.items()):
if peer.role == "seed":
continue
last_seen = int(peer.last_seen_at or peer.announced_at or 0)
if last_seen > 0 and timestamp - last_seen > max(60, int(max_age_s or 0)):
del self._peers[peer_url]
removed += 1
return removed
def manifest_peers(self) -> list[dict[str, str]]:
return [peer.manifest_peer() for peer in self.records()]
def _normalize_entry(self, entry: dict[str, Any]) -> RegistryPeer:
peer_url = normalize_peer_url(str(entry.get("peer_url", "") or ""))
if not peer_url:
raise ValueError("registry peer_url is required")
transport = str(entry.get("transport", "") or peer_transport_kind(peer_url) or "").strip().lower()
role = str(entry.get("role", "participant") or "participant").strip().lower()
if role not in ALLOWED_REGISTRY_ROLES:
raise ValueError("registry role unsupported")
return RegistryPeer(
peer_url=peer_url,
transport=transport,
role=role,
node_id=str(entry.get("node_id", "") or "").strip(),
label=str(entry.get("label", "") or "").strip(),
announced_at=int(entry.get("announced_at", 0) or 0),
last_seen_at=int(entry.get("last_seen_at", 0) or entry.get("announced_at", 0) or 0),
failure_count=int(entry.get("failure_count", 0) or 0),
)
+1 -1
View File
@@ -16,7 +16,7 @@ DATA_DIR = BACKEND_DIR / "data"
DEFAULT_PEER_STORE_PATH = DATA_DIR / "peer_store.json"
PEER_STORE_VERSION = 1
ALLOWED_PEER_BUCKETS = {"bootstrap", "sync", "push"}
ALLOWED_PEER_SOURCES = {"bundle", "operator", "bootstrap_promoted", "runtime"}
ALLOWED_PEER_SOURCES = {"bundle", "operator", "bootstrap_promoted", "runtime", "swarm"}
ALLOWED_PEER_TRANSPORTS = {"clearnet", "onion"}
ALLOWED_PEER_ROLES = {"participant", "relay", "seed"}
+55 -4
View File
@@ -140,10 +140,24 @@ def transport_tier_from_state(state: dict[str, Any] | None) -> str:
snapshot = state or {}
if not bool(snapshot.get("configured")):
return "public_degraded"
if not bool(snapshot.get("ready")):
return "public_degraded"
arti_ready = bool(snapshot.get("arti_ready"))
rns_ready = bool(snapshot.get("rns_ready"))
running = bool(snapshot.get("running"))
transport_usable = bool(snapshot.get("ready"))
if not transport_usable:
try:
from services.config import get_settings
if (
bool(getattr(get_settings(), "MESH_WORMHOLE_TRUST_FILE_READY", False))
and running
and arti_ready
):
transport_usable = True
except Exception:
pass
if not transport_usable:
return "public_degraded"
if arti_ready and rns_ready:
return "private_strong"
if arti_ready or rns_ready:
@@ -157,8 +171,45 @@ def transport_tier_is_sufficient(current_tier: str | None, required_tier: str |
return TRANSPORT_TIER_ORDER[current] >= TRANSPORT_TIER_ORDER[required]
def release_lane_required_tier(lane: str) -> str:
return network_release_required_tier(lane)
_DM_RUNTIME_ENFORCEMENT_ROUTES = {
("POST", "/api/mesh/dm/send"),
("POST", "/api/mesh/dm/poll"),
("GET", "/api/mesh/dm/poll"),
("GET", "/api/mesh/dm/count"),
("POST", "/api/mesh/dm/count"),
}
def runtime_route_enforcement_tier(path: str, method: str, *, static_tier: str) -> str:
"""Adjust static route tiers for Tor-only nodes that never reach private_strong."""
normalized_path = str(path or "").strip()
normalized_method = str(method or "").strip().upper()
static = normalize_transport_tier(static_tier)
if (normalized_method, normalized_path) not in _DM_RUNTIME_ENFORCEMENT_ROUTES:
return static
if static != "private_strong":
return static
return release_lane_required_tier("dm")
def release_lane_required_tier(lane: str, *, wormhole_state: dict[str, Any] | None = None) -> str:
normalized_lane = str(lane or "").strip().lower()
required = network_release_required_tier(normalized_lane)
if normalized_lane != "dm":
return required
state = wormhole_state
if state is None:
try:
from services.wormhole_supervisor import get_wormhole_state
state = get_wormhole_state()
except Exception:
state = {}
# Tor-only nodes never reach private_strong (needs Arti + RNS). Encrypted
# relay over Arti still preserves ciphertext privacy for offline delivery.
if not bool((state or {}).get("rns_enabled")):
return "private_transitional"
return required
def private_delivery_status(status_code: str, *, reason_code: str = "", plain_reason: str = "") -> dict[str, str]:
@@ -386,6 +386,20 @@ def _dispatch_dm(
sampled=sampled,
)
replication_peer_urls: list[str] = []
try:
from services.mesh.mesh_dm_connect_delivery import relay_push_peer_urls_for_payload
replication_peer_urls = [
str(raw or "").strip().rstrip("/")
for raw in list(payload.get("relay_push_peer_urls") or [])
if str(raw or "").strip()
]
if not replication_peer_urls:
replication_peer_urls = relay_push_peer_urls_for_payload(payload)
except Exception:
replication_peer_urls = []
apply_dm_relay_jitter()
relay_result = dm_relay.deposit(
sender_id=relay_sender_id,
@@ -399,7 +413,25 @@ def _dispatch_dm(
sender_token_hash=sender_token_hash,
payload_format=payload_format,
session_welcome=session_welcome,
replication_peer_urls=replication_peer_urls,
)
replicate_info = dict(relay_result.get("replicate") or {})
if replication_peer_urls and not replicate_info.get("ok"):
return _dispatch_result(
ok=False,
lane="dm",
selected_transport="relay",
selected_carrier="relay",
dispatch_reason="scoped_relay_replicate_failed",
hidden_transport_effective=bool(hidden_relay),
no_acceptable_path=False,
detail=(
"Scoped relay replicate did not reach the recipient node: "
+ str(replicate_info.get("failed") or replicate_info.get("detail") or "unknown")
),
msg_id=msg_id,
replicate=replicate_info,
)
if not relay_result.get("ok"):
return _dispatch_result(
ok=False,
@@ -436,6 +468,7 @@ def _dispatch_dm(
else str(relay_result.get("detail", "") or "Delivered privately")
),
msg_id=str(relay_result.get("msg_id", "") or msg_id),
replicate=replicate_info,
)
@@ -600,8 +633,15 @@ def attempt_private_release(
policy_reason_code=str(decision.reason_code or ""),
)
if normalized_lane == "dm":
dm_payload = dict(payload or {})
try:
from services.mesh.mesh_dm_connect_delivery import enrich_connect_release_payload
dm_payload = enrich_connect_release_payload(dm_payload)
except Exception:
pass
return _dispatch_dm(
dict(payload or {}),
dm_payload,
secure_dm_enabled=secure_dm_enabled or _secure_dm_enabled,
rns_private_dm_ready=rns_private_dm_ready or _rns_private_dm_ready,
anonymous_dm_hidden_transport_enforced=(
+174
View File
@@ -2,6 +2,9 @@
from __future__ import annotations
import base64
import binascii
import math
from dataclasses import dataclass
from typing import Any, Callable
@@ -33,6 +36,88 @@ def _require_fields(payload: dict[str, Any], fields: tuple[str, ...]) -> tuple[b
return True, "ok"
_SEALED_CIPHERTEXT_PREFIXES = ("x3dh1:", "dm1:", "mls1:", "sealed:")
def _strip_sealed_ciphertext_prefix(value: str) -> str:
lowered = value.lower()
for prefix in _SEALED_CIPHERTEXT_PREFIXES:
if lowered.startswith(prefix):
return value[len(prefix) :]
return value
def _sealed_ciphertext_has_known_prefix(value: str) -> bool:
lowered = str(value or "").strip().lower()
return any(lowered.startswith(prefix) for prefix in _SEALED_CIPHERTEXT_PREFIXES)
def _decode_base64ish(value: Any) -> bytes | None:
raw = str(value or "").strip()
if not raw or any(ch.isspace() for ch in raw):
return None
padded = raw + ("=" * (-len(raw) % 4))
for altchars in (None, b"-_"):
try:
return base64.b64decode(padded.encode("ascii"), altchars=altchars, validate=True)
except (binascii.Error, UnicodeEncodeError, ValueError):
continue
return None
def _decode_sealed_ciphertext_value(value: Any) -> bytes | None:
raw = str(value or "").strip()
if not raw:
return None
return _decode_base64ish(_strip_sealed_ciphertext_prefix(raw))
def _byte_entropy(data: bytes) -> float:
if not data:
return 0.0
counts = [0] * 256
for byte in data:
counts[byte] += 1
total = float(len(data))
return -sum((count / total) * math.log2(count / total) for count in counts if count)
def _validate_sealed_bytes_field(
payload: dict[str, Any],
field: str,
*,
min_bytes: int = 8,
entropy_floor: float = 2.5,
) -> tuple[bool, str]:
raw = str(payload.get(field, "") or "").strip()
prefixed = _sealed_ciphertext_has_known_prefix(raw)
data = _decode_sealed_ciphertext_value(raw)
if data is None:
return False, f"{field} must be base64-encoded sealed bytes"
if len(data) < min_bytes:
return False, f"{field} is too short"
# X3DH / MLS envelopes are structured JSON or ratchet frames — skip
# plaintext heuristics once a known wire prefix is present.
if prefixed:
return True, "ok"
# Short test vectors and compact envelopes can be low entropy; only apply
# heuristics once there is enough material to distinguish a sealed blob
# from accidental base64-encoded plaintext.
if len(data) >= 32:
printable = sum(1 for byte in data if 32 <= byte <= 126 or byte in (9, 10, 13))
if printable / len(data) > 0.9:
try:
data.decode("utf-8")
return False, f"{field} looks like encoded plaintext"
except UnicodeDecodeError:
pass
if _byte_entropy(data) < entropy_floor:
return False, f"{field} entropy is too low for sealed bytes"
return True, "ok"
def _validate_message(payload: dict[str, Any]) -> tuple[bool, str]:
ok, reason = _require_fields(
payload, ("message", "destination", "channel", "priority", "ephemeral")
@@ -331,6 +416,7 @@ ACTIVE_PUBLIC_LEDGER_EVENT_TYPES: frozenset[str] = frozenset(
LEGACY_PUBLIC_LEDGER_EVENT_TYPES: frozenset[str] = frozenset(
{
"gate_message",
"dm_message",
}
)
"""Event types that exist historically on the public chain and must remain
@@ -425,6 +511,8 @@ def validate_event_payload(event_type: str, payload: dict[str, Any]) -> tuple[bo
def validate_public_ledger_payload(event_type: str, payload: dict[str, Any]) -> tuple[bool, str]:
if event_type == "gate_message":
return validate_private_gate_ledger_payload(payload)
if event_type not in PUBLIC_LEDGER_EVENT_TYPES and event_type not in _EXTENSION_VALIDATORS:
return False, f"{event_type} is not allowed on the public ledger"
forbidden = sorted(
@@ -441,6 +529,92 @@ def validate_public_ledger_payload(event_type: str, payload: dict[str, Any]) ->
return True, "ok"
_PRIVATE_GATE_LEDGER_ALLOWED_FIELDS: frozenset[str] = frozenset(
{
"gate",
"ciphertext",
"nonce",
"sender_ref",
"format",
"epoch",
"gate_envelope",
"envelope_hash",
"reply_to",
"transport_lock",
"signed_context",
}
)
def validate_private_gate_ledger_payload(payload: dict[str, Any]) -> tuple[bool, str]:
"""Validate ciphertext-only gate events for private Infonet replication."""
ok, reason = validate_event_payload("gate_message", payload)
if not ok:
return ok, reason
unexpected = sorted(
key
for key in payload.keys()
if str(key or "").strip().lower() not in _PRIVATE_GATE_LEDGER_ALLOWED_FIELDS
)
if unexpected:
return False, f"private gate ledger payload contains unsupported fields: {', '.join(unexpected)}"
if "message" in payload or "_local_plaintext" in payload or "_local_reply_to" in payload:
return False, "private gate ledger payload must not contain plaintext"
transport_lock = str(payload.get("transport_lock", "") or "").strip().lower()
if transport_lock and transport_lock not in {"private", "private_strong", "rns", "onion"}:
return False, "gate messages require private transport_lock"
ok, reason = _validate_sealed_bytes_field(payload, "ciphertext")
if not ok:
return ok, reason
ok, reason = _validate_sealed_bytes_field(payload, "nonce")
if not ok:
return ok, reason
return True, "ok"
_PRIVATE_DM_LEDGER_ALLOWED_FIELDS: frozenset[str] = frozenset(
{
"recipient_id",
"delivery_class",
"recipient_token",
"ciphertext",
"msg_id",
"timestamp",
"format",
"session_welcome",
"sender_seal",
"relay_salt",
"transport_lock",
"signed_context",
}
)
def validate_private_dm_ledger_payload(payload: dict[str, Any]) -> tuple[bool, str]:
"""Validate ciphertext-only DM dead-drop events for private Infonet replication."""
ok, reason = validate_event_payload("dm_message", payload)
if not ok:
return ok, reason
unexpected = sorted(
key
for key in payload.keys()
if str(key or "").strip().lower() not in _PRIVATE_DM_LEDGER_ALLOWED_FIELDS
)
if unexpected:
return False, f"private DM ledger payload contains unsupported fields: {', '.join(unexpected)}"
if "message" in payload or "plaintext" in payload or "_local_plaintext" in payload:
return False, "private DM ledger payload must not contain plaintext"
transport_lock = str(payload.get("transport_lock", "") or "").strip().lower()
if transport_lock != "private_strong":
return False, "DM hashchain spool requires private_strong transport_lock"
if not str(payload.get("ciphertext", "") or "").strip():
return False, "ciphertext cannot be empty"
ok, reason = _validate_sealed_bytes_field(payload, "ciphertext")
if not ok:
return ok, reason
return True, "ok"
def validate_protocol_fields(protocol_version: str, network_id: str) -> tuple[bool, str]:
if protocol_version != PROTOCOL_VERSION:
return False, "Unsupported protocol_version"
+26 -3
View File
@@ -38,6 +38,11 @@ _REVOCATION_TTL_CACHE: dict[str, dict[str, Any]] = {}
_REVOCATION_TTL_LOCK = threading.Lock()
_REVOCATION_REFRESH_LOCK = threading.Lock()
_REVOCATION_REFRESH_FAIL_FAST_WINDOW_S = 5.0
def _request_scope_path(request: Request) -> str:
scope = getattr(request, "scope", {}) or {}
return str(scope.get("path") or "")
_REVOCATION_REFRESH_RETRY_AFTER_S = 5
_REVOCATION_PRECHECK_UNAVAILABLE_DETAIL = "Signed event integrity preflight unavailable"
@@ -166,7 +171,7 @@ def _canonical_signed_write_retry_payload(
signed_context = build_signed_context(
event_type=prepared.event_type,
kind=prepared.kind.value,
endpoint=str(request.url.path or ""),
endpoint=_request_scope_path(request),
lane_floor=_content_private_required_transport_tier(prepared.kind),
sequence_domain=_signed_context_sequence_domain(prepared),
node_id=prepared.node_id,
@@ -458,8 +463,26 @@ def _apply_content_private_transport_lock_policy(prepared: "PreparedSignedWrite"
except Exception:
current_tier = "public_degraded"
lock_to_satisfy = normalized
if prepared.kind in {
SignedWriteKind.DM_POLL,
SignedWriteKind.DM_COUNT,
SignedWriteKind.DM_SEND,
SignedWriteKind.DM_REGISTER,
SignedWriteKind.DM_BLOCK,
SignedWriteKind.DM_WITNESS,
}:
from services.mesh.mesh_privacy_policy import release_lane_required_tier
lane_cap = release_lane_required_tier("dm")
# Clients sign private_strong; Tor-only nodes cap DM at
# private_transitional. Accept when live transport meets the
# strongest tier this node can offer on the DM lane.
if not transport_tier_is_sufficient(lane_cap, normalized):
lock_to_satisfy = lane_cap
if (
not transport_tier_is_sufficient(current_tier, normalized)
not transport_tier_is_sufficient(current_tier, lock_to_satisfy)
and prepared.kind not in _QUEUEABLE_CONTENT_PRIVATE_KINDS
):
metrics_inc("signed_write_transport_lock_tier_mismatch")
@@ -540,7 +563,7 @@ def _apply_signed_context_policy(prepared: "PreparedSignedWrite", request: Reque
ok, reason = validate_signed_context(
event_type=prepared.event_type,
kind=prepared.kind.value,
endpoint=str(request.url.path or ""),
endpoint=_request_scope_path(request),
lane_floor=_content_private_required_transport_tier(prepared.kind),
sequence_domain=_signed_context_sequence_domain(prepared),
node_id=prepared.node_id,
+507
View File
@@ -0,0 +1,507 @@
"""Private Infonet swarm discovery and immediate ledger propagation."""
from __future__ import annotations
import json
import logging
import threading
import time
from typing import Any
from services.config import get_settings
from services.mesh.mesh_bootstrap_manifest import (
BootstrapManifest,
BootstrapManifestError,
BootstrapPeer,
build_bootstrap_manifest_payload,
load_bootstrap_manifest,
parse_bootstrap_manifest_dict,
sign_bootstrap_manifest_payload,
write_signed_bootstrap_manifest,
)
from services.mesh.mesh_crypto import normalize_peer_url, resolve_peer_key_for_url
from services.mesh.mesh_peer_registry import DEFAULT_PEER_REGISTRY_PATH, PeerRegistry, RegistryPeer
from services.mesh.mesh_peer_store import (
DEFAULT_PEER_STORE_PATH,
PeerStore,
make_push_peer_record,
make_sync_peer_record,
)
from services.mesh.mesh_router import parse_configured_relay_peers, peer_transport_kind
logger = logging.getLogger(__name__)
_SWARM_LOCK = threading.Lock()
_LAST_MANIFEST_PULL_AT = 0.0
_LAST_ANNOUNCE_AT = 0.0
def peer_registry_enabled() -> bool:
settings = get_settings()
if bool(getattr(settings, "MESH_PEER_REGISTRY_DISABLED", False)):
return False
if str(getattr(settings, "MESH_BOOTSTRAP_SIGNER_PRIVATE_KEY", "") or "").strip():
return True
return bool(getattr(settings, "MESH_PEER_REGISTRY_ENABLED", False))
def _manifest_path() -> str:
return str(getattr(get_settings(), "MESH_BOOTSTRAP_MANIFEST_PATH", "") or "data/bootstrap_peers.json")
def _signer_public_key_b64() -> str:
from services.mesh.mesh_fleet_defaults import effective_bootstrap_signer_public_key_b64
return effective_bootstrap_signer_public_key_b64()
def _signer_private_key_b64() -> str:
return str(getattr(settings, "MESH_BOOTSTRAP_SIGNER_PRIVATE_KEY", "") or "").strip() if (settings := get_settings()) else ""
def _signer_id() -> str:
configured = str(getattr(get_settings(), "MESH_BOOTSTRAP_SIGNER_ID", "") or "").strip()
return configured or "shadowbroker-seed"
def _private_transport_required() -> bool:
return not bool(getattr(get_settings(), "MESH_INFONET_ALLOW_CLEARNET_SYNC", False))
def _configured_seed_peer_urls() -> list[str]:
from services.mesh.mesh_fleet_defaults import configured_bootstrap_seed_peers_with_fleet_default
settings = get_settings()
primary = str(getattr(settings, "MESH_BOOTSTRAP_SEED_PEERS", "") or "").strip()
legacy = str(getattr(settings, "MESH_DEFAULT_SYNC_PEERS", "") or "").strip()
return configured_bootstrap_seed_peers_with_fleet_default(
parse_configured_relay_peers(primary or legacy)
)
def _seed_manifest_peers() -> list[dict[str, str]]:
peers: list[dict[str, str]] = []
for peer_url in _configured_seed_peer_urls():
transport = str(peer_transport_kind(peer_url) or "")
if _private_transport_required() and transport != "onion":
continue
peers.append(
{
"peer_url": peer_url,
"transport": transport,
"role": "seed",
"label": "ShadowBroker bootstrap seed",
}
)
return peers
def publish_registry_manifest(*, now: float | None = None, persist: bool = True) -> BootstrapManifest:
private_key = _signer_private_key_b64()
public_key = _signer_public_key_b64()
if not private_key or not public_key:
raise BootstrapManifestError("bootstrap signer keys are required to publish swarm manifest")
timestamp = int(now if now is not None else time.time())
registry = PeerRegistry(DEFAULT_PEER_REGISTRY_PATH)
try:
registry.load()
except Exception:
registry = PeerRegistry(DEFAULT_PEER_REGISTRY_PATH)
stale_s = int(getattr(get_settings(), "MESH_PEER_REGISTRY_STALE_S", 0) or 7 * 86400)
if stale_s > 0:
registry.prune_stale(max_age_s=stale_s, now=timestamp)
peers = _seed_manifest_peers() + registry.manifest_peers()
ttl_s = int(getattr(get_settings(), "MESH_SWARM_MANIFEST_TTL_S", 0) or 4 * 3600)
payload = build_bootstrap_manifest_payload(
signer_id=_signer_id(),
peers=peers,
issued_at=timestamp,
valid_until=timestamp + max(300, ttl_s),
)
signature = sign_bootstrap_manifest_payload(payload, signer_private_key_b64=private_key)
manifest = BootstrapManifest(
version=int(payload["version"]),
issued_at=int(payload["issued_at"]),
valid_until=int(payload["valid_until"]),
signer_id=str(payload["signer_id"]),
peers=tuple(BootstrapPeer(**dict(peer)) for peer in peers),
signature=signature,
)
if persist:
registry.save()
write_signed_bootstrap_manifest(
_manifest_path(),
signer_id=manifest.signer_id,
signer_private_key_b64=private_key,
peers=[peer.to_dict() for peer in manifest.peers],
issued_at=manifest.issued_at,
valid_until=manifest.valid_until,
)
return manifest
def load_live_bootstrap_manifest(*, now: float | None = None) -> BootstrapManifest | None:
public_key = _signer_public_key_b64()
if not public_key:
return None
if peer_registry_enabled():
try:
return publish_registry_manifest(now=now, persist=False)
except BootstrapManifestError:
logger.warning("live registry manifest unavailable", exc_info=True)
try:
return load_bootstrap_manifest(_manifest_path(), signer_public_key_b64=public_key, now=now)
except BootstrapManifestError:
return None
def _upsert_swarm_peer_into_store(
*,
peer_url: str,
transport: str,
role: str,
label: str = "",
signer_id: str = "",
now: float | None = None,
) -> None:
timestamp = int(now if now is not None else time.time())
if _private_transport_required() and transport != "onion":
return
store = PeerStore(DEFAULT_PEER_STORE_PATH)
try:
store.load()
except Exception:
store = PeerStore(DEFAULT_PEER_STORE_PATH)
store.upsert(
make_sync_peer_record(
peer_url=peer_url,
transport=transport,
role=role,
source="swarm",
label=label,
signer_id=signer_id,
now=timestamp,
)
)
store.upsert(
make_push_peer_record(
peer_url=peer_url,
transport=transport,
role=role if role != "seed" else "relay",
source="swarm",
label=label,
now=timestamp,
)
)
store.save()
def record_peer_announcement(body: dict[str, Any], *, now: float | None = None) -> RegistryPeer:
if not peer_registry_enabled():
raise ValueError("peer registry is not enabled on this node")
registry = PeerRegistry(DEFAULT_PEER_REGISTRY_PATH)
try:
registry.load()
except Exception:
registry = PeerRegistry(DEFAULT_PEER_REGISTRY_PATH)
peer = registry.upsert_announcement(
peer_url=str(body.get("peer_url", "") or ""),
transport=str(body.get("transport", "") or ""),
role=str(body.get("role", "participant") or "participant"),
node_id=str(body.get("node_id", "") or ""),
label=str(body.get("label", "") or ""),
now=now,
)
registry.save()
_upsert_swarm_peer_into_store(
peer_url=peer.peer_url,
transport=peer.transport,
role=peer.role,
label=peer.label,
signer_id=_signer_id(),
now=now,
)
try:
publish_registry_manifest(now=now, persist=True)
except Exception:
logger.warning("failed to republish swarm manifest after announce", exc_info=True)
return peer
def merge_manifest_into_peer_store(manifest: BootstrapManifest, *, now: float | None = None) -> int:
timestamp = int(now if now is not None else time.time())
merged = 0
for peer in manifest.peers:
if _private_transport_required() and peer.transport != "onion":
continue
_upsert_swarm_peer_into_store(
peer_url=peer.peer_url,
transport=peer.transport,
role=peer.role,
label=peer.label,
signer_id=manifest.signer_id,
now=timestamp,
)
merged += 1
return merged
def fetch_remote_bootstrap_manifest(seed_peer_url: str, *, now: float | None = None) -> BootstrapManifest | None:
import requests
public_key = _signer_public_key_b64()
if not public_key:
return None
normalized = normalize_peer_url(seed_peer_url)
if not normalized:
return None
from main import _infonet_peer_requests_proxies
proxies = _infonet_peer_requests_proxies(normalized)
timeout = int(getattr(get_settings(), "MESH_SYNC_TIMEOUT_S", 0) or 45)
request_kwargs: dict[str, Any] = {"timeout": timeout}
if proxies:
request_kwargs["proxies"] = proxies
try:
response = requests.get(f"{normalized}/api/mesh/infonet/bootstrap-manifest", **request_kwargs)
except Exception as exc:
logger.debug("swarm manifest fetch failed for %s: %s", normalized, exc)
return None
if response.status_code != 200:
return None
try:
raw = response.json()
except Exception:
return None
if not isinstance(raw, dict) or raw.get("ok") is False:
return None
manifest_body = dict(raw.get("manifest") or raw)
try:
return parse_bootstrap_manifest_dict(
manifest_body,
signer_public_key_b64=public_key,
now=now,
)
except BootstrapManifestError:
return None
def refresh_swarm_manifest_from_seeds(*, now: float | None = None, force: bool = False) -> dict[str, Any]:
global _LAST_MANIFEST_PULL_AT
interval_s = int(getattr(get_settings(), "MESH_SWARM_MANIFEST_PULL_INTERVAL_S", 0) or 300)
timestamp = float(now if now is not None else time.time())
with _SWARM_LOCK:
if not force and _LAST_MANIFEST_PULL_AT and timestamp - _LAST_MANIFEST_PULL_AT < max(30, interval_s):
return {"ok": True, "skipped": True, "reason": "manifest_pull_interval"}
_LAST_MANIFEST_PULL_AT = timestamp
if not _signer_public_key_b64():
return {"ok": False, "detail": "MESH_BOOTSTRAP_SIGNER_PUBLIC_KEY is not configured"}
last_error = "manifest fetch failed"
for seed_url in _configured_seed_peer_urls():
manifest = fetch_remote_bootstrap_manifest(seed_url, now=timestamp)
if manifest is None:
continue
try:
merged = merge_manifest_into_peer_store(manifest, now=timestamp)
return {
"ok": True,
"seed_peer_url": seed_url,
"peer_count": len(manifest.peers),
"merged_peer_count": merged,
}
except Exception as exc:
last_error = str(exc or type(exc).__name__)
return {"ok": False, "detail": last_error}
def announce_local_peer_to_seeds(*, now: float | None = None, force: bool = False) -> dict[str, Any]:
global _LAST_ANNOUNCE_AT
import hashlib as _hashlib_mod
import hmac as _hmac_mod
import requests
from main import _infonet_peer_requests_proxies, _local_infonet_peer_url, _participant_node_enabled
if not _participant_node_enabled():
return {"ok": False, "detail": "participant node disabled"}
peer_url = _local_infonet_peer_url()
if not peer_url:
return {"ok": False, "detail": "local peer URL is not ready"}
peer_key = resolve_peer_key_for_url(peer_url)
if not peer_key:
return {"ok": False, "detail": "peer HMAC secret is not configured"}
timestamp = float(now if now is not None else time.time())
with _SWARM_LOCK:
if not force and _LAST_ANNOUNCE_AT and timestamp - _LAST_ANNOUNCE_AT < 300:
return {"ok": True, "skipped": True, "reason": "announce_interval"}
_LAST_ANNOUNCE_AT = timestamp
transport = str(peer_transport_kind(peer_url) or "onion")
body = {
"peer_url": peer_url,
"transport": transport,
"role": "participant",
"node_id": "",
"label": "",
"ts": int(timestamp),
}
body_bytes = json.dumps(body, sort_keys=True, separators=(",", ":"), ensure_ascii=False).encode("utf-8")
hmac_hex = _hmac_mod.new(peer_key, body_bytes, _hashlib_mod.sha256).hexdigest()
timeout = int(getattr(get_settings(), "MESH_RELAY_PUSH_TIMEOUT_S", 0) or 45)
results: list[dict[str, Any]] = []
for seed_url in _configured_seed_peer_urls():
normalized = normalize_peer_url(seed_url)
if not normalized:
continue
proxies = _infonet_peer_requests_proxies(normalized)
request_kwargs: dict[str, Any] = {
"data": body_bytes,
"headers": {
"Content-Type": "application/json",
"X-Peer-Url": peer_url,
"X-Peer-HMAC": hmac_hex,
},
"timeout": timeout,
}
if proxies:
request_kwargs["proxies"] = proxies
try:
response = requests.post(
f"{normalized}/api/mesh/infonet/peer-announce",
**request_kwargs,
)
results.append(
{
"seed_peer_url": normalized,
"status_code": int(response.status_code),
"ok": response.status_code == 200,
}
)
except Exception as exc:
results.append({"seed_peer_url": normalized, "ok": False, "detail": str(exc)})
ok = any(bool(item.get("ok")) for item in results)
return {"ok": ok, "peer_url": peer_url, "results": results}
def _announce_succeeded(announce: dict[str, Any]) -> bool:
if not bool(announce.get("ok")):
return False
results = announce.get("results") or []
return any(bool(item.get("ok")) and int(item.get("status_code") or 0) == 200 for item in results)
def _manifest_succeeded(manifest: dict[str, Any]) -> bool:
if not bool(manifest.get("ok")):
return False
peer_count = int(manifest.get("merged_peer_count") or manifest.get("peer_count") or 0)
return peer_count >= 1
def join_swarm_with_retries(
*,
attempts: int = 6,
delay_s: float = 15.0,
force: bool = True,
) -> dict[str, Any]:
"""Announce to seed and pull manifest, retrying while Tor circuits warm up."""
last_announce: dict[str, Any] = {"ok": False, "detail": "not attempted"}
last_manifest: dict[str, Any] = {"ok": False, "detail": "not attempted"}
tries = max(1, int(attempts))
pause_s = max(1.0, float(delay_s))
for attempt in range(tries):
last_announce = announce_local_peer_to_seeds(force=force)
last_manifest = refresh_swarm_manifest_from_seeds(force=force)
if _announce_succeeded(last_announce) and _manifest_succeeded(last_manifest):
return {
"ok": True,
"attempts": attempt + 1,
"announce": last_announce,
"manifest_pull": last_manifest,
}
if attempt + 1 < tries:
time.sleep(pause_s)
return {
"ok": False,
"attempts": tries,
"announce": last_announce,
"manifest_pull": last_manifest,
"detail": "swarm join incomplete after retries",
}
def push_infonet_events_to_http_peers(events: list[dict[str, Any]]) -> dict[str, Any]:
import hashlib as _hashlib_mod
import hmac as _hmac_mod
import requests
from main import (
_filter_infonet_peer_urls,
_infonet_peer_requests_proxies,
_local_infonet_peer_url,
_participant_node_enabled,
_record_public_push_result,
)
from services.mesh.mesh_router import authenticated_push_peer_urls
if not _participant_node_enabled() or not events:
return {"ok": False, "detail": "nothing to push"}
peers = _filter_infonet_peer_urls(authenticated_push_peer_urls())
if not peers:
return {"ok": False, "detail": "no push peers configured"}
sender_url = _local_infonet_peer_url()
peer_key = resolve_peer_key_for_url(sender_url)
if not peer_key:
return {"ok": False, "detail": "peer HMAC secret is not configured"}
body_bytes = json.dumps(
{"events": events},
sort_keys=True,
separators=(",", ":"),
ensure_ascii=False,
).encode("utf-8")
hmac_hex = _hmac_mod.new(peer_key, body_bytes, _hashlib_mod.sha256).hexdigest()
timeout = int(getattr(get_settings(), "MESH_RELAY_PUSH_TIMEOUT_S", 0) or 45)
results: list[dict[str, Any]] = []
for peer_url in peers:
normalized = normalize_peer_url(peer_url)
if not normalized:
continue
proxies = _infonet_peer_requests_proxies(normalized)
request_kwargs: dict[str, Any] = {
"data": body_bytes,
"headers": {
"Content-Type": "application/json",
"X-Peer-Url": sender_url,
"X-Peer-HMAC": hmac_hex,
},
"timeout": timeout,
}
if proxies:
request_kwargs["proxies"] = proxies
try:
response = requests.post(f"{normalized}/api/mesh/infonet/peer-push", **request_kwargs)
results.append(
{
"peer_url": normalized,
"ok": response.status_code == 200,
"status_code": int(response.status_code),
}
)
except Exception as exc:
results.append({"peer_url": normalized, "ok": False, "detail": str(exc)})
ok = any(bool(item.get("ok")) for item in results)
event_id = str((events[-1] or {}).get("event_id", "") or "")
_record_public_push_result(
event_id,
ok=ok,
error="" if ok else "immediate peer push failed",
results=results,
)
return {"ok": ok, "results": results}
@@ -929,6 +929,85 @@ def list_wormhole_dm_contacts() -> dict[str, dict[str, Any]]:
return _read_contacts()
def get_wormhole_dm_contact(peer_id: str) -> dict[str, Any] | None:
peer_key = str(peer_id or "").strip()
if not peer_key:
return None
contacts = _read_contacts()
if peer_key not in contacts:
return None
return dict(_normalize_contact(contacts[peer_key]))
def sever_wormhole_dm_contact(peer_id: str, *, block: bool = False) -> dict[str, Any]:
"""Close the shared DM lane; a fresh contact request + accept is required to reopen."""
peer_key = str(peer_id or "").strip()
if not peer_key:
return {"ok": False, "detail": "peer_id required"}
contacts = _read_contacts()
current = _normalize_contact(contacts.get(peer_key))
now = int(time.time())
current["sharedAlias"] = ""
current["sharedAliasCounter"] = 0
current["sharedAliasPublicKey"] = ""
current["sharedAliasPublicKeyAlgo"] = "Ed25519"
current["previousSharedAliases"] = []
current["pendingSharedAlias"] = ""
current["pendingSharedAliasCounter"] = 0
current["pendingSharedAliasPublicKey"] = ""
current["pendingSharedAliasPublicKeyAlgo"] = "Ed25519"
current["pendingSharedAliasGraceMs"] = 0
current["sharedAliasGraceUntil"] = 0
current["sharedAliasRotatedAt"] = 0
current["acceptedPreviousAlias"] = ""
current["acceptedPreviousAliasCounter"] = 0
current["acceptedPreviousAliasPublicKey"] = ""
current["acceptedPreviousAliasPublicKeyAlgo"] = "Ed25519"
current["acceptedPreviousGraceUntil"] = 0
current["acceptedPreviousHardGraceUntil"] = 0
current["acceptedPreviousAwaitingReply"] = False
current["aliasBindingSeq"] = 0
current["aliasBindingPendingReason"] = ""
current["aliasBindingPreparedAt"] = 0
current["aliasGateJoinAppliedSeq"] = 0
if block:
current["blocked"] = True
current["updated_at"] = now
contacts[peer_key] = _normalize_contact(current)
_write_contacts(contacts)
relay_policy = {}
try:
from services.mesh.mesh_dm_connect_delivery import revoke_connect_relay_policy
relay_policy = revoke_connect_relay_policy(peer_key)
except Exception:
relay_policy = {"ok": False}
relay_block = {"ok": False}
if block:
try:
from services.mesh.mesh_dm_relay import dm_relay
from services.mesh.mesh_wormhole_persona import get_dm_identity
local_id = str(get_dm_identity().get("node_id", "") or "").strip()
if local_id:
dm_relay.block(local_id, peer_key)
relay_block = {"ok": True, "local_id": local_id}
except Exception as exc:
relay_block = {"ok": False, "detail": str(exc) or type(exc).__name__}
return {
"ok": True,
"peer_id": peer_key,
"severed": True,
"blocked": bool(block),
"relay_policy": relay_policy,
"relay_block": relay_block,
}
def _promote_invite_lookup_mode(contact: dict[str, Any], *, now: int | None = None) -> bool:
current = dict(contact or {})
lookup_handle = str(current.get("invitePinnedPrekeyLookupHandle", "") or "").strip()
@@ -1070,11 +1149,14 @@ def pin_wormhole_dm_invite(
identity_dh_pub_key = str(payload.get("identity_dh_pub_key", "") or "")
dh_algo = str(payload.get("dh_algo", "X25519") or "X25519")
prekey_lookup_handle = str(payload.get("prekey_lookup_handle", "") or "")
lookup_peer_url = str(payload.get("lookup_peer_url", "") or "").strip().rstrip("/")
if str(alias or "").strip():
current["alias"] = str(alias or "").strip()
current["dhPubKey"] = identity_dh_pub_key
current["dhAlgo"] = dh_algo
current["invitePinnedPrekeyLookupHandle"] = prekey_lookup_handle
if lookup_peer_url:
current["invitePinnedLookupPeerUrl"] = lookup_peer_url
current["invitePinnedRootFingerprint"] = str(payload.get("root_fingerprint", "") or "").strip().lower()
current["invitePinnedRootManifestFingerprint"] = str(
payload.get("root_manifest_fingerprint", "") or ""
@@ -1170,6 +1252,12 @@ def pin_wormhole_dm_invite(
current["updated_at"] = now
contacts[peer_key] = _normalize_contact(current)
_write_contacts(contacts)
try:
from services.mesh.mesh_dm_connect_delivery import grant_connect_relay_policy
grant_connect_relay_policy(peer_key, reason="invite_import")
except Exception:
pass
return contacts[peer_key]
@@ -549,6 +549,27 @@ def invite_identity_commitment_for_identity_material(
return hashlib.sha256(_stable_json(material).encode("utf-8")).hexdigest()
def _local_dm_lookup_peer_url() -> str:
"""Return this node's fleet-reachable URL for invite-scoped prekey lookup."""
try:
from services.config import get_settings
from services.mesh.mesh_crypto import normalize_peer_url
configured = normalize_peer_url(str(getattr(get_settings(), "MESH_PUBLIC_PEER_URL", "") or ""))
if configured:
return configured
from services.tor_hidden_service import tor_service
onion = str(getattr(tor_service, "onion_address", "") or "").strip()
if onion:
if "://" not in onion:
onion = f"http://{onion}:8000"
return normalize_peer_url(onion)
except Exception:
pass
return ""
def _dm_invite_payload(
data: dict[str, Any],
*,
@@ -930,6 +951,9 @@ def export_wormhole_dm_invite(*, label: str = "", expires_in_s: int = 0) -> dict
# fetch our prekey bundle without using our stable agent_id.
lookup_handle = secrets.token_hex(24)
payload["prekey_lookup_handle"] = lookup_handle
lookup_peer_url = _local_dm_lookup_peer_url()
if lookup_peer_url:
payload["lookup_peer_url"] = lookup_peer_url
# Persist the handle so it is included in future prekey registrations.
existing_handles, _ = _normalize_prekey_lookup_handles(
+350 -80
View File
@@ -79,6 +79,164 @@ def _warn_legacy_prekey_lookup(agent_id: str) -> None:
)
def _fleet_peer_lookup_user_agent() -> str:
custom = str(os.environ.get("SHADOWBROKER_MESH_PEER_USER_AGENT") or "").strip()
if custom:
return custom
return "Mozilla/5.0 (compatible; ShadowbrokerMesh/1.0)"
_INVITE_LOOKUP_MAX_ELAPSED_S = 120
_INVITE_LOOKUP_MAX_BOOTSTRAP_PEERS = 3
_INVITE_LOOKUP_MAX_PUSH_PEERS = 16
_INVITE_LOOKUP_PARALLEL_WORKERS = 8
def _invite_lookup_request_timeout(peer_url: str) -> tuple[int, int]:
from services.mesh.mesh_router import peer_transport_kind
if peer_transport_kind(peer_url) == "onion":
return (10, 35)
return (5, 15)
def _bootstrap_seed_peer_urls() -> set[str]:
try:
from services.config import get_settings
from services.mesh.mesh_router import parse_configured_relay_peers
seeds: set[str] = set()
raw = str(getattr(get_settings(), "MESH_BOOTSTRAP_SEED_PEERS", "") or "")
for peer in parse_configured_relay_peers(raw):
normalized = str(peer or "").strip().rstrip("/")
if normalized:
seeds.add(normalized)
return seeds
except Exception:
return set()
def _discovered_push_peer_urls(*, limit: int = _INVITE_LOOKUP_MAX_PUSH_PEERS) -> list[str]:
try:
from services.mesh.mesh_router import authenticated_push_peer_urls
seeds = _bootstrap_seed_peer_urls()
peers: list[str] = []
for peer in authenticated_push_peer_urls():
normalized = str(peer or "").strip().rstrip("/")
if not normalized or normalized in seeds:
continue
peers.append(normalized)
if len(peers) >= max(1, int(limit or 1)):
break
return peers
except Exception:
return []
def _prioritized_invite_lookup_peer_urls(*, preferred: list[str] | None = None) -> list[str]:
preferred_urls = [
str(peer or "").strip().rstrip("/")
for peer in list(preferred or [])
if str(peer or "").strip()
]
configured = _configured_public_lookup_peer_urls()
seeds = _bootstrap_seed_peer_urls()
active: list[str] = []
bootstrap: list[str] = []
push_discovery: list[str] = []
seen = set(preferred_urls)
for peer in configured:
if peer in seen:
continue
seen.add(peer)
if peer in seeds:
bootstrap.append(peer)
else:
active.append(peer)
for peer in _discovered_push_peer_urls():
if peer in seen:
continue
seen.add(peer)
push_discovery.append(peer)
ordered = list(preferred_urls)
ordered.extend(active)
ordered.extend(push_discovery)
ordered.extend(bootstrap[:_INVITE_LOOKUP_MAX_BOOTSTRAP_PEERS])
return ordered
def _preferred_invite_lookup_peer_urls(lookup_token: str) -> list[str]:
token = str(lookup_token or "").strip()
if not token:
return []
try:
from services.mesh.mesh_wormhole_contacts import list_wormhole_dm_contacts
except Exception:
return []
peers: list[str] = []
for contact in list_wormhole_dm_contacts() or []:
if not isinstance(contact, dict):
continue
if str(contact.get("invitePinnedPrekeyLookupHandle", "") or "").strip() != token:
continue
peer_url = str(contact.get("invitePinnedLookupPeerUrl", "") or "").strip().rstrip("/")
if peer_url and peer_url not in peers:
peers.append(peer_url)
return peers
def _peer_http_request(
method: str,
peer_url: str,
*,
body_bytes: bytes | None = None,
headers: dict[str, str] | None = None,
timeout: int | tuple[int, int] = 45,
):
"""HTTP to a fleet peer, using Tor SOCKS when the URL is an onion address."""
import requests
from services.mesh.mesh_crypto import normalize_peer_url
from urllib.parse import urlparse
raw_peer_url = str(peer_url or "").strip()
parsed = urlparse(raw_peer_url)
if parsed.path and parsed.path not in {"", "/"}:
# Full request URLs include invite lookup query params; do not
# normalize them away when deriving the peer base URL.
normalized = raw_peer_url
else:
normalized = normalize_peer_url(raw_peer_url)
if not normalized:
raise OSError("invalid peer url")
if isinstance(timeout, tuple):
connect_timeout, read_timeout = timeout
resolved_timeout: int | tuple[int, int] = (
max(1, int(connect_timeout or 5)),
max(1, int(read_timeout or 15)),
)
else:
resolved_timeout = max(1, int(timeout or 45))
request_kwargs: dict[str, Any] = {
"headers": dict(headers or {}),
"timeout": resolved_timeout,
}
try:
from main import _infonet_peer_requests_proxies
proxy_peer_url = normalize_peer_url(f"{parsed.scheme}://{parsed.netloc}")
proxies = _infonet_peer_requests_proxies(proxy_peer_url)
if proxies:
request_kwargs["proxies"] = proxies
except Exception:
pass
if method.upper() == "GET":
return requests.get(normalized, **request_kwargs)
request_kwargs["data"] = body_bytes or b""
return requests.post(normalized, **request_kwargs)
def _fetch_dm_prekey_bundle_from_peer_lookup(lookup_token: str) -> dict[str, Any]:
"""Fetch an invite-scoped prekey bundle from configured authenticated peers.
@@ -95,12 +253,12 @@ def _fetch_dm_prekey_bundle_from_peer_lookup(lookup_token: str) -> dict[str, Any
normalize_peer_url,
resolve_peer_key_for_url,
)
from services.mesh.mesh_router import configured_relay_peer_urls
from services.mesh.mesh_router import authenticated_push_peer_urls
settings = get_settings()
# Issue #256: secret check moved per-peer below. We still bail out
# cleanly when there are no peers configured at all.
peers = configured_relay_peer_urls()
peers = authenticated_push_peer_urls()
if not peers:
return {"ok": False, "detail": "peer prekey lookup unavailable"}
timeout = max(1, _safe_int(getattr(settings, "MESH_RELAY_PUSH_TIMEOUT_S", 10) or 10, 10))
@@ -132,17 +290,17 @@ def _fetch_dm_prekey_bundle_from_peer_lookup(lookup_token: str) -> dict[str, Any
"X-Peer-Url": sender_peer_url,
"X-Peer-HMAC": hmac.new(peer_key, body, hashlib.sha256).hexdigest(),
}
request = urllib.request.Request(
f"{normalized_peer_url}/api/mesh/dm/prekey-peer-lookup",
data=body,
headers=headers,
method="POST",
)
try:
with urllib.request.urlopen(request, timeout=timeout) as response:
raw = response.read(256 * 1024)
response = _peer_http_request(
"POST",
f"{normalized_peer_url}/api/mesh/dm/prekey-peer-lookup",
body_bytes=body,
headers=headers,
timeout=timeout,
)
raw = response.content[: 256 * 1024]
payload = json.loads(raw.decode("utf-8"))
except (urllib.error.URLError, TimeoutError, json.JSONDecodeError, OSError) as exc:
except (json.JSONDecodeError, OSError, Exception) as exc:
last_detail = str(exc) or type(exc).__name__
continue
if isinstance(payload, dict) and payload.get("ok"):
@@ -161,12 +319,18 @@ def _configured_public_lookup_peer_urls() -> list[str]:
settings = get_settings()
candidates: list[str] = []
# Operator-configured peers first, then recently active fleet nodes.
# Invite handles are minted on a specific node; cold bootstrap seeds
# rarely have them cached and should not be tried before contacts.
for raw in (
getattr(settings, "MESH_BOOTSTRAP_SEED_PEERS", ""),
getattr(settings, "MESH_DEFAULT_SYNC_PEERS", ""),
):
candidates.extend(parse_configured_relay_peers(str(raw or "")))
candidates.extend(active_sync_peer_urls())
for raw in (
getattr(settings, "MESH_BOOTSTRAP_SEED_PEERS", ""),
):
candidates.extend(parse_configured_relay_peers(str(raw or "")))
except Exception:
return []
@@ -204,7 +368,50 @@ def _normalize_remote_lookup_bundle(payload: dict[str, Any]) -> dict[str, Any]:
return data
def _fetch_dm_prekey_bundle_from_public_lookup(lookup_token: str) -> dict[str, Any]:
def _try_public_prekey_lookup_peer(
peer_url: str,
encoded: str,
*,
timeout: int | tuple[int, int] | None = None,
) -> dict[str, Any]:
normalized_peer_url = str(peer_url or "").strip().rstrip("/")
if not normalized_peer_url:
return {"ok": False, "detail": "invalid peer url"}
resolved_timeout = timeout or _invite_lookup_request_timeout(normalized_peer_url)
try:
response = _peer_http_request(
"GET",
f"{normalized_peer_url}/api/mesh/dm/prekey-bundle?{encoded}",
headers={
"Accept": "application/json",
"User-Agent": _fleet_peer_lookup_user_agent(),
},
timeout=resolved_timeout,
)
raw = response.content[: 256 * 1024]
payload = json.loads(raw.decode("utf-8"))
except (json.JSONDecodeError, OSError, Exception) as exc:
logger.debug("public prekey lookup failed for %s: %s", normalized_peer_url, type(exc).__name__)
return {"ok": False, "detail": "peer prekey lookup unavailable"}
if not isinstance(payload, dict):
return {"ok": False, "detail": "invalid peer response"}
if payload.get("pending") or str(payload.get("status", "") or "") == "preparing_private_lane":
return {"ok": False, "detail": "peer prekey lookup still preparing"}
if not payload.get("ok"):
return {
"ok": False,
"detail": str(payload.get("detail", "") or "Prekey bundle not found"),
}
if not isinstance(payload.get("bundle"), dict):
return {"ok": False, "detail": "Prekey bundle not found"}
return _normalize_remote_lookup_bundle(payload)
def _fetch_dm_prekey_bundle_from_public_lookup(
lookup_token: str,
*,
extra_preferred_peer_urls: list[str] | None = None,
) -> dict[str, Any]:
"""Fetch an invite-scoped prekey bundle from bootstrap/sync peers.
The token is high-entropy and invite-scoped. This path does not expose a
@@ -212,61 +419,69 @@ def _fetch_dm_prekey_bundle_from_public_lookup(lookup_token: str) -> dict[str, A
derive it from the signed identity public key and validate the bundle before
accepting it.
"""
from concurrent.futures import FIRST_COMPLETED, ThreadPoolExecutor, wait
token = str(lookup_token or "").strip()
if not token:
return {"ok": False, "detail": "lookup token required"}
peers = _configured_public_lookup_peer_urls()
preferred = list(_preferred_invite_lookup_peer_urls(token))
for peer in list(extra_preferred_peer_urls or []):
normalized = str(peer or "").strip().rstrip("/")
if normalized and normalized not in preferred:
preferred.insert(0, normalized)
peers = _prioritized_invite_lookup_peer_urls(preferred=preferred)
if not peers:
return {"ok": False, "detail": "peer prekey lookup unavailable"}
try:
from services.config import get_settings
timeout = max(1, _safe_int(getattr(get_settings(), "MESH_SYNC_TIMEOUT_S", 5) or 5, 5))
except Exception:
timeout = 5
encoded = urllib.parse.urlencode({"lookup_token": token})
last_detail = ""
for peer_url in peers:
normalized_peer_url = str(peer_url or "").strip().rstrip("/")
if not normalized_peer_url:
continue
# Generic UA: any peer-facing crypto request should not carry a
# fork-specific identifier — that turns prekey lookups into a
# software-fingerprinting beacon.
from services.network_utils import DEFAULT_USER_AGENT
request = urllib.request.Request(
f"{normalized_peer_url}/api/mesh/dm/prekey-bundle?{encoded}",
headers={
"Accept": "application/json",
"User-Agent": DEFAULT_USER_AGENT,
},
method="GET",
hinted_only = bool(list(extra_preferred_peer_urls or []))
hint_timeout = (5, 20)
for peer_url in preferred:
hinted = _try_public_prekey_lookup_peer(
peer_url,
encoded,
timeout=hint_timeout if hinted_only else None,
)
try:
with urllib.request.urlopen(request, timeout=timeout) as response:
raw = response.read(256 * 1024)
payload = json.loads(raw.decode("utf-8"))
except (urllib.error.URLError, TimeoutError, json.JSONDecodeError, OSError) as exc:
logger.debug("public prekey lookup failed for %s: %s", normalized_peer_url, type(exc).__name__)
last_detail = "peer prekey lookup unavailable"
continue
if not isinstance(payload, dict):
last_detail = "invalid peer response"
continue
if payload.get("pending") or str(payload.get("status", "") or "") == "preparing_private_lane":
last_detail = "peer prekey lookup still preparing"
continue
if not payload.get("ok"):
last_detail = str(payload.get("detail", "") or last_detail or "Prekey bundle not found")
continue
if not isinstance(payload.get("bundle"), dict):
last_detail = "Prekey bundle not found"
continue
normalized = _normalize_remote_lookup_bundle(payload)
if normalized.get("ok"):
return normalized
last_detail = str(normalized.get("detail", "") or last_detail)
if hinted.get("ok"):
return hinted
if isinstance(hinted, dict):
last_detail = str(hinted.get("detail", "") or last_detail)
remaining_peers = [peer for peer in peers if peer not in set(preferred)]
if not remaining_peers:
return {"ok": False, "detail": last_detail or "Prekey bundle not found"}
if hinted_only:
return {"ok": False, "detail": last_detail or "Prekey bundle not found"}
deadline = time.time() + _INVITE_LOOKUP_MAX_ELAPSED_S
workers = min(_INVITE_LOOKUP_PARALLEL_WORKERS, max(1, len(remaining_peers)))
with ThreadPoolExecutor(max_workers=workers) as executor:
futures = {
executor.submit(_try_public_prekey_lookup_peer, peer_url, encoded): peer_url
for peer_url in remaining_peers
}
while futures and time.time() < deadline:
done, _ = wait(
futures,
timeout=max(0.1, deadline - time.time()),
return_when=FIRST_COMPLETED,
)
if not done:
break
for future in done:
futures.pop(future, None)
try:
result = future.result()
except Exception as exc:
last_detail = str(exc) or type(exc).__name__
continue
if isinstance(result, dict) and result.get("ok"):
for pending in futures:
pending.cancel()
return result
if isinstance(result, dict):
last_detail = str(result.get("detail", "") or last_detail)
for pending in futures:
pending.cancel()
return {"ok": False, "detail": last_detail or "Prekey bundle not found"}
@@ -1019,6 +1234,7 @@ def fetch_dm_prekey_bundle(
lookup_token: str = "",
*,
allow_peer_lookup: bool = True,
lookup_peer_urls: list[str] | None = None,
) -> dict[str, Any]:
from services.mesh.mesh_dm_relay import dm_relay
@@ -1043,12 +1259,18 @@ def fetch_dm_prekey_bundle(
resolved_id = found_id
lookup_mode = "invite_lookup_handle"
elif allow_peer_lookup:
peer_found = _fetch_dm_prekey_bundle_from_peer_lookup(resolved_lookup)
if peer_found.get("ok"):
return peer_found
public_found = _fetch_dm_prekey_bundle_from_public_lookup(resolved_lookup)
preferred_peer_urls = list(lookup_peer_urls or [])
public_found = _fetch_dm_prekey_bundle_from_public_lookup(
resolved_lookup,
extra_preferred_peer_urls=preferred_peer_urls,
)
if public_found.get("ok"):
return public_found
peer_found: dict[str, Any] = {"ok": False, "detail": ""}
if not preferred_peer_urls:
peer_found = _fetch_dm_prekey_bundle_from_peer_lookup(resolved_lookup)
if peer_found.get("ok"):
return peer_found
if str(public_found.get("detail", "") or "").strip():
return {"ok": False, "detail": str(public_found.get("detail", "") or "Prekey bundle not found")}
return {"ok": False, "detail": str(peer_found.get("detail", "") or "Prekey bundle not found")}
@@ -1134,12 +1356,24 @@ def _classify_root_attestation_failure(peer_id: str) -> tuple[str, bool]:
return "", False
def bootstrap_encrypt_for_peer(peer_id: str, plaintext: str) -> dict[str, Any]:
fetched_bundle = fetch_dm_prekey_bundle(str(peer_id or "").strip())
def bootstrap_encrypt_for_peer(
peer_id: str,
plaintext: str,
*,
lookup_token: str = "",
fetched_bundle: dict[str, Any] | None = None,
) -> dict[str, Any]:
token = str(lookup_token or "").strip()
peer = str(peer_id or "").strip()
if fetched_bundle is None:
fetched_bundle = fetch_dm_prekey_bundle(
agent_id=peer if not token else "",
lookup_token=token,
)
if not fetched_bundle.get("ok"):
detail = str(fetched_bundle.get("detail", "") or "")
if "root attestation" in detail.lower():
trust_level, trust_changed = _classify_root_attestation_failure(str(peer_id or "").strip())
trust_level, trust_changed = _classify_root_attestation_failure(peer or token)
if trust_level:
return {
"ok": False,
@@ -1152,32 +1386,68 @@ def bootstrap_encrypt_for_peer(peer_id: str, plaintext: str) -> dict[str, Any]:
from services.mesh.mesh_dm_relay import dm_relay
resolved_peer_id = str(fetched_bundle.get("agent_id", peer_id) or peer_id).strip()
resolved_peer_id = str(fetched_bundle.get("agent_id", peer) or peer).strip()
stored = dm_relay.get_prekey_bundle(resolved_peer_id)
if not stored:
return {"ok": False, "detail": "Peer prekey bundle not found"}
remote_bundle = dict(fetched_bundle.get("bundle") or {})
if not remote_bundle and fetched_bundle.get("identity_dh_pub_key"):
remote_bundle = fetched_bundle
if remote_bundle:
stored = {
"bundle": remote_bundle,
"signature": str(fetched_bundle.get("signature", "") or ""),
"public_key": str(fetched_bundle.get("public_key", "") or ""),
"public_key_algo": str(fetched_bundle.get("public_key_algo", "") or ""),
"sequence": _safe_int(fetched_bundle.get("sequence", 0) or 0),
}
else:
return {"ok": False, "detail": "Peer prekey bundle not found"}
validated_record = {**dict(stored), "agent_id": resolved_peer_id}
ok, reason = _validate_bundle_record(validated_record)
if not ok:
return {"ok": False, "detail": reason}
trust_state = observe_remote_prekey_bundle(resolved_peer_id, validated_record)
trust_level = str(trust_state.get("trust_level", "") or "")
from services.mesh.mesh_wormhole_contacts import verified_first_contact_requirement
consent_handshake = False
try:
from services.mesh.mesh_wormhole_dead_drop import parse_contact_consent
verified_first_contact = verified_first_contact_requirement(
resolved_peer_id,
trust_level=trust_level,
)
if not verified_first_contact.get("ok"):
return {
"ok": False,
"peer_id": resolved_peer_id,
"detail": str(verified_first_contact.get("detail", "") or "verified first contact required"),
"trust_changed": trust_level in ("mismatch", "continuity_broken"),
"trust_level": str(verified_first_contact.get("trust_level", "") or trust_level or "unpinned"),
consent = parse_contact_consent(str(plaintext or "")) or {}
consent_handshake = str(consent.get("kind", "") or "") in {
"contact_offer",
"contact_accept",
"contact_deny",
}
except Exception:
consent_handshake = False
if not consent_handshake:
from services.mesh.mesh_wormhole_contacts import verified_first_contact_requirement
verified_first_contact = verified_first_contact_requirement(
resolved_peer_id,
trust_level=trust_level,
)
if not verified_first_contact.get("ok"):
return {
"ok": False,
"peer_id": resolved_peer_id,
"detail": str(
verified_first_contact.get("detail", "") or "verified first contact required"
),
"trust_changed": trust_level in ("mismatch", "continuity_broken"),
"trust_level": str(
verified_first_contact.get("trust_level", "") or trust_level or "unpinned"
),
}
peer_bundle_stored = dm_relay.consume_one_time_prekey(resolved_peer_id)
if not peer_bundle_stored:
remote_bundle = dict(stored.get("bundle") or {})
otks = list(remote_bundle.get("one_time_prekeys") or [])
peer_bundle_stored = {
"bundle": remote_bundle,
"claimed_one_time_prekey": dict(otks[0] or {}) if otks else {},
}
if not peer_bundle_stored.get("bundle"):
return {"ok": False, "detail": "Peer prekey bundle not found"}
peer_bundle = dict(peer_bundle_stored.get("bundle") or {})
peer_static = str(peer_bundle.get("identity_dh_pub_key", "") or "")
+24 -46
View File
@@ -34,9 +34,9 @@ _session.mount("http://", HTTPAdapter(max_retries=_retry, pool_maxsize=10))
# upstream's only recourse was to block "Shadowbroker" as a whole — which
# would take out every other install too.
#
# Fix: give each install a stable pseudonymous handle and include it in
# the User-Agent. Now an upstream can rate-limit or block the offending
# operator without affecting anyone else.
# Fix: give each install a stable pseudonymous handle used as the entire
# User-Agent product token (no shared "Shadowbroker" label). Upstreams see
# ``operator-7f3a92`` (or ``OPERATOR_HANDLE``), not one monolithic app name.
#
# The handle:
#
@@ -51,7 +51,6 @@ _session.mount("http://", HTTPAdapter(max_retries=_retry, pool_maxsize=10))
# - Is NEVER mixed into mesh / Wormhole / Infonet identity. This layer is
# strictly for public third-party API attribution.
_SHADOWBROKER_VERSION = "0.9"
_OPERATOR_HANDLE_FILE = (
Path(__file__).parent.parent / "data" / "operator_handle.json"
)
@@ -146,7 +145,12 @@ def get_operator_handle() -> str:
# 3. On-disk handle from a previous run.
persisted = _load_persisted_operator_handle()
if persisted:
_OPERATOR_HANDLE_CACHE = _normalize_handle(persisted)
normalized = _normalize_handle(persisted)
# Migrate legacy auto-generated handles (pre-Round-7a ``shadow-`` prefix).
if normalized.startswith("shadow-"):
normalized = f"operator-{normalized[len('shadow-'):]}"
_persist_operator_handle(normalized)
_OPERATOR_HANDLE_CACHE = normalized
return _OPERATOR_HANDLE_CACHE
# 4. Generate, persist, return.
@@ -170,41 +174,21 @@ def _normalize_handle(raw: str) -> str:
return safe[:48] if safe else "anonymous"
_CONTACT_URL = "https://github.com/BigBodyCobain/Shadowbroker/issues"
def outbound_user_agent(purpose: str = "") -> str:
"""Build a User-Agent for an outbound third-party HTTP request.
Returns something like::
Returns the per-install handle only, e.g. ``operator-7f3a92`` or
``operator-7f3a92 (purpose: wikipedia)``. No shared project name so
upstream abuse teams cannot block every install with one ``Shadowbroker``
rule.
Shadowbroker/0.9 (operator: shadow-7f3a92; purpose: wikipedia;
+https://github.com/BigBodyCobain/Shadowbroker/issues)
The ``purpose`` is optional but recommended it tells the upstream
what feature of ours is making the call (``wikipedia``, ``openmhz``,
``nominatim``, etc.), which makes their logs and our complaints
actionable.
Every outbound call in the backend that previously sent a custom
User-Agent should call this helper instead. Centralizing here means:
- one place to change the contact URL,
- one place to bump the version on release,
- one place a Wikimedia / OpenMHz operator can reach to ask for
the project to back off, with a per-install handle so they can
target the specific install instead of the project as a whole.
Set ``SHADOWBROKER_USER_AGENT`` to override the entire string if needed.
"""
handle = get_operator_handle()
if purpose:
purpose_clean = _normalize_handle(purpose)
return (
f"Shadowbroker/{_SHADOWBROKER_VERSION} "
f"(operator: {handle}; purpose: {purpose_clean}; +{_CONTACT_URL})"
)
return (
f"Shadowbroker/{_SHADOWBROKER_VERSION} "
f"(operator: {handle}; +{_CONTACT_URL})"
)
return f"{handle} (purpose: {purpose_clean})"
return handle
def _reset_operator_handle_cache_for_tests() -> None:
@@ -215,19 +199,13 @@ def _reset_operator_handle_cache_for_tests() -> None:
_OPERATOR_HANDLE_CACHE = ""
# Default outbound User-Agent. Retained for backwards compatibility with
# call sites that haven't been migrated to ``outbound_user_agent()`` yet.
# Operators who want full per-install attribution should set the
# ``OPERATOR_HANDLE`` setting and migrate call sites incrementally.
#
# Operators who run a public-facing relay can also override the whole UA
# string via the ``SHADOWBROKER_USER_AGENT`` env var. That override
# completely bypasses the per-operator helper; only use it if you know
# what you're doing.
DEFAULT_USER_AGENT = os.environ.get(
"SHADOWBROKER_USER_AGENT",
f"Shadowbroker/{_SHADOWBROKER_VERSION}",
)
def default_user_agent() -> str:
"""Default User-Agent for ``fetch_with_curl`` and legacy call sites."""
custom = (os.environ.get("SHADOWBROKER_USER_AGENT") or "").strip()
if custom:
return custom
return outbound_user_agent()
# Find bash for curl fallback — Git bash's curl has the TLS features
# needed to pass CDN fingerprint checks (brotli, zstd, libpsl)
@@ -283,7 +261,7 @@ def fetch_with_curl(url, method="GET", json_data=None, timeout=15, headers=None,
both Python requests and the barebones Windows system curl.
"""
default_headers = {
"User-Agent": DEFAULT_USER_AGENT,
"User-Agent": default_user_agent(),
}
if headers:
default_headers.update(headers)
+4 -2
View File
@@ -12,6 +12,8 @@ logger = logging.getLogger(__name__)
CONFIG_PATH = Path(__file__).parent.parent / "config" / "news_feeds.json"
MAX_FEEDS = 50
_FEED_URL_REPLACEMENTS = {
"http://feeds.bbci.co.uk/news/world/rss.xml": "https://feeds.bbci.co.uk/news/world/rss.xml",
"http://www.news.cn/english/rss/worldrss.xml": "https://www.news.cn/english/rss/worldrss.xml",
"https://www.channelnewsasia.com/rssfeed/8395986": "https://www.channelnewsasia.com/api/v1/rss-outbound-feed?_format=xml",
}
_DEAD_FEED_URLS = {
@@ -27,7 +29,7 @@ _DEAD_FEED_URLS = {
DEFAULT_FEEDS = [
{"name": "NPR", "url": "https://feeds.npr.org/1004/rss.xml", "weight": 4},
{"name": "BBC", "url": "http://feeds.bbci.co.uk/news/world/rss.xml", "weight": 3},
{"name": "BBC", "url": "https://feeds.bbci.co.uk/news/world/rss.xml", "weight": 3},
{"name": "AlJazeera", "url": "https://www.aljazeera.com/xml/rss/all.xml", "weight": 2},
{"name": "NYT", "url": "https://rss.nytimes.com/services/xml/rss/nyt/World.xml", "weight": 1},
{"name": "GDACS", "url": "https://www.gdacs.org/xml/rss.xml", "weight": 5},
@@ -35,7 +37,7 @@ DEFAULT_FEEDS = [
{"name": "Bellingcat", "url": "https://www.bellingcat.com/feed/", "weight": 4},
{"name": "Guardian", "url": "https://www.theguardian.com/world/rss", "weight": 3},
{"name": "TASS", "url": "https://tass.com/rss/v2.xml", "weight": 2},
{"name": "Xinhua", "url": "http://www.news.cn/english/rss/worldrss.xml", "weight": 2},
{"name": "Xinhua", "url": "https://www.news.cn/english/rss/worldrss.xml", "weight": 2},
{"name": "CNA", "url": "https://www.channelnewsasia.com/api/v1/rss-outbound-feed?_format=xml", "weight": 3},
{"name": "Mercopress", "url": "https://en.mercopress.com/rss/", "weight": 3},
{"name": "SCMP", "url": "https://www.scmp.com/rss/91/feed", "weight": 4},
+172
View File
@@ -83,6 +83,18 @@ READ_COMMANDS = frozenset({
"sar_pin_click",
# Analysis zones (OpenClaw map overlays)
"list_analysis_zones",
# Recon / OSINT toolkit (server-side proxies, SSRF guarded)
"osint_lookup",
"osint_tools",
"entity_expand",
# Agent routing helpers
"route_query",
"run_playbook",
# Private Infonet reads (operator-delegated)
"infonet_status",
"list_gates",
"read_gate_messages",
"poll_dms",
})
WRITE_COMMANDS = frozenset({
@@ -112,6 +124,14 @@ WRITE_COMMANDS = frozenset({
"place_analysis_zone",
"delete_analysis_zone",
"clear_analysis_zones",
# Active recon (subnet device discovery)
"osint_sweep",
# Private Infonet writes (operator wormhole identity)
"ensure_infonet_ready",
"join_infonet_swarm",
"post_gate_message",
"cast_vote",
"send_dm",
})
@@ -637,6 +657,19 @@ def _compact_query_result(result: Any) -> Any:
# Command dispatcher
# ---------------------------------------------------------------------------
def _expensive_gate(cmd: str, args: dict[str, Any]) -> dict[str, Any] | None:
from services.openclaw_routing import EXPENSIVE_GATE_MESSAGE, requires_expensive_confirm
if requires_expensive_confirm(cmd, args):
return {
"ok": False,
"detail": EXPENSIVE_GATE_MESSAGE,
"code": "expensive_command_blocked",
"hint": "route_query",
}
return None
def _dispatch_command(cmd: str, args: dict[str, Any]) -> dict[str, Any]:
"""Route a command to the appropriate AI Intel function.
@@ -644,6 +677,43 @@ def _dispatch_command(cmd: str, args: dict[str, Any]) -> dict[str, Any]:
Commands run in an isolated thread (via _execute_command) so they
do not need or touch the caller's event loop.
"""
blocked = _expensive_gate(cmd, args)
if blocked is not None:
return blocked
if cmd == "route_query":
from services.openclaw_routing import route_query
result = route_query(
text=str(args.get("text", "") or args.get("query", "") or ""),
lat=args.get("lat"),
lng=args.get("lng"),
radius_km=float(args.get("radius_km", 50) or 50),
compact=bool(args.get("compact", True)),
)
return {"ok": True, "data": result}
if cmd == "run_playbook":
from services.openclaw_routing import plan_playbook
plan = plan_playbook(str(args.get("name", "") or args.get("playbook", "")), args)
if not plan.get("ok"):
return plan
batch_results: list[dict[str, Any]] = []
for item in plan.get("batch", []):
inner_cmd = str(item.get("cmd", "")).strip().lower()
inner_args = item.get("args") or {}
inner_result = _dispatch_command(inner_cmd, inner_args)
batch_results.append({"cmd": inner_cmd, **inner_result})
return {
"ok": True,
"data": {
"playbook": plan.get("playbook"),
"description": plan.get("description", ""),
"results": batch_results,
},
}
if cmd == "get_telemetry":
from services.telemetry import get_cached_telemetry_refs
data = get_cached_telemetry_refs()
@@ -725,6 +795,7 @@ def _dispatch_command(cmd: str, args: dict[str, Any]) -> dict[str, Any]:
owner=str(args.get("owner", "") or args.get("operator", "") or ""),
layers=args.get("layers") if isinstance(args.get("layers"), (list, tuple)) else None,
limit=args.get("limit", 10),
fallback_search=bool(args.get("fallback_search") or args.get("confirm_fuzzy")),
)
if _wants_compact(args):
compact = dict(result)
@@ -780,6 +851,7 @@ def _dispatch_command(cmd: str, args: dict[str, Any]) -> dict[str, Any]:
query=str(args.get("query", "") or ""),
limit=args.get("limit", 10),
include_gdelt=bool(args.get("include_gdelt", True)),
include_telegram=bool(args.get("include_telegram", True)),
)
if _wants_compact(args):
return {"ok": True, "data": _compact_query_result(result), "format": "compressed_v1"}
@@ -846,6 +918,26 @@ def _dispatch_command(cmd: str, args: dict[str, Any]) -> dict[str, Any]:
return {"ok": True, "data": _compact_query_result(result), "format": "compressed_v1"}
return {"ok": True, "data": result}
if cmd == "osint_lookup":
from services.osint.openclaw_recon import run_osint_lookup
tool = str(args.get("tool", "") or args.get("lookup", "") or args.get("type", "") or "")
result = run_osint_lookup(tool, args)
return {"ok": True, "data": result, "tool": tool.strip().lower()}
if cmd == "osint_tools":
from services.osint.openclaw_recon import osint_tool_help
return {"ok": True, "data": osint_tool_help()}
if cmd == "osint_sweep":
from services.osint.openclaw_recon import run_osint_sweep
result = run_osint_sweep(args)
return {"ok": True, "data": result}
if cmd == "entity_expand":
from services.osint.openclaw_recon import run_entity_expand
result = run_entity_expand(args)
return {"ok": True, "data": result}
if cmd == "get_report":
from services.telemetry import get_cached_telemetry_refs, get_cached_slow_telemetry_refs
fast = get_cached_telemetry_refs()
@@ -1065,6 +1157,7 @@ def _dispatch_command(cmd: str, args: dict[str, Any]) -> dict[str, Any]:
owner=str(args.get("owner", "") or args.get("operator", "") or ""),
layers=args.get("layers") if isinstance(args.get("layers"), (list, tuple)) else None,
limit=5,
fallback_search=True,
)
best = lookup.get("best_match") if isinstance(lookup.get("best_match"), dict) else {}
group = str(best.get("group", "") or entity_type).lower()
@@ -1516,6 +1609,85 @@ def _dispatch_command(cmd: str, args: dict[str, Any]) -> dict[str, Any]:
count = clear_zones(source="openclaw")
return {"ok": True, "data": {"removed_count": count}}
# -- Infonet / gate / DM (operator-delegated, full tier for writes) ------
if cmd == "infonet_status":
from services.openclaw_infonet import get_infonet_status
return get_infonet_status()
if cmd == "ensure_infonet_ready":
from services.openclaw_infonet import ensure_infonet_ready
return ensure_infonet_ready(join_swarm=bool(args.get("join_swarm", True)))
if cmd == "join_infonet_swarm":
from services.openclaw_infonet import join_infonet_swarm
return join_infonet_swarm()
if cmd == "list_gates":
from services.openclaw_infonet import list_gates
return list_gates()
if cmd == "read_gate_messages":
from services.openclaw_infonet import read_gate_messages
gate_id = str(args.get("gate_id", "") or args.get("gate", "")).strip()
return read_gate_messages(
gate_id,
limit=int(args.get("limit", 20) or 20),
decrypt=bool(args.get("decrypt", False)),
)
if cmd == "post_gate_message":
from services.openclaw_infonet import post_gate_message
gate_id = str(args.get("gate_id", "") or args.get("gate", "")).strip()
plaintext = str(args.get("plaintext", "") or args.get("message", "")).strip()
return post_gate_message(
gate_id,
plaintext,
reply_to=str(args.get("reply_to", "") or ""),
)
if cmd == "cast_vote":
from services.openclaw_infonet import cast_vote
target_id = str(args.get("target_id", "") or args.get("target", "")).strip()
vote_raw = args.get("vote", args.get("direction"))
try:
vote_val = int(vote_raw)
except (TypeError, ValueError):
return {"ok": False, "detail": "vote must be 1 or -1"}
return cast_vote(
target_id,
vote_val,
gate=str(args.get("gate", "") or args.get("gate_id", "")).strip(),
)
if cmd == "send_dm":
from services.openclaw_infonet import send_dm
peer_id = str(
args.get("peer_id", "")
or args.get("recipient_id", "")
or args.get("recipient", "")
).strip()
plaintext = str(args.get("plaintext", "") or args.get("message", "")).strip()
return send_dm(
peer_id,
plaintext,
delivery_class=str(args.get("delivery_class", "shared") or "shared"),
recipient_token=str(args.get("recipient_token", "") or ""),
)
if cmd == "poll_dms":
from services.openclaw_infonet import poll_dms
return poll_dms(limit=int(args.get("limit", 20) or 20))
return {"ok": False, "detail": f"unhandled command: {cmd}"}
+796
View File
@@ -0,0 +1,796 @@
"""OpenClaw agent delegation for private Infonet / gate / DM actions.
Agents authenticate with OpenClaw HMAC on the command channel. Write
commands require ``OPENCLAW_ACCESS_TIER=full``. Actions use the operator's
local wormhole persona and node runtime the agent posts on behalf of the
user who configured the skill, not as a separate fleet identity.
"""
from __future__ import annotations
import asyncio
import json
import logging
import os
import secrets
import time
from typing import Any
from starlette.requests import Request
logger = logging.getLogger(__name__)
def _run_async(coro):
try:
asyncio.get_running_loop()
except RuntimeError:
return asyncio.run(coro)
return asyncio.run(coro)
def _local_agent_request(path: str, *, method: str = "POST") -> Request:
scope = {
"type": "http",
"method": method.upper(),
"path": path,
"headers": [],
"client": ("127.0.0.1", 52421),
}
request = Request(scope)
request.state._private_lane_current_tier = "private_strong"
request.state._transport_tier = "private_strong"
return request
def ensure_infonet_ready(*, join_swarm: bool = True) -> dict[str, Any]:
"""Warm Tor, enable the participant node, and optionally join the swarm."""
from routers.ai_intel import _write_env_value
from services.config import get_settings
from services.mesh.mesh_swarm_runtime import join_swarm_with_retries
from services.node_settings import read_node_settings, write_node_settings
from services.tor_hidden_service import tor_service
from services.wormhole_supervisor import _check_arti_ready
steps: dict[str, Any] = {}
tor_result = tor_service.start(target_port=8000)
steps["tor"] = tor_result
if tor_result.get("ok"):
try:
_write_env_value("MESH_ARTI_ENABLED", "true")
get_settings.cache_clear()
except Exception as exc:
logger.debug("failed to persist MESH_ARTI_ENABLED: %s", exc)
if not _check_arti_ready():
return {
"ok": False,
"detail": "Tor/Arti transport is not ready yet",
"steps": steps,
}
if not bool(read_node_settings().get("enabled")):
write_node_settings(enabled=True)
steps["node_enabled"] = True
try:
import main as main_mod
main_mod._refresh_node_peer_store()
main_mod._start_infonet_node_runtime("openclaw_agent")
except Exception as exc:
logger.warning("node runtime start after agent enable failed: %s", exc)
else:
steps["node_enabled"] = True
if join_swarm:
joined = join_swarm_with_retries()
steps["announce"] = joined.get("announce") or {}
steps["manifest_pull"] = joined.get("manifest_pull") or {}
steps["swarm_attempts"] = joined.get("attempts")
ok = bool(joined.get("ok"))
else:
ok = True
return {
"ok": ok,
"detail": "Infonet participant runtime ready" if ok else "swarm join incomplete",
"steps": steps,
"onion_address": str(tor_result.get("onion_address") or ""),
}
def join_infonet_swarm() -> dict[str, Any]:
from services.mesh.mesh_swarm_runtime import join_swarm_with_retries
joined = join_swarm_with_retries()
return {
"ok": bool(joined.get("ok")),
"announce": joined.get("announce") or {},
"manifest_pull": joined.get("manifest_pull") or {},
"attempts": joined.get("attempts"),
"detail": joined.get("detail"),
}
def get_infonet_status() -> dict[str, Any]:
from services.mesh.mesh_hashchain import infonet
from services.wormhole_supervisor import get_wormhole_state
info = infonet.get_info()
valid, reason = infonet.validate_chain(verify_signatures=False)
try:
wormhole = get_wormhole_state()
except Exception:
wormhole = {"configured": False, "ready": False, "arti_ready": False, "rns_ready": False}
try:
import main as main_mod
runtime = main_mod._node_runtime_snapshot()
private_tier = main_mod._current_private_lane_tier(wormhole)
except Exception:
runtime = {}
private_tier = "public_degraded"
return {
"ok": True,
"chain": info,
"valid": valid,
"validation": reason,
"private_lane_tier": private_tier,
"wormhole": wormhole,
"runtime": runtime,
}
def list_gates() -> dict[str, Any]:
from services.mesh.mesh_reputation import gate_manager
return {"ok": True, "gates": gate_manager.list_gates()}
def read_gate_messages(
gate_id: str,
*,
limit: int = 20,
decrypt: bool = False,
) -> dict[str, Any]:
from services.mesh.mesh_hashchain import gate_store
gate_key = str(gate_id or "").strip().lower()
if not gate_key:
return {"ok": False, "detail": "gate_id required"}
messages, cursor = gate_store.get_messages_with_cursor(gate_key, limit=max(1, min(int(limit), 100)))
out = []
if decrypt:
from services.mesh.mesh_gate_repair import decrypt_gate_message_with_repair
for msg in messages:
item = dict(msg)
try:
decrypted = decrypt_gate_message_with_repair(
gate_id=gate_key,
epoch=int(item.get("epoch") or 0),
ciphertext=str(item.get("ciphertext") or ""),
nonce=str(item.get("nonce") or item.get("iv") or ""),
sender_ref=str(item.get("sender_ref") or ""),
gate_envelope=str(item.get("gate_envelope") or ""),
envelope_hash=str(item.get("envelope_hash") or ""),
event_id=str(item.get("event_id") or ""),
)
if decrypted.get("ok"):
item["plaintext"] = decrypted.get("plaintext", "")
except Exception as exc:
item["decrypt_error"] = str(exc)
out.append(item)
else:
out = [dict(m) for m in messages]
return {
"ok": True,
"gate": gate_key,
"count": len(out),
"cursor": cursor,
"messages": out,
}
def post_gate_message(
gate_id: str,
plaintext: str,
*,
reply_to: str = "",
) -> dict[str, Any]:
"""Compose, sign, and post an MLS gate message using the operator persona."""
from services.mesh.mesh_gate_repair import (
compose_gate_message_with_repair,
sign_gate_message_with_repair,
)
from services.mesh.mesh_wormhole_persona import bootstrap_wormhole_persona_state, create_gate_persona
gate_key = str(gate_id or "").strip().lower()
if not gate_key:
return {"ok": False, "detail": "gate_id required"}
if not str(plaintext or "").strip():
return {"ok": False, "detail": "plaintext required"}
bootstrap_wormhole_persona_state(force=False)
try:
create_gate_persona(gate_key, label="openclaw-agent")
except Exception:
pass
composed = compose_gate_message_with_repair(
gate_id=gate_key,
plaintext=str(plaintext),
reply_to=str(reply_to or ""),
)
if not composed.get("ok"):
return composed
signed = sign_gate_message_with_repair(
gate_id=gate_key,
epoch=int(composed.get("epoch") or 0),
ciphertext=str(composed.get("ciphertext") or ""),
nonce=str(composed.get("nonce") or ""),
payload_format=str(composed.get("format") or "mls1"),
reply_to=str(reply_to or ""),
envelope_hash=str(composed.get("envelope_hash") or ""),
transport_lock="private_strong",
)
if not signed.get("ok"):
return signed
body = {
"sender_id": str(signed.get("sender_id") or composed.get("sender_id") or ""),
"public_key": str(signed.get("public_key") or composed.get("public_key") or ""),
"public_key_algo": str(signed.get("public_key_algo") or composed.get("public_key_algo") or ""),
"signature": str(signed.get("signature") or ""),
"sequence": int(signed.get("sequence") or composed.get("sequence") or 0),
"protocol_version": str(signed.get("protocol_version") or composed.get("protocol_version") or ""),
"epoch": int(signed.get("epoch") or composed.get("epoch") or 0),
"ciphertext": str(signed.get("ciphertext") or composed.get("ciphertext") or ""),
"nonce": str(signed.get("nonce") or composed.get("nonce") or ""),
"sender_ref": str(signed.get("sender_ref") or composed.get("sender_ref") or ""),
"format": str(signed.get("format") or composed.get("format") or "mls1"),
"gate_envelope": str(signed.get("gate_envelope") or composed.get("gate_envelope") or ""),
"envelope_hash": str(signed.get("envelope_hash") or composed.get("envelope_hash") or ""),
"transport_lock": "private_strong",
"reply_to": str(signed.get("reply_to") or reply_to or ""),
}
import main as main_mod
path = f"/api/mesh/gate/{gate_key}/message"
request = _local_agent_request(path)
return main_mod._submit_gate_message_envelope(request, gate_key, body)
def cast_vote(
target_id: str,
vote: int,
*,
gate: str = "",
) -> dict[str, Any]:
"""Cast a signed reputation vote using the operator gate/transport persona."""
from services.mesh.mesh_hashchain import infonet
from services.mesh.mesh_protocol import PROTOCOL_VERSION, normalize_payload
from services.mesh.mesh_reputation import gate_manager, reputation_ledger
from services.mesh.mesh_wormhole_persona import (
bootstrap_wormhole_persona_state,
sign_gate_wormhole_event,
sign_public_wormhole_event,
)
voter_gate = str(gate or "").strip().lower()
target = str(target_id or "").strip()
vote_val = int(vote)
if not target:
return {"ok": False, "detail": "target_id required"}
if vote_val not in (1, -1):
return {"ok": False, "detail": "vote must be 1 or -1"}
bootstrap_wormhole_persona_state(force=False)
vote_payload = {"target_id": target, "vote": vote_val, "gate": voter_gate}
normalized = normalize_payload("vote", vote_payload)
ok_payload, reason = True, "ok"
from services.mesh.mesh_schema import validate_event_payload
ok_payload, reason = validate_event_payload("vote", normalized)
if not ok_payload:
return {"ok": False, "detail": reason}
if voter_gate:
signed = sign_gate_wormhole_event(
gate_id=voter_gate,
event_type="vote",
payload=normalized,
)
else:
signed = sign_public_wormhole_event(event_type="vote", payload=normalized)
if not signed.get("ok", True):
return signed
voter_id = str(signed.get("node_id") or "")
public_key = str(signed.get("public_key") or "")
public_key_algo = str(signed.get("public_key_algo") or "")
signature = str(signed.get("signature") or "")
sequence = int(signed.get("sequence") or 0)
if voter_gate:
can_enter, enter_reason = gate_manager.can_enter(voter_id, voter_gate)
if not can_enter:
return {"ok": False, "detail": f"Gate vote denied: {enter_reason}"}
reputation_ledger.register_node(voter_id, public_key, public_key_algo)
stable_voter_id = voter_id
try:
import main as main_mod
root_nid = main_mod._cached_root_node_id()
if root_nid:
stable_voter_id = root_nid
except Exception:
pass
ok, cast_reason, weight = reputation_ledger.cast_vote(
stable_voter_id,
target,
vote_val,
voter_gate,
)
if ok:
try:
infonet.append(
event_type="vote",
node_id=voter_id,
payload=normalized,
signature=signature,
sequence=sequence,
public_key=public_key,
public_key_algo=public_key_algo,
protocol_version=str(signed.get("protocol_version") or PROTOCOL_VERSION),
)
except Exception as exc:
logger.warning("vote recorded in ledger but infonet append failed: %s", exc)
return {"ok": ok, "detail": cast_reason, "weight": round(float(weight or 0), 2)}
def _http_post_json(
url: str,
body: dict[str, Any],
*,
extra_headers: dict[str, str] | None = None,
timeout: int = 120,
) -> dict[str, Any]:
import urllib.error
import urllib.request
payload_bytes = json.dumps(body, separators=(",", ":"), sort_keys=True).encode("utf-8")
headers = {"Content-Type": "application/json"}
if extra_headers:
headers.update(extra_headers)
req = urllib.request.Request(url, data=payload_bytes, headers=headers, method="POST")
try:
with urllib.request.urlopen(req, timeout=timeout) as resp:
raw = resp.read().decode("utf-8")
except urllib.error.HTTPError as exc:
detail = exc.read().decode("utf-8", errors="replace")
try:
parsed = json.loads(detail)
if isinstance(parsed, dict):
return parsed
except Exception:
pass
return {"ok": False, "detail": detail or f"http {exc.code}"}
if not raw:
return {}
parsed = json.loads(raw)
return parsed if isinstance(parsed, dict) else {"ok": False, "detail": "invalid json response"}
def _issue_sender_token_for_http_send(
api_base: str,
*,
recipient: str,
delivery: str,
recipient_token: str,
) -> dict[str, Any]:
extra_headers: dict[str, str] = {}
admin_key = str(os.environ.get("ADMIN_KEY") or "").strip()
if admin_key:
extra_headers["X-Admin-Key"] = admin_key
return _http_post_json(
f"{api_base}/api/wormhole/dm/sender-token",
{
"recipient_id": recipient,
"delivery_class": delivery,
"recipient_token": recipient_token,
},
extra_headers=extra_headers or None,
)
def _submit_signed_dm_send(
*,
recipient: str,
delivery_class: str,
recipient_token: str,
ciphertext: str,
payload_format: str,
session_welcome: str = "",
connect_intent: str = "",
lookup_peer_url: str = "",
peer_dh_pub: str = "",
) -> dict[str, Any]:
import main as main_mod
from services.mesh.mesh_protocol import (
PROTOCOL_VERSION,
SIGNED_CONTEXT_FIELD,
build_signed_context,
)
from services.mesh.mesh_schema import validate_event_payload
from services.mesh.mesh_wormhole_persona import get_dm_identity, sign_dm_wormhole_event
from services.mesh.mesh_wormhole_sender_token import issue_wormhole_dm_sender_token
delivery = str(delivery_class or "shared").strip().lower()
identity = get_dm_identity()
sender_id = str(identity.get("node_id") or "")
msg_id = secrets.token_hex(16)
timestamp = int(time.time())
sequence = int(identity.get("sequence", 0) or 0) + 1
dm_payload: dict[str, Any] = {
"recipient_id": recipient,
"delivery_class": delivery,
"recipient_token": str(recipient_token or ""),
"ciphertext": str(ciphertext or ""),
"msg_id": msg_id,
"timestamp": timestamp,
"format": str(payload_format or "mls1"),
"transport_lock": "private_strong",
}
if session_welcome:
dm_payload["session_welcome"] = str(session_welcome)
try:
from services.config import get_settings
from services.mesh.mesh_wormhole_seal import build_sender_seal
if (
delivery == "shared"
and bool(get_settings().MESH_DM_REQUIRE_SENDER_SEAL_SHARED)
and not str(dm_payload.get("sender_seal", "") or "").strip()
):
seal = build_sender_seal(
recipient_id=recipient,
recipient_dh_pub=str(peer_dh_pub or ""),
msg_id=msg_id,
timestamp=timestamp,
)
if seal.get("ok"):
dm_payload["sender_seal"] = str(seal.get("sender_seal") or "")
except Exception:
pass
ok_payload, reason = validate_event_payload("dm_message", dm_payload)
if not ok_payload:
return {"ok": False, "detail": reason}
dm_payload[SIGNED_CONTEXT_FIELD] = build_signed_context(
event_type="dm_message",
kind="dm_send",
endpoint="/api/mesh/dm/send",
lane_floor="private_strong",
sequence_domain="dm_send",
node_id=sender_id,
sequence=sequence,
payload=dm_payload,
recipient_id=recipient,
)
signed = sign_dm_wormhole_event(
event_type="dm_message",
payload=dm_payload,
sequence=sequence,
)
if not signed.get("ok", True):
return signed
body = {
"sender_id": sender_id,
"sender_token": "",
"recipient_id": recipient,
"delivery_class": delivery,
"recipient_token": str(recipient_token or ""),
"ciphertext": str(ciphertext or ""),
"format": str(payload_format or "mls1"),
"transport_lock": "private_strong",
"session_welcome": str(session_welcome or ""),
"msg_id": msg_id,
"timestamp": timestamp,
"sender_seal": str(dm_payload.get("sender_seal") or ""),
"public_key": str(signed.get("public_key") or ""),
"public_key_algo": str(signed.get("public_key_algo") or ""),
"signature": str(signed.get("signature") or ""),
"sequence": int(signed.get("sequence") or 0),
"protocol_version": str(signed.get("protocol_version") or PROTOCOL_VERSION),
"signed_context": dict(dm_payload.get(SIGNED_CONTEXT_FIELD) or {}),
}
normalized_intent = str(connect_intent or "").strip().lower()
normalized_lookup_peer = str(lookup_peer_url or "").strip().rstrip("/")
if normalized_intent:
body["connect_intent"] = normalized_intent
if normalized_lookup_peer:
body["lookup_peer_url"] = normalized_lookup_peer
api_base = str(os.environ.get("SB_API_BASE", "http://127.0.0.1:8000") or "http://127.0.0.1:8000").rstrip("/")
result: dict[str, Any] = {"ok": False, "detail": "dm send failed"}
try:
import urllib.error
if delivery in ("request", "shared"):
issued = _issue_sender_token_for_http_send(
api_base,
recipient=recipient,
delivery=delivery,
recipient_token=str(recipient_token or ""),
)
if not issued.get("ok"):
return issued
body["sender_token"] = str(issued.get("sender_token") or "")
result = _http_post_json(f"{api_base}/api/mesh/dm/send", body)
except (urllib.error.URLError, TimeoutError):
if delivery in ("request", "shared"):
issued = issue_wormhole_dm_sender_token(
recipient_id=recipient,
delivery_class=delivery,
recipient_token=str(recipient_token or ""),
)
if not issued.get("ok"):
return issued
body["sender_token"] = str(issued.get("sender_token") or "")
async def _send():
import json as _json
raw = _json.dumps(body).encode("utf-8")
async def receive():
return {"type": "http.request", "body": raw, "more_body": False}
req = Request(
{
"type": "http",
"method": "POST",
"path": "/api/mesh/dm/send",
"headers": [(b"content-type", b"application/json")],
"client": ("127.0.0.1", 52421),
},
receive,
)
req.state._private_lane_current_tier = "private_strong"
req.state._transport_tier = "private_strong"
return await main_mod.dm_send(req)
result = _run_async(_send())
except Exception as exc:
result = {"ok": False, "detail": str(exc) or type(exc).__name__}
if isinstance(result, dict):
result.setdefault("msg_id", msg_id)
result.setdefault("sender_id", sender_id)
result.setdefault("recipient_id", recipient)
return result
def send_contact_request(
*,
lookup_token: str = "",
peer_id: str = "",
note: str = "",
lookup_peer_url: str = "",
cached_prekey_bundle: dict[str, Any] | None = None,
) -> dict[str, Any]:
"""Send a first-contact request using a short address or peer id."""
from services.mesh.mesh_wormhole_dead_drop import build_contact_offer
from services.mesh.mesh_wormhole_persona import get_dm_identity
from services.mesh.mesh_wormhole_prekey import bootstrap_encrypt_for_peer, fetch_dm_prekey_bundle
token = str(lookup_token or "").strip()
peer = str(peer_id or "").strip()
if not token and not peer:
return {"ok": False, "detail": "lookup_token or peer_id required"}
preferred_peer = str(lookup_peer_url or "").strip().rstrip("/")
if cached_prekey_bundle and cached_prekey_bundle.get("ok"):
bundle = dict(cached_prekey_bundle)
else:
bundle = fetch_dm_prekey_bundle(
agent_id=peer if not token else "",
lookup_token=token,
lookup_peer_urls=[preferred_peer] if preferred_peer else None,
)
if not bundle.get("ok"):
return bundle
recipient = str(bundle.get("agent_id") or peer).strip()
if not recipient:
return {"ok": False, "detail": "recipient unresolved"}
identity = get_dm_identity()
offer = build_contact_offer(
dh_pub_key=str(identity.get("dh_pub_key") or ""),
dh_algo=str(identity.get("dh_algo") or "X25519"),
geo_hint=str(note or ""),
)
encrypted = bootstrap_encrypt_for_peer(
recipient,
offer,
fetched_bundle=bundle,
)
if not encrypted.get("ok"):
return encrypted
return _submit_signed_dm_send(
recipient=recipient,
delivery_class="request",
recipient_token="",
ciphertext=str(encrypted.get("result") or ""),
payload_format="mls1",
connect_intent="contact_request",
lookup_peer_url=preferred_peer,
)
def send_contact_accept(
*,
peer_id: str,
peer_dh_pub: str = "",
lookup_token: str = "",
lookup_peer_url: str = "",
) -> dict[str, Any]:
"""Accept a pending contact request and open the shared DM lane."""
from services.mesh.mesh_wormhole_dead_drop import build_contact_accept, issue_pairwise_dm_alias
from services.mesh.mesh_wormhole_prekey import bootstrap_encrypt_for_peer, fetch_dm_prekey_bundle
peer = str(peer_id or "").strip()
if not peer:
return {"ok": False, "detail": "peer_id required"}
token = str(lookup_token or "").strip()
preferred_peer = str(lookup_peer_url or "").strip().rstrip("/")
dh_pub = str(peer_dh_pub or "").strip()
if not dh_pub:
bundle = fetch_dm_prekey_bundle(
agent_id=peer if not token else "",
lookup_token=token,
lookup_peer_urls=[preferred_peer] if preferred_peer else None,
)
if not bundle.get("ok"):
return bundle
dh_pub = str(bundle.get("dh_pub_key") or "").strip()
if not dh_pub:
return {"ok": False, "detail": "peer dh_pub_key unavailable"}
alias = issue_pairwise_dm_alias(peer_id=peer, peer_dh_pub=dh_pub)
if not alias.get("ok"):
return alias
shared_alias = str(alias.get("shared_alias") or "").strip()
if not shared_alias:
return {"ok": False, "detail": "shared_alias unavailable"}
accept_plain = build_contact_accept(shared_alias=shared_alias)
encrypted = bootstrap_encrypt_for_peer(peer, accept_plain, lookup_token=token)
if not encrypted.get("ok"):
return encrypted
sent = _submit_signed_dm_send(
recipient=peer,
delivery_class="request",
recipient_token="",
ciphertext=str(encrypted.get("result") or ""),
payload_format="mls1",
connect_intent="contact_accept",
lookup_peer_url=preferred_peer,
)
if isinstance(sent, dict):
sent.setdefault("shared_alias", shared_alias)
return sent
def send_dm(
peer_id: str,
plaintext: str,
*,
delivery_class: str = "shared",
recipient_token: str = "",
) -> dict[str, Any]:
"""Compose and send an encrypted DM on behalf of the operator."""
import main as main_mod
recipient = str(peer_id or "").strip()
if not recipient:
return {"ok": False, "detail": "peer_id required"}
if not str(plaintext or "").strip():
return {"ok": False, "detail": "plaintext required"}
delivery = str(delivery_class or "shared").strip().lower()
if delivery not in ("shared", "request"):
return {"ok": False, "detail": "delivery_class must be shared or request"}
composed = main_mod.compose_wormhole_dm(
peer_id=recipient,
peer_dh_pub="",
plaintext=str(plaintext),
)
if not composed.get("ok"):
return composed
return _submit_signed_dm_send(
recipient=recipient,
delivery_class=delivery,
recipient_token=str(recipient_token or ""),
ciphertext=str(composed.get("ciphertext") or ""),
payload_format=str(composed.get("format") or "mls1"),
session_welcome=str(composed.get("session_welcome") or ""),
)
def poll_dms(*, limit: int = 20) -> dict[str, Any]:
"""Poll encrypted DMs for the operator DM identity."""
import json
import main as main_mod
from services.mesh.mesh_protocol import PROTOCOL_VERSION
from services.mesh.mesh_wormhole_persona import get_dm_identity, sign_dm_wormhole_event
identity = get_dm_identity()
agent_id = str(identity.get("node_id") or "")
if not agent_id:
return {"ok": False, "detail": "dm identity is not configured"}
poll_payload = {"mailbox_claims": [], "agent_id": agent_id}
signed = sign_dm_wormhole_event(event_type="dm_poll", payload=poll_payload)
if not signed.get("ok", True):
return signed
body = {
"agent_id": agent_id,
"mailbox_claims": [],
"timestamp": int(time.time()),
"nonce": secrets.token_hex(8),
"public_key": str(signed.get("public_key") or ""),
"public_key_algo": str(signed.get("public_key_algo") or ""),
"signature": str(signed.get("signature") or ""),
"sequence": int(signed.get("sequence") or 0),
"protocol_version": str(signed.get("protocol_version") or PROTOCOL_VERSION),
}
raw = json.dumps(body).encode("utf-8")
async def _poll():
async def receive():
return {"type": "http.request", "body": raw, "more_body": False}
req = Request(
{
"type": "http",
"method": "POST",
"path": "/api/mesh/dm/poll",
"headers": [(b"content-type", b"application/json")],
"client": ("127.0.0.1", 52421),
},
receive,
)
return await main_mod.dm_poll_secure(req)
result = _run_async(_poll())
if isinstance(result, dict):
messages = list(result.get("messages") or [])
if limit and len(messages) > int(limit):
result = dict(result)
result["messages"] = messages[: int(limit)]
result["count"] = len(result["messages"])
return result if isinstance(result, dict) else {"ok": False, "detail": "dm poll failed"}
+531
View File
@@ -0,0 +1,531 @@
"""Deterministic OpenClaw routing — intent → fastest command.
Keeps expensive fuzzy scans and full-layer dumps out of the default agent path.
"""
from __future__ import annotations
import re
from typing import Any
EXPENSIVE_COMMANDS = frozenset({
"search_telemetry",
"get_telemetry",
"get_slow_telemetry",
"get_report",
})
EXPENSIVE_GATE_MESSAGE = (
"expensive command blocked — use route_query, find_entity, run_playbook, or targeted reads. "
"Pass confirm_expensive=true only when fuzzy search or full dumps are intentional."
)
LATENCY_TIER_MS: dict[str, int] = {
"channel_status": 5,
"route_query": 5,
"get_summary": 10,
"what_changed": 15,
"search_news": 15,
"find_flights": 25,
"find_ships": 25,
"find_entity": 30,
"entities_near": 30,
"brief_area": 30,
"get_layer_slice": 50,
"correlate_entity": 15,
"entity_expand": 40,
"osint_lookup": 200,
"run_playbook": 120,
"infonet_status": 20,
"list_gates": 15,
"read_gate_messages": 40,
"poll_dms": 80,
"ensure_infonet_ready": 120000,
"join_infonet_swarm": 90000,
"post_gate_message": 15000,
"cast_vote": 5000,
"send_dm": 20000,
"search_telemetry": 8000,
"get_telemetry": 3500,
"get_slow_telemetry": 1500,
"get_report": 5000,
}
RE_N_NUMBER = re.compile(r"\bN\d{1,5}[A-Z]{0,2}\b", re.I)
RE_CALLSIGN = re.compile(r"\b[A-Z]{2,4}\d{1,4}[A-Z]?\b")
RE_MMSI = re.compile(r"\b\d{9}\b")
RE_CVE = re.compile(r"\bCVE-\d{4}-\d+\b", re.I)
RE_IPV4 = re.compile(r"\b(?:\d{1,3}\.){3}\d{1,3}\b")
RE_DOMAIN = re.compile(
r"\b(?:[a-z0-9](?:[a-z0-9-]{0,61}[a-z0-9])?\.)+(?:[a-z]{2,})\b",
re.I,
)
KNOWN_CALLSIGNS = frozenset({
"AF1", "AF2", "EXEC1", "EXEC2", "SAM", "STALK52", "SPAR19", "SPAR20",
})
PLAYBOOKS: dict[str, dict[str, Any]] = {
"hot_snapshot": {
"description": "Summary + hot layers + what changed (one batch)",
"batch": [
{"cmd": "get_summary", "args": {"compact": True}},
{
"cmd": "get_layer_slice",
"args": {
"layers": [
"news",
"telegram_osint",
"military_flights",
"private_jets",
"earthquakes",
],
"limit_per_layer": 10,
"compact": True,
},
},
{"cmd": "what_changed", "args": {"compact": True}},
],
},
"status_check": {
"description": "Channel health + layer counts",
"batch": [
{"cmd": "channel_status", "args": {}},
{"cmd": "get_summary", "args": {"compact": True}},
],
},
"morning_brief": {
"description": "Operator morning digest layers",
"batch": [
{"cmd": "get_summary", "args": {"compact": True}},
{"cmd": "what_changed", "args": {"compact": True}},
{
"cmd": "get_layer_slice",
"args": {
"layers": [
"news",
"telegram_osint",
"gdelt",
"earthquakes",
"crowdthreat",
"military_flights",
],
"limit_per_layer": 15,
"compact": True,
},
},
],
},
"monitor_heartbeat": {
"description": "Low-latency monitor poll (replaces full telemetry pull)",
"batch": [
{"cmd": "what_changed", "args": {"compact": True}},
{
"cmd": "get_layer_slice",
"args": {
"layers": [
"military_flights",
"ships",
"earthquakes",
"liveuamap",
"crowdthreat",
"uap_sightings",
"firms_fires",
"gps_jamming",
"wastewater",
],
"limit_per_layer": 200,
"compact": True,
},
},
],
},
}
def routing_manifest() -> dict[str, Any]:
"""Machine-readable routing hints for /api/ai/capabilities."""
return {
"default_read": "find_entity",
"preferred_entry": "route_query",
"client_wrapper": "ShadowBrokerClient.ask",
"batch_playbook": "run_playbook",
"last_resort": "search_telemetry",
"expensive_commands": sorted(EXPENSIVE_COMMANDS),
"latency_tier_ms": LATENCY_TIER_MS,
"anti_patterns": [
"search_telemetry for known tail numbers, callsigns, owners, or MMSI",
"get_telemetry for routine reads — use get_layer_slice or run_playbook hot_snapshot",
"sequential send_command loops — use send_batch or run_playbook",
"/api/health for liveness — use channel_status",
"empty layers: [] on get_layer_slice — pass explicit layer names",
],
"recipes": [
{
"intent": "natural language question",
"use": "route_query → recommended cmd, or ShadowBrokerClient.ask()",
},
{
"intent": "known person/aircraft",
"use": "find_entity(query=...) or find_flights(owner=...)",
},
{
"intent": "news / telegram topic",
"use": "search_news(query=...)",
},
{
"intent": "near a point",
"use": "entities_near or brief_area",
},
{
"intent": "hot snapshot",
"use": "run_playbook(name=hot_snapshot)",
},
{
"intent": "post to infonet gate / join swarm",
"use": "ensure_infonet_ready then post_gate_message (full tier)",
},
{
"intent": "read encrypted gate traffic",
"use": "read_gate_messages(gate_id=infonet, decrypt=true)",
},
{
"intent": "dm another node",
"use": "send_dm(peer_id=..., plaintext=...) (full tier)",
},
],
"playbooks": {
name: {"description": spec.get("description", "")}
for name, spec in PLAYBOOKS.items()
},
"agent_surface": {
"primary": ["ask", "send_batch", "channel_status"],
"writes": [
"place_pin",
"add_watch",
"inject_data",
"place_analysis_zone",
"ensure_infonet_ready",
"post_gate_message",
"cast_vote",
"send_dm",
],
"infonet_reads": [
"infonet_status",
"list_gates",
"read_gate_messages",
"poll_dms",
],
},
}
def requires_expensive_confirm(cmd: str, args: dict[str, Any] | None) -> bool:
if cmd not in EXPENSIVE_COMMANDS:
return False
if isinstance(args, dict) and args.get("confirm_expensive") is True:
return False
return True
def _compact_args(args: dict[str, Any], *, compact: bool) -> dict[str, Any]:
out = dict(args)
if compact and "compact" not in out:
out["compact"] = True
return out
def _estimate_ms(cmd: str) -> int:
return int(LATENCY_TIER_MS.get(cmd, 100))
def _news_query(text: str) -> str:
cleaned = text
for prefix in (
"news about",
"news on",
"telegram",
"headlines about",
"headlines on",
"latest on",
"search news for",
):
if cleaned.lower().startswith(prefix):
cleaned = cleaned[len(prefix):].strip()
return cleaned.strip(" ?.")
def route_query(
text: str = "",
*,
lat: float | None = None,
lng: float | None = None,
radius_km: float = 50,
compact: bool = True,
) -> dict[str, Any]:
"""Map natural-language intent to the fastest command (no LLM)."""
raw = str(text or "").strip()
lowered = raw.lower()
avoid = ["search_telemetry", "get_telemetry", "get_slow_telemetry"]
alternates: list[dict[str, Any]] = []
if not raw and lat is not None and lng is not None:
recommended = {
"cmd": "brief_area",
"args": _compact_args(
{"lat": lat, "lng": lng, "radius_km": radius_km},
compact=compact,
),
}
return {
"intent": "area_brief",
"recommended": recommended,
"alternates": [{"cmd": "entities_near", "args": recommended["args"]}],
"avoid": avoid,
"estimated_ms": _estimate_ms("brief_area"),
}
if not raw:
recommended = {"cmd": "get_summary", "args": _compact_args({}, compact=compact)}
return {
"intent": "discovery",
"recommended": recommended,
"alternates": [{"cmd": "channel_status", "args": {}}],
"avoid": avoid,
"estimated_ms": _estimate_ms("get_summary"),
}
cve_match = RE_CVE.search(raw)
if cve_match:
recommended = {
"cmd": "osint_lookup",
"args": _compact_args({"tool": "cve", "cve": cve_match.group(0).upper()}, compact=compact),
}
return _route_result("cve_lookup", recommended, avoid, alternates)
ip_match = RE_IPV4.search(raw)
if ip_match and ("ip" in lowered or "address" in lowered or lowered.count(".") >= 3):
recommended = {
"cmd": "osint_lookup",
"args": _compact_args({"tool": "ip", "ip": ip_match.group(0)}, compact=compact),
}
alternates.append({"cmd": "entity_expand", "args": {"type": "ip", "id": ip_match.group(0)}})
return _route_result("ip_lookup", recommended, avoid, alternates)
if "whois" in lowered or ("dns" in lowered and RE_DOMAIN.search(raw)):
domain = (RE_DOMAIN.search(raw) or re.search(r"\b([a-z0-9-]+\.[a-z]{2,})\b", raw, re.I))
tool = "whois" if "whois" in lowered else "dns"
domain_value = domain.group(0) if domain else raw
recommended = {
"cmd": "osint_lookup",
"args": _compact_args({"tool": tool, "domain": domain_value}, compact=compact),
}
return _route_result("domain_lookup", recommended, avoid, alternates)
if "sanction" in lowered or "ofac" in lowered:
recommended = {
"cmd": "osint_lookup",
"args": _compact_args({"tool": "sanctions", "query": raw}, compact=compact),
}
return _route_result("sanctions_lookup", recommended, avoid, alternates)
mmsi_match = RE_MMSI.search(raw)
if mmsi_match and any(k in lowered for k in ("mmsi", "ship", "vessel", "yacht", "boat", "maritime")):
recommended = {
"cmd": "find_ships",
"args": _compact_args({"mmsi": mmsi_match.group(0)}, compact=compact),
}
alternates.append({"cmd": "find_entity", "args": {"mmsi": mmsi_match.group(0), "entity_type": "ship"}})
return _route_result("maritime_identifier", recommended, avoid, alternates)
n_match = RE_N_NUMBER.search(raw)
if n_match:
reg = n_match.group(0).upper()
recommended = {
"cmd": "find_flights",
"args": _compact_args({"registration": reg}, compact=compact),
}
alternates.append({"cmd": "find_entity", "args": {"registration": reg, "entity_type": "aircraft"}})
return _route_result("tail_number", recommended, avoid, alternates)
# callsign tokens
tokens = re.findall(r"\b[A-Z0-9]{2,8}\b", raw.upper())
for token in tokens:
if token in KNOWN_CALLSIGNS or RE_CALLSIGN.fullmatch(token):
recommended = {
"cmd": "find_flights",
"args": _compact_args({"callsign": token}, compact=compact),
}
alternates.append({"cmd": "find_entity", "args": {"callsign": token, "entity_type": "aircraft"}})
return _route_result("callsign", recommended, avoid, alternates)
if any(k in lowered for k in ("news", "telegram", "headline", "headlines", "gdelt")):
recommended = {
"cmd": "search_news",
"args": _compact_args({"query": _news_query(raw), "limit": 10}, compact=compact),
}
alternates.append({
"cmd": "get_layer_slice",
"args": {"layers": ["telegram_osint", "news"], "limit_per_layer": 10, "compact": compact},
})
return _route_result("news_search", recommended, avoid, alternates)
if lat is not None and lng is not None and any(
k in lowered for k in ("near", "around", "within", "radius", "brief", "aoi")
):
recommended = {
"cmd": "brief_area",
"args": _compact_args(
{"lat": lat, "lng": lng, "radius_km": radius_km, "query": raw},
compact=compact,
),
}
alternates.append({
"cmd": "entities_near",
"args": {"lat": lat, "lng": lng, "radius_km": radius_km, "compact": compact},
})
return _route_result("area_brief", recommended, avoid, alternates)
if any(k in lowered for k in ("what changed", "updates", "delta", "since last")):
recommended = {"cmd": "what_changed", "args": _compact_args({}, compact=compact)}
return _route_result("incremental_poll", recommended, avoid, alternates)
if any(k in lowered for k in ("summary", "status", "layers populated", "what data")):
recommended = {"cmd": "get_summary", "args": _compact_args({}, compact=compact)}
alternates.append({"cmd": "channel_status", "args": {}})
return _route_result("discovery", recommended, avoid, alternates)
if any(k in lowered for k in ("recon", "whois", "dns lookup", "cve", "mac address")):
recommended = {
"cmd": "osint_tools",
"args": {},
}
return _route_result("recon_discovery", recommended, avoid, alternates)
entity_type = ""
if any(k in lowered for k in ("ship", "vessel", "yacht", "boat", "maritime", "carrier")):
entity_type = "ship"
elif any(k in lowered for k in ("jet", "plane", "flight", "aircraft", "helicopter", "tail")):
entity_type = "aircraft"
owner_hint = ""
if any(k in lowered for k in ("owner", "operated by", "'s jet", "'s yacht", "belongs to")):
owner_hint = raw
for phrase in ("where is", "find", "track", "locate", "jet", "yacht", "plane", "flight", "ship"):
owner_hint = re.sub(rf"\b{phrase}\b", "", owner_hint, flags=re.I).strip()
entity_args: dict[str, Any] = {"query": raw, "compact": compact}
if entity_type:
entity_args["entity_type"] = entity_type
if owner_hint and len(owner_hint) >= 3:
entity_args["owner"] = owner_hint
recommended = {
"cmd": "find_entity",
"args": _compact_args(entity_args, compact=compact),
}
alternates = [
{"cmd": "search_news", "args": {"query": raw, "limit": 10, "compact": compact}},
]
if any(k in lowered for k in ("near", "around")):
alternates.append({
"cmd": "search_telemetry",
"args": {"query": raw, "limit": 10, "confirm_expensive": True, "compact": compact},
})
return _route_result("entity_lookup", recommended, avoid, alternates)
def _route_result(
intent: str,
recommended: dict[str, Any],
avoid: list[str],
alternates: list[dict[str, Any]],
) -> dict[str, Any]:
cmd = str(recommended.get("cmd", ""))
return {
"intent": intent,
"recommended": recommended,
"alternates": alternates,
"avoid": avoid,
"estimated_ms": _estimate_ms(cmd),
}
def plan_playbook(name: str, args: dict[str, Any] | None = None) -> dict[str, Any]:
"""Resolve a named playbook to a command batch."""
playbook = str(name or "").strip().lower()
params = dict(args or {})
if not playbook:
return {"ok": False, "detail": "playbook name required"}
if playbook == "track_snapshot":
query = str(params.get("query", "") or params.get("name", "") or "").strip()
if not query:
return {"ok": False, "detail": "track_snapshot requires query"}
return {
"ok": True,
"playbook": playbook,
"description": "Resolve entity for tracking",
"batch": [
{
"cmd": "find_entity",
"args": {
"query": query,
"entity_type": params.get("entity_type", ""),
"fallback_search": True,
"compact": True,
},
}
],
}
if playbook == "area_brief":
lat = params.get("lat")
lng = params.get("lng")
if lat is None or lng is None:
return {"ok": False, "detail": "area_brief requires lat and lng"}
return {
"ok": True,
"playbook": playbook,
"description": "Brief an area of interest",
"batch": [
{
"cmd": "brief_area",
"args": {
"lat": lat,
"lng": lng,
"radius_km": params.get("radius_km", 50),
"query": params.get("query", ""),
"compact": True,
},
}
],
}
if playbook == "entity_recon":
query = str(params.get("query", "") or params.get("ip", "") or "").strip()
ip_match = RE_IPV4.search(query)
if not ip_match:
return {"ok": False, "detail": "entity_recon requires an IP in query"}
return {
"ok": True,
"playbook": playbook,
"description": "IP recon + entity graph",
"batch": [
{"cmd": "osint_lookup", "args": {"tool": "ip", "ip": ip_match.group(0), "compact": True}},
{"cmd": "entity_expand", "args": {"type": "ip", "id": ip_match.group(0)}},
],
}
spec = PLAYBOOKS.get(playbook)
if not spec:
known = sorted(PLAYBOOKS) + ["track_snapshot", "area_brief", "entity_recon"]
return {"ok": False, "detail": f"unknown playbook: {playbook}", "known": known}
return {
"ok": True,
"playbook": playbook,
"description": spec.get("description", ""),
"batch": [dict(item) for item in spec.get("batch", [])],
}
+1
View File
@@ -0,0 +1 @@
"""Operator-initiated OSINT lookups (server-side proxies)."""
+492
View File
@@ -0,0 +1,492 @@
"""Server-side OSINT lookups (Osiris port, HTTPS outbound only)."""
from __future__ import annotations
import ipaddress
import json
import logging
import re
import socket
import time
from concurrent.futures import ThreadPoolExecutor, as_completed
from datetime import datetime, timezone
from typing import Any
from urllib.parse import quote
from services.network_utils import fetch_with_curl
from services.sanctions.ofac import match_exact, search_sanctions
from services.ssrf_guard import safe_get, validate_domain, validate_host
logger = logging.getLogger(__name__)
_IPV4_RE = re.compile(r"^(\d{1,3}\.){3}\d{1,3}$")
_IPV6_RE = re.compile(r"^[0-9a-fA-F:]+$")
_CVE_RE = re.compile(r"^CVE-\d{4}-\d{4,}$", re.I)
_ASN_RE = re.compile(r"^(AS)?\d+$", re.I)
def _now_iso() -> str:
return datetime.now(timezone.utc).isoformat()
def _json_get(url: str, *, timeout: float = 8.0, headers: dict[str, str] | None = None) -> Any:
resp = fetch_with_curl(url, timeout=timeout, headers=headers or {"Accept": "application/json"})
if resp.status_code != 200:
return None
try:
return resp.json()
except Exception:
return None
def _sanctions_hits(*values: str) -> list[dict[str, Any]] | None:
hits: list[dict[str, Any]] = []
seen: set[str] = set()
for value in values:
if not value or value in seen:
continue
seen.add(value)
entries = match_exact(value)
if entries:
hits.append({"matched_value": value, "entries": entries})
return hits or None
def lookup_ip(ip: str) -> dict[str, Any]:
if not _IPV4_RE.match(ip) and not _IPV6_RE.match(ip):
raise ValueError("Invalid IP format")
check = validate_host(ip.strip("[]"))
if not check.get("ok"):
raise ValueError(check.get("reason", "blocked IP"))
results: dict[str, Any] = {"ip": ip, "timestamp": _now_iso()}
fields = (
"status,message,continent,country,countryCode,region,regionName,city,zip,"
"lat,lon,timezone,isp,org,as,asname,mobile,proxy,hosting,query"
)
geo = _json_get(f"https://ip-api.com/json/{quote(ip)}?fields={fields}", timeout=5)
if isinstance(geo, dict) and geo.get("status") == "success":
results["geo"] = {
"country": geo.get("country"),
"country_code": geo.get("countryCode"),
"region": geo.get("regionName"),
"city": geo.get("city"),
"lat": geo.get("lat"),
"lon": geo.get("lon"),
"timezone": geo.get("timezone"),
"isp": geo.get("isp"),
"org": geo.get("org"),
"as_number": geo.get("as"),
"as_name": geo.get("asname"),
"is_mobile": geo.get("mobile"),
"is_proxy": geo.get("proxy"),
"is_hosting": geo.get("hosting"),
}
results["reputation"] = {
"is_proxy": bool(geo.get("proxy")),
"is_hosting": bool(geo.get("hosting")),
"is_mobile": bool(geo.get("mobile")),
"risk_level": "HIGH" if geo.get("proxy") else "MEDIUM" if geo.get("hosting") else "LOW",
}
sm = _sanctions_hits(geo.get("org") or "", geo.get("isp") or "", geo.get("asname") or "")
if sm:
results["sanctions_match"] = {"source": "OFAC SDN", "hits": sm}
return results
def lookup_dns(domain: str) -> dict[str, Any]:
if not validate_domain(domain):
raise ValueError("Invalid domain format")
results: dict[str, Any] = {"domain": domain, "records": {}, "timestamp": _now_iso()}
for rtype in ("A", "AAAA", "MX", "NS", "TXT", "CNAME", "SOA"):
data = _json_get(
f"https://dns.google/resolve?name={quote(domain)}&type={rtype}",
timeout=5,
)
answers = []
if isinstance(data, dict):
for ans in data.get("Answer") or []:
answers.append(
{
"name": ans.get("name"),
"type": ans.get("type"),
"ttl": ans.get("TTL"),
"data": ans.get("data"),
}
)
results["records"][rtype] = answers
a_records = results["records"].get("A") or []
mx_records = results["records"].get("MX") or []
ns_records = results["records"].get("NS") or []
results["summary"] = {
"ip_addresses": [r["data"] for r in a_records if r.get("data")],
"mail_servers": [r["data"] for r in mx_records if r.get("data")],
"nameservers": [r["data"] for r in ns_records if r.get("data")],
"total_records": sum(len(v) for v in results["records"].values()),
}
return results
def lookup_whois(domain: str) -> dict[str, Any]:
if not validate_domain(domain):
raise ValueError("Invalid domain format")
results: dict[str, Any] = {"domain": domain, "timestamp": _now_iso()}
rdap = _json_get(f"https://rdap.org/domain/{quote(domain)}", timeout=8)
if isinstance(rdap, dict):
entities = []
for ent in rdap.get("entities") or []:
vcard = ent.get("vcardArray")
name = org = None
if isinstance(vcard, list) and len(vcard) > 1:
for row in vcard[1]:
if row[0] == "fn":
name = row[3]
if row[0] == "org":
org = row[3]
if name or org:
entities.append({"handle": ent.get("handle"), "roles": ent.get("roles"), "name": name, "org": org})
events = [
{"action": e.get("eventAction"), "date": e.get("eventDate")}
for e in (rdap.get("events") or [])
]
results["rdap"] = {
"handle": rdap.get("handle"),
"name": rdap.get("ldhName"),
"status": rdap.get("status"),
"events": events,
"nameservers": [ns.get("ldhName") for ns in (rdap.get("nameservers") or [])],
"entities": entities,
}
results["registration"] = next((e["date"] for e in events if e["action"] == "registration"), None)
results["expiration"] = next((e["date"] for e in events if e["action"] == "expiration"), None)
results["last_changed"] = next((e["date"] for e in events if e["action"] == "last changed"), None)
sm = _sanctions_hits(*(e.get("name") or "" for e in entities), *(e.get("org") or "" for e in entities))
if sm:
results["sanctions_match"] = {"source": "OFAC SDN", "hits": sm}
try:
res = safe_get(f"https://{domain}", timeout=5, headers={"User-Agent": "Shadowbroker-OSINT/1.0"})
headers = {}
for h in (
"server",
"x-powered-by",
"x-frame-options",
"strict-transport-security",
"content-security-policy",
"x-content-type-options",
"x-xss-protection",
"referrer-policy",
"permissions-policy",
):
val = res.headers.get(h)
if val:
headers[h] = val
score = sum(
1
for k in (
"strict-transport-security",
"content-security-policy",
"x-frame-options",
"x-content-type-options",
"referrer-policy",
)
if k in headers
) + (2 if "strict-transport-security" in headers else 0) + (2 if "content-security-policy" in headers else 0)
results["http"] = {"status": res.status_code, "headers": headers, "final_url": res.url}
results["security_score"] = {
"score": score,
"max": 7,
"grade": "A" if score >= 5 else "B" if score >= 3 else "C" if score >= 1 else "F",
}
except Exception as exc:
logger.debug("WHOIS header probe failed for %s: %s", domain, exc)
return results
def lookup_certs(domain: str) -> dict[str, Any]:
if not validate_domain(domain):
raise ValueError("Invalid domain format")
resp = fetch_with_curl(
f"https://crt.sh/?q=%25.{quote(domain)}&output=json",
timeout=10,
headers={"User-Agent": "Shadowbroker-OSINT/1.0"},
)
if resp.status_code != 200:
return {"domain": domain, "certificates": [], "error": "crt.sh unavailable"}
try:
certs = resp.json()
except Exception:
certs = []
seen: set[str] = set()
subdomains: set[str] = set()
unique: list[dict[str, Any]] = []
for cert in (certs or [])[:200]:
key = f"{cert.get('common_name')}-{cert.get('serial_number')}"
if key in seen:
continue
seen.add(key)
for name in (cert.get("name_value") or "").split("\n"):
clean = name.strip().replace("*.", "")
if clean.endswith(domain):
subdomains.add(clean)
unique.append(
{
"id": cert.get("id"),
"issuer": cert.get("issuer_name"),
"common_name": cert.get("common_name"),
"not_before": cert.get("not_before"),
"not_after": cert.get("not_after"),
}
)
return {
"domain": domain,
"certificates": unique[:50],
"subdomains": sorted(subdomains)[:100],
"total_found": len(certs or []),
"timestamp": _now_iso(),
}
def lookup_threats(query: str | None = None) -> dict[str, Any]:
results: dict[str, Any] = {"timestamp": _now_iso()}
pulses = _json_get("https://otx.alienvault.com/api/v1/pulses/activity?limit=10", timeout=8)
if isinstance(pulses, dict):
results["pulses"] = [
{
"name": p.get("name"),
"description": (p.get("description") or "")[:200],
"created": p.get("created"),
"tags": (p.get("tags") or [])[:5],
"adversary": p.get("adversary"),
"indicators_count": p.get("indicator_count"),
}
for p in (pulses.get("results") or [])[:10]
]
if query:
if _IPV4_RE.match(query):
try:
tor_resp = fetch_with_curl("https://check.torproject.org/torbulkexitlist", timeout=5)
results["tor_exit_node"] = query in (tor_resp.text or "").splitlines() if tor_resp.status_code == 200 else None
except Exception:
results["tor_exit_node"] = None
otx = _json_get(f"https://otx.alienvault.com/api/v1/indicators/IPv4/{quote(query)}/general", timeout=5)
if isinstance(otx, dict):
results["otx"] = {
"reputation": otx.get("reputation"),
"pulse_count": (otx.get("pulse_info") or {}).get("count", 0),
"country": otx.get("country_name"),
"asn": otx.get("asn"),
}
elif validate_domain(query):
otx = _json_get(f"https://otx.alienvault.com/api/v1/indicators/domain/{quote(query)}/general", timeout=5)
if isinstance(otx, dict):
results["otx"] = {"pulse_count": (otx.get("pulse_info") or {}).get("count", 0)}
pulse_count = (results.get("otx") or {}).get("pulse_count", 0)
results["threat_level"] = "HIGH" if pulse_count > 5 else "MEDIUM" if pulse_count > 0 else "LOW"
return results
def lookup_bgp(query: str) -> dict[str, Any]:
results: dict[str, Any] = {"query": query, "timestamp": _now_iso()}
if _IPV4_RE.match(query):
data = _json_get(f"https://api.bgpview.io/ip/{quote(query)}", timeout=8)
if isinstance(data, dict) and data.get("status") == "ok":
results["ip"] = data.get("data")
results["type"] = "ip"
return results
if _ASN_RE.match(query):
asn_num = re.sub(r"^AS", "", query, flags=re.I)
asn = _json_get(f"https://api.bgpview.io/asn/{asn_num}", timeout=8)
prefixes = _json_get(f"https://api.bgpview.io/asn/{asn_num}/prefixes", timeout=8)
peers = _json_get(f"https://api.bgpview.io/asn/{asn_num}/peers", timeout=8)
if isinstance(asn, dict) and asn.get("status") == "ok":
results["asn"] = asn.get("data")
if isinstance(prefixes, dict) and prefixes.get("status") == "ok":
pdata = prefixes.get("data") or {}
results["prefixes"] = {
"ipv4": (pdata.get("ipv4_prefixes") or [])[:20],
"ipv6": (pdata.get("ipv6_prefixes") or [])[:10],
"total_v4": len(pdata.get("ipv4_prefixes") or []),
"total_v6": len(pdata.get("ipv6_prefixes") or []),
}
if isinstance(peers, dict) and peers.get("status") == "ok":
pdata = peers.get("data") or {}
results["peers"] = {
"upstream": (pdata.get("ipv4_peers") or [])[:10],
"total": len(pdata.get("ipv4_peers") or []),
}
results["type"] = "asn"
return results
raise ValueError("Unrecognized query format. Use IP address or AS number.")
def lookup_sanctions(query: str, *, schema: str | None = None, limit: int = 25) -> dict[str, Any]:
matches = search_sanctions(query, schema=schema, limit=limit)
return {
"query": query,
"schema": schema,
"total": len(matches),
"matches": matches,
"source": "OpenSanctions / US OFAC SDN",
"timestamp": _now_iso(),
}
def lookup_cve(cve: str) -> dict[str, Any]:
if not _CVE_RE.match(cve):
raise ValueError("Invalid CVE format")
cve_id = cve.upper()
data = _json_get(f"https://cveawg.mitre.org/api/cve/{quote(cve_id)}", timeout=8)
if isinstance(data, dict) and data.get("cveMetadata"):
meta = data["cveMetadata"]
desc = ""
for block in (data.get("containers") or {}).get("cna", {}).get("descriptions") or []:
if block.get("lang") == "en":
desc = block.get("value") or desc
return {"id": meta.get("cveId", cve_id), "description": desc or "No description.", "timestamp": _now_iso()}
fallback = _json_get(f"https://cve.circl.lu/api/cve/{quote(cve_id)}", timeout=8)
if isinstance(fallback, dict):
return {
"id": fallback.get("id", cve_id),
"description": fallback.get("summary") or "No description.",
"cvss": fallback.get("cvss"),
"references": (fallback.get("references") or [])[:5],
"timestamp": _now_iso(),
}
raise ValueError("CVE not found")
def lookup_mac(mac: str) -> dict[str, Any]:
clean = mac.strip().upper()
clean = re.sub(r"[^A-F0-9:-]", "", clean)
data = _json_get(f"https://api.macvendors.com/{quote(clean)}", timeout=8)
if isinstance(data, dict):
return {"mac": clean, "vendor": data.get("company") or data.get("organization") or "Not Found"}
if isinstance(data, str) and data:
return {"mac": clean, "vendor": data}
return {"mac": clean, "vendor": "Not Found"}
def lookup_github(username: str) -> dict[str, Any]:
user = _json_get(f"https://api.github.com/users/{quote(username)}", timeout=8)
if not isinstance(user, dict) or user.get("message") == "Not Found":
raise ValueError("GitHub user not found")
repos = _json_get(f"https://api.github.com/users/{quote(username)}/repos?per_page=10&sort=updated", timeout=8)
return {
"username": username,
"profile": {
"name": user.get("name"),
"bio": user.get("bio"),
"company": user.get("company"),
"location": user.get("location"),
"public_repos": user.get("public_repos"),
"followers": user.get("followers"),
"created_at": user.get("created_at"),
"html_url": user.get("html_url"),
},
"repos": [
{"name": r.get("name"), "language": r.get("language"), "stars": r.get("stargazers_count")}
for r in (repos or [])[:10]
if isinstance(r, dict)
],
"timestamp": _now_iso(),
}
def lookup_leaks(email: str) -> dict[str, Any]:
if "@" not in email or len(email) < 5:
raise ValueError("Invalid email")
# HIBP requires API key for v3; use public breach directory style via leak-lookup (rate limited)
data = _json_get(f"https://leakcheck.io/api/public?check={quote(email)}", timeout=8)
if isinstance(data, dict):
return {
"email": email,
"found": bool(data.get("found")),
"sources": data.get("sources") or [],
"timestamp": _now_iso(),
}
return {"email": email, "found": False, "sources": [], "timestamp": _now_iso()}
def sweep_init(ip: str, cidr: int = 24) -> dict[str, Any]:
try:
addr = ipaddress.IPv4Address(ip)
except ValueError as exc:
raise ValueError("Invalid IPv4 address format") from exc
if addr.is_private or addr.is_loopback or addr.is_link_local or addr.is_reserved:
raise ValueError("Private and reserved IP ranges are not allowed")
if cidr < 24 or cidr > 32:
raise ValueError("CIDR must be between 24 and 32")
fields = "status,message,country,countryCode,region,regionName,city,lat,lon,isp,org,as,proxy,hosting"
geo = _json_get(f"https://ip-api.com/json/{quote(ip)}?fields={fields}", timeout=5)
if not isinstance(geo, dict) or geo.get("status") != "success":
raise ValueError(f"Geolocation failed: {(geo or {}).get('message', 'unknown')}")
return {
"center": {
"lat": geo.get("lat"),
"lng": geo.get("lon"),
"city": geo.get("city"),
"region": geo.get("regionName"),
"country": geo.get("country"),
"countryCode": geo.get("countryCode"),
"isp": geo.get("isp"),
"asn": geo.get("as") or "",
"org": geo.get("org") or "",
},
"target_ip": ip,
"cidr": cidr,
}
def _internetdb_lookup(ip: str) -> dict[str, Any] | None:
try:
resp = fetch_with_curl(
f"https://internetdb.shodan.io/{quote(ip)}",
timeout=4,
headers={"Accept": "application/json"},
)
if resp.status_code == 404:
return None
if resp.status_code != 200:
return None
return resp.json()
except Exception:
return None
def sweep_scan(subnet_start: str, cidr: int, *, max_workers: int = 12) -> dict[str, Any]:
"""Scan a /24-/32 via Shodan InternetDB (server-side proxy)."""
base = int(ipaddress.IPv4Address(subnet_start))
host_count = 2 ** (32 - cidr)
if host_count > 256:
raise ValueError("Subnet too large")
ips = [str(ipaddress.IPv4Address(base + i)) for i in range(host_count)]
devices: list[dict[str, Any]] = []
t0 = time.time()
with ThreadPoolExecutor(max_workers=max_workers) as pool:
futures = {pool.submit(_internetdb_lookup, ip): ip for ip in ips}
for fut in as_completed(futures):
ip = futures[fut]
data = fut.result()
if not data:
continue
devices.append(
{
"ip": data.get("ip") or ip,
"ports": data.get("ports") or [],
"hostnames": data.get("hostnames") or [],
"cpes": data.get("cpes") or [],
"vulns": data.get("vulns") or [],
"tags": data.get("tags") or [],
}
)
return {
"devices": devices,
"summary": {"total_hosts": host_count, "total_responsive": len(devices)},
"sweep_time_ms": int((time.time() - t0) * 1000),
}
def subnet_start_for(ip: str, cidr: int) -> str:
net = ipaddress.IPv4Network(f"{ip}/{cidr}", strict=False)
return str(net.network_address)
+135
View File
@@ -0,0 +1,135 @@
"""OpenClaw dispatch for the operator recon / OSINT lookup toolkit."""
from __future__ import annotations
from typing import Any
from services.osint import lookups
from services.osint_intel.resolve import ALLOWED_TYPES, resolve_entity
_OSINT_TOOLS: dict[str, str] = {
"ip": "ip",
"dns": "domain",
"whois": "domain",
"certs": "domain",
"threats": "query",
"bgp": "query",
"sanctions": "query",
"cve": "cve",
"mac": "mac",
"github": "username",
"leaks": "email",
"sweep_init": "ip",
}
_ENTITY_SCHEMAS = frozenset({
"Person",
"Organization",
"Company",
"Vessel",
"Airplane",
"LegalEntity",
})
def _require_str(args: dict[str, Any], *keys: str) -> str:
for key in keys:
value = str(args.get(key, "") or "").strip()
if value:
return value
joined = "/".join(keys)
raise ValueError(f"Missing required argument: {joined}")
def run_osint_lookup(tool: str, args: dict[str, Any]) -> dict[str, Any]:
"""Run a passive OSINT lookup (same backends as /api/osint/*)."""
name = str(tool or "").strip().lower().replace("-", "_")
if name not in _OSINT_TOOLS:
allowed = ", ".join(sorted(_OSINT_TOOLS))
raise ValueError(f"Unknown OSINT tool '{tool}'. Allowed: {allowed}")
if name == "ip":
return lookups.lookup_ip(_require_str(args, "ip", "query", "value"))
if name == "dns":
return lookups.lookup_dns(_require_str(args, "domain", "query", "value"))
if name == "whois":
return lookups.lookup_whois(_require_str(args, "domain", "query", "value"))
if name == "certs":
return lookups.lookup_certs(_require_str(args, "domain", "query", "value"))
if name == "threats":
query = str(args.get("query", "") or args.get("value", "") or "").strip() or None
return lookups.lookup_threats(query)
if name == "bgp":
return lookups.lookup_bgp(_require_str(args, "query", "asn", "value"))
if name == "sanctions":
query = _require_str(args, "query", "name", "value")
schema = str(args.get("schema", "") or "").strip() or None
if schema and schema not in _ENTITY_SCHEMAS:
allowed = ", ".join(sorted(_ENTITY_SCHEMAS))
raise ValueError(f"Invalid schema. Allowed: {allowed}")
limit = args.get("limit", 25)
try:
limit = int(limit)
except (TypeError, ValueError):
limit = 25
limit = max(1, min(100, limit))
return lookups.lookup_sanctions(query, schema=schema, limit=limit)
if name == "cve":
return lookups.lookup_cve(_require_str(args, "cve", "query", "value"))
if name == "mac":
return lookups.lookup_mac(_require_str(args, "mac", "query", "value"))
if name == "github":
return lookups.lookup_github(_require_str(args, "username", "user", "query", "value"))
if name == "leaks":
return lookups.lookup_leaks(_require_str(args, "email", "query", "value"))
if name == "sweep_init":
ip = _require_str(args, "ip", "query", "value")
cidr = args.get("cidr", 24)
try:
cidr = int(cidr)
except (TypeError, ValueError):
cidr = 24
return lookups.sweep_init(ip, cidr)
raise ValueError(f"Unhandled OSINT tool: {name}")
def run_osint_sweep(args: dict[str, Any]) -> dict[str, Any]:
"""Run subnet device discovery (Shodan InternetDB proxy). Requires full access tier."""
ip = _require_str(args, "ip", "query", "value")
cidr = args.get("cidr", 24)
try:
cidr = int(cidr)
except (TypeError, ValueError):
cidr = 24
subnet = lookups.subnet_start_for(ip, cidr)
scan = lookups.sweep_scan(subnet, cidr)
init = lookups.sweep_init(ip, cidr)
return {**init, **scan, "subnet": f"{subnet}/{cidr}"}
def run_entity_expand(args: dict[str, Any]) -> dict[str, Any]:
"""Expand an entity graph node (aircraft, vessel, IP, company, person, country)."""
entity_type = _require_str(args, "type", "entity_type")
entity_id = _require_str(args, "id", "entity_id", "query", "value")
props = {
"label": entity_id,
"registration": str(args.get("registration", "") or "").strip() or None,
"model": str(args.get("model", "") or "").strip() or None,
"icao24": str(args.get("icao24", "") or "").strip() or None,
}
props = {key: value for key, value in props.items() if value is not None}
return resolve_entity(entity_type, entity_id, props)
def osint_tool_help() -> dict[str, Any]:
"""Discovery metadata for agents."""
return {
"tools": sorted(_OSINT_TOOLS),
"entity_types": sorted(ALLOWED_TYPES),
"sanctions_schemas": sorted(_ENTITY_SCHEMAS),
"notes": {
"osint_lookup": "Passive lookups — same data as the Recon panel /api/osint/* routes.",
"osint_sweep": "Active subnet scan via Shodan InternetDB — requires full OpenClaw access tier.",
"entity_expand": "Build a relationship graph around aircraft, vessels, IPs, companies, people, or countries.",
},
}
+1
View File
@@ -0,0 +1 @@
"""Entity graph resolution (Osiris intel layer port)."""
+268
View File
@@ -0,0 +1,268 @@
"""Entity graph resolver (Python port of Osiris intel/server.js)."""
from __future__ import annotations
import logging
import re
import threading
import time
from typing import Any
from urllib.parse import quote
from services.network_utils import fetch_with_curl
from services.sanctions.ofac import match_exact, search_sanctions
logger = logging.getLogger(__name__)
ALLOWED_TYPES = frozenset({"aircraft", "vessel", "company", "person", "ip", "country"})
_WD_CACHE: dict[str, tuple[float, dict[str, Any]]] = {}
_WD_LOCK = threading.Lock()
_WD_TTL = 24 * 60 * 60
_WD_UA = "Shadowbroker-Intel/1.0 (ontology engine)"
def _dedup(nodes: list[dict], links: list[dict]) -> dict[str, Any]:
node_map: dict[str, dict] = {}
for n in nodes:
node_map[n["id"]] = n
seen_links: set[str] = set()
out_links: list[dict] = []
for link in links:
key = f"{link['source']}{link['target']}{link['label']}"
if key in seen_links:
continue
seen_links.add(key)
out_links.append(link)
return {"nodes": list(node_map.values()), "links": out_links}
def _wd_cache_get(key: str) -> dict[str, Any] | None:
with _WD_LOCK:
entry = _WD_CACHE.get(key)
if not entry:
return None
ts, data = entry
if time.time() - ts > _WD_TTL:
_WD_CACHE.pop(key, None)
return None
return data
def _wd_cache_set(key: str, data: dict[str, Any]) -> None:
with _WD_LOCK:
if len(_WD_CACHE) > 5000:
oldest = next(iter(_WD_CACHE))
_WD_CACHE.pop(oldest, None)
_WD_CACHE[key] = (time.time(), data)
def _add_sanctions(id_label: str, root_id: str, nodes: list, links: list) -> None:
for hit in search_sanctions(id_label, limit=3):
sid = f"sanction:{hit['id']}"
nodes.append(
{
"id": sid,
"label": hit["name"],
"type": "sanction",
"properties": {"programs": hit.get("programs"), "source": "OFAC SDN"},
}
)
links.append({"source": root_id, "target": sid, "label": "SANCTIONS MATCH"})
def _sparql(query: str) -> list[dict[str, Any]]:
url = f"https://query.wikidata.org/sparql?query={quote(query)}&format=json"
resp = fetch_with_curl(url, timeout=10, headers={"User-Agent": _WD_UA, "Accept": "application/sparql-results+json"})
if resp.status_code != 200:
return []
try:
data = resp.json()
except Exception:
return []
return data.get("results", {}).get("bindings", [])
def _wd_search(label: str) -> str | None:
url = (
"https://www.wikidata.org/w/api.php?action=wbsearchentities"
f"&search={quote(label)}&language=en&limit=1&format=json"
)
resp = fetch_with_curl(url, timeout=5, headers={"User-Agent": _WD_UA})
if resp.status_code != 200:
return None
try:
hits = resp.json().get("search") or []
except Exception:
return None
return hits[0]["id"] if hits else None
def _resolve_ip(id_value: str) -> dict[str, Any]:
cache_key = f"ip:{id_value}"
cached = _wd_cache_get(cache_key)
if cached:
return cached
root_id = f"ip:{id_value}"
nodes: list[dict] = [{"id": root_id, "label": id_value, "type": "ip", "properties": {}}]
links: list[dict] = []
geo = fetch_with_curl(
f"https://ip-api.com/json/{quote(id_value)}"
"?fields=status,country,countryCode,city,lat,lon,isp,org,as,asname,proxy,hosting,mobile",
timeout=8,
)
if geo.status_code == 200:
try:
data = geo.json()
except Exception:
data = {}
if data.get("status") == "success":
nodes[0]["properties"] = {
"proxy": bool(data.get("proxy")),
"hosting": bool(data.get("hosting")),
"mobile": bool(data.get("mobile")),
"source": "ip-api.com",
}
if data.get("isp"):
iid = f"company:{data['isp']}"
nodes.append({"id": iid, "label": data["isp"], "type": "company", "properties": {"role": "ISP"}})
links.append({"source": root_id, "target": iid, "label": "HOSTED_BY"})
if data.get("country"):
cid = f"country:{data['country']}"
nodes.append(
{
"id": cid,
"label": data["country"],
"type": "country",
"properties": {"code": data.get("countryCode")},
}
)
links.append({"source": root_id, "target": cid, "label": "LOCATED_IN"})
for val in (data.get("isp"), data.get("org"), data.get("asname")):
if val:
for entry in match_exact(val):
sid = f"sanction:{entry['id']}"
nodes.append({"id": sid, "label": entry["name"], "type": "sanction", "properties": {}})
links.append({"source": root_id, "target": sid, "label": "SANCTIONS MATCH"})
whois = fetch_with_curl(
f"https://stat.ripe.net/data/whois/data.json?resource={quote(id_value)}",
timeout=8,
)
if whois.status_code == 200:
try:
records = whois.json().get("data", {}).get("records") or []
except Exception:
records = []
for record in records:
for field in record:
if field.get("key") in ("netname", "NetName"):
nid = f"company:{field['value']}"
nodes.append({"id": nid, "label": field["value"], "type": "company", "properties": {"role": "Network"}})
links.append({"source": root_id, "target": nid, "label": "HOSTED_BY"})
result = _dedup(nodes, links)
_wd_cache_set(cache_key, result)
return result
def _resolve_company(id_value: str) -> dict[str, Any]:
cache_key = f"company:{id_value}"
cached = _wd_cache_get(cache_key)
if cached:
return cached
root_id = f"company:{id_value}"
nodes = [{"id": root_id, "label": id_value, "type": "company", "properties": {}}]
links: list[dict] = []
safe = re.sub(r'[^a-zA-Z0-9 \-._]', '', id_value).strip()
qid = _wd_search(safe)
filt = f"VALUES ?item {{ wd:{qid} }}" if qid else f'?item rdfs:label "{safe}"@en . ?item wdt:P31/wdt:P279* wd:Q4830453 .'
rows = _sparql(
f"""
SELECT ?countryLabel ?parentLabel ?ceoLabel WHERE {{
{filt}
OPTIONAL {{ ?item wdt:P17 ?country . }}
OPTIONAL {{ ?item wdt:P749 ?parent . }}
OPTIONAL {{ ?item wdt:P169 ?ceo . }}
SERVICE wikibase:label {{ bd:serviceParam wikibase:language "en" . }}
}} LIMIT 10
"""
)
for row in rows:
if row.get("countryLabel", {}).get("value"):
cid = f"country:{row['countryLabel']['value']}"
nodes.append({"id": cid, "label": row["countryLabel"]["value"], "type": "country", "properties": {}})
links.append({"source": root_id, "target": cid, "label": "HEADQUARTERED"})
if row.get("parentLabel", {}).get("value"):
pid = f"company:{row['parentLabel']['value']}"
nodes.append({"id": pid, "label": row["parentLabel"]["value"], "type": "company", "properties": {}})
links.append({"source": root_id, "target": pid, "label": "PARENT ORG"})
if row.get("ceoLabel", {}).get("value"):
pid = f"person:{row['ceoLabel']['value']}"
nodes.append({"id": pid, "label": row["ceoLabel"]["value"], "type": "person", "properties": {"role": "CEO"}})
links.append({"source": root_id, "target": pid, "label": "CEO"})
_add_sanctions(id_value, root_id, nodes, links)
result = _dedup(nodes, links)
_wd_cache_set(cache_key, result)
return result
def _resolve_from_store(entity_type: str, id_value: str, props: dict[str, Any]) -> dict[str, Any]:
from services.fetchers._store import get_latest_data_subset_refs
root_id = f"{entity_type}:{id_value}"
nodes = [{"id": root_id, "label": props.get("label") or id_value, "type": entity_type, "properties": props}]
links: list[dict] = []
data = get_latest_data_subset_refs("flights", "ships", "military_flights", "tracked_flights")
if entity_type == "aircraft":
icao = (props.get("icao24") or id_value).lower()
for bucket in ("military_flights", "tracked_flights", "flights"):
for f in data.get(bucket) or []:
if str(f.get("icao24", "")).lower() == icao:
if f.get("country"):
cid = f"country:{f['country']}"
nodes.append({"id": cid, "label": f["country"], "type": "country", "properties": {}})
links.append({"source": root_id, "target": cid, "label": "REGISTERED_IN"})
if f.get("registration"):
nodes[0]["properties"]["registration"] = f["registration"]
break
elif entity_type == "vessel":
mmsi = str(props.get("mmsi") or id_value)
for ship in data.get("ships") or []:
if str(ship.get("mmsi")) == mmsi:
if ship.get("country"):
cid = f"country:{ship['country']}"
nodes.append({"id": cid, "label": ship["country"], "type": "country", "properties": {}})
links.append({"source": root_id, "target": cid, "label": "FLAG"})
break
_add_sanctions(id_value, root_id, nodes, links)
return _dedup(nodes, links)
def resolve_entity(entity_type: str, id_value: str, properties: dict[str, Any] | None = None) -> dict[str, Any]:
etype = (entity_type or "").lower().strip()
eid = (id_value or "").strip()
if etype not in ALLOWED_TYPES:
raise ValueError(f"Invalid type. Allowed: {', '.join(sorted(ALLOWED_TYPES))}")
if len(eid) < 2 or len(eid) > 200:
raise ValueError("Invalid id (2-200 chars)")
props = properties or {}
if etype == "ip":
return _resolve_ip(eid)
if etype in ("company", "person", "country"):
if etype == "company":
return _resolve_company(eid)
if etype == "person":
root_id = f"person:{eid}"
nodes = [{"id": root_id, "label": eid, "type": "person", "properties": {}}]
links: list[dict] = []
_add_sanctions(eid, root_id, nodes, links)
return _dedup(nodes, links)
root_id = f"country:{eid}"
nodes = [{"id": root_id, "label": eid, "type": "country", "properties": {}}]
links = []
_add_sanctions(eid, root_id, nodes, links)
return _dedup(nodes, links)
return _resolve_from_store(etype, eid, props)
@@ -0,0 +1,81 @@
"""Operator opt-in for Polymarket/Kalshi outbound fetches (Global Threat Intercept)."""
from __future__ import annotations
import json
import logging
import os
import threading
from pathlib import Path
from typing import Any
logger = logging.getLogger(__name__)
_OPT_IN_FILE = Path(__file__).resolve().parent.parent / "data" / "prediction_markets_opt_in.json"
_OPT_IN_LOCK = threading.Lock()
def _env_flag(name: str) -> str:
return str(os.getenv(name, "")).strip().lower()
def get_prediction_markets_ui_opt_in() -> bool:
if not _OPT_IN_FILE.exists():
return False
try:
payload = json.loads(_OPT_IN_FILE.read_text(encoding="utf-8"))
return bool(payload.get("opted_in"))
except (OSError, json.JSONDecodeError, TypeError) as exc:
logger.warning("Prediction markets opt-in file unreadable: %s", exc)
return False
def set_prediction_markets_ui_opt_in(opted_in: bool) -> None:
_OPT_IN_FILE.parent.mkdir(parents=True, exist_ok=True)
with _OPT_IN_LOCK:
_OPT_IN_FILE.write_text(
json.dumps({"opted_in": bool(opted_in)}, indent=2),
encoding="utf-8",
)
def prediction_markets_env_forced_on() -> bool:
return _env_flag("PREDICTION_MARKETS_ENABLED") in {"1", "true", "yes", "on"}
def prediction_markets_env_forced_off() -> bool:
return _env_flag("PREDICTION_MARKETS_ENABLED") in {"0", "false", "no", "off"}
def prediction_markets_fetch_enabled() -> bool:
"""True when UI opt-in or env enables Polymarket/Kalshi pulls."""
if get_prediction_markets_ui_opt_in():
return True
return prediction_markets_env_forced_on()
def prediction_markets_status() -> dict[str, Any]:
ui_opted_in = get_prediction_markets_ui_opt_in()
env_on = prediction_markets_env_forced_on()
env_off = prediction_markets_env_forced_off()
env_override = None
if env_on:
env_override = "on"
elif env_off:
env_override = "off"
return {
"enabled": prediction_markets_fetch_enabled(),
"ui_opted_in": ui_opted_in,
"env_override": env_override,
"jitter": {
"scheduler_interval_minutes": int(
os.environ.get("PREDICTION_MARKETS_INTERVAL_MINUTES", "7")
),
"scheduler_jitter_seconds": int(
os.environ.get("PREDICTION_MARKETS_SCHEDULER_JITTER_S", "240")
),
"pre_fetch_jitter_seconds": float(
os.environ.get("PREDICTION_MARKETS_PRE_FETCH_JITTER_S", "90")
),
},
}
+1 -1
View File
@@ -213,7 +213,7 @@ def validate_privacy_core_startup(settings: Any | None = None) -> None:
attestation = privacy_core_attestation(snapshot)
state = str(attestation.get("attestation_state", "") or "").strip()
if state == "attested_current":
if state in {"attested_current", "development_override"}:
return
logger.critical(
+33
View File
@@ -301,3 +301,36 @@ def get_region_dossier(lat: float, lng: float) -> dict:
dossier_cache[cache_key] = result
return result
def fetch_wikipedia_page_summary(title: str) -> dict | None:
"""Wikipedia REST summary for a page title (backend-proxied for #360)."""
trimmed = (title or "").strip()
if not trimmed:
return None
data = _fetch_local_wiki_summary(trimmed, "")
if not data.get("extract") and not data.get("description"):
return None
return {
"title": trimmed,
"description": data.get("description", ""),
"extract": data.get("extract", ""),
"thumbnail": data.get("thumbnail", ""),
"type": "standard",
}
def fetch_wikidata_sparql_bindings(sparql: str) -> list:
"""Run a Wikidata SPARQL query; returns bindings list (empty on failure)."""
trimmed = (sparql or "").strip()
if not trimmed:
return []
url = f"https://query.wikidata.org/sparql?query={quote(trimmed)}&format=json"
try:
res = fetch_with_curl(url, timeout=8, headers=_wikimedia_request_headers())
if res.status_code == 200:
bindings = res.json().get("results", {}).get("bindings", [])
return bindings if isinstance(bindings, list) else []
except (ConnectionError, TimeoutError, ValueError, KeyError, OSError) as e:
logger.warning("Wikidata SPARQL failed: %s", e)
return []
@@ -0,0 +1,5 @@
"""Sentinel-2 road corridor freight trend analysis (DrishX engine port)."""
from .config import optional_deps_available, road_corridor_sat_enabled
__all__ = ["optional_deps_available", "road_corridor_sat_enabled"]
@@ -0,0 +1,4 @@
from .cli import main
if __name__ == "__main__":
raise SystemExit(main())
+53
View File
@@ -0,0 +1,53 @@
"""CLI for manual road corridor analysis runs."""
from __future__ import annotations
import argparse
import logging
import sys
from .config import optional_deps_available, road_corridor_sat_enabled
from .credentials import sentinel_credentials_configured
from .pipeline import analyze_preset
from .presets import CORRIDOR_PRESETS
def main(argv: list[str] | None = None) -> int:
parser = argparse.ArgumentParser(description="Run Sentinel-2 road corridor truck trend analysis")
parser.add_argument("--preset", required=True, help="Preset id (e.g. laredo_i35)")
parser.add_argument("-v", "--verbose", action="store_true")
args = parser.parse_args(argv)
logging.basicConfig(level=logging.DEBUG if args.verbose else logging.INFO)
if not optional_deps_available():
print(
"Install optional deps: uv sync --extra road-corridor "
"(geopandas, osmnx, rasterio, sentinelhub, scikit-learn, imageio)",
file=sys.stderr,
)
return 2
if not road_corridor_sat_enabled() and not args.verbose:
print("Note: ROAD_CORRIDOR_SAT_ENABLED is off — CLI still runs for manual analysis.")
if not sentinel_credentials_configured():
print("Set SENTINEL_CLIENT_ID and SENTINEL_CLIENT_SECRET first.", file=sys.stderr)
return 2
valid = {p["id"] for p in CORRIDOR_PRESETS}
if args.preset not in valid:
print(f"Unknown preset {args.preset!r}. Choose from: {', '.join(sorted(valid))}", file=sys.stderr)
return 2
def progress(msg: str, pct: int | None = None) -> None:
suffix = f" ({pct}%)" if pct is not None else ""
print(f"{msg}{suffix}")
result = analyze_preset(args.preset, progress_cb=progress)
print(
f"Done: {result.get('total_detections', 0)} detections across "
f"{len(result.get('daily_counts') or [])} days — status={result.get('status')}"
)
return 0 if result.get("status") == "ok" else 1
if __name__ == "__main__":
raise SystemExit(main())
@@ -0,0 +1,41 @@
"""Configuration for Sentinel-2 road corridor trend analysis."""
from __future__ import annotations
import os
from pathlib import Path
_BACKEND_ROOT = Path(__file__).resolve().parents[2]
DATA_ROOT = Path(os.environ.get("ROAD_CORRIDOR_DATA_DIR", str(_BACKEND_ROOT / "data" / "road_corridors")))
CACHE_DIR = DATA_ROOT / "cache"
DETECTION_CROP_DIR = DATA_ROOT / "detection_crops"
STATE_PATH = DATA_ROOT / "_refresh_state.json"
DEFAULT_MONTHS = int(os.environ.get("ROAD_CORRIDOR_MONTHS", "2"))
DEFAULT_MAX_FRAMES = int(os.environ.get("ROAD_CORRIDOR_MAX_FRAMES", "6"))
SCHEDULED_PRESET_IDS = [
s.strip()
for s in os.environ.get("ROAD_CORRIDOR_SCHEDULED_PRESETS", "laredo_i35").split(",")
if s.strip()
]
def road_corridor_sat_enabled() -> bool:
return os.environ.get("ROAD_CORRIDOR_SAT_ENABLED", "").strip().lower() in {
"1",
"true",
"yes",
"on",
}
def optional_deps_available() -> bool:
try:
import geopandas # noqa: F401
import osmnx # noqa: F401
import rasterio # noqa: F401
import sentinelhub # noqa: F401
import sklearn # noqa: F401
return True
except ImportError:
return False

Some files were not shown because too many files have changed in this diff Show More