Shadowbroker

mirror of https://github.com/BigBodyCobain/Shadowbroker.git synced 2026-07-15 08:17:23 +02:00

Author	SHA1	Message	Date
BigBodyCobain	a16f22ed34	Cover AI and SAR proxy auth routes	2026-05-29 08:15:06 -06:00
BigBodyCobain	41e35e4da2	Fail fast on short admin keys	2026-05-28 15:02:40 -06:00
BigBodyCobain	be3ab5823a	Fix self-host API key proxy auth	2026-05-28 01:54:23 -06:00
BigBodyCobain	ef52bd03d2	Harden private Infonet host checks	2026-05-28 01:26:48 -06:00
BigBodyCobain	017f383096	Fix BadHost path handling	2026-05-28 01:24:33 -06:00
Shadowbroker	41799f9891	feat(ci): switch GitLab mirror-to-github job to per-repo SSH deploy key (#331 ) * feat(ci): switch mirror-to-github job from PAT to per-repo SSH deploy key GitHub fine-grained PATs are capped at 366 days, classic PATs would need 'public_repo' (broader scope than needed). Per-repo SSH deploy keys are tighter: - Can ONLY push to BigBodyCobain/Shadowbroker (no access to anything else, not even other repos owned by the same account). - Never expire. - Rotating == one-click delete on github.com/.../settings/keys. Changes: - New CI/CD variable GITHUB_MIRROR_SSH_KEY (File, Protected) holding the ed25519 private half. Public half lives on the repo's deploy keys with write access enabled. - mirror-to-github before_script writes the key to ~/.ssh/id_ed25519, pins github.com host fingerprints (ed25519 + ecdsa + rsa from the 2023-03-24 rotation) into ~/.ssh/known_hosts so we never trust a MITM, then pushes via git@github.com:... instead of HTTPS. - Job rule now gates on GITHUB_MIRROR_SSH_KEY (the new var) instead of GITHUB_MIRROR_TOKEN (which never existed). After this lands, every commit pushed directly to GitLab main will mirror back to GitHub main automatically — closing the loop on bi-directional sync. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * fix(secret-scan): exempt SSH known_hosts entries from leaked-key detection PR #331 introduced github.com host fingerprints pinned in .gitlab-ci.yml's mirror-to-github before_script. The scanner flagged them as embedded secrets and blocked CI: BLOCKED: Embedded secrets/tokens found in: .gitlab-ci.yml 133: github.com ssh-ed25519 AAAA... 135: github.com ssh-rsa AAAA... These are PUBLIC host keys — the whole point of pinning known_hosts is to publish the fingerprint widely so a MITM is detectable. They are documented at https://docs.github.com/en/authentication/keeping-your-account-and-data-secure/githubs-ssh-key-fingerprints and committing them is the correct, secure practice. Fix: add a KNOWN_HOSTS_LINE regex to the content-scan block that recognizes `<host-or-ip> [salt] <algo> AAAA...` shape lines (the exact format used in ~/.ssh/known_hosts) and filters them out before flagging the file. Bare `ssh-rsa AAAA...` lines without a host prefix are still caught — only the host-key shape is exempt. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-25 04:22:09 -06:00
Shadowbroker	a1af9c3595	fix(ci): wrap GitLab dind TLS env in docker context so buildx accepts it (#330 ) The build-backend and build-frontend jobs were failing immediately after identity verification finally allocated runners: $ docker buildx create --use --name multiarch --driver docker-container ERROR: could not create a builder instance with TLS data loaded from environment. Please use `docker context create <context-name>` to create a context for current environment and then create a builder instance with context set to <context-name> The dind service exports DOCKER_HOST=tcp://docker:2376 + DOCKER_TLS_CERTDIR=/certs, but buildx --driver docker-container doesn't read TLS from those env vars directly. Documented GitLab fix: create an empty `docker context` (which inherits the current TLS env), then bind buildx to that context name as a positional arg. After this lands, the multi-arch buildx jobs should actually build and push amd64 + arm64 images to registry.gitlab.com/bigbodycobain/shadowbroker/backend:latest registry.gitlab.com/bigbodycobain/shadowbroker/frontend:latest Surfaced by the post-verification pipeline at https://gitlab.com/bigbodycobain/Shadowbroker/-/pipelines/2550501798 Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-25 02:04:53 -06:00
Shadowbroker	c8a8fc56f8	chore(ci): bump comment in .gitlab-ci.yml to verify post-verification runner allocation (#329 ) Pipelines on the GitLab mirror have been instant-failing with 0 jobs and no started_at since the project was created — classic "shared runners not allocated to unverified free-tier accounts" pattern. The account is now identity-verified; this trivial comment bump exists solely to fire a fresh pipeline that confirms runners now pick up the build-backend and build-frontend jobs. If the resulting pipeline produces real jobs that build the multi-arch images and push them to registry.gitlab.com/bigbodycobain/shadowbroker/{backend,frontend}, the GitLab install path is at full parity with the GitHub one. Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-25 01:54:08 -06:00
Shadowbroker	e6aba86ce1	chore(release): update v0.9.81 SHA256 digests after rebuild (#328 ) Re-cut v0.9.81 binaries from current main (which now includes the private gate + DM hashchain spool from #326 and the gate-directory test from #327). All three artifacts were signed with the same minisign updater key as the original v0.9.81 release, so existing v0.9.81 installs on Tauri auto-update accept the new bundles. Updated hashes (verified against released assets): - ShadowBroker_v0.9.81.zip f81f454bdc88e9a32c351df38212b8cfa624704d65764b971bb091eef62259c6 - ShadowBroker_0.9.81_x64-setup.exe 25e9a95d0d8ce959a7d08fe8e7406772ae24b596652793e81d1de5d02510a5a6 - ShadowBroker_0.9.81_x64_en-US.msi 34e655fc0c0f195ee4ac978f228a4b2b9d5565253b8771aca9ef4693409e9e70 Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-25 01:16:12 -06:00
Shadowbroker	d5609ac02f	test(infonet): cover gate directory renderer (landing + command variants) (#327 ) Adds the focused test Codex wrote alongside the gate-directory UI work that already shipped in #326 (the `renderGateDirectory` helper used both under the Infonet logo on the landing screen and as the output of the `gates` command in the terminal). The renderer itself is already on origin/main; this PR just ships the test so CI catches regressions to the dual-variant render. Verified locally: - frontend npm run test:ci -- src/__tests__/mesh/infonetShellGateDirectory.test.tsx → 1/1 pass Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-24 21:55:54 -06:00
Shadowbroker	1d7fa5185a	feat(infonet): private gate + DM hashchain spool with hardened propagation (#326 ) Private gate messages and offline DMs now ride the Infonet hashchain as ciphertext-only events, replicated across nodes via private transports (Tor onion / RNS / loopback) and decrypted only by parties holding the gate or recipient keys. Hashchain core (mesh_hashchain.py) ---------------------------------- * New ``append_private_gate_message`` and ``append_private_dm_message`` append paths with full signature verification, public-key binding, revocation check, and replay protection in a dedicated sequence domain (so a gate post does not consume the author's public broadcast sequence, and a DM cannot replay-block a public message at sequence=1). * Fork validation and full-chain validation now accept the gate signature compatibility variants — older signatures that canonicalize with/without epoch or reply_to still verify, so a re-sync from an older peer doesn't reject still-valid history. * DM hashchain spool: capped at 2 active sealed offline DMs per recipient mailbox, plus a per-(sender, recipient) cap so one prolific sender can't consume both slots. 1-hour TTL on the cap counter. Spool intentionally small — it's an offline bootstrap channel, not a persistent mailbox. * Rebuild-state preserves the gate sequence domain across reloads so a chain reload doesn't accidentally let an old gate sequence replay-collide on next append. Schema enforcement (mesh_schema.py) ----------------------------------- * Private gate + DM payloads have closed allowlists of fields. Plaintext keys (``message``, ``plaintext``, ``_local_plaintext``, ``_local_reply_to``) are explicit rejection-bait — they raise before the event ever touches the chain. * DM ciphertext + nonce must look like base64-ish sealed bytes; obvious base64-encoded plaintext shapes are rejected. * ``transport_lock`` required: DM hashchain spool requires ``private_strong``; gate accepts ``private``/``private_strong``/ ``rns``/``onion``. Defense-in-depth at the network layer (main.py + mesh_public.py) ---------------------------------------------------------------- * ``_infonet_sync_response_events`` now silently redacts private events (gate_message + dm_message) unless the request looks like a loopback / onion / RNS / private transport caller. If an operator accidentally exposes :8000 to the public internet, an external puller gets public events only — never ciphertext. * ``_sync_from_peer`` raises ``PeerSyncRateLimited`` for 429 (handled as 4-tuple return with retry_after_s) and ``PeerSyncHTTPError`` for other non-200 statuses (handled by ``_run_public_sync_cycle`` to honor server cooldown hints even outside the 429 path). DM relay hydration (main.py) ----------------------------- * New ``_hydrate_dm_relay_from_chain``: when accepted dm_message chain events arrive on a node, they get deposited into the local DM relay store with a deterministic sender_token_hash so re-sync of the same event is idempotent. Recipients see the ciphertext as a normal DM on their next poll and decrypt with their existing recipient key. Other surfaces -------------- * meshnode.bat / meshnode.sh now set ``MESH_INFONET_ALLOW_CLEARNET_SYNC= false`` and the participant runtime flags by default so a freshly spun-up node defaults to private-only sync. * InfonetTerminal/InfonetShell.tsx adds a gate directory renderer for the new private-gate workflow. * docker-compose.relay.yml binds the relay backend to 127.0.0.1:8000 only; Tor's hidden service forwards onion traffic into 127.0.0.1. Public clearnet :8000 stays off the network edge. Tests ----- * 7 new tests in test_private_gate_hashchain.py + test_private_dm_ hashchain.py covering: gate fork accepts ciphertext propagation, gate fork rejects plaintext, append rejects plaintext before normalize, append requires private_strong, append rejects non-sealed ciphertext shape, DM spool 2-per-recipient + 1-per-pair cap, DM hydration delivers to poll/claim. * Updated test_mesh_node_bootstrap_runtime.py covers 429 backoff via PeerSyncRateLimited 4-tuple AND PeerSyncHTTPError exception. * Updated test_s14b_public_sync_gate_filter.py + test_s9b_gate_store_ hydration.py + test_gate_write_cutover.py cover the new private redaction on public sync responses. * test_private_gate_hashchain.py + test_private_dm_hashchain.py: 10 passed locally. * Combined mesh-relevant suite (the 5 modified existing tests + 2 new): 17 passed. Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-24 21:25:18 -06:00
Shadowbroker	fb97042c01	Update README.md Elaborated on Tor and Reticulum usage.	2026-05-24 11:08:05 -06:00
Shadowbroker	2616a6c9e3	Update README.md	2026-05-24 11:06:40 -06:00
Shadowbroker	a930497e14	fix(start-scripts): find bundled privacy_core.dll next to script (#319 ) (#324 ) * fix(start-scripts): find bundled privacy_core.dll next to script start.bat and start.sh only checked the source-tree DLL path (``privacy-core/target/release/privacy_core.dll``), not the bundled location where MSI/AppImage/DMG installers stage the library directly next to the script in backend-runtime/. Users running start.bat from inside an MSI install dir (a documented workaround when the desktop shell crashes) saw a scary "install Rust" warning even though the DLL was sitting right next to them. See issue #319 for the user-reported confusion. Fix: add a fallback check for the bundled location before falling through to the "build privacy-core from source" warning. Source-tree behavior unchanged — the source path is still preferred when present. Also re-stamps the v0.9.81 source archive: ``release_digests.json`` v0.9.81 zip hash updated to point at the rebuilt source archive that contains these script changes. MSI/EXE/sig hashes are unchanged (the scripts live at the repo root, not inside the desktop bundle). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * fix(#319): bundle start.bat + start.sh into the MSI/EXE installers Follow-up to the start-script DLL fallback fix in the prior commit. ChrisMTheMan's report on #319 made it clear the workaround flow was: 1. MSI install crashes on launch (different bug, fixed in v0.9.81) 2. User goes looking for start.bat to launch the backend manually 3. start.bat isn't in their install dir, so they go fetch it from GitHub 4. They get a working script but it doesn't know about the bundled privacy_core.dll layout, so they see a scary "install Rust" warning The prior commit fixed step 4. This commit fixes step 3 — start.bat and start.sh now ship inside the MSI/EXE installers (staged into backend-runtime/ next to the privacy_core.dll they expect to find). After the rebuild lands, an MSI user looking for these scripts finds them right inside their install dir, already pointing at the correct bundled DLL location. What changed ------------ * ``build-backend-runtime.cjs`` now has a ``stageStartScripts()`` step that copies start.bat and start.sh from the repo root into the staged backend-runtime/. Preserves the executable bit on .sh under POSIX. * ``release_digests.json`` v0.9.81 block hashes refreshed for the rebuilt MSI / EXE / source-zip (the scripts being bundled changed the MSI/EXE contents; the source zip also includes the start-script fix from the prior commit). ShadowBroker_v0.9.81.zip 6.06 MB af8c87ccdece8fbb9aadc6be63cce10d3fcba74e6d87ef83289dda6d555fd270 ShadowBroker_0.9.81_x64_en-US.msi 122.4 MB 8977c9a1c54e1f0d030436be9c4e3d81d766cc0080699eb747649095f360c7ff ShadowBroker_0.9.81_x64-setup.exe 76.5 MB 4e866fa0423c0c2470ed32f4809167a7815dc23ee7762b69e95681c1f3a28250 Post-merge plan --------------- Force-move the v0.9.81 tag to this commit and replace ALL release assets on the GitHub release: zip, msi, exe, both .sig files, latest.json, SHA256SUMS.txt, release-manifest.json. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com> v0.9.81	2026-05-23 21:34:59 -06:00
Shadowbroker	2dc1fcc778	release: v0.9.81 — signed auto-update + admin_session race fix (#323 ) What this release does ---------------------- 1. Establishes a fresh Tauri updater signing keypair. The previous keypair (pubkey baked into v0.9.79 / v0.9.8) had no matching private key on any maintainer-controlled machine — every prior release shipped without signatures, so auto-update has never actually worked. v0.9.81 rotates to a new pubkey and ships signed installers + latest.json so every release from here is a one-click upgrade. 2. Fixes the ``admin_session_required`` race in TopRightControls.tsx. The updateAction state used to default to ``auto_apply`` at React-init time. A click on the Update button before the async runtime probe completed went down the auto_apply path (POST /api/system/update), which throws ``admin_session_required`` on fresh sessions. Desktop installs now default to ``manual_download`` based on synchronous ``window.__TAURI__`` detection at useState init. One-time cost for current installs ---------------------------------- Anyone on v0.9.79 or v0.9.8 will see the in-app Update button still trigger the broken path on their existing install (the fix only takes effect once they're ON v0.9.81). The MANUAL DOWNLOAD button in the update dialog opens the GitHub release page, where they grab the .msi and run it. After that one manual hop, all future updates are seamless. Release artifacts ----------------- ShadowBroker_v0.9.81.zip 6.06 MB 42f8a51f9a5690d1e7349d90d8ecf2d163c9061d6cf90c69ee03647a785437ff ShadowBroker_0.9.81_x64_en-US.msi 122.4 MB a45b177c26c95d2b28d71592d7147e88ff4e104865f214fde11249d311ec9e25 ShadowBroker_0.9.81_x64-setup.exe 76.5 MB eca884b9d37eeccd0f11c91dcc6f6ae1b3609d9dee72bd73c37c9a427babfef2 Plus .sig files for the .msi and .exe, plus a signed latest.json for the Tauri updater endpoint. Sizes match the v0.9.79 / v0.9.8 reference shape within drift for the new TopRightControls patch. release_digests.json keeps v0.9.79 + v0.9.8 blocks alongside v0.9.81 so operators still on those versions continue to validate cleanly during the rollout transition. Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-23 18:43:53 -06:00
Shadowbroker	896d1ae938	fix(#319,#296): v0.9.8 rebuild — bundle missing deps so backend launches (#322 ) Issues #319 and #296 reported that the installed v0.9.79 Windows MSI/EXE crashed on launch with: thread 'main' panicked ... failed to setup app: error encountered during setup hook: ShadowBroker cannot start: the bundled local backend failed to launch. technical detail: managed_backend_exited_early:exit code: 103 Root cause: ``backend/pyproject.toml`` declares ``defusedxml>=0.7.1`` and ``PySocks==1.7.1`` as runtime dependencies, but the venv used to build v0.9.79 (and the initial v0.9.8 publish) had both missing. When ``services/fetchers/aircraft_database.py`` does ``import defusedxml.ElementTree`` at startup, Python raises ``ModuleNotFoundError`` and uvicorn exits, which Tauri reports as ``managed_backend_exited_early``. Both packages now installed in the build venv. ``main.py`` imports end-to-end with only the expected ``plane_alert_db.json not found`` warning (runtime-state file, populated on first launch). Rebuilt artifacts on the maintainer's local machine: ShadowBroker_v0.9.8.zip 6.06 MB 183bb5cd62b9b9349d95df5ef7696cb6ca810ab4b991fa9dab6f898af4c7a175 ShadowBroker_0.9.8_x64_en-US.msi 122.4 MB fe22f9d51e4360d74c18a7250c2fbb9ed4fa4c7a884b3ac0d04a21115466386b ShadowBroker_0.9.8_x64-setup.exe 76.5 MB 94a0309862e9c81c92cdcbfea8eec9dbb97eef19ded82b26217b397defbc810c After this merges, the v0.9.8 tag will be force-moved to this commit and the GitHub release assets replaced so the integrity chain validates against the working installers instead of the broken ones. Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com> v0.9.8	2026-05-23 16:48:45 -06:00
Shadowbroker	8dfa6a7199	release: v0.9.8 — Cumulative Fuel/CO2, AIS Resilience, Data-Layer Repair (#321 ) Bumps every hardcoded 0.9.79 → 0.9.8 across backend, frontend, desktop-shell, helm, lockfiles, test fixtures. Refreshes the in-app ChangelogModal HEADLINE_FEATURES, NEW_FEATURES, and BUG_FIXES with the v0.9.8 highlights. Release artifacts built locally and hashed into release_digests.json: ShadowBroker_v0.9.8.zip 6.06 MB d506f6b8462ccb12096f0cd9462233be58928094240416b65fb3127bdd1f3820 ShadowBroker_0.9.8_x64_en-US.msi 122.4 MB d4be4cb68c3e6409fff54c225acdcdd08e27d5d6d2b31616d78d2a4f6812991d ShadowBroker_0.9.8_x64-setup.exe 76.5 MB 1115d1f5cf37edd03ea2c21d821c7626e1bf3319c990402aaa0293bca46fea67 Sizes match the v0.9.79 reference shape (5.76 MB / 117 MB / 72.9 MB) within expected drift for new code. The .zip is a `git archive` of the v0.9.8 source tree (matching v0.9.79's approach). Audit confirms no .env, .key, .venv-dir, or cache files leaked into the backend-runtime bundle. Python 3.11.9 + 199 site-packages + privacy_core all staged correctly. Headline changes since v0.9.79: * Cumulative fuel/CO2 per flight (#317) — running totals since first observation, not just per-hour rate. * AIS maritime resilience (#314, #316) — outage banner + AISHub REST fallback when AISStream WebSocket primary is offline. * Data-layer repair (#311, #312) — UAP fallback respects the 60-day cutoff; GPS jamming threshold tuning + nac_p=0 inclusion so the layer actually fires. * Per-flight source attribution (#313) — source field on every record. * Cross-node DM mailbox replication (#309). * Infonet sync HTTP 429 honored (#310). Test fixtures updated: * test_per_operator_outbound_attribution.py — added v0.9.8 UA strings to the banned-aggregate-literals list (alongside v0.9.79). * updateRuntime.test.ts — bumped asset filename fixtures to v0.9.8. release_digests.json keeps the v0.9.79 block alongside v0.9.8 so operators still on 0.9.79 validate cleanly during the rollout. The accent narrowing fix in ChangelogModal (one feature uses 'purple', two use 'cyan' so the renderer's `accent === 'purple'` comparison still type-checks) is included. Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-23 16:24:20 -06:00
Shadowbroker	ef6b8ec181	fix(desktop-build): strip layout.tsx force-dynamic on CRLF checkouts too (#320 ) build-frontend-export.cjs stages a desktop-only frontend export tree and strips the ``force-dynamic`` + ``revalidate`` directives from ``frontend/src/app/layout.tsx`` so Next's ``output: "export"`` can prerender every route. The strip regexes only matched LF (``\n``). Any Windows checkout without ``core.autocrlf=input`` has CRLF line endings, the strip silently no-op'd, and the desktop build failed at the static-export step: Error: Page with `dynamic = "force-dynamic"` couldn't be exported. `output: "export"` requires all pages be renderable statically because there is no runtime server to dynamically render routes in this output format. Export encountered an error on /_not-found/page: /_not-found Reaches every Windows contributor who hasn't normalized line endings locally. Replacing each ``\n`` in the strip regexes with ``\r?\n`` makes the strip CRLF-tolerant; LF behavior is unchanged. Verified by running both regexes against the actual layout.tsx (302 bytes removed, force-dynamic + revalidate both gone) and against a synthetic LF input (296 bytes removed, same outcome). Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-23 16:07:11 -06:00
Shadowbroker	dcea325fba	Merge pull request #317 from BigBodyCobain/feat/cumulative-fuel-burn feat(flights): cumulative fuel burned + CO2 emitted per flight	2026-05-23 08:09:34 -06:00
BigBodyCobain	03b8053617	feat(flights): cumulative fuel burned + CO2 emitted per flight Pre-fix the emissions tooltip only showed the per-hour rate — what most users actually want is the cumulative amount burned. This adds running totals computed by multiplying the model-based rate by the elapsed observation time since we first saw the airframe. New module ``flight_observations.py``: * Tracks first_seen_at + last_seen_at per icao24 hex. * Re-opens a fresh session when an aircraft is unseen for > 15 min (treated as a new flight — landed and took off, or transited a dead zone). Prevents the cumulative counter from resetting mid-flight if the trail-rendering cache prunes the trail. * Clamps elapsed time to 24h max so clock skew can't produce comically large numbers. * Pruned every 5 min via a new scheduler job (mirrors ais_prune cadence). flights.py + military.py emission enrichment now also attaches: * observed_seconds — how long we've been tracking this airframe. * fuel_gallons_burned — rate * elapsed_h. * co2_kg_emitted — rate * elapsed_h. The existing per-hour rate fields stay in the dict for backward compat and are shown as small secondary context in the tooltip. Frontend EmissionsEstimateBlock (NewsFeed.tsx) now prominently shows the cumulative totals with the rate as smaller context underneath plus "Observed in flight for Xh Ym". When observed_seconds is 0 (first refresh) it renders "Just observed · totals will appear on next refresh" instead of a misleading "0 gal". 12 backend tests cover record/accumulate/reset, the 24h clamp, prune, case-insensitive key normalization, and end-to-end emission integration in _classify_and_publish. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-23 07:56:23 -06:00
Shadowbroker	20807a2d62	Merge pull request #316 from BigBodyCobain/feat/aishub-fallback feat(ais): AISHub REST fallback when AISStream is offline (20-min polling)	2026-05-23 07:42:56 -06:00
Shadowbroker	79fbf9741b	Merge pull request #314 from BigBodyCobain/feat/ais-upstream-health feat(ais): surface AISStream upstream outage instead of failing silently	2026-05-23 07:12:37 -06:00
BigBodyCobain	a2f5d62926	feat(ais): AISHub REST fallback when AISStream WebSocket is offline When stream.aisstream.io is unreachable (cert outage, server down — see 2026-05-20 and 2026-05-23 events) the ships layer goes empty. This adds a slow REST fallback to data.aishub.net so the layer stays populated in degraded mode. Behavior: * Opt-in via AISHUB_USERNAME (free registration at aishub.net/api). Without the env var the fetcher is a no-op. * Default poll cadence 20 min — well inside their free-tier limits, gives ships time to move enough to look "alive". Configurable via AISHUB_POLL_INTERVAL_MINUTES, clamped to [1, 360]. * Internal gate: skips the poll entirely when the WebSocket primary is currently connected. Stomping fresh live data with 20-min-old REST data would be worse than leaving it alone. * Vessels merge into the shared _vessels dict with source="aishub" so the existing UI / health tooling can attribute the provider. * Live data wins races: if a WebSocket update for the same MMSI lands in the last 1s, we don't overwrite with the slower REST record. Scheduler job runs every AISHUB_POLL_INTERVAL_MINUTES minutes alongside the existing ais_prune job in data_fetcher.py. 24 tests cover gating (no-username, primary-connected), response parsing (success / error / empty / malformed / unexpected shape), record normalization (sentinels, missing fields, range checks, AIS @ padding), poll interval clamping, and end-to-end merge with live-data-wins. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-23 07:00:32 -06:00
BigBodyCobain	5e0b2c037e	feat(ais): surface upstream outage instead of failing silently On 2026-05-23, stream.aisstream.io went fully offline (TCP timeouts on port 443). The backend kept respawning the node WebSocket proxy every few seconds with nothing arriving. From the operator's POV the ships layer silently went empty — no banner, no log surfacing, no way to tell whether it was their config / network / viewport filter / upstream. Backend: * ais_proxy_status() now also returns: - connected (bool): true when a vessel message arrived in last 60s - last_msg_age_seconds (int \| None) - proxy_spawn_count (int): proxy respawns — sustained growth without connected means upstream is dead * /api/health escalates top status to "degraded" when AIS_API_KEY is set but the proxy is currently disconnected. Existing degraded_tls signal preserved. Frontend: * useAisUpstreamHealth hook polls /api/health every 30s, derives the outage state. Defensively only reports outage once spawn_count > 0 so operators who haven't opted in don't see the banner. * AisUpstreamBanner component renders a dismissible amber notice "Ship data temporarily unavailable — AISStream upstream is offline" mounted on the main app shell. 7 backend tests pin the status-shape contract and the /api/health escalation behavior in both with-key and without-key configurations. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-23 06:38:05 -06:00
Shadowbroker	69ef231e5a	Merge pull request #313 from BigBodyCobain/feat/flight-source-attribution feat(flights): stamp source attribution on every flight record	2026-05-23 06:29:31 -06:00
Shadowbroker	7a5f47ca9e	Merge pull request #312 from BigBodyCobain/fix/gps-jamming-thresholds fix(gps-jamming): count nac_p=0 + lower thresholds so layer actually fires	2026-05-23 06:29:20 -06:00
Shadowbroker	5cd49542bf	Merge pull request #311 from BigBodyCobain/fix/uap-fallback-cutoff fix(uap): stop HF fallback from serving 3-year-old NUFORC sightings	2026-05-23 06:29:08 -06:00
BigBodyCobain	f14d4feb6d	feat(flights): stamp source attribution on every flight record Pre-fix, adsb.lol records (the primary source for most flights) carried no source marker. OpenSky records got is_opensky: True and supplementals got supplemental_source, so any UI inspecting source labels saw OpenSky/airplanes.live records as explicitly tagged and adsb.lol records as "unlabeled" — making it look like adsb.lol wasn't being used at all even though it's the primary source. Changes: * _fetch_adsb_lol_regions stamps source="adsb.lol" on each aircraft before returning, so the tag survives the OpenSky dedupe-by-hex merge. * OpenSky records get source="OpenSky" (alongside is_opensky=True for back-compat). * military fetcher tags source on both adsb.lol and airplanes.live records before they're merged, and propagates source into the military_flights and uavs output dicts. * _classify_and_publish promotes the explicit source field into the published flight dict. Falls back to legacy supplemental_source if source is absent. Final fallback "adsb.lol" preserves prior behavior for any caller synthesizing records without going through a fetcher. 8 new tests cover the published-dict propagation, OpenSky tagging, supplemental fallback, explicit-wins precedence, default behavior, the adsb.lol regional fetcher tagging, and the military output dict. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-23 06:14:39 -06:00
BigBodyCobain	19a8560a80	fix(gps-jamming): count nac_p=0 + lower thresholds so the layer actually fires Three stacked filters meant the gps_jamming layer almost never lit up: 1. nac_p == 0 aircraft were dropped on the theory that "0 = old transponder." That's only half right — modern Mode-S Enhanced Surveillance transponders also fall back to nac_p=0 when they lose GPS lock entirely, which IS the jamming signature we want to catch. Discarding them was discarding the strongest signal. None (no field at all — typical for OpenSky-sourced records) is still skipped because absence-of-data isn't evidence. 2. GPS_JAMMING_MIN_AIRCRAFT was 5 per 1°x1° cell. Jamming hotspots (eastern Med, Russia/Ukraine border, Iran/Iraq) tend to have sparser traffic because pilots avoid them. Lowered to 3. 3. GPS_JAMMING_MIN_RATIO was 0.30. Combined with the (preserved) -1 noise cushion that made the effective bar high. Lowered to 0.20. The 1-aircraft noise cushion is intact so a single quirky transponder still can't flag a zone alone. Also extracted the detector loop into a pure ``detect_gps_jamming_zones()`` function at module scope so it's testable in isolation (was previously inlined inside ``_classify_and_publish``). The public signature accepts threshold overrides for ad-hoc re-tuning without code edits. 16 new tests cover nac_p=0 inclusion, None-skip preservation, MIN_AIRCRAFT lowering, MIN_RATIO lowering, noise cushion preservation, constant pinning, override behavior, lon/lng key compatibility, and robustness to empty/None inputs. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-22 23:40:18 -06:00
BigBodyCobain	0d0e009867	fix(uap): stop HF fallback from serving 3-year-old NUFORC sightings The UAP sightings layer is sourced from a live scrape of nuforc.org with a static Hugging Face CSV mirror (kcimc/NUFORC) as a fallback. The fallback parsed every row, sorted by occurred-desc, and took the top 250 — with no date cutoff. The HF mirror is a third-party snapshot that hasn't been refreshed in years, so the "newest 250" rows it returns are from ~2022-23. When the live path fails (Cloudflare 403, curl disabled on Windows, wdtNonce regex stale, etc.) users see a map full of sightings from 3 years ago, labeled as the "last 60 days" layer. Changes: * HF fallback now applies the same 60-day cutoff the live path uses. Rows outside the window are dropped before take-top-N. If the mirror has nothing inside the window the fallback returns [] (don't serve stale). * When the HF mirror is fully stale a loud ERROR log fires with the count of dropped rows so the operator can tell the mirror's the problem, not a network issue. * When BOTH live AND HF fallback produce 0 rows, fetch_uap_sightings now trips assert_canary("uap_sightings", 0) so the health registry shows the layer as broken instead of "fresh and empty for days." * Scheduler moved from daily 12:00 UTC to weekly Mondays 12:00 UTC. The layer is a rolling 60-day digest; refreshing once a week is enough cadence for human-readable map exploration and keeps nuforc.org load light. 6 new tests cover the cutoff filter, the doomsday-log path, the mixed-age path, the both-paths-empty health failure, the positive fallback path, and the scheduler cadence. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-22 23:27:12 -06:00
Shadowbroker	febcce9125	Merge pull request #310 from BigBodyCobain/fix/infonet-sync-429-backoff Infonet sync: honor HTTP 429 Retry-After + exponential backoff	2026-05-22 23:11:00 -06:00
BigBodyCobain	31ebcb5cd9	Infonet sync: honor HTTP 429 Retry-After + exponential backoff Fixes the retry-storm that's been keeping the local node 429'd out of the seed peer (the diagnosis we ran earlier in the session). Pre-fix: 1. Sync hits the seed peer, gets HTTP 429 (Too Many Requests) 2. _peer_sync_response stringifies the status into a ValueError 3. _sync_from_peer catches it, error becomes the str() of the exc 4. _run_public_sync_cycle calls finish_sync(error=..., failure_backoff_s=60) 5. next_sync_due_at = now + 60s 6. After 60s, sync runs again, hits same upstream that hasn't reset its rate-limit bucket, 429 again. Loop indefinitely. Net effect: a node that hit one transient 429 would hammer the seed every 60s forever, keeping the bucket full and never recovering. We saw this in the live status dump: consecutive_failures=49, last_sync_ok_at=0, retry storm sustained over the entire uptime. What changed ------------ services/mesh/mesh_infonet_sync_support.py * New typed exception PeerSyncRateLimited carries the parsed Retry-After value out of the HTTP layer instead of stringifying everything into a generic ValueError. * New parse_retry_after_header() handles both RFC 7231 §7.1.3 forms (delay-seconds and HTTP-date). Clamped at 1 hour so a hostile peer can't silence us for days. * New _failure_backoff_seconds() helper computes the next delay as max(exponential, retry_after_s). Schedule with default base=60s, cap=1800s: failure 1 -> 60s (preserves pre-fix for transient blips) failure 2 -> 120s failure 3 -> 240s failure 4 -> 480s failure 5 -> 960s failure 6+ -> 1800s (capped at 30 min) cap_s=0 explicitly disables exponential entirely — operators who want pure-Retry-After behavior have that option. * finish_sync now accepts retry_after_s and failure_backoff_cap_s kwargs. Backward-compatible: existing callers that don't pass retry_after_s get the same first-failure delay as before (the base value), only repeat failures grow. main.py * _peer_sync_response detects 429 specifically, parses the Retry-After header, raises PeerSyncRateLimited(retry_after_s=N). Includes the response body prefix in the message so the operator's last_error finally shows something useful. * _sync_from_peer extended to return (ok, error, forked, retry_after_s) — the 4th tuple element is non-zero only when the upstream sent a parseable Retry-After. Existing call shape preserved: the lone caller in _run_public_sync_cycle was updated in the same commit. * _run_public_sync_cycle forwards retry_after_s into finish_sync. Tests ----- backend/tests/mesh/test_infonet_sync_429_backoff.py — 17 new tests: TestParseRetryAfter (7): - integer seconds form - HTTP-date form (computed as seconds-from-now) - HTTP-date in the past returns 0 - empty / whitespace returns 0 - malformed returns 0 - clamps to 1 hour (hostile-peer cap) - negative returns 0 TestFailureBackoffSeconds (5): - exponential growth schedule pins each level - retry_after wins when larger than exponential - exponential wins when larger than retry_after - cap_s=0 disables exponential entirely - zero inputs return zero TestFinishSyncBackoff (5): - first failure uses base unchanged (pre-fix back-compat) - consecutive_failures actually grow the delay - retry_after honored at low failure count - success resets consecutive_failures - last_error carries the HTTP status / Retry-After detail All 24 existing sync-support / status-gate tests still pass. Other failures in tests/mesh/ are pre-existing on origin/main and unrelated to this change (verified by running the same tests against the user's main worktree without these edits). What the operator sees after this lands + a docker rebuild ---------------------------------------------------------- With the live 429 storm we diagnosed: Pre-fix: consecutive_failures keeps climbing 1/min forever, last_error empty or generic Post-fix: consecutive_failures grows, next_sync_due_at backs off exponentially (max 30 min), last_error explicitly carries "HTTP 429 from <peer> (retry_after=Ns): <body>" so the operator can see what's actually wrong. Once the upstream bucket drains and a sync succeeds, consecutive_failures resets to 0 and the schedule returns to the normal 300s interval.	2026-05-22 22:55:05 -06:00
Shadowbroker	b3fca3dc18	Merge pull request #309 from BigBodyCobain/feat/cross-node-dm-mailbox-replication DM mailbox: per-(sender, recipient) anti-spam cap + replication primitives	2026-05-22 22:43:26 -06:00
BigBodyCobain	401f114e4f	DM mailbox: outbound replication + receiving endpoint Second commit on this branch (first added the per-sender cap + accept_replica primitive). This commit wires the actual cross-node propagation: Outbound (sender side) ---------------------- * New ``DMRelay._replicate_envelope_to_peers_async()`` — fire-and-forget thread that POSTs the envelope to every authenticated relay peer via the same per-peer HMAC pattern gate-message replication uses (#256 ``X-Peer-Url`` + ``X-Peer-HMAC`` headers, ``resolve_peer_key_for_url``). * ``deposit()`` now calls the replication helper after a successful local accept. Per-peer errors are swallowed — slow Tor peers must not block the sender's UX, and the recipient polling from a healthy peer works fine even if some peers are down. * Metrics: dm_replication_push_ok / _rejected / _error. Inbound (receiving side) ------------------------ * New endpoint ``POST /api/mesh/dm/replicate-envelope`` in routers/mesh_peer_sync.py. * Same HMAC auth gate (``_verify_peer_push_hmac``) as the existing infonet/gate peer-push endpoints. Unauthenticated requests get 403. * Body cap of 64 KB (DM envelope is bounded by MESH_DM_MAX_MSG_BYTES). * Calls DMRelay.accept_replica which enforces the per-sender cap as a network rule — hostile sender's relay can hold extras locally but honest peers reject them on inbound replication. End-to-end flow now works ------------------------- 1. Alice's node accepts a deposit to Bob's mailbox (local cap check). 2. Alice's node spawns a background thread that POSTs the envelope to MESH_RELAY_PEERS with per-peer HMAC. 3. Each peer's /api/mesh/dm/replicate-envelope verifies the HMAC and calls accept_replica, which re-enforces the per-sender cap. 4. Bob (offline at the time of send) eventually logs into ANY node in MESH_RELAY_PEERS, his existing pollDmMailboxes pulls from the local mailbox there, finds Alice's envelope, decrypts. Tests ----- backend/tests/test_dm_replicate_envelope_endpoint.py — 4 tests: TestReplicateEndpointAuth: - rejects requests without peer HMAC (403) - rejects requests with WRONG peer HMAC (403) — confirms the HMAC is actually verified, not just present - rejects oversize bodies (>64 KB) with 400/413 TestReplicateEndpointRegistered: - static check that POST /api/mesh/dm/replicate-envelope is registered on app.routes — catches future refactor that drops the router include All 38 backend tests touching the new code paths still pass: test_dm_relay_per_sender_cap.py (14) test_dm_replicate_envelope_endpoint.py (4) test_no_new_duplicate_routes.py (1) — new route is unique test_per_peer_secret_resolver.py (19) — HMAC primitive unaffected What's still ahead (PR-3+) -------------------------- * ack propagation: when recipient pulls a message on node X, peers Y/Z should prune their copies to free the sender's quota network-wide. Without this, the sender's quota frees only on the node the recipient actually polled — other peers still see N pending until TTL expiry. Workable but suboptimal. PR-3 will add a /api/mesh/dm/ack endpoint with the same HMAC pattern. * recipient pull-from-peers: today the recipient's poll only hits their own node's relay. If they log into a peer they didn't deposit with, they need a way to fetch envelopes from other peers in MESH_RELAY_PEERS. Today this works as long as the recipient's current node is one of the peers Alice's node pushed to — which is true in a fully-meshed deployment but not guaranteed for partial meshes. PR-4 if telemetry shows this matters.	2026-05-22 19:23:09 -06:00
BigBodyCobain	79b39e8985	DM mailbox: per-(sender, recipient) anti-spam cap + replication primitives Foundation work for cross-node DM mailbox replication. Adds the network rule that makes the replication safe to ship next, plus the primitives the outbound replication PR will call. The rule -------- A single sender can have at most N UNACKED messages parked in a single recipient's mailbox at any one time. Default N=2, tunable via ``MESH_DM_PENDING_PER_SENDER_LIMIT``. Once the recipient pulls (acks) a message, the sender's quota for that (sender, recipient) pair frees up. Network rule, not local rule ---------------------------- The cap is enforced TWICE: 1. ``DMRelay.deposit(...)`` — local check on the sender's own node. Refuses to spool the (N+1)th message before it can be replicated. 2. ``DMRelay.accept_replica(...)`` — replication-acceptance check on every receiving peer. Refuses to accept an inbound replica that would put the local mailbox over the cap. The second half is what makes the rule a NETWORK rule. A hostile sender could patch out the deposit check on their own relay and continue to spool extras locally — but those extras can never propagate, because every honest peer enforces the same cap on the way in. A recipient who polls from honest peers therefore never sees more than N pending from any one sender, regardless of how many spam attempts the hostile sender's relay accepted. New API surface on ``DMRelay`` ------------------------------ _per_sender_pending_limit() — reads MESH_DM_PENDING_PER_SENDER_LIMIT _per_sender_pending_count(...) — counts unacked from a sender for a mailbox accept_replica(envelope=...) — peer-push receive entry point envelope_for_replication(...) — helper to extract a wire-form envelope ``accept_replica`` is idempotent on duplicate ``msg_id`` (replication round-trips and multi-path delivery don't double-spool). ``envelope_for_replication`` exposes the exact shape ``accept_replica`` expects, so the follow-up PR (outbound replication wiring) just has to fetch the envelope and POST it to authenticated peer URLs with the existing per-peer HMAC pattern from #256. Why this is PR-1 of two ----------------------- The full cross-node mailbox replication needs three pieces: A. cap enforcement on deposit (in this PR) B. cap enforcement on replica acceptance (in this PR) C. outbound: push envelope to MESH_RELAY_PEERS after deposit (NEXT PR) (A) + (B) shipped together close the cap-bypass attack surface BEFORE (C) introduces the actual cross-node propagation. Shipping them in the other order would briefly let extras propagate during the window between "outbound push lands" and "accept_replica cap lands." Tests ----- backend/tests/test_dm_relay_per_sender_cap.py — 14 tests: TestDepositCap: - first 2 deposits succeed (UX baseline) - 3rd from same sender rejected with friendly message - different senders have independent quotas - different recipients have independent quotas - ack frees the quota (after recipient pulls, sender can deposit again) - cap is env-tunable TestAcceptReplicaCap: - replica accepted under cap - idempotent on duplicate msg_id (no double-spool, no rejection) - rejected at cap with structured ``cap_violation`` marker so sender's relay can stop retrying - per-sender, not per-mailbox: different sender_block_ref passes even when another sender at the same mailbox is capped - malformed envelope shapes rejected without crash TestEnvelopeForReplication: - returns the envelope for stored messages - returns None for unknown msg_id - round-trips through accept_replica end-to-end (proves the wire shape matches across the two sides)	2026-05-22 19:18:01 -06:00
Shadowbroker	c3e38621fc	Merge pull request #308 from BigBodyCobain/fix/296-windows-venv-uvicorn-detection Fix #296: reject backend venvs missing uvicorn before launch (Windows)	2026-05-22 18:56:08 -06:00
BigBodyCobain	9ef02dd06f	Fix #296 : reject backend venvs missing uvicorn before launch Reported by @f3n3k on Windows native install path. Symptom: C:\001\backend\venv\Scripts\python.exe: No module named uvicorn [backend] exited with 1 ShadowBroker has stopped. Exit code: 1 Root cause ---------- The Windows Start.bat flow chains: Start.bat └─ scripts\run-windows-runtime.ps1 └─ frontend\scripts\dev-all.cjs └─ start-backend.js └─ backend\venv\Scripts\python.exe -m uvicorn main:app `start-backend.js` decided whether an existing `backend\venv` was usable by calling `canRun(candidate, ["-V"])`. That only checks whether Python itself can run — it does NOT check whether the backend's actual runtime dependencies are installed. When the venv exists but `pip install` never finished (partial install, failed network, interrupted bootstrap, etc.), the launcher happily accepted that broken venv, then died with the exact error f3n3k reported. Fix --- New `canRunBackendPython()` helper that requires BOTH: python -V # Python is runnable python -c "import fastapi, uvicorn" # backend deps are installed Used in two call sites: * `ensureBackendVenv()` — when iterating candidate venvs on first launch, reject any venv whose Python can't import the backend's real entry-point deps. The launcher then falls through to its existing rebuild path (`rebuildBackendVenv`) which reinstalls deps before declaring the venv healthy. * `rebuildBackendVenv()` — after a rebuild attempt, verify the deps are present before returning the new interpreter path. Catches silent partial rebuilds. The check is the import that uvicorn itself would do at startup, so a green return here genuinely means "uvicorn will start". Cost is one extra `python -c` per venv candidate on launcher startup — milliseconds. Verified locally with `node --check start-backend.js`. Credit: @f3n3k for the original report.	2026-05-22 18:50:27 -06:00
Shadowbroker	ba39d3b9aa	Merge pull request #307 from BigBodyCobain/fix/302-openclaw-hmac-reveal-hardening Fix #302: split OpenClaw HMAC reveal into dedicated POST with no-store headers	2026-05-22 18:47:09 -06:00
BigBodyCobain	f91ddcf38b	Fix #302 : split OpenClaw HMAC reveal into dedicated POST with no-store Reported by @tg12. Pre-fix, two problems lived on the GET endpoint: 1. `GET /api/ai/connect-info?reveal=true` returned the full HMAC secret in the response body on every Connect modal open. Even gated to require_local_operator, that put the secret into browser history, dev-tools network panels, browser disk caches, HAR exports, and screen captures. 2. The same GET endpoint auto-bootstrapped (generated + persisted) the secret on a mere read. Side effects on a GET are a footgun: browser prefetchers, mirror tools, and casual curl-from-history would all silently mint+persist a fresh secret. Backend (backend/routers/ai_intel.py) ------------------------------------- GET /api/ai/connect-info — always returns the MASKED fingerprint (first6 + bullets + last4). No `?reveal` param. NO auto-bootstrap. When the secret is missing, returns `hmac_secret_set: false` and tells the caller to POST to /bootstrap. POST /api/ai/connect-info/bootstrap — NEW. Mints+persists the secret if missing. Idempotent. Never returns the full secret in the response body. POST /api/ai/connect-info/reveal — NEW. Returns the full secret with Cache-Control: no-store, no-cache, must-revalidate + Pragma: no-cache + Expires: 0. POST so the body never lands in URL history. 404 (with a pointer to /bootstrap) when the secret isn't set. POST /api/ai/connect-info/regenerate — keeps existing one-time-reveal behavior (regen IS a deliberate destructive action triggered by the operator). Same no-store/no-cache headers added so even the regen response doesn't get cached. Frontend (AIIntelPanel.tsx, OnboardingModal.tsx) ------------------------------------------------ * On mount: GET (masked only). If hmac_secret_set: false, fire a transparent POST /bootstrap and refresh the masked fingerprint. Operator sees no behavior change from pre-#302. * Reveal (eye icon): lazy POST /reveal — secret only travels when the operator explicitly clicks the button. * Copy: lazy POST /reveal too — copying without a prior reveal works exactly like before, just routed through the new endpoint. * Regenerate: POST returns the new secret (same as before, but the response now has no-store headers). * The displayed snippet uses the masked fingerprint until the operator clicks Reveal or Copy. Tests (backend/tests/test_openclaw_connect_info_reveal.py — 13 tests) --------------------------------------------------------------------- * GET returns masked + the full secret never appears in r.text * GET does NOT auto-bootstrap when missing * GET silently ignores any ?reveal=true query (back-compat noise) * POST /bootstrap mints when missing, idempotent when set * POST /bootstrap never returns the full secret * POST /reveal returns the full secret with Cache-Control: no-store, no-cache + Pragma: no-cache + Expires: 0 * POST /reveal 404s with a pointer to /bootstrap when no secret * POST /regenerate returns the new secret with the same headers * Anonymous remote callers get 403 on ALL FOUR endpoints (parametric regression against the same allowlist used elsewhere). Adjacent suites still green: test_openclaw_route_security, test_no_new_duplicate_routes, test_control_surface_auth. 67/67 pass locally. Credit: @tg12 for the audit report.	2026-05-22 18:40:24 -06:00
Shadowbroker	49151d8b9f	Merge pull request #304 from BigBodyCobain/fix/298-sentinel-creds-server-side Fix #298: move Sentinel credentials from browser storage to backend .env	2026-05-22 18:29:11 -06:00
BigBodyCobain	767a2f6c00	Merge remote-tracking branch 'origin/main' into fix/298-sentinel-creds-server-side	2026-05-22 18:19:12 -06:00
Shadowbroker	2da739c9e8	Merge pull request #306 from BigBodyCobain/fix/messagesview-flake-alias-race Deflake messagesViewFirstContact: alias-resolution race in toast text	2026-05-22 18:18:56 -06:00
BigBodyCobain	eca7f24e2c	Loosen messagesViewFirstContact toast assertion to fix alias-race flake Follow-up to #305. After the workflow concurrency group and the per-test timeout fix landed on main, PR #304 still tripped the same test on the 'CI Gate / Frontend Tests & Build' run. Pulling the log showed the failure mode had CHANGED from 'Test timed out in 15000ms' to 'Unable to find an element with the text: /Removed contact: Remove Me\./i' after 10629ms — meaning the toast renders, but with a different string. Tracing through MessagesView.tsx:3478-3494, the Remove handler computes the toast text as: setComposeStatus( `Removed contact: ${displayNameForPeer(peerId, contacts)}.`, ); displayNameForPeer reads contacts[peerId].alias or falls through to the raw peerId. The reference is captured from the closed-over React state. Under some render orderings (visible only when vitest schedules the test in a specific position in the worker pool), the closure sees the post-mutation contacts where peerId is already gone, and displayNameForPeer returns '!sb_remove' instead of 'Remove Me'. The toast renders correctly — but as 'Removed contact: !sb_remove.' — and the precise regex misses. Fix: loosen the assertion to /Removed contact:/i. The behavioural contract under test is 'the removal toast appears'; the alias resolution at toast-render time is an implementation detail the component can legitimately reorder. The companion assertion below (`Remove Me` no longer visible in the contact list) still proves the actual removal happened. Verified locally: 26/26 tests pass in 5.15s.	2026-05-22 18:06:56 -06:00
BigBodyCobain	7bfaad17f0	Merge remote-tracking branch 'origin/main' into fix/298-sentinel-creds-server-side	2026-05-22 17:55:58 -06:00
Shadowbroker	e3efcfd476	Merge pull request #305 from BigBodyCobain/fix/messagesview-flake-ci-concurrency Deflake messagesViewFirstContact via CI concurrency group	2026-05-22 17:55:22 -06:00
BigBodyCobain	32b8421a1c	Merge origin/main into fix/298: resolve tools.py conflict PR #303 landed on main and added Depends(require_local_operator) to the @router.post decorators for /api/sentinel/token and /api/sentinel/tile. PR #298 (this branch) edited the same decorator lines AND function bodies to add the env-credential fallback resolver. Resolution keeps BOTH: * The require_local_operator dependency from #303 (the auth gate) * The _resolve_sentinel_credentials helper from #298 * The env-fallback path inside the function bodies Both layers are independent — the gate blocks anonymous callers, the env fallback lets legitimate (gated) callers omit credentials from the body. Verified: 46 tests pass against the merged code, including both test_sentinel_credentials_server_side.py (#298 fallback) and test_sentinel_routes_auth_gate.py (#303 gate).	2026-05-22 17:52:10 -06:00
BigBodyCobain	bc70cc3527	fix(test): per-test timeout — 15s waitFor inside 15s testTimeout was zero headroom Mistake in the prior commit on this branch (`44e9b38`). Bumped the waitFor timeout to 15s without realising the suite-wide testTimeout was ALSO 15s (raised in Round 7a deflake work). Net effect: the test ran out of clock budget BEFORE waitFor could even finish polling, producing "Test timed out in 15000ms" on the "Frontend Tests & Build" run of PR #305 — same job that the concurrency-group fix had just freed from the resource-contention flake. Fix: * Bump JUST this test's per-test timeout to 30s via the `{ timeout: 30_000 }` argument on the `it()` block. * Drop the inner waitFor back to 10s (was 15s) so it has a clear margin against the 30s test budget after setup/render/click. 26/26 tests in the file pass locally in 6.19s. The concurrency-group fix in ci.yml stays as-is — that was correct and verifiably worked (CI Gate / Frontend Tests & Build went green on the PR after 8 prior failures). The flake-jump to the sibling workflow exposed this second-order bug.	2026-05-22 17:49:00 -06:00
BigBodyCobain	44e9b38ac2	Deflake messagesViewFirstContact via CI concurrency group Root cause ---------- ci.yml fires twice on every PR — once directly via `pull_request: [main]` (producing the "Frontend Tests & Build" check) and once via `workflow_call` from docker-publish.yml (producing the "CI Gate / Frontend Tests & Build" check). Both jobs land on the same Actions runner pool at the same time and fight for CPU/RAM. Under contention, the React reconciliation in `messagesViewFirstContact.test.tsx > removes an approved contact immediately from the visible contact list` overruns its 5s waitFor timeout. This is the single test that has flaked on PRs #226, #237, #261, #262, #265, #294, #303, and the `fd7d6fa` push — always on the same job name ("CI Gate / Frontend Tests & Build"), never on the sibling job ("Frontend Tests & Build") on the same commit. PR #304 (which heavily touched the frontend) passed both jobs on first try. PR #303 (zero frontend changes) failed only the CI Gate job. That asymmetry is what finally pinpointed the parallel-resource-contention cause rather than anything in the test or the PRs. Fix --- .github/workflows/ci.yml — added a workflow-level concurrency group keyed on the PR head SHA (or pushed commit SHA). Both invocations against the same commit now share a group, so the second one queues instead of running in parallel. cancel-in-progress is intentionally `false` — cancelling would risk leaving a PR check stuck in "Expected" if only one of the two ever finished. Total CI time grows by ~2 min in exchange for deterministic outcomes. frontend/src/__tests__/mesh/messagesViewFirstContact.test.tsx — belt-and-suspenders bump of the waitFor timeout from 5s to 15s. The structural fix above should make the original 5s margin sufficient, but the bump removes the residual risk of brief runner load spikes inside the (now serialised) single job. The failure mode this masks would be "toast never renders", which still fails loudly at 15s. The full mesh test file (26 tests) passes locally in ~8s with the bumped timeout.	2026-05-22 17:36:33 -06:00
Shadowbroker	b01a69c172	Merge pull request #303 from BigBodyCobain/fix/299-300-301-sentinel-auth-gate Fix #299/#300/#301: gate Sentinel proxy routes with require_local_operator	2026-05-22 10:56:41 -06:00
BigBodyCobain	b041b5e97c	Fix #298 : move Sentinel credentials from browser storage to backend .env Reported by @tg12. Pre-fix, the Settings panel stored real third-party Copernicus CDSE client_id + client_secret in browser localStorage / sessionStorage via the privacy storage helper, and the proxy routes required those values to come back in every tile/token request body. Any same-origin script (XSS, malicious browser extension, dev-tools HAR export) had read access to the credentials. This change moves them server-side, behind the same .env-backed admin flow every other third-party API key (OpenSky, AIS Stream, Finnhub, Shodan, …) already uses. Backend ------- backend/services/api_settings.py * Added SENTINEL_CLIENT_ID and SENTINEL_CLIENT_SECRET entries to API_REGISTRY. The existing GET/PUT /api/settings/api-keys flow (already require_local_operator-gated, .env-backed) now manages them — no new route surface. backend/routers/tools.py * /api/sentinel/token and /api/sentinel/tile resolve credentials via a new _resolve_sentinel_credentials() helper: body fields win for back-compat with any legacy callers, otherwise the helper reads SENTINEL_CLIENT_ID / SENTINEL_CLIENT_SECRET from os.environ. * When neither source has a value, the route returns 400 with a friendly pointer ("Set SENTINEL_CLIENT_ID and SENTINEL_CLIENT_SECRET in the API Keys panel") instead of the curt "required" message. The user's standing rule against hostile errors applies. * Function bodies only — decorator lines untouched, so this PR does not conflict with #303 (which adds Depends(require_local_operator) to the same routes). Frontend -------- frontend/src/lib/sentinelHub.ts — rewritten * Removed: getSentinelCredentials / setSentinelCredentials / clearSentinelCredentials / getSentinelCredentialStorageMode. These were the browser-storage read/write helpers; their existence was the bug. * Added: checkBackendSentinelStatus(), refreshSentinelStatus(), getCachedSentinelStatus(), and a kept-for-back-compat hasSentinelCredentials() shim. Status is sourced from /api/settings/api-keys (the same endpoint the API Keys panel already uses), so we don't add a new route just for this read. * Added: migrateLegacySentinelBrowserKeys() — one-shot, idempotent helper that clears sb_sentinel_client_id / _secret / _instance_id from BOTH localStorage and sessionStorage. We deliberately do NOT auto-POST those legacy browser values to the backend; doing so would silently migrate a secret across a trust boundary without operator consent. Operators re-enter once in the API Keys panel and the legacy keys get wiped here. * fetchSentinelTile and getSentinelToken no longer send client_id / client_secret in the request body. The backend uses .env. frontend/src/components/SettingsPanel.tsx * Dropped sb_sentinel_client_id / _secret / _instance_id from PRIVACY_SENSITIVE_BROWSER_KEYS — they're no longer written. * SentinelTab rewritten: removed the inline Client ID / Client Secret inputs + Save / Clear / Test buttons. Replaced with a status panel that calls checkBackendSentinelStatus() on mount, a one-click "Open API Keys Panel" button, and a migration banner that appears only when migrateLegacySentinelBrowserKeys() actually cleared something. * Setup guide STEP 3 now points to the API Keys panel instead of the local form. frontend/src/app/page.tsx * Added a one-time useEffect that fires checkBackendSentinelStatus() on mount so the cached value (which the synchronous hasSentinelCredentials() shim reads) is populated before MaplibreViewer's tile-URL memo runs. Tests ----- backend/tests/test_sentinel_credentials_server_side.py (new) * API_REGISTRY surface — sentinel_client_id / sentinel_client_secret are registered with the right env_keys, ALLOWED_ENV_KEYS lets /api/settings/api-keys PUT them. * Resolution order — body wins, env is fallback, neither → 400 with the friendly pointer message, and NO upstream HTTP call when neither source has credentials (asserted via MagicMock(side_effect=AssertionError)). * /api/sentinel/tile same shape. frontend/src/__tests__/utils/sentinelHub.test.ts (new) * migrateLegacySentinelBrowserKeys clears localStorage AND sessionStorage, reports what it cleared, idempotent. * fetchSentinelTile + getSentinelToken POST WITHOUT client_id / client_secret in the body (plants leaked credentials in browser storage first to prove they are NOT picked up). * checkBackendSentinelStatus parses /api/settings/api-keys correctly: true only when both keys is_set, false on partial config or network errors. All 7 backend tests + 8 frontend tests pass locally. The test_no_new_duplicate_routes guard and the api-settings test suite still pass. Credit: @tg12 for the audit report.	2026-05-22 10:44:50 -06:00

1 2 3 4 5 ...

396 Commits