Second commit on this branch (first added the per-sender cap + accept_replica
primitive). This commit wires the actual cross-node propagation:
Outbound (sender side)
----------------------
* New ``DMRelay._replicate_envelope_to_peers_async()`` — fire-and-forget
thread that POSTs the envelope to every authenticated relay peer via
the same per-peer HMAC pattern gate-message replication uses (#256
``X-Peer-Url`` + ``X-Peer-HMAC`` headers, ``resolve_peer_key_for_url``).
* ``deposit()`` now calls the replication helper after a successful
local accept. Per-peer errors are swallowed — slow Tor peers must not
block the sender's UX, and the recipient polling from a healthy peer
works fine even if some peers are down.
* Metrics: dm_replication_push_ok / _rejected / _error.
Inbound (receiving side)
------------------------
* New endpoint ``POST /api/mesh/dm/replicate-envelope`` in
routers/mesh_peer_sync.py.
* Same HMAC auth gate (``_verify_peer_push_hmac``) as the existing
infonet/gate peer-push endpoints. Unauthenticated requests get 403.
* Body cap of 64 KB (DM envelope is bounded by MESH_DM_MAX_MSG_BYTES).
* Calls DMRelay.accept_replica which enforces the per-sender cap as a
network rule — hostile sender's relay can hold extras locally but
honest peers reject them on inbound replication.
End-to-end flow now works
-------------------------
1. Alice's node accepts a deposit to Bob's mailbox (local cap check).
2. Alice's node spawns a background thread that POSTs the envelope
to MESH_RELAY_PEERS with per-peer HMAC.
3. Each peer's /api/mesh/dm/replicate-envelope verifies the HMAC and
calls accept_replica, which re-enforces the per-sender cap.
4. Bob (offline at the time of send) eventually logs into ANY node
in MESH_RELAY_PEERS, his existing pollDmMailboxes pulls from
the local mailbox there, finds Alice's envelope, decrypts.
Tests
-----
backend/tests/test_dm_replicate_envelope_endpoint.py — 4 tests:
TestReplicateEndpointAuth:
- rejects requests without peer HMAC (403)
- rejects requests with WRONG peer HMAC (403) — confirms the
HMAC is actually verified, not just present
- rejects oversize bodies (>64 KB) with 400/413
TestReplicateEndpointRegistered:
- static check that POST /api/mesh/dm/replicate-envelope is
registered on app.routes — catches future refactor that
drops the router include
All 38 backend tests touching the new code paths still pass:
test_dm_relay_per_sender_cap.py (14)
test_dm_replicate_envelope_endpoint.py (4)
test_no_new_duplicate_routes.py (1) — new route is unique
test_per_peer_secret_resolver.py (19) — HMAC primitive unaffected
What's still ahead (PR-3+)
--------------------------
* ack propagation: when recipient pulls a message on node X, peers Y/Z
should prune their copies to free the sender's quota network-wide.
Without this, the sender's quota frees only on the node the recipient
actually polled — other peers still see N pending until TTL expiry.
Workable but suboptimal. PR-3 will add a /api/mesh/dm/ack endpoint
with the same HMAC pattern.
* recipient pull-from-peers: today the recipient's poll only hits
their own node's relay. If they log into a peer they didn't deposit
with, they need a way to fetch envelopes from other peers in
MESH_RELAY_PEERS. Today this works as long as the recipient's
current node is one of the peers Alice's node pushed to — which is
true in a fully-meshed deployment but not guaranteed for partial
meshes. PR-4 if telemetry shows this matters.