diff --git a/CLAUDE.md b/CLAUDE.md index dfe9df23..8699d49a 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -225,13 +225,25 @@ When you need to interact with a browser (QA, dogfooding, cookie setup), use the project uses. **Sidebar architecture:** Before modifying `sidepanel.js`, `background.js`, -`content.js`, `sidebar-agent.ts`, or sidebar-related server endpoints, read -`docs/designs/SIDEBAR_MESSAGE_FLOW.md`. It documents the full initialization -timeline, message flow, auth token chain, tab concurrency model, and known -failure modes. The sidebar spans 5 files across 2 codebases (extension + server) -with non-obvious ordering dependencies. The doc exists to prevent the kind of +`content.js`, `sidebar-agent.ts`, `terminal-agent.ts`, or sidebar-related +server endpoints, read `docs/designs/SIDEBAR_MESSAGE_FLOW.md`. It documents +the full initialization timeline, message flow, auth token chain, tab +concurrency model, the Terminal-tab PTY flow, and known failure modes. +The sidebar spans 6 files across 2 codebases (extension + server) with +non-obvious ordering dependencies. The doc exists to prevent the kind of silent failures that come from not understanding the cross-component flow. +**Terminal tab is its own process.** `terminal-agent.ts` is a separate +non-compiled bun process from `sidebar-agent.ts`. Do not bolt PTY logic +onto sidebar-agent — codex confirmed it would couple chat reliability to +PTY framing bugs. Cookie minting (`pty-session-cookie.ts`) lives in the +server; the cookie travels via `Set-Cookie` and back via `Cookie:` on the +WebSocket upgrade. The WS upgrade gates on Origin AND cookie; both are +load-bearing for the Terminal tab to be safe. `/health` MUST NOT surface +the cookie value or any shell-grant token (codex finding: existing +`AUTH_TOKEN` is already exposed there in headed mode; that's a separate +v1.1+ TODO, not something to widen). + **Transport-layer security** (v1.6.0.0+). When `pair-agent` starts an ngrok tunnel, the daemon binds two HTTP listeners: a local listener (127.0.0.1, full command surface, never forwarded) and a tunnel listener (locked allowlist: `/connect`, diff --git a/TODOS.md b/TODOS.md index a250cd50..5e76c056 100644 --- a/TODOS.md +++ b/TODOS.md @@ -1,5 +1,79 @@ # TODOS +## Sidebar Terminal (cc-pty-import follow-ups) + +### v1.1: PTY session survives sidebar reload + +**What:** Today the Terminal tab's PTY dies with the WebSocket — sidebar +reload, side-panel close, even a quick navigate-away in another tab close +the session. v1.1 should key the PTY on a tab/session id so a reload +reattaches to the existing claude process and you keep `/resume` history. + +**Why:** Mid-task resilience. When you've been pair-programming with claude +for 20 minutes and an accidental Cmd-R blows it away, the cost is real. + +**Pros:** Better UX, fewer interrupted sessions. **Cons:** Session-tracking +state, ghost-process risk, lifecycle bugs (when DOES the PTY actually go +away?). v1 chose the simple "PTY dies with WS" model deliberately. + +**Context:** /plan-eng-review Issue 1C decision (cc-pty-import branch, +2026-04-25). v1 ships with phoenix's lifecycle. **Depends on:** +cc-pty-import landed. + +**Priority:** P2 (nice-to-have). +**Effort:** M. Likely needs a per-tab session map keyed by chrome.tabs.id +plus a TTL so abandoned PTYs eventually exit. + +--- + +### v1.1+: Audit `/health` token distribution + +**What:** Codex's outside-voice review on cc-pty-import flagged that +`/health` already surfaces `AUTH_TOKEN` to any localhost caller in headed +mode (`server.ts:1657`). That's a pre-existing soft leak — anything +running on localhost gets the root token by hitting `/health`. + +**Why:** cc-pty-import sidesteps it by NOT putting the PTY token there +(uses an HttpOnly cookie path instead). But the underlying leak is still +shippable surface. A second extension or a localhost web app could +currently scrape `AUTH_TOKEN` and hit any browse-server endpoint. + +**Pros:** Closes a real privilege-escalation path on multi-extension +machines. **Cons:** Either we tighten the gate (Origin must be OUR +extension id, not just any chrome-extension://) or we move bootstrap +discovery off `/health` entirely. Either has migration cost for tests +and the existing extension. + +**Context:** codex finding #2 on cc-pty-import plan-eng review. Not in +scope of that PR; deliberately deferred to keep PTY-import small. + +**Priority:** P2. +**Effort:** M. + +--- + +### v1.1+: Apply terminal-agent's exception handlers to sidebar-agent + +**What:** While reviewing cc-pty-import, codex noted that `sidebar-agent.ts` +has no `process.on('uncaughtException'|'unhandledRejection')` handlers. +A bug in claude stream parsing or queue I/O can take down the chat path +silently. terminal-agent.ts ships with these handlers; sidebar-agent +should get them too. + +**Why:** Today a single uncaught exception in chat = entire sidebar chat +dies and nothing tells the user. The CLI doesn't supervise the agent. + +**Pros:** Chat survives transient bugs. **Cons:** Catching uncaught +exceptions can hide real failures — pair the handlers with structured +logging so we still see the bug. + +**Context:** codex finding #4 on cc-pty-import plan-eng review. + +**Priority:** P2. +**Effort:** S. + +--- + ## Testing ### Pre-existing test failures surfaced during v1.12.0.0 ship diff --git a/docs/designs/SIDEBAR_MESSAGE_FLOW.md b/docs/designs/SIDEBAR_MESSAGE_FLOW.md index 050d428b..7d12faa2 100644 --- a/docs/designs/SIDEBAR_MESSAGE_FLOW.md +++ b/docs/designs/SIDEBAR_MESSAGE_FLOW.md @@ -188,3 +188,134 @@ Each browser tab can run its own agent simultaneously: | Queue file | `~/.gstack/sidebar-agent-queue.jsonl` | Filesystem | | State file | `.gstack/browse.json` | Filesystem | | Chat log | `~/.gstack/sessions//chat.jsonl` | Filesystem | + +## Terminal flow + +The sidebar has a second primary tab next to Chat: **Terminal**. Where Chat +spawns one-shot `claude -p` per message, Terminal runs **interactive +`claude` in a real PTY** with xterm.js as the renderer. + +### Components + +``` +┌─────────────────┐ ┌──────────────┐ ┌──────────────────┐ +│ sidepanel.js + │────▶│ server.ts │────▶│terminal-agent.ts │ +│ -terminal.js │ │ (compiled) │ │ (non-compiled) │ +│ (xterm.js) │ │ │ │ PTY listener │ +└─────────────────┘ └──────────────┘ └──────────────────┘ + ▲ │ │ + │ ws://127.0.0.1:/ws (cookie auth) │ Bun.spawn(claude) + └───────────────────────┼──────────────────────▶│ terminal: {data} + │ ▼ + │ ┌──────────────────┐ + │ │ claude PTY │ + │ └──────────────────┘ + POST /pty-session │ + (Bearer AUTH_TOKEN) │ + ▼ + ┌──────────────────┐ + │ pty-session- │ + │ cookie.ts │ + │ (HttpOnly cookie)│ + └──────────────────┘ + │ + │ POST /internal/grant (loopback) + ▼ + ┌──────────────────┐ + │ validTokens Set │ + │ in agent memory │ + └──────────────────┘ +``` + +### Startup + first-key timeline + +``` +T+0ms CLI runs `$B connect` + ├── Server starts (compiled) + └── Spawns terminal-agent.ts via `bun run` + +T+500ms terminal-agent.ts boots + ├── Bun.serve on 127.0.0.1:0 (random port) + ├── Writes /terminal-port (server reads it for /health) + ├── Writes /terminal-internal-token (loopback handshake) + └── Probes claude → writes claude-available.json + +T+1-3s Extension loads, sidebar opens + ├── Terminal tab is default-active + ├── sidepanel-terminal.js: setState(IDLE), shows "Press any key" + └── No PTY spawned yet (lazy) + +T+user-keys First keystroke fires onAnyKey + ├── POST /pty-session (Authorization: Bearer AUTH_TOKEN) + │ └── server mints cookie, posts /internal/grant to agent + │ └── responds with Set-Cookie: gstack_pty= + │ └── responds with terminalPort + ├── GET /claude-available (preflight) + ├── new WebSocket(ws://127.0.0.1:/ws) + │ └── Browser carries gstack_pty cookie + Origin automatically + │ └── Agent validates Origin AND cookie BEFORE upgrading + ├── On upgrade success, send {type:"resize"} then a single byte + └── Agent message handler sees first byte → spawnClaude() +``` + +### Dual-token model + +| Token | Lives in | Used for | Lifetime | +|-------|----------|----------|----------| +| `AUTH_TOKEN` | `/browse.json`; in-memory in server.ts | `/pty-session` POST (mint cookie) | server lifetime | +| `gstack_pty` cookie | Browser HttpOnly jar; agent `validTokens` Set | `/ws` upgrade auth | 30 min, dies on WS close | +| `INTERNAL_TOKEN` | `/terminal-internal-token`; in agent memory | server → agent loopback `/internal/grant` | agent lifetime | + +`AUTH_TOKEN` is **never** valid for `/ws` directly. The cookie is **never** +valid for `/pty-session` or `/command`. Strict separation prevents an SSE +or sidebar-chat token leak from escalating into shell access. + +### Threat model + +The Terminal tab **bypasses the entire prompt-injection security stack** +(`content-security.ts` datamarking, `security-classifier.ts` ML scoring, +canary detection, ensemble verdicts). On the Terminal tab the user is +typing directly to claude — there is no untrusted page content in the +loop, so the threat model is "user trusts themselves," same as opening +a terminal locally. + +That trust assumption is load-bearing on three transport-layer guarantees: + +1. **Local-only listener.** `terminal-agent.ts` binds `127.0.0.1` only. + The dual-listener tunnel surface (server.ts:95 `TUNNEL_PATHS`) does + **not** include `/pty-session` or `/terminal/*`, so the tunnel returns + 404 by default-deny. +2. **Origin gate.** `/ws` upgrades require + `Origin: chrome-extension://`. A localhost web page cannot mount a + cross-site WebSocket hijack against the shell because its Origin is + a regular `http(s)://...`. +3. **Cookie auth.** `gstack_pty` is HttpOnly + SameSite=Strict, scoped to + the local listener, minted only by an authenticated `/pty-session` + POST. JS injected into a page can't read it; cross-site requests + can't send it. + +Drop any of those three and the whole tab becomes unsafe. + +### Lifecycle + +- **Lazy spawn**: claude is not started until the user types a key. Idle + sidebar opens cost nothing. +- **One PTY per WS**: closing the WebSocket SIGINTs claude, then SIGKILLs + after 3s. The `gstack_pty` cookie is also revoked so a stolen cookie + can't be replayed against a new PTY. +- **No auto-reconnect**: when the WS closes the user sees "Session ended, + click to start a new session." Auto-reconnect would burn a fresh + claude session every reload. v1.1 may add session resumption keyed on + tab/session id (see TODOS). + +### Files + +| Component | File | Runs in | +|-----------|------|---------| +| Terminal UI | `extension/sidepanel-terminal.js` + xterm.js in `extension/lib/` | Chrome side panel | +| PTY agent | `browse/src/terminal-agent.ts` | Bun (non-compiled, can spawn) | +| Cookie store | `browse/src/pty-session-cookie.ts` | Bun (compiled, in server.ts) | +| Port file | `/terminal-port` | Filesystem | +| Internal token | `/terminal-internal-token` | Filesystem | +| Claude probe | `/claude-available.json` | Filesystem | +| Active tab | `/active-tab.json` | Filesystem (claude reads) |