From 6b4df9384e7c9f892ebc4274fa2ab6aa0d857b4b Mon Sep 17 00:00:00 2001 From: Garry Tan Date: Sun, 22 Mar 2026 21:40:14 -0700 Subject: [PATCH] docs: TODOS cleanup + Chrome vs Chromium exploration doc - Update TODOS.md: mark CDP mode, $B watch, sidebar scout as SHIPPED - Delete dead "cross-platform CDP browser discovery" TODO - Rename dependencies from "CDP connect" to "headed mode" - Add docs/designs/CHROME_VS_CHROMIUM_EXPLORATION.md memorializing the architecture exploration and decision to use Playwright Chromium Co-Authored-By: Claude Opus 4.6 (1M context) --- TODOS.md | 32 ++----- .../designs/CHROME_VS_CHROMIUM_EXPLORATION.md | 84 +++++++++++++++++++ 2 files changed, 93 insertions(+), 23 deletions(-) create mode 100644 docs/designs/CHROME_VS_CHROMIUM_EXPLORATION.md diff --git a/TODOS.md b/TODOS.md index ec657601..1f805a7d 100644 --- a/TODOS.md +++ b/TODOS.md @@ -131,43 +131,29 @@ **Effort:** L **Priority:** P4 -### CDP mode — SHIPPED (Phase 1) +### Headed mode with Chrome extension — SHIPPED -`$B connect` connects to real Chrome/Comet via CDP. All existing browse commands work unchanged. Chrome extension with Side Panel activity feed. See `browse/src/chrome-launcher.ts`. +`$B connect` launches Playwright's bundled Chromium in headed mode with the gstack Chrome extension auto-loaded. `$B handoff` now produces the same result (extension + side panel). Sidebar chat gated behind `--chat` flag. -### `$B watch` — passive observation mode +### `$B watch` — SHIPPED -**What:** Claude observes your browsing without interacting. Captures snapshots, console logs, network requests as you navigate. "Watch me do this, then you do the same." +Claude observes user browsing in passive read-only mode with periodic snapshots. `$B watch stop` exits with summary. Mutation commands blocked during watch. -**Why:** Bridges the gap between "Claude controls my browser" and "Claude learns from me." Enables flow recording for QA regression tests. +### Sidebar scout / file drop relay — SHIPPED -**Context:** Requires CDP connect (shipped). Would add a new browse command that enters read-only mode with periodic snapshot capture. User demonstrates a flow, Claude records it, then can reproduce. - -**Effort:** M (human: ~1 week / CC: ~30 min) -**Priority:** P2 -**Depends on:** CDP connect (shipped) +Sidebar agent writes structured messages to `.context/sidebar-inbox/`. Workspace agent reads via `$B inbox`. Message format: `{type, timestamp, page, userMessage, sidebarSessionId}`. ### Multi-agent tab isolation -**What:** Two Claude sessions connect to the same Chrome, each operating on different tabs. No cross-contamination. +**What:** Two Claude sessions connect to the same browser, each operating on different tabs. No cross-contamination. **Why:** Enables parallel /qa + /design-review on different tabs in the same browser. -**Context:** Requires tab ownership model for concurrent CDP connections. Playwright may not cleanly support two `connectOverCDP` sessions to the same browser. Needs investigation. +**Context:** Requires tab ownership model for concurrent headed connections. Playwright may not cleanly support two persistent contexts. Needs investigation. **Effort:** L (human: ~2 weeks / CC: ~2 hours) **Priority:** P3 -**Depends on:** CDP connect (shipped) - -### Cross-platform CDP browser discovery - -**What:** Extend browser discovery algorithm to Windows (`where chrome`, registry lookup) and Linux (`which google-chrome`, XDG paths). Focus command via wmctrl (Linux) and PowerShell (Windows). - -**Why:** gstack already has Windows support (Node.js fallback). CDP connect should follow. - -**Effort:** M (human: ~1 week / CC: ~30 min) -**Priority:** P3 -**Depends on:** CDP connect (shipped) +**Depends on:** Headed mode (shipped) ### Chrome Web Store publishing diff --git a/docs/designs/CHROME_VS_CHROMIUM_EXPLORATION.md b/docs/designs/CHROME_VS_CHROMIUM_EXPLORATION.md new file mode 100644 index 00000000..55c078d1 --- /dev/null +++ b/docs/designs/CHROME_VS_CHROMIUM_EXPLORATION.md @@ -0,0 +1,84 @@ +# Chrome vs Chromium: Why We Use Playwright's Bundled Chromium + +## The Original Vision + +When we built `$B connect`, the plan was to connect to the user's **real Chrome browser** — the one with their cookies, sessions, extensions, and open tabs. No more cookie import. The design called for: + +1. `chromium.connectOverCDP(wsUrl)` connecting to a running Chrome via CDP +2. Quit Chrome gracefully, relaunch with `--remote-debugging-port=9222` +3. Access the user's real browsing context + +This is why `chrome-launcher.ts` existed (361 LOC of browser binary discovery, CDP port probing, and runtime detection) and why the method was called `connectCDP()`. + +## What Actually Happened + +Real Chrome silently blocks `--load-extension` when launched via Playwright's `channel: 'chrome'`. The extension wouldn't load. We needed the extension for the side panel (activity feed, refs, chat). + +The implementation fell back to `chromium.launchPersistentContext()` with Playwright's bundled Chromium — which reliably loads extensions via `--load-extension` and `--disable-extensions-except`. But the naming stayed: `connectCDP()`, `connectionMode: 'cdp'`, `BROWSE_CDP_URL`, `chrome-launcher.ts`. + +The original vision (access user's real browser state) was never implemented. We launched a fresh browser every time — functionally identical to Playwright's Chromium, but with 361 lines of dead code and misleading names. + +## The Discovery (2026-03-22) + +During a `/office-hours` design session, we traced the architecture and discovered: + +1. `connectCDP()` doesn't use CDP — it calls `launchPersistentContext()` +2. `connectionMode: 'cdp'` is misleading — it's just "headed mode" +3. `chrome-launcher.ts` is dead code — its only import was in an unreachable `attemptReconnect()` method +4. `preExistingTabIds` was designed for protecting real Chrome tabs we never connect to +5. `$B handoff` (headless → headed) used a different API (`launch()` + `newContext()`) that couldn't load extensions, creating two different "headed" experiences + +## The Fix + +### Renamed +- `connectCDP()` → `launchHeaded()` +- `connectionMode: 'cdp'` → `connectionMode: 'headed'` +- `BROWSE_CDP_URL` → `BROWSE_HEADED` + +### Deleted +- `chrome-launcher.ts` (361 LOC) +- `attemptReconnect()` (dead method) +- `preExistingTabIds` (dead concept) +- `reconnecting` field (dead state) +- `cdp-connect.test.ts` (tests for deleted code) + +### Converged +- `$B handoff` now uses `launchPersistentContext()` + extension loading (same as `$B connect`) +- One headed mode, not two +- Handoff gives you the extension + side panel for free + +### Gated +- Sidebar chat behind `--chat` flag +- `$B connect` (default): activity feed + refs only +- `$B connect --chat`: + experimental standalone chat agent + +## Architecture (after) + +``` +Browser States: + HEADLESS (default) ←→ HEADED ($B connect or $B handoff) + Playwright Playwright (same engine) + launch() launchPersistentContext() + invisible visible + extension + side panel + +Sidebar (orthogonal add-on, headed only): + Activity tab — always on, shows live browse commands + Refs tab — always on, shows @ref overlays + Chat tab — opt-in via --chat, experimental standalone agent + +Data Bridge (sidebar → workspace): + Sidebar writes to .context/sidebar-inbox/*.json + Workspace reads via $B inbox +``` + +## Why Not Real Chrome? + +Real Chrome blocks `--load-extension` when launched by Playwright. This is a Chrome security feature — extensions loaded via command-line args are restricted in Chromium-based browsers to prevent malicious extension injection. + +Playwright's bundled Chromium doesn't have this restriction because it's designed for testing and automation. The `ignoreDefaultArgs` option lets us bypass Playwright's own extension-blocking flags. + +If we ever want to access the user's real cookies/sessions, the path is: +1. Cookie import (already works via `$B cookie-import`) +2. Conductor session injection (future — sidebar sends messages to workspace agent) + +Not reconnecting to real Chrome.