docs: TODOS cleanup + Chrome vs Chromium exploration doc

- Update TODOS.md: mark CDP mode, $B watch, sidebar scout as SHIPPED
- Delete dead "cross-platform CDP browser discovery" TODO
- Rename dependencies from "CDP connect" to "headed mode"
- Add docs/designs/CHROME_VS_CHROMIUM_EXPLORATION.md memorializing
  the architecture exploration and decision to use Playwright Chromium

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
Garry Tan
2026-03-22 21:40:14 -07:00
parent 82708f1405
commit 6b4df9384e
2 changed files with 93 additions and 23 deletions
+9 -23
View File
@@ -131,43 +131,29 @@
**Effort:** L
**Priority:** P4
### CDP mode — SHIPPED (Phase 1)
### Headed mode with Chrome extension — SHIPPED
`$B connect` connects to real Chrome/Comet via CDP. All existing browse commands work unchanged. Chrome extension with Side Panel activity feed. See `browse/src/chrome-launcher.ts`.
`$B connect` launches Playwright's bundled Chromium in headed mode with the gstack Chrome extension auto-loaded. `$B handoff` now produces the same result (extension + side panel). Sidebar chat gated behind `--chat` flag.
### `$B watch` — passive observation mode
### `$B watch` — SHIPPED
**What:** Claude observes your browsing without interacting. Captures snapshots, console logs, network requests as you navigate. "Watch me do this, then you do the same."
Claude observes user browsing in passive read-only mode with periodic snapshots. `$B watch stop` exits with summary. Mutation commands blocked during watch.
**Why:** Bridges the gap between "Claude controls my browser" and "Claude learns from me." Enables flow recording for QA regression tests.
### Sidebar scout / file drop relay — SHIPPED
**Context:** Requires CDP connect (shipped). Would add a new browse command that enters read-only mode with periodic snapshot capture. User demonstrates a flow, Claude records it, then can reproduce.
**Effort:** M (human: ~1 week / CC: ~30 min)
**Priority:** P2
**Depends on:** CDP connect (shipped)
Sidebar agent writes structured messages to `.context/sidebar-inbox/`. Workspace agent reads via `$B inbox`. Message format: `{type, timestamp, page, userMessage, sidebarSessionId}`.
### Multi-agent tab isolation
**What:** Two Claude sessions connect to the same Chrome, each operating on different tabs. No cross-contamination.
**What:** Two Claude sessions connect to the same browser, each operating on different tabs. No cross-contamination.
**Why:** Enables parallel /qa + /design-review on different tabs in the same browser.
**Context:** Requires tab ownership model for concurrent CDP connections. Playwright may not cleanly support two `connectOverCDP` sessions to the same browser. Needs investigation.
**Context:** Requires tab ownership model for concurrent headed connections. Playwright may not cleanly support two persistent contexts. Needs investigation.
**Effort:** L (human: ~2 weeks / CC: ~2 hours)
**Priority:** P3
**Depends on:** CDP connect (shipped)
### Cross-platform CDP browser discovery
**What:** Extend browser discovery algorithm to Windows (`where chrome`, registry lookup) and Linux (`which google-chrome`, XDG paths). Focus command via wmctrl (Linux) and PowerShell (Windows).
**Why:** gstack already has Windows support (Node.js fallback). CDP connect should follow.
**Effort:** M (human: ~1 week / CC: ~30 min)
**Priority:** P3
**Depends on:** CDP connect (shipped)
**Depends on:** Headed mode (shipped)
### Chrome Web Store publishing
@@ -0,0 +1,84 @@
# Chrome vs Chromium: Why We Use Playwright's Bundled Chromium
## The Original Vision
When we built `$B connect`, the plan was to connect to the user's **real Chrome browser** — the one with their cookies, sessions, extensions, and open tabs. No more cookie import. The design called for:
1. `chromium.connectOverCDP(wsUrl)` connecting to a running Chrome via CDP
2. Quit Chrome gracefully, relaunch with `--remote-debugging-port=9222`
3. Access the user's real browsing context
This is why `chrome-launcher.ts` existed (361 LOC of browser binary discovery, CDP port probing, and runtime detection) and why the method was called `connectCDP()`.
## What Actually Happened
Real Chrome silently blocks `--load-extension` when launched via Playwright's `channel: 'chrome'`. The extension wouldn't load. We needed the extension for the side panel (activity feed, refs, chat).
The implementation fell back to `chromium.launchPersistentContext()` with Playwright's bundled Chromium — which reliably loads extensions via `--load-extension` and `--disable-extensions-except`. But the naming stayed: `connectCDP()`, `connectionMode: 'cdp'`, `BROWSE_CDP_URL`, `chrome-launcher.ts`.
The original vision (access user's real browser state) was never implemented. We launched a fresh browser every time — functionally identical to Playwright's Chromium, but with 361 lines of dead code and misleading names.
## The Discovery (2026-03-22)
During a `/office-hours` design session, we traced the architecture and discovered:
1. `connectCDP()` doesn't use CDP — it calls `launchPersistentContext()`
2. `connectionMode: 'cdp'` is misleading — it's just "headed mode"
3. `chrome-launcher.ts` is dead code — its only import was in an unreachable `attemptReconnect()` method
4. `preExistingTabIds` was designed for protecting real Chrome tabs we never connect to
5. `$B handoff` (headless → headed) used a different API (`launch()` + `newContext()`) that couldn't load extensions, creating two different "headed" experiences
## The Fix
### Renamed
- `connectCDP()``launchHeaded()`
- `connectionMode: 'cdp'``connectionMode: 'headed'`
- `BROWSE_CDP_URL``BROWSE_HEADED`
### Deleted
- `chrome-launcher.ts` (361 LOC)
- `attemptReconnect()` (dead method)
- `preExistingTabIds` (dead concept)
- `reconnecting` field (dead state)
- `cdp-connect.test.ts` (tests for deleted code)
### Converged
- `$B handoff` now uses `launchPersistentContext()` + extension loading (same as `$B connect`)
- One headed mode, not two
- Handoff gives you the extension + side panel for free
### Gated
- Sidebar chat behind `--chat` flag
- `$B connect` (default): activity feed + refs only
- `$B connect --chat`: + experimental standalone chat agent
## Architecture (after)
```
Browser States:
HEADLESS (default) ←→ HEADED ($B connect or $B handoff)
Playwright Playwright (same engine)
launch() launchPersistentContext()
invisible visible + extension + side panel
Sidebar (orthogonal add-on, headed only):
Activity tab — always on, shows live browse commands
Refs tab — always on, shows @ref overlays
Chat tab — opt-in via --chat, experimental standalone agent
Data Bridge (sidebar → workspace):
Sidebar writes to .context/sidebar-inbox/*.json
Workspace reads via $B inbox
```
## Why Not Real Chrome?
Real Chrome blocks `--load-extension` when launched by Playwright. This is a Chrome security feature — extensions loaded via command-line args are restricted in Chromium-based browsers to prevent malicious extension injection.
Playwright's bundled Chromium doesn't have this restriction because it's designed for testing and automation. The `ignoreDefaultArgs` option lets us bypass Playwright's own extension-blocking flags.
If we ever want to access the user's real cookies/sessions, the path is:
1. Cookie import (already works via `$B cookie-import`)
2. Conductor session injection (future — sidebar sends messages to workspace agent)
Not reconnecting to real Chrome.