docs: document real browser mode + Chrome extension in BROWSER.md and README.md

BROWSER.md: new sections for connect/disconnect/focus commands,
Chrome extension Side Panel install, CDP-aware skills, activity streaming.
Updated command reference table, key components, env vars, source map.

README.md: updated /browse description, added "Real browser mode" to
What's New section.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
Garry Tan
2026-03-21 12:52:33 -07:00
parent 6e9ad62805
commit c0a80ee77c
2 changed files with 68 additions and 1 deletions
+65
View File
@@ -18,6 +18,7 @@ This document covers the command reference and internals of gstack's headless br
| Cookies | `cookie-import`, `cookie-import-browser` | Import cookies from file or real browser |
| Multi-step | `chain` (JSON from stdin) | Batch commands in one call |
| Handoff | `handoff [reason]`, `resume` | Switch to visible Chrome for user takeover |
| Real browser | `connect`, `disconnect`, `focus` | Control real Chrome, visible window |
All selector arguments accept CSS selectors, `@e` refs after `snapshot`, or `@c` refs after `snapshot -C`. 50+ commands total plus cookie import.
@@ -70,6 +71,8 @@ browse/
│ ├── cookie-import-browser.ts # Decrypt + import cookies from real Chromium browsers
│ ├── cookie-picker-routes.ts # HTTP routes for interactive cookie picker UI
│ ├── cookie-picker-ui.ts # Self-contained HTML/CSS/JS for cookie picker
│ ├── chrome-launcher.ts # Browser discovery, CDP probe, runtime detection
│ ├── activity.ts # Activity streaming (SSE) for Chrome extension
│ └── buffers.ts # CircularBuffer<T> + console/network/dialog capture
├── test/ # Integration tests + HTML fixtures
└── dist/
@@ -124,6 +127,64 @@ The server hooks into Playwright's `page.on('console')`, `page.on('response')`,
The `console`, `network`, and `dialog` commands read from the in-memory buffers, not disk.
### Real browser mode (`connect`)
Instead of headless Chromium, `connect` launches your real Chrome as a headed window controlled by Playwright. You see everything Claude does in real time.
```bash
$B connect # launch real Chrome, headed
$B goto https://app.com # navigates in the visible window
$B snapshot -i # refs from the real page
$B click @e3 # clicks in the real window
$B focus # bring Chrome window to foreground (macOS)
$B status # shows Mode: cdp
$B disconnect # back to headless mode
```
The window has a subtle green shimmer line at the top edge and a floating "gstack" pill in the bottom-right corner so you always know which Chrome window is being controlled.
**How it works:** Playwright's `channel: 'chrome'` launches your system Chrome binary via a native pipe protocol — not CDP WebSocket. All existing browse commands work unchanged because they go through Playwright's abstraction layer.
**When to use it:**
- QA testing where you want to watch Claude click through your app
- Design review where you need to see exactly what Claude sees
- Debugging where headless behavior differs from real Chrome
- Demos where you're sharing your screen
**Commands:**
| Command | What it does |
|---------|-------------|
| `connect` | Launch real Chrome, restart server in headed mode |
| `disconnect` | Close real Chrome, restart in headless mode |
| `focus` | Bring Chrome to foreground (macOS). `focus @e3` also scrolls element into view |
| `status` | Shows `Mode: cdp` when connected, `Mode: launched` when headless |
**CDP-aware skills:** When in real-browser mode, `/qa` and `/design-review` automatically skip cookie import prompts and headless workarounds.
### Chrome extension (Side Panel)
A Manifest V3 Chrome extension shows live browse activity in a Side Panel:
```
extension/
manifest.json — Manifest V3, sidePanel permission
background.js — polls /health, relays refs, badge status
sidepanel.html/js — live activity feed, dark theme
popup.html/js — port config, connection status
content.js/css — @ref overlays + connection pill
```
**Install:** `chrome://extensions` → Developer mode → Load unpacked → select `extension/`
**Setup:** Click the gstack icon → enter the browse server port (shown by `$B status` or in `.gstack/browse.json`).
**Features:**
- Toolbar badge: green when connected, gray when not
- Side Panel: live feed of every browse command and result
- @ref overlays: floating panel showing current refs on the page
- Connection pill: subtle indicator on every page when connected
### User handoff
When the headless browser can't proceed (CAPTCHA, MFA, complex auth), `handoff` opens a visible Chrome window at the exact same page with all cookies, localStorage, and tabs preserved. The user solves the problem manually, then `resume` returns control to the agent with a fresh snapshot.
@@ -171,6 +232,8 @@ No port collisions. No shared state. Each project is fully isolated.
| `BROWSE_IDLE_TIMEOUT` | 1800000 (30 min) | Idle shutdown timeout in ms |
| `BROWSE_STATE_FILE` | `.gstack/browse.json` | Path to state file (CLI passes to server) |
| `BROWSE_SERVER_SCRIPT` | auto-detected | Path to server.ts |
| `BROWSE_CDP_URL` | (none) | Set to `channel:chrome` for real browser mode |
| `BROWSE_CDP_PORT` | 0 | CDP port (used internally) |
### Performance
@@ -250,6 +313,8 @@ Tests spin up a local HTTP server (`browse/test/test-server.ts`) serving HTML fi
| `browse/src/cookie-import-browser.ts` | Decrypt Chromium cookies via macOS Keychain + PBKDF2/AES-128-CBC. Auto-detects installed browsers. |
| `browse/src/cookie-picker-routes.ts` | HTTP routes for `/cookie-picker/*` — browser list, domain search, import, remove. |
| `browse/src/cookie-picker-ui.ts` | Self-contained HTML generator for the interactive cookie picker (dark theme, no frameworks). |
| `browse/src/chrome-launcher.ts` | Browser binary discovery, CDP port probe, runtime detection (Conductor/Claude Code/Codex/terminal). |
| `browse/src/activity.ts` | Activity streaming — `ActivityEntry` type, `CircularBuffer`, privacy filtering, SSE subscriber management. |
| `browse/src/buffers.ts` | `CircularBuffer<T>` (O(1) ring buffer) + console/network/dialog capture with async disk flush. |
### Deploying to the active skill
+3 -1
View File
@@ -142,7 +142,7 @@ One sprint, one person, one feature — that takes about 30 minutes with gstack.
| `/ship` | **Release Engineer** | Sync main, run tests, audit coverage, push, open PR. Bootstraps test frameworks if you don't have one. One command. |
| `/document-release` | **Technical Writer** | Update all project docs to match what you just shipped. Catches stale READMEs automatically. |
| `/retro` | **Eng Manager** | Team-aware weekly retro. Per-person breakdowns, shipping streaks, test health trends, growth opportunities. |
| `/browse` | **QA Engineer** | Give the agent eyes. Real Chromium browser, real clicks, real screenshots. ~100ms per command. |
| `/browse` | **QA Engineer** | Give the agent eyes. Real Chromium browser, real clicks, real screenshots. ~100ms per command. `$B connect` launches your real Chrome as a headed window — watch every action live. |
| `/setup-browser-cookies` | **Session Manager** | Import cookies from your real browser (Chrome, Arc, Brave, Edge) into the headless session. Test authenticated pages. |
### Power tools
@@ -172,6 +172,8 @@ One sprint, one person, one feature — that takes about 30 minutes with gstack.
**`/document-release` is the engineer you never had.** It reads every doc file in your project, cross-references the diff, and updates everything that drifted. README, ARCHITECTURE, CONTRIBUTING, CLAUDE.md, TODOS — all kept current automatically. And now `/ship` auto-invokes it — docs stay current without an extra command.
**Real browser mode.** `$B connect` launches your actual Chrome as a headed window controlled by Playwright. You watch Claude click, fill, and navigate in real time — same window, same screen. A subtle green shimmer at the top edge tells you which Chrome window gstack controls. All existing browse commands work unchanged. `$B disconnect` returns to headless. A Chrome extension Side Panel shows a live activity feed of every command. This is co-presence — Claude isn't remote-controlling a hidden browser, it's sitting next to you in the same cockpit.
**Browser handoff when the AI gets stuck.** Hit a CAPTCHA, auth wall, or MFA prompt? `$B handoff` opens a visible Chrome at the exact same page with all your cookies and tabs intact. Solve the problem, tell Claude you're done, `$B resume` picks up right where it left off. The agent even suggests it automatically after 3 consecutive failures.
**Multi-AI second opinion.** `/codex` gets an independent review from OpenAI's Codex CLI — a completely different AI looking at the same diff. Three modes: code review with a pass/fail gate, adversarial challenge that actively tries to break your code, and open consultation with session continuity. When both `/review` (Claude) and `/codex` (OpenAI) have reviewed the same branch, you get a cross-model analysis showing which findings overlap and which are unique to each.