* Phase 2: Enhanced browser — dialog handling, upload, state checks, snapshots - CircularBuffer O(1) ring buffer for console/network/dialog (was O(n) array+shift) - Async buffer flush with Bun.write() (was appendFileSync) - Dialog auto-accept/dismiss with buffer + prompt text support - File upload command (upload <sel> <file...>) - Element state checks (is visible/hidden/enabled/disabled/checked/editable/focused) - Annotated screenshots with ref labels overlaid (-a flag) - Snapshot diffing against previous snapshot (-D flag) - Cursor-interactive element scan for non-ARIA clickables (-C flag) - Snapshot scoping depth limit (-d N flag) - Health check with page.evaluate + 2s timeout - Playwright error wrapping — actionable messages for AI agents - Fix useragent — context recreation preserves cookies/storage/URLs - wait --networkidle / --load / --domcontentloaded flags - console --errors filter (error + warning only) - cookie-import <json-file> with auto-fill domain from page URL - 166 integration tests (was ~63) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Phase 2: Rewrite SKILL.md as QA playbook + command reference Reorient SKILL.md files from raw command reference to QA-first playbook with 10 workflow patterns (test user flows, verify deployments, dogfood features, responsive layouts, file upload, forms, dialogs, compare pages). Compact command reference tables at the bottom. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Phase 3: /qa skill — systematic QA testing with health scores New /qa skill for systematic web app QA testing. Three modes: - full: 5-10 documented issues with screenshots and repro steps - quick: 30-second smoke test with health score - regression: compare against saved baseline Includes issue taxonomy (7 categories, 4 severity levels), structured report template, health score rubric (weighted across 7 categories), framework detection guidance (Next.js, Rails, WordPress, SPA). Also adds browse/bin/find-browse (DRY binary discovery using git rev-parse), .gstack/ to .gitignore, and updated TODO roadmap. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Bump to v0.3.0 — Phase 2 + Phase 3 changelog Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat: cookie-import-browser — Chromium cookie decryption module + tests Pure logic module for reading and decrypting cookies from macOS Chromium browsers (Comet, Chrome, Arc, Brave, Edge). Supports v10 AES-128-CBC encryption with macOS Keychain access, PBKDF2 key derivation, and per-browser key caching. 18 unit tests with encrypted cookie fixtures. * feat: cookie picker web UI + route handler Two-panel dark-theme picker served from the browse server. Left panel shows source browser domains with search and import buttons. Right panel shows imported domains with trash buttons. No cookie values exposed. 6 API endpoints, importedDomains Set tracking, inline clearCookies. * feat: wire cookie-import-browser into browse server Add cookie-picker route dispatch (no auth, localhost-only), add cookie-import-browser to WRITE_COMMANDS and CHAIN_WRITE, add serverPort property to BrowserManager, add write command with two modes (picker UI vs --domain direct import), update CLI help text. * chore: /setup-browser-cookies skill + docs (Phase 3.5) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * chore: bump version and changelog (v0.3.1) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * security: redact sensitive values from command output (PR #21) type no longer echoes text (reports character count), cookie redacts value with ****, header redacts Authorization/Cookie/X-API-Key/X-Auth-Token, storage set drops value, forms redacts password fields. Prevents secrets from persisting in LLM transcripts. 7 new tests. Credit: fredluz (PR #21) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * security: path traversal prevention for screenshot/pdf/eval (PR #26) Add validateOutputPath() for screenshot/pdf/responsive (restricts to /tmp and cwd) and validateReadPath() for eval (blocks .. sequences and absolute paths outside safe dirs). 7 new tests. Credit: Jah-yee (PR #26) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: auto-install Playwright Chromium in setup (PR #22) Setup now verifies Playwright can launch Chromium, and auto-installs it via `bunx playwright install chromium` if missing. Exits non-zero if build or Chromium launch fails. Credit: AkbarDevop (PR #22) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * security: fix path validation bypass, CORS restriction, cookie-import path check - startsWith('/tmp') matched '/tmpevil' — now requires trailing slash - CORS Access-Control-Allow-Origin changed from * to http://127.0.0.1:<port> - cookie-import now validates file paths (was missing validateReadPath) - 3 new tests for prefix collision and cookie-import path traversal Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: address review informational issues + add regression tests - Add cookie-import to CHAIN_WRITE set for chain command routing - Add path validation to snapshot -a -o output path - Fix package.json version to match 0.3.1 - Use crypto.randomUUID() for temp DB paths (unpredictable filenames) - Add regression tests for chain cookie-import and snapshot path validation Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * docs: add /qa, /setup-browser-cookies to README + update BROWSER.md - Add /qa and /setup-browser-cookies to skills table, install/update/uninstall blurbs - Add dedicated README sections for both new skills with usage examples - Update demo workflow to show cookie import → QA → browse flow - Update BROWSER.md: cookie import commands, new source files, test count (203) - Update skill count from 6 to 8 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat: team-aware /retro v2.0 — per-person praise and growth opportunities - Identify current user via git config, orient narrative as "you" vs teammates - Add per-author metrics: commits, LOC, focus areas, commit type mix, sessions - New "Your Week" section with personal deep-dive for whoever runs the command - New "Team Breakdown" with per-person praise and growth opportunities - Track AI-assisted commits via Co-Authored-By trailers - Personal + team shipping streaks - Tone: praise like a 1:1, growth like investment advice, never compare negatively Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * docs: add Conductor parallel sessions section to README Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
14 KiB
Browser — technical details
This document covers the command reference and internals of gstack's headless browser.
Command reference
| Category | Commands | What for |
|---|---|---|
| Navigate | goto, back, forward, reload, url |
Get to a page |
| Read | text, html, links, forms, accessibility |
Extract content |
| Snapshot | snapshot [-i] [-c] [-d N] [-s sel] [-D] [-a] [-o] [-C] |
Get refs, diff, annotate |
| Interact | click, fill, select, hover, type, press, scroll, wait, viewport, upload |
Use the page |
| Inspect | js, eval, css, attrs, is, console, network, dialog, cookies, storage, perf |
Debug and verify |
| Visual | screenshot, pdf, responsive |
See what Claude sees |
| Compare | diff <url1> <url2> |
Spot differences between environments |
| Dialogs | dialog-accept [text], dialog-dismiss |
Control alert/confirm/prompt handling |
| Tabs | tabs, tab, newtab, closetab |
Multi-page workflows |
| Cookies | cookie-import, cookie-import-browser |
Import cookies from file or real browser |
| Multi-step | chain (JSON from stdin) |
Batch commands in one call |
All selector arguments accept CSS selectors, @e refs after snapshot, or @c refs after snapshot -C. 50+ commands total plus cookie import.
How it works
gstack's browser is a compiled CLI binary that talks to a persistent local Chromium daemon over HTTP. The CLI is a thin client — it reads a state file, sends a command, and prints the response to stdout. The server does the real work via Playwright.
┌─────────────────────────────────────────────────────────────────┐
│ Claude Code │
│ │
│ "browse goto https://staging.myapp.com" │
│ │ │
│ ▼ │
│ ┌──────────┐ HTTP POST ┌──────────────┐ │
│ │ browse │ ──────────────── │ Bun HTTP │ │
│ │ CLI │ localhost:9400 │ server │ │
│ │ │ Bearer token │ │ │
│ │ compiled │ ◄────────────── │ Playwright │──── Chromium │
│ │ binary │ plain text │ API calls │ (headless) │
│ └──────────┘ └──────────────┘ │
│ ~1ms startup persistent daemon │
│ auto-starts on first call │
│ auto-stops after 30 min idle │
└─────────────────────────────────────────────────────────────────┘
Lifecycle
-
First call: CLI checks
/tmp/browse-server.jsonfor a running server. None found — it spawnsbun run browse/src/server.tsin the background. The server launches headless Chromium via Playwright, picks a port (9400-9410), generates a bearer token, writes the state file, and starts accepting HTTP requests. This takes ~3 seconds. -
Subsequent calls: CLI reads the state file, sends an HTTP POST with the bearer token, prints the response. ~100-200ms round trip.
-
Idle shutdown: After 30 minutes with no commands, the server shuts down and cleans up the state file. Next call restarts it automatically.
-
Crash recovery: If Chromium crashes, the server exits immediately (no self-healing — don't hide failure). The CLI detects the dead server on the next call and starts a fresh one.
Key components
browse/
├── src/
│ ├── cli.ts # Thin client — reads state file, sends HTTP, prints response
│ ├── server.ts # Bun.serve HTTP server — routes commands to Playwright
│ ├── browser-manager.ts # Chromium lifecycle — launch, tabs, ref map, crash handling
│ ├── snapshot.ts # Accessibility tree → @ref assignment → Locator map + diff/annotate/-C
│ ├── read-commands.ts # Non-mutating commands (text, html, links, js, css, is, dialog, etc.)
│ ├── write-commands.ts # Mutating commands (click, fill, select, upload, dialog-accept, etc.)
│ ├── meta-commands.ts # Server management, chain, diff, snapshot routing
│ ├── cookie-import-browser.ts # Decrypt + import cookies from real Chromium browsers
│ ├── cookie-picker-routes.ts # HTTP routes for interactive cookie picker UI
│ ├── cookie-picker-ui.ts # Self-contained HTML/CSS/JS for cookie picker
│ └── buffers.ts # CircularBuffer<T> + console/network/dialog capture
├── test/ # Integration tests + HTML fixtures
└── dist/
└── browse # Compiled binary (~58MB, Bun --compile)
The snapshot system
The browser's key innovation is ref-based element selection, built on Playwright's accessibility tree API:
page.locator(scope).ariaSnapshot()returns a YAML-like accessibility tree- The snapshot parser assigns refs (
@e1,@e2, ...) to each element - For each ref, it builds a Playwright
Locator(usinggetByRole+ nth-child) - The ref-to-Locator map is stored on
BrowserManager - Later commands like
click @e3look up the Locator and calllocator.click()
No DOM mutation. No injected scripts. Just Playwright's native accessibility API.
Extended snapshot features:
--diff(-D): Stores each snapshot as a baseline. On the next-Dcall, returns a unified diff showing what changed. Use this to verify that an action (click, fill, etc.) actually worked.--annotate(-a): Injects temporary overlay divs at each ref's bounding box, takes a screenshot with ref labels visible, then removes the overlays. Use-o <path>to control the output path.--cursor-interactive(-C): Scans for non-ARIA interactive elements (divs withcursor:pointer,onclick,tabindex>=0) usingpage.evaluate. Assigns@c1,@c2... refs with deterministicnth-childCSS selectors. These are elements the ARIA tree misses but users can still click.
Authentication
Each server session generates a random UUID as a bearer token. The token is written to the state file (/tmp/browse-server.json) with chmod 600. Every HTTP request must include Authorization: Bearer <token>. This prevents other processes on the machine from controlling the browser.
Console, network, and dialog capture
The server hooks into Playwright's page.on('console'), page.on('response'), and page.on('dialog') events. All entries are kept in O(1) circular buffers (50,000 capacity each) and flushed to disk asynchronously via Bun.write():
- Console:
/tmp/browse-console.log - Network:
/tmp/browse-network.log - Dialog:
/tmp/browse-dialog.log
The console, network, and dialog commands read from the in-memory buffers, not disk.
Dialog handling
Dialogs (alert, confirm, prompt) are auto-accepted by default to prevent browser lockup. The dialog-accept and dialog-dismiss commands control this behavior. For prompts, dialog-accept <text> provides the response text. All dialogs are logged to the dialog buffer with type, message, and action taken.
Multi-workspace support
Each workspace gets its own isolated browser instance with its own Chromium process, tabs, cookies, and logs.
If CONDUCTOR_PORT is set (e.g., by Conductor), the browse port is derived deterministically:
browse_port = CONDUCTOR_PORT - 45600
| Workspace | CONDUCTOR_PORT | Browse port | State file |
|---|---|---|---|
| Workspace A | 55040 | 9440 | /tmp/browse-server-9440.json |
| Workspace B | 55041 | 9441 | /tmp/browse-server-9441.json |
| No Conductor | — | 9400 (scan) | /tmp/browse-server.json |
You can also set BROWSE_PORT directly.
Environment variables
| Variable | Default | Description |
|---|---|---|
BROWSE_PORT |
0 (auto-scan 9400-9410) | Fixed port for the HTTP server |
CONDUCTOR_PORT |
— | If set, browse port = this - 45600 |
BROWSE_IDLE_TIMEOUT |
1800000 (30 min) | Idle shutdown timeout in ms |
BROWSE_STATE_FILE |
/tmp/browse-server.json |
Path to state file |
BROWSE_SERVER_SCRIPT |
auto-detected | Path to server.ts |
Performance
| Tool | First call | Subsequent calls | Context overhead per call |
|---|---|---|---|
| Chrome MCP | ~5s | ~2-5s | ~2000 tokens (schema + protocol) |
| Playwright MCP | ~3s | ~1-3s | ~1500 tokens (schema + protocol) |
| gstack browse | ~3s | ~100-200ms | 0 tokens (plain text stdout) |
The context overhead difference compounds fast. In a 20-command browser session, MCP tools burn 30,000-40,000 tokens on protocol framing alone. gstack burns zero.
Why CLI over MCP?
MCP (Model Context Protocol) works well for remote services, but for local browser automation it adds pure overhead:
- Context bloat: every MCP call includes full JSON schemas and protocol framing. A simple "get the page text" costs 10x more context tokens than it should.
- Connection fragility: persistent WebSocket/stdio connections drop and fail to reconnect.
- Unnecessary abstraction: Claude Code already has a Bash tool. A CLI that prints to stdout is the simplest possible interface.
gstack skips all of this. Compiled binary. Plain text in, plain text out. No protocol. No schema. No connection management.
Acknowledgments
The browser automation layer is built on Playwright by Microsoft. Playwright's accessibility tree API, locator system, and headless Chromium management are what make ref-based interaction possible. The snapshot system — assigning @ref labels to accessibility tree nodes and mapping them back to Playwright Locators — is built entirely on top of Playwright's primitives. Thank you to the Playwright team for building such a solid foundation.
Development
Prerequisites
- Bun v1.0+
- Playwright's Chromium (installed automatically by
bun install)
Quick start
bun install # install dependencies + Playwright Chromium
bun test # run integration tests (~3s)
bun run dev <cmd> # run CLI from source (no compile)
bun run build # compile to browse/dist/browse
Dev mode vs compiled binary
During development, use bun run dev instead of the compiled binary. It runs browse/src/cli.ts directly with Bun, so you get instant feedback without a compile step:
bun run dev goto https://example.com
bun run dev text
bun run dev snapshot -i
bun run dev click @e3
The compiled binary (bun run build) is only needed for distribution. It produces a single ~58MB executable at browse/dist/browse using Bun's --compile flag.
Running tests
bun test # run all tests
bun test browse/test/commands # run command integration tests only
bun test browse/test/snapshot # run snapshot tests only
bun test browse/test/cookie-import-browser # run cookie import unit tests only
Tests spin up a local HTTP server (browse/test/test-server.ts) serving HTML fixtures from browse/test/fixtures/, then exercise the CLI commands against those pages. 203 tests across 3 files, ~15 seconds total.
Source map
| File | Role |
|---|---|
browse/src/cli.ts |
Entry point. Reads /tmp/browse-server.json, sends HTTP to the server, prints response. |
browse/src/server.ts |
Bun HTTP server. Routes commands to the right handler. Manages idle timeout. |
browse/src/browser-manager.ts |
Chromium lifecycle — launch, tab management, ref map, crash detection. |
browse/src/snapshot.ts |
Parses accessibility tree, assigns @e/@c refs, builds Locator map. Handles --diff, --annotate, -C. |
browse/src/read-commands.ts |
Non-mutating commands: text, html, links, js, css, is, dialog, forms, etc. Exports getCleanText(). |
browse/src/write-commands.ts |
Mutating commands: goto, click, fill, upload, dialog-accept, useragent (with context recreation), etc. |
browse/src/meta-commands.ts |
Server management, chain routing, diff (DRY via getCleanText), snapshot delegation. |
browse/src/cookie-import-browser.ts |
Decrypt Chromium cookies via macOS Keychain + PBKDF2/AES-128-CBC. Auto-detects installed browsers. |
browse/src/cookie-picker-routes.ts |
HTTP routes for /cookie-picker/* — browser list, domain search, import, remove. |
browse/src/cookie-picker-ui.ts |
Self-contained HTML generator for the interactive cookie picker (dark theme, no frameworks). |
browse/src/buffers.ts |
CircularBuffer<T> (O(1) ring buffer) + console/network/dialog capture with async disk flush. |
Deploying to the active skill
The active skill lives at ~/.claude/skills/gstack/. After making changes:
- Push your branch
- Pull in the skill directory:
cd ~/.claude/skills/gstack && git pull - Rebuild:
cd ~/.claude/skills/gstack && bun run build
Or copy the binary directly: cp browse/dist/browse ~/.claude/skills/gstack/browse/dist/browse
Adding a new command
- Add the handler in
read-commands.ts(non-mutating) orwrite-commands.ts(mutating) - Register the route in
server.ts - Add a test case in
browse/test/commands.test.tswith an HTML fixture if needed - Run
bun testto verify - Run
bun run buildto compile