mirror of
https://github.com/garrytan/gstack.git
synced 2026-05-01 19:25:10 +02:00
7665adf4fe
* feat: CDP connect — control real Chrome/Comet via Playwright Add `connectCDP()` to BrowserManager: connects to a running browser via Chrome DevTools Protocol. All existing browse commands work unchanged through Playwright's abstraction layer. - chrome-launcher.ts: browser discovery, CDP probe, auto-relaunch with rollback - browser-manager.ts: connectCDP(), mode guards (close/closeTab/recreateContext/handoff), auto-reconnect on browser restart, getRefMap() for extension API - server.ts: CDP branch in start(), /health gains mode field, /refs endpoint, idle timer only resets on /command (not passive endpoints) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: browse connect/disconnect/focus CLI commands - connect: pre-server command that discovers browser, starts server in CDP mode - disconnect: drops CDP connection, restarts in headless mode - focus: brings browser window to foreground via osascript (macOS) - status: now shows Mode: cdp | launched | headed - startServer() accepts extra env vars for CDP URL/port passthrough Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: CDP-aware skill templates — skip cookie import in real browser mode Skills now check `$B status` for CDP mode and skip: - /qa: cookie import prompt, user-agent override, headless workarounds - /design-review: cookie import for authenticated pages - /setup-browser-cookies: returns "not needed" in CDP mode Regenerated SKILL.md files from updated templates. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: activity streaming — SSE endpoint for Chrome extension Side Panel Real-time browse command feed via Server-Sent Events: - activity.ts: ActivityEntry type, CircularBuffer (capacity 1000), privacy filtering (redacts passwords, auth tokens, sensitive URL params), cursor-based gap detection, async subscriber notification - server.ts: /activity/stream SSE, /activity/history REST, handleCommand instrumented with command_start/command_end events - 18 unit tests for filterArgs privacy, emitActivity, subscribe lifecycle Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: Chrome extension Side Panel + Conductor API proposal Chrome extension (Manifest V3, sideload): - Side Panel with live activity feed, @ref overlays, dark terminal aesthetic - Background worker: health polling, SSE relay, ref fetching - Popup: port config, connection status, side panel launcher - Content script: floating ref panel with @ref badges Conductor API proposal (docs/designs/CONDUCTOR_SESSION_API.md): - SSE endpoint for full Claude Code session mirroring in Side Panel - Discovery via HTTP endpoint (not filesystem — extensions can't read files) TODOS.md: add $B watch, multi-agent tabs, cross-platform CDP, Web Store publishing. Mark CDP mode as shipped. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: detect Conductor runtime, skip osascript quit for sandboxed apps macOS App Management blocks Electron apps (Conductor) from quitting other apps via osascript. Now detects the runtime environment: - terminal/claude-code/codex: can manage apps freely - conductor: prints manual restart instructions + polls for 60s detectRuntime() checks env vars and parent process. When Chrome needs restart but we can't quit it, prints step-by-step instructions and waits for the user to restart Chrome with --remote-debugging-port. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: detect Conductor via actual env vars (CONDUCTOR_WORKSPACE_NAME) Previous detection checked CONDUCTOR_WORKSPACE_ID which doesn't exist. Conductor sets CONDUCTOR_WORKSPACE_NAME, CONDUCTOR_BIN_DIR, CONDUCTOR_PORT, and __CFBundleIdentifier=com.conductor.app. Check these FIRST because Conductor sessions also have ANTHROPIC_API_KEY (which was matching claude-code). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: connection status pill — floating indicator when gstack controls Chrome Small pill in bottom-right corner of every page: "● gstack · 3 refs" Shows when connected via CDP, fades to 30% opacity after 3s, full on hover. Disappears entirely when disconnected. Background worker now notifies content scripts on connect/disconnect state changes so the pill appears/disappears without polling. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: Chrome requires --user-data-dir for remote debugging Chrome refuses --remote-debugging-port without an explicit --user-data-dir. Add userDataDir to BrowserBinary registry (macOS Application Support paths) and pass it in both auto-launch and manual restart instructions. Fix double-quoting in CLI manual restart instructions. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: Chrome must be fully quit before launching with --remote-debugging-port Chrome refuses to enable CDP on its default profile when another instance is running (even with explicit --user-data-dir). The only reliable path: fully quit Chrome first, then relaunch with the flag. Updated instructions to emphasize this clearly with verification step. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: bin/chrome-cdp — quit Chrome and relaunch with CDP in one command Quits Chrome gracefully, waits for full exit, relaunches with --remote-debugging-port, polls until CDP is ready. Usage: chrome-cdp [port] Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: use Playwright channel:chrome instead of broken connectOverCDP Playwright's connectOverCDP hangs with Chrome 146 due to CDP protocol version mismatch. Switch to channel:'chrome' which uses Playwright's native pipe protocol to launch the system Chrome binary directly. This is simpler and more reliable: - No CDP port discovery needed - No --remote-debugging-port or --user-data-dir hassles - $B connect just works — launches real Chrome headed window - All Playwright APIs (snapshot, click, fill) work unchanged bin/chrome-cdp updated with symlinked profile approach (kept for manual CDP use cases, but $B connect no longer needs it). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: green border + gstack label on controlled Chrome window Injects a 2px green border and small "gstack" label on every page loaded in the controlled Chrome window via context.addInitScript(). Users can instantly tell which Chrome window Claude controls. Also fixes close() for channel:chrome mode (uses browser.close() not browser.disconnect() which doesn't exist). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * chore: cleanup chrome-launcher runtime detection, remove puppeteer-core dep Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * style(design): redesign controlled Chrome indicator Replace crude green border + label with polished indicator: - 2px shimmer gradient at top edge (green→cyan→green, 3s loop) - Floating pill bottom-right with frosted glass bg, fades to 25% opacity after 4s so it doesn't compete with page content - prefers-reduced-motion disables shimmer animation - Much more subtle — looks like a developer tool, not broken CSS Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: document real browser mode + Chrome extension in BROWSER.md and README.md BROWSER.md: new sections for connect/disconnect/focus commands, Chrome extension Side Panel install, CDP-aware skills, activity streaming. Updated command reference table, key components, env vars, source map. README.md: updated /browse description, added "Real browser mode" to What's New section. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: step-by-step Chrome extension install guide in BROWSER.md Replace terse bullet points with numbered walkthrough covering: developer mode toggle, load unpacked, macOS file picker tip (Cmd+Shift+G), pin extension, configure port, open side panel. Added troubleshooting section. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: add Cmd+Shift+. tip for hidden folders in macOS file picker macOS hides folders starting with . by default. Added both shortcuts: Cmd+Shift+G (paste path directly) and Cmd+Shift+. (show hidden files). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: integrate hidden folder tips into the install flow naturally Move Cmd+Shift+G and Cmd+Shift+. tips inline with the file picker step instead of as a separate tip block after it. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: auto-load Chrome extension when $B connect launches Chrome Extension auto-loads via --load-extension flag — no manual chrome://extensions install needed. findExtensionPath() checks repo root, global install, and dev paths. Also adds bin/gstack-extension helper for manual install in regular Chrome, and rewrites BROWSER.md install docs with auto-load as primary path. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: /connect-chrome skill — one command to launch Chrome with Side Panel New skill that runs $B connect, verifies the connection, guides the user to open the Side Panel, and demos the live activity feed. Extension auto-loads via --load-extension so no manual chrome://extensions install needed. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: use launchPersistentContext for Chrome extension loading Playwright's chromium.launch() silently ignores --load-extension. Switch to launchPersistentContext with ignoreDefaultArgs to remove --disable-extensions flag. Use bundled Chromium (real Chrome blocks unpacked extensions). Fixed port 34567 for CDP mode so the extension auto-connects. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: sync extension to DESIGN.md — amber accent, zinc neutrals, grain texture Import design system from gstack-website. Update all extension colors: green (#4ade80) → amber (#F59E0B/#FBBF24), zinc gray neutrals, grain texture overlay. Regenerate icons as amber "G" monogram on dark background. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: sidebar chat with Claude Code — icon opens side panel directly Replace popup flyout with direct side panel open on icon click. Primary UI is now a chat interface that sends messages to Claude Code via file queue. Activity/Refs tabs moved behind a debug toggle in the footer. Command bar with history, auto-poll for responses, amber design system. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: sidebar agent — Claude-powered chat backend via file queue Add /sidebar-command, /sidebar-response, and /sidebar-chat endpoints to the browse server. sidebar-agent.ts watches the command queue file, spawns claude -p with browse context for each message, and streams responses back to the sidebar chat. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: remove duplicate gstack pill overlay, hide crash restore bubble The addInitScript indicator and the extension's content script were both injecting bottom-right pills, causing duplicates. Remove the pill from addInitScript (extension handles it). Replace --restore-last-session with --hide-crash-restore-bubble to suppress the "Chromium didn't shut down correctly" dialog. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: state file authority — CDP server cannot be silently replaced Hardens the connect/disconnect lifecycle: - ensureServer() refuses to auto-start headless when CDP server is alive - $B connect does full cleanup: SIGTERM → 2s → SIGKILL, profile locks, state - shutdown() cleans Chromium SingletonLock/Socket/Cookie files - uncaughtException/unhandledRejection handlers do emergency cleanup This prevents the bug where a headless server overwrites the CDP server's state file, causing $B commands to hit the wrong browser. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: sidebar agent streaming events + session state management Enhance sidebar-agent.ts with: - Live streaming of claude -p events (tool_use, text, result) to sidebar - Session state file for BROWSE_STATE_FILE propagation to claude subprocess - Improved logging (stderr, exit codes, event types) - stdin.end() to prevent claude waiting for input - summarizeToolInput() with path shortening for compact sidebar display Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: sidebar chat UI — streaming events, agent status, reconnect retry Sidebar panel improvements: - Chat tab renders streaming agent events (tool_use, text, result) - Thinking dots animation while agent processes - Agent error display with styled error blocks - tryConnect() with 2s retry loop for initial connection - Debug tabs (Activity/Refs) hidden behind gear toggle - Clear chat button - Compact tool call display with path shortening Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: server-integrated sidebar agent with sessions and message queue Move the sidebar agent from a separate bun process into server.ts: - Agent spawns claude -p directly when messages arrive via /sidebar-command - In-memory chat buffer backed by per-session chat.jsonl on disk - Session manager: create, load, persist, list sessions - Message queue (cap 5) with agent status tracking (idle/processing/hung) - Stop/kill endpoints with queue dismiss support - /health now returns agent status + session info - All sidebar endpoints require Bearer auth - Agent killed on server shutdown - 120s timeout detects hung claude processes Eliminates: file-queue polling, separate sidebar-agent.ts process, stale auth tokens, state file conflicts between processes. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: extension auth + token flow for server-integrated agent Update Chrome extension to use Bearer auth on all sidebar endpoints: - background.js captures auth token from /health, exposes via getToken msg - background.js sets openPanelOnActionClick for direct side panel access - sidepanel.js gets token from background, sends in all fetch headers - Health broadcasts include token so sidebar auto-authenticates - Removes popup from manifest — icon click opens side panel directly Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: self-healing sidebar — reconnect banner, state machine, copy button Sidebar UI now handles disconnection gracefully: - Connection state machine: connected → reconnecting → dead - Amber pulsing banner during reconnect (2s retry, 30 attempts) - Red "Server offline" banner with Reconnect + Copy /connect-chrome buttons - Green "Reconnected" toast that fades after 3s on successful reconnect - Copy button lets user paste /connect-chrome into any Claude Code session Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: crash handling — save session, kill agent, distinct exit codes Hardened shutdown/crash behavior: - Browser disconnect exits with code 2 (distinct from crash code 1) - emergencyCleanup kills agent subprocess and saves session state - Clean shutdown saves session before exit (chat history persists) - Clear user message on browser disconnect: "Run $B connect to reconnect" Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: worktree-per-session isolation for sidebar agent Each sidebar session gets an isolated git worktree so the agent's file operations don't conflict with the user's working directory: - createWorktree() creates detached HEAD worktree in ~/.gstack/worktrees/ - Falls back to main cwd for non-git repos or on creation failure - Handles collision cleanup from prior crashes - removeWorktree() cleans up on session switch and shutdown - worktreePath persisted in session.json Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(qa): ISSUE-001 — disconnect blocked by CDP guard in ensureServer $B disconnect was routed through ensureServer() which refused to start a headless server when a CDP state file existed. Disconnect is now handled before ensureServer() (like connect), with force-kill + cleanup fallback when the CDP server is unresponsive. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: resolve claude binary path for daemon-spawned agent The browse server runs as a daemon and may not inherit the user's shell PATH. Add findClaudeBin() that checks ~/.local/bin/claude (standard install location), which claude, and common system paths. Shows a clear error in the sidebar chat if claude CLI is not found. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: resolve claude symlinks + check Conductor bundled binary posix_spawn fails on symlinks in compiled bun binaries. Now: - Checks Conductor app's bundled binary first (not a symlink) - Scans ~/.local/share/claude/versions/ for direct versioned binaries - Uses fs.realpathSync() to resolve symlinks before spawning Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: compiled bun binary cannot posix_spawn — use external agent process Compiled bun binaries fail posix_spawn on ALL executables (even /bin/bash). The server now writes to an agent queue file, and a separate non-compiled bun process (sidebar-agent.ts) reads the queue, spawns claude, and POSTs events back via /sidebar-agent/event. Changes: - server.ts: spawnClaude writes to queue file instead of spawning directly - server.ts: new /sidebar-agent/event endpoint for agent → server relay - server.ts: fix result event field name (event.text vs event.result) - sidebar-agent.ts: rewritten to poll queue file, relay events via HTTP - cli.ts: $B connect auto-starts sidebar-agent as non-compiled bun process Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: loading spinner on sidebar open while connecting to server Shows an amber spinner with "Connecting..." when the sidebar first opens, replacing the empty state. After the first successful /sidebar-chat poll: - If chat history exists: renders it immediately - If no history: shows the welcome message Prevents the jarring empty-then-populated flash on sidebar open. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: zero-friction side panel — auto-open on install, pill is clickable Three changes to eliminate manual side panel setup: - Auto-open side panel on extension install/update (onInstalled listener) - gstack pill (bottom-right) is now clickable — opens the side panel - Pill has pointer-events: auto so clicks always register (was: none) User no longer needs to find the puzzle piece icon, pin the extension, or know the side panel exists. It opens automatically on first launch and can be re-opened by clicking the floating gstack pill. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * refactor: kill CDP naming, delete chrome-launcher.ts dead code The connectCDP() method and connectionMode: 'cdp' naming was a legacy artifact — real Chrome was tried but failed (silently blocks --load-extension), so the implementation already used Playwright's bundled Chromium via launchPersistentContext(). The naming was misleading. Changes: - Delete chrome-launcher.ts (361 LOC) — only import was in unreachable attemptReconnect() method - Delete dead attemptReconnect() and reconnecting field - Delete preExistingTabIds (was for protecting real Chrome tabs we never connect to) - Rename connectCDP() → launchHeaded() - Rename connectionMode: 'cdp' → 'headed' across all files - Replace BROWSE_CDP_URL/BROWSE_CDP_PORT env vars with BROWSE_HEADED=1 - Regenerate SKILL.md files for updated command descriptions - Move BrowserManager unit tests to browser-manager-unit.test.ts Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: converge handoff into connect — extension loads on handoff Handoff now uses launchPersistentContext() with extension auto-loading, same as the connect/launchHeaded() path. This means when the agent gets stuck (2FA, CAPTCHA) and hands off to the user, the Chrome extension + side panel are available automatically. Before: handoff used chromium.launch() + newContext() — no extension After: handoff uses chromium.launchPersistentContext() — extension loads Also sets connectionMode to 'headed' and disables dialog auto-accept on handoff, matching connect behavior. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: gate sidebar chat behind --chat flag $B connect (default): headed Chromium + extension with Activity + Refs tabs only. No separate agent spawned. Clean, no confusion. $B connect --chat: same + Chat tab with standalone claude -p agent. Shows experimental banner: "Standalone mode — this is a separate agent from your workspace." Implementation: - cli.ts: parse --chat, set BROWSE_SIDEBAR_CHAT env, conditionally spawn sidebar-agent - server.ts: gate /sidebar-* routes behind chatEnabled, return 403 when disabled, include chatEnabled in /health response - sidepanel.js: applyChatEnabled() hides/shows Chat tab + banner - background.js: forward chatEnabled from health response - sidepanel.html/css: experimental banner with amber styling Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: file drop relay + $B inbox command Sidebar agent now writes structured messages to .context/sidebar-inbox/ when processing user input. The workspace agent can read these via $B inbox to see what the user reported from the browser. File drop format: .context/sidebar-inbox/{timestamp}-observation.json { type, timestamp, page: {url}, userMessage, sidebarSessionId } Atomic writes (tmp + rename) prevent partial reads. $B inbox --clear removes messages after display. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: $B watch — passive observation mode Claude enters read-only mode and captures periodic snapshots (every 5s) while the user browses. Mutation commands (click, fill, etc.) are blocked during watch. $B watch stop exits and returns a summary with the last snapshot. Requires headed mode ($B connect). This is the inverse of the scout pattern — the workspace agent watches through the browser instead of the sidebar relaying to it. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * test: add coverage for sidebar-agent, file-drop, and watch mode 33 new tests covering: - Sidebar agent queue parsing (valid/malformed/empty JSONL) - writeToInbox file drop (directory creation, atomic writes, JSON format) - Inbox command (display, sorting, --clear, malformed file handling) - Watch mode state machine (start/stop cycles, snapshots, duration) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: TODOS cleanup + Chrome vs Chromium exploration doc - Update TODOS.md: mark CDP mode, $B watch, sidebar scout as SHIPPED - Delete dead "cross-platform CDP browser discovery" TODO - Rename dependencies from "CDP connect" to "headed mode" - Add docs/designs/CHROME_VS_CHROMIUM_EXPLORATION.md memorializing the architecture exploration and decision to use Playwright Chromium Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: add Conductor Chrome sidebar integration design doc Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: sidebar-agent validates cwd before spawning claude The queue entry may reference a worktree that was cleaned up between sessions. Now falls back to process.cwd() if the path doesn't exist, preventing silent spawn failures. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: gen-skill-docs resolver merge + preamble tier gate + plan file discovery The local RESOLVERS record in gen-skill-docs.ts was shadowing the imported canonical resolvers, causing stale test coverage and preamble generators to be used instead of the authoritative versions in resolvers/. Changes: - Merge imported RESOLVERS with local overrides (spread + override pattern) - Fix preamble tier gate: tier 1 skills no longer get AskUserQuestion format - Make plan file discovery host-agnostic (search multiple plan dirs) - Add missing E2E tier entries for ship/review plan completion tests Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: ungate sidebar agent + raise timeout to 5 minutes (v0.12.0) Sidebar chat is now always available in headed mode — no --chat flag needed. Agent tasks get 5 minutes instead of 2, enabling multi-page workflows like navigating directories and filling forms across pages. Changes: - cli.ts: remove --chat flag, always set BROWSE_SIDEBAR_CHAT=1, always spawn agent - server.ts: remove chatEnabled gate (403 response), raise AGENT_TIMEOUT_MS to 300s - sidebar-agent.ts: raise child process timeout from 120s to 300s Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: headed mode + sidebar agent documentation (v0.12.0) - README: sidebar agent section, personal automation example (school parent portal), two auth paths (manual login + cookie import), DevTools MCP mention - BROWSER.md: sidebar agent section with usage, timeout, session isolation, authentication, and random delay documentation - connect-chrome template: add sidebar chat onboarding step - CHANGELOG: v0.12.0 entry covering headed mode, sidebar agent, extension - VERSION: bump to 0.12.0.0 - TODOS: Chrome DevTools MCP integration as P0 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * chore: regenerate SKILL.md files Generated from updated templates + resolver merge. Key changes: - Tier 1 skills no longer include AskUserQuestion format section - Ship/review skills now include coverage gate with thresholds - Connect-chrome skill includes sidebar chat onboarding step - Plan file discovery uses host-agnostic paths Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * chore: regenerate Codex connect-chrome skill Updated preamble with proactive prompt and sidebar chat onboarding step. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: network idle, state persistence, iframe support, chain pipe format (v0.12.1.0) (#516) * feat: network idle detection + chain pipe format - Upgrade click/fill/select from domcontentloaded to networkidle wait (2s timeout, best-effort). Catches XHR/fetch triggered by interactions. - Add pipe-delimited format to chain as JSON fallback: $B chain 'goto url | click @e5 | snapshot -ic' - Add post-loop networkidle wait in chain when last command was a write. - Frame-aware: commands use target (getActiveFrameOrPage) for locator ops, page-only ops (goto/back/forward/reload) guard against frame context. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: $B state save/load + $B frame — new browse commands - state save/load: persist cookies + URLs to .gstack/browse-states/{name}.json File perms 0o600, name sanitized to [a-zA-Z0-9_-]. V1 skips localStorage (breaks on load-before-navigate). Load replaces session via closeAllPages(). - frame: switch command context to iframe via CSS selector, @ref, --name, or --url. 'frame main' returns to main frame. Execution target abstraction (getActiveFrameOrPage) across read-commands, snapshot, and write-commands. - Frame context cleared on tab switch, navigation, resume, and handoff. - Snapshot shows [Context: iframe src="..."] header when in frame. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * test: add tests for network idle, chain pipe format, state, and frame - Network idle: click on fetch button waits for XHR, static click is fast - Chain pipe: pipe-delimited commands, quoted args, JSON still works - State: save/load round-trip, name sanitization, missing state error - Frame: switch to iframe + back, snapshot context header, fill in frame, goto-in-frame guard, usage error New fixtures: network-idle.html (fetch + static buttons), iframe.html (srcdoc) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: review fixes — iframe ref scoping, detached frame recovery, state validation - snapshot.ts: ref locators, cursor-interactive scan, and cursor locator now use target (frame-aware) instead of page — fixes @ref clicking in iframes - browser-manager.ts: getActiveFrameOrPage auto-recovers from detached frames via isDetached() check - meta-commands.ts: state load resets activeFrame, elementHandle disposed after contentFrame(), state file schema validation (cookies + pages arrays), filter empty pipe segments in chain tokenizer - write-commands.ts: upload command uses target.locator() for frame support Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * chore: regenerate SKILL.md files + rebuild binary Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * chore: bump version and changelog (v0.12.1.0) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
396 lines
23 KiB
Markdown
396 lines
23 KiB
Markdown
# Browser — technical details
|
|
|
|
This document covers the command reference and internals of gstack's headless browser.
|
|
|
|
## Command reference
|
|
|
|
| Category | Commands | What for |
|
|
|----------|----------|----------|
|
|
| Navigate | `goto`, `back`, `forward`, `reload`, `url` | Get to a page |
|
|
| Read | `text`, `html`, `links`, `forms`, `accessibility` | Extract content |
|
|
| Snapshot | `snapshot [-i] [-c] [-d N] [-s sel] [-D] [-a] [-o] [-C]` | Get refs, diff, annotate |
|
|
| Interact | `click`, `fill`, `select`, `hover`, `type`, `press`, `scroll`, `wait`, `viewport`, `upload` | Use the page |
|
|
| Inspect | `js`, `eval`, `css`, `attrs`, `is`, `console`, `network`, `dialog`, `cookies`, `storage`, `perf` | Debug and verify |
|
|
| Visual | `screenshot [--viewport] [--clip x,y,w,h] [sel\|@ref] [path]`, `pdf`, `responsive` | See what Claude sees |
|
|
| Compare | `diff <url1> <url2>` | Spot differences between environments |
|
|
| Dialogs | `dialog-accept [text]`, `dialog-dismiss` | Control alert/confirm/prompt handling |
|
|
| Tabs | `tabs`, `tab`, `newtab`, `closetab` | Multi-page workflows |
|
|
| Cookies | `cookie-import`, `cookie-import-browser` | Import cookies from file or real browser |
|
|
| Multi-step | `chain` (JSON from stdin) | Batch commands in one call |
|
|
| Handoff | `handoff [reason]`, `resume` | Switch to visible Chrome for user takeover |
|
|
| Real browser | `connect`, `disconnect`, `focus` | Control real Chrome, visible window |
|
|
|
|
All selector arguments accept CSS selectors, `@e` refs after `snapshot`, or `@c` refs after `snapshot -C`. 50+ commands total plus cookie import.
|
|
|
|
## How it works
|
|
|
|
gstack's browser is a compiled CLI binary that talks to a persistent local Chromium daemon over HTTP. The CLI is a thin client — it reads a state file, sends a command, and prints the response to stdout. The server does the real work via [Playwright](https://playwright.dev/).
|
|
|
|
```
|
|
┌─────────────────────────────────────────────────────────────────┐
|
|
│ Claude Code │
|
|
│ │
|
|
│ "browse goto https://staging.myapp.com" │
|
|
│ │ │
|
|
│ ▼ │
|
|
│ ┌──────────┐ HTTP POST ┌──────────────┐ │
|
|
│ │ browse │ ──────────────── │ Bun HTTP │ │
|
|
│ │ CLI │ localhost:rand │ server │ │
|
|
│ │ │ Bearer token │ │ │
|
|
│ │ compiled │ ◄────────────── │ Playwright │──── Chromium │
|
|
│ │ binary │ plain text │ API calls │ (headless) │
|
|
│ └──────────┘ └──────────────┘ │
|
|
│ ~1ms startup persistent daemon │
|
|
│ auto-starts on first call │
|
|
│ auto-stops after 30 min idle │
|
|
└─────────────────────────────────────────────────────────────────┘
|
|
```
|
|
|
|
### Lifecycle
|
|
|
|
1. **First call**: CLI checks `.gstack/browse.json` (in the project root) for a running server. None found — it spawns `bun run browse/src/server.ts` in the background. The server launches headless Chromium via Playwright, picks a random port (10000-60000), generates a bearer token, writes the state file, and starts accepting HTTP requests. This takes ~3 seconds.
|
|
|
|
2. **Subsequent calls**: CLI reads the state file, sends an HTTP POST with the bearer token, prints the response. ~100-200ms round trip.
|
|
|
|
3. **Idle shutdown**: After 30 minutes with no commands, the server shuts down and cleans up the state file. Next call restarts it automatically.
|
|
|
|
4. **Crash recovery**: If Chromium crashes, the server exits immediately (no self-healing — don't hide failure). The CLI detects the dead server on the next call and starts a fresh one.
|
|
|
|
### Key components
|
|
|
|
```
|
|
browse/
|
|
├── src/
|
|
│ ├── cli.ts # Thin client — reads state file, sends HTTP, prints response
|
|
│ ├── server.ts # Bun.serve HTTP server — routes commands to Playwright
|
|
│ ├── browser-manager.ts # Chromium lifecycle — launch, tabs, ref map, crash handling
|
|
│ ├── snapshot.ts # Accessibility tree → @ref assignment → Locator map + diff/annotate/-C
|
|
│ ├── read-commands.ts # Non-mutating commands (text, html, links, js, css, is, dialog, etc.)
|
|
│ ├── write-commands.ts # Mutating commands (click, fill, select, upload, dialog-accept, etc.)
|
|
│ ├── meta-commands.ts # Server management, chain, diff, snapshot routing
|
|
│ ├── cookie-import-browser.ts # Decrypt + import cookies from real Chromium browsers
|
|
│ ├── cookie-picker-routes.ts # HTTP routes for interactive cookie picker UI
|
|
│ ├── cookie-picker-ui.ts # Self-contained HTML/CSS/JS for cookie picker
|
|
│ ├── activity.ts # Activity streaming (SSE) for Chrome extension
|
|
│ └── buffers.ts # CircularBuffer<T> + console/network/dialog capture
|
|
├── test/ # Integration tests + HTML fixtures
|
|
└── dist/
|
|
└── browse # Compiled binary (~58MB, Bun --compile)
|
|
```
|
|
|
|
### The snapshot system
|
|
|
|
The browser's key innovation is ref-based element selection, built on Playwright's accessibility tree API:
|
|
|
|
1. `page.locator(scope).ariaSnapshot()` returns a YAML-like accessibility tree
|
|
2. The snapshot parser assigns refs (`@e1`, `@e2`, ...) to each element
|
|
3. For each ref, it builds a Playwright `Locator` (using `getByRole` + nth-child)
|
|
4. The ref-to-Locator map is stored on `BrowserManager`
|
|
5. Later commands like `click @e3` look up the Locator and call `locator.click()`
|
|
|
|
No DOM mutation. No injected scripts. Just Playwright's native accessibility API.
|
|
|
|
**Ref staleness detection:** SPAs can mutate the DOM without navigation (React router, tab switches, modals). When this happens, refs collected from a previous `snapshot` may point to elements that no longer exist. To handle this, `resolveRef()` runs an async `count()` check before using any ref — if the element count is 0, it throws immediately with a message telling the agent to re-run `snapshot`. This fails fast (~5ms) instead of waiting for Playwright's 30-second action timeout.
|
|
|
|
**Extended snapshot features:**
|
|
- `--diff` (`-D`): Stores each snapshot as a baseline. On the next `-D` call, returns a unified diff showing what changed. Use this to verify that an action (click, fill, etc.) actually worked.
|
|
- `--annotate` (`-a`): Injects temporary overlay divs at each ref's bounding box, takes a screenshot with ref labels visible, then removes the overlays. Use `-o <path>` to control the output path.
|
|
- `--cursor-interactive` (`-C`): Scans for non-ARIA interactive elements (divs with `cursor:pointer`, `onclick`, `tabindex>=0`) using `page.evaluate`. Assigns `@c1`, `@c2`... refs with deterministic `nth-child` CSS selectors. These are elements the ARIA tree misses but users can still click.
|
|
|
|
### Screenshot modes
|
|
|
|
The `screenshot` command supports four modes:
|
|
|
|
| Mode | Syntax | Playwright API |
|
|
|------|--------|----------------|
|
|
| Full page (default) | `screenshot [path]` | `page.screenshot({ fullPage: true })` |
|
|
| Viewport only | `screenshot --viewport [path]` | `page.screenshot({ fullPage: false })` |
|
|
| Element crop | `screenshot "#sel" [path]` or `screenshot @e3 [path]` | `locator.screenshot()` |
|
|
| Region clip | `screenshot --clip x,y,w,h [path]` | `page.screenshot({ clip })` |
|
|
|
|
Element crop accepts CSS selectors (`.class`, `#id`, `[attr]`) or `@e`/`@c` refs from `snapshot`. Auto-detection: `@e`/`@c` prefix = ref, `.`/`#`/`[` prefix = CSS selector, `--` prefix = flag, everything else = output path.
|
|
|
|
Mutual exclusion: `--clip` + selector and `--viewport` + `--clip` both throw errors. Unknown flags (e.g. `--bogus`) also throw.
|
|
|
|
### Authentication
|
|
|
|
Each server session generates a random UUID as a bearer token. The token is written to the state file (`.gstack/browse.json`) with chmod 600. Every HTTP request must include `Authorization: Bearer <token>`. This prevents other processes on the machine from controlling the browser.
|
|
|
|
### Console, network, and dialog capture
|
|
|
|
The server hooks into Playwright's `page.on('console')`, `page.on('response')`, and `page.on('dialog')` events. All entries are kept in O(1) circular buffers (50,000 capacity each) and flushed to disk asynchronously via `Bun.write()`:
|
|
|
|
- Console: `.gstack/browse-console.log`
|
|
- Network: `.gstack/browse-network.log`
|
|
- Dialog: `.gstack/browse-dialog.log`
|
|
|
|
The `console`, `network`, and `dialog` commands read from the in-memory buffers, not disk.
|
|
|
|
### Real browser mode (`connect`)
|
|
|
|
Instead of headless Chromium, `connect` launches your real Chrome as a headed window controlled by Playwright. You see everything Claude does in real time.
|
|
|
|
```bash
|
|
$B connect # launch real Chrome, headed
|
|
$B goto https://app.com # navigates in the visible window
|
|
$B snapshot -i # refs from the real page
|
|
$B click @e3 # clicks in the real window
|
|
$B focus # bring Chrome window to foreground (macOS)
|
|
$B status # shows Mode: cdp
|
|
$B disconnect # back to headless mode
|
|
```
|
|
|
|
The window has a subtle green shimmer line at the top edge and a floating "gstack" pill in the bottom-right corner so you always know which Chrome window is being controlled.
|
|
|
|
**How it works:** Playwright's `channel: 'chrome'` launches your system Chrome binary via a native pipe protocol — not CDP WebSocket. All existing browse commands work unchanged because they go through Playwright's abstraction layer.
|
|
|
|
**When to use it:**
|
|
- QA testing where you want to watch Claude click through your app
|
|
- Design review where you need to see exactly what Claude sees
|
|
- Debugging where headless behavior differs from real Chrome
|
|
- Demos where you're sharing your screen
|
|
|
|
**Commands:**
|
|
|
|
| Command | What it does |
|
|
|---------|-------------|
|
|
| `connect` | Launch real Chrome, restart server in headed mode |
|
|
| `disconnect` | Close real Chrome, restart in headless mode |
|
|
| `focus` | Bring Chrome to foreground (macOS). `focus @e3` also scrolls element into view |
|
|
| `status` | Shows `Mode: cdp` when connected, `Mode: launched` when headless |
|
|
|
|
**CDP-aware skills:** When in real-browser mode, `/qa` and `/design-review` automatically skip cookie import prompts and headless workarounds.
|
|
|
|
### Chrome extension (Side Panel)
|
|
|
|
A Chrome extension that shows a live activity feed of browse commands in a Side Panel, plus @ref overlays on the page.
|
|
|
|
#### Automatic install (recommended)
|
|
|
|
When you run `$B connect`, the extension **auto-loads** into the Playwright-controlled Chrome window. No manual steps needed — the Side Panel is immediately available.
|
|
|
|
```bash
|
|
$B connect # launches Chrome with extension pre-loaded
|
|
# Click the gstack icon in toolbar → Open Side Panel
|
|
```
|
|
|
|
The port is auto-configured. You're done.
|
|
|
|
#### Manual install (for your regular Chrome)
|
|
|
|
If you want the extension in your everyday Chrome (not the Playwright-controlled one), run:
|
|
|
|
```bash
|
|
bin/gstack-extension # opens chrome://extensions, copies path to clipboard
|
|
```
|
|
|
|
Or do it manually:
|
|
|
|
1. **Go to `chrome://extensions`** in Chrome's address bar
|
|
2. **Toggle "Developer mode" ON** (top-right corner)
|
|
3. **Click "Load unpacked"** — a file picker opens
|
|
4. **Navigate to the extension folder:** Press **Cmd+Shift+G** in the file picker to open "Go to folder", then paste one of these paths:
|
|
- Global install: `~/.claude/skills/gstack/extension`
|
|
- Dev/source: `<gstack-repo>/extension`
|
|
|
|
Press Enter, then click **Select**.
|
|
|
|
(Tip: macOS hides folders starting with `.` — press **Cmd+Shift+.** in the file picker to reveal them if you prefer to navigate manually.)
|
|
|
|
5. **Pin it:** Click the puzzle piece icon (Extensions) in the toolbar → pin "gstack browse"
|
|
6. **Set the port:** Click the gstack icon → enter the port from `$B status` or `.gstack/browse.json`
|
|
7. **Open Side Panel:** Click the gstack icon → "Open Side Panel"
|
|
|
|
#### What you get
|
|
|
|
| Feature | What it does |
|
|
|---------|-------------|
|
|
| **Toolbar badge** | Green dot when the browse server is reachable, gray when not |
|
|
| **Side Panel** | Live scrolling feed of every browse command — shows command name, args, duration, status (success/error) |
|
|
| **Refs tab** | After `$B snapshot`, shows the current @ref list (role + name) |
|
|
| **@ref overlays** | Floating panel on the page showing current refs |
|
|
| **Connection pill** | Small "gstack" pill in the bottom-right corner of every page when connected |
|
|
|
|
#### Troubleshooting
|
|
|
|
- **Badge stays gray:** Check that the port is correct. The browse server may have restarted on a different port — re-run `$B status` and update the port in the popup.
|
|
- **Side Panel is empty:** The feed only shows activity after the extension connects. Run a browse command (`$B snapshot`) to see it appear.
|
|
- **Extension disappeared after Chrome update:** Sideloaded extensions persist across updates. If it's gone, reload it from Step 3.
|
|
|
|
### Sidebar agent
|
|
|
|
The Chrome side panel includes a chat interface. Type a message and a child Claude instance executes it in the browser. The sidebar agent has access to `Bash`, `Read`, `Glob`, and `Grep` tools (same as Claude Code, minus `Edit` and `Write` ... read-only by design).
|
|
|
|
**How it works:**
|
|
|
|
1. You type a message in the side panel chat
|
|
2. The extension POSTs to the local browse server (`/sidebar-command`)
|
|
3. The server queues the message and the sidebar-agent process spawns `claude -p` with your message + the current page context
|
|
4. Claude executes browse commands via Bash (`$B snapshot`, `$B click @e3`, etc.)
|
|
5. Progress streams back to the side panel in real time
|
|
|
|
**What you can do:**
|
|
- "Take a snapshot and describe what you see"
|
|
- "Click the Login button, fill in test@example.com / password123, and submit"
|
|
- "Go through every row in this table and extract the names and emails"
|
|
- "Navigate to Settings > Account and screenshot it"
|
|
|
|
**Timeout:** Each task gets up to 5 minutes. Multi-page workflows (navigating a directory, filling forms across pages) work within this window. If a task times out, the side panel shows an error and you can retry or break it into smaller steps.
|
|
|
|
**Session isolation:** Each sidebar session runs in its own git worktree. The sidebar agent won't interfere with your main Claude Code session.
|
|
|
|
**Authentication:** The sidebar agent uses the same browser session as headed mode. Two options:
|
|
1. Log in manually in the headed browser ... your session persists for the sidebar agent
|
|
2. Import cookies from your real Chrome via `/setup-browser-cookies`
|
|
|
|
**Random delays:** If you need the agent to pause between actions (e.g., to avoid rate limits), use `sleep` in bash or `$B wait <milliseconds>`.
|
|
|
|
### User handoff
|
|
|
|
When the headless browser can't proceed (CAPTCHA, MFA, complex auth), `handoff` opens a visible Chrome window at the exact same page with all cookies, localStorage, and tabs preserved. The user solves the problem manually, then `resume` returns control to the agent with a fresh snapshot.
|
|
|
|
```bash
|
|
$B handoff "Stuck on CAPTCHA at login page" # opens visible Chrome
|
|
# User solves CAPTCHA...
|
|
$B resume # returns to headless with fresh snapshot
|
|
```
|
|
|
|
The browser auto-suggests `handoff` after 3 consecutive failures. State is fully preserved across the switch — no re-login needed.
|
|
|
|
### Dialog handling
|
|
|
|
Dialogs (alert, confirm, prompt) are auto-accepted by default to prevent browser lockup. The `dialog-accept` and `dialog-dismiss` commands control this behavior. For prompts, `dialog-accept <text>` provides the response text. All dialogs are logged to the dialog buffer with type, message, and action taken.
|
|
|
|
### JavaScript execution (`js` and `eval`)
|
|
|
|
`js` runs a single expression, `eval` runs a JS file. Both support `await` — expressions containing `await` are automatically wrapped in an async context:
|
|
|
|
```bash
|
|
$B js "await fetch('/api/data').then(r => r.json())" # works
|
|
$B js "document.title" # also works (no wrapping needed)
|
|
$B eval my-script.js # file with await works too
|
|
```
|
|
|
|
For `eval` files, single-line files return the expression value directly. Multi-line files need explicit `return` when using `await`. Comments containing "await" don't trigger wrapping.
|
|
|
|
### Multi-workspace support
|
|
|
|
Each workspace gets its own isolated browser instance with its own Chromium process, tabs, cookies, and logs. State is stored in `.gstack/` inside the project root (detected via `git rev-parse --show-toplevel`).
|
|
|
|
| Workspace | State file | Port |
|
|
|-----------|------------|------|
|
|
| `/code/project-a` | `/code/project-a/.gstack/browse.json` | random (10000-60000) |
|
|
| `/code/project-b` | `/code/project-b/.gstack/browse.json` | random (10000-60000) |
|
|
|
|
No port collisions. No shared state. Each project is fully isolated.
|
|
|
|
### Environment variables
|
|
|
|
| Variable | Default | Description |
|
|
|----------|---------|-------------|
|
|
| `BROWSE_PORT` | 0 (random 10000-60000) | Fixed port for the HTTP server (debug override) |
|
|
| `BROWSE_IDLE_TIMEOUT` | 1800000 (30 min) | Idle shutdown timeout in ms |
|
|
| `BROWSE_STATE_FILE` | `.gstack/browse.json` | Path to state file (CLI passes to server) |
|
|
| `BROWSE_SERVER_SCRIPT` | auto-detected | Path to server.ts |
|
|
| `BROWSE_CDP_URL` | (none) | Set to `channel:chrome` for real browser mode |
|
|
| `BROWSE_CDP_PORT` | 0 | CDP port (used internally) |
|
|
|
|
### Performance
|
|
|
|
| Tool | First call | Subsequent calls | Context overhead per call |
|
|
|------|-----------|-----------------|--------------------------|
|
|
| Chrome MCP | ~5s | ~2-5s | ~2000 tokens (schema + protocol) |
|
|
| Playwright MCP | ~3s | ~1-3s | ~1500 tokens (schema + protocol) |
|
|
| **gstack browse** | **~3s** | **~100-200ms** | **0 tokens** (plain text stdout) |
|
|
|
|
The context overhead difference compounds fast. In a 20-command browser session, MCP tools burn 30,000-40,000 tokens on protocol framing alone. gstack burns zero.
|
|
|
|
### Why CLI over MCP?
|
|
|
|
MCP (Model Context Protocol) works well for remote services, but for local browser automation it adds pure overhead:
|
|
|
|
- **Context bloat**: every MCP call includes full JSON schemas and protocol framing. A simple "get the page text" costs 10x more context tokens than it should.
|
|
- **Connection fragility**: persistent WebSocket/stdio connections drop and fail to reconnect.
|
|
- **Unnecessary abstraction**: Claude Code already has a Bash tool. A CLI that prints to stdout is the simplest possible interface.
|
|
|
|
gstack skips all of this. Compiled binary. Plain text in, plain text out. No protocol. No schema. No connection management.
|
|
|
|
## Acknowledgments
|
|
|
|
The browser automation layer is built on [Playwright](https://playwright.dev/) by Microsoft. Playwright's accessibility tree API, locator system, and headless Chromium management are what make ref-based interaction possible. The snapshot system — assigning `@ref` labels to accessibility tree nodes and mapping them back to Playwright Locators — is built entirely on top of Playwright's primitives. Thank you to the Playwright team for building such a solid foundation.
|
|
|
|
## Development
|
|
|
|
### Prerequisites
|
|
|
|
- [Bun](https://bun.sh/) v1.0+
|
|
- Playwright's Chromium (installed automatically by `bun install`)
|
|
|
|
### Quick start
|
|
|
|
```bash
|
|
bun install # install dependencies + Playwright Chromium
|
|
bun test # run integration tests (~3s)
|
|
bun run dev <cmd> # run CLI from source (no compile)
|
|
bun run build # compile to browse/dist/browse
|
|
```
|
|
|
|
### Dev mode vs compiled binary
|
|
|
|
During development, use `bun run dev` instead of the compiled binary. It runs `browse/src/cli.ts` directly with Bun, so you get instant feedback without a compile step:
|
|
|
|
```bash
|
|
bun run dev goto https://example.com
|
|
bun run dev text
|
|
bun run dev snapshot -i
|
|
bun run dev click @e3
|
|
```
|
|
|
|
The compiled binary (`bun run build`) is only needed for distribution. It produces a single ~58MB executable at `browse/dist/browse` using Bun's `--compile` flag.
|
|
|
|
### Running tests
|
|
|
|
```bash
|
|
bun test # run all tests
|
|
bun test browse/test/commands # run command integration tests only
|
|
bun test browse/test/snapshot # run snapshot tests only
|
|
bun test browse/test/cookie-import-browser # run cookie import unit tests only
|
|
```
|
|
|
|
Tests spin up a local HTTP server (`browse/test/test-server.ts`) serving HTML fixtures from `browse/test/fixtures/`, then exercise the CLI commands against those pages. 203 tests across 3 files, ~15 seconds total.
|
|
|
|
### Source map
|
|
|
|
| File | Role |
|
|
|------|------|
|
|
| `browse/src/cli.ts` | Entry point. Reads `.gstack/browse.json`, sends HTTP to the server, prints response. |
|
|
| `browse/src/server.ts` | Bun HTTP server. Routes commands to the right handler. Manages idle timeout. |
|
|
| `browse/src/browser-manager.ts` | Chromium lifecycle — launch, tab management, ref map, crash detection. |
|
|
| `browse/src/snapshot.ts` | Parses accessibility tree, assigns `@e`/`@c` refs, builds Locator map. Handles `--diff`, `--annotate`, `-C`. |
|
|
| `browse/src/read-commands.ts` | Non-mutating commands: `text`, `html`, `links`, `js`, `css`, `is`, `dialog`, `forms`, etc. Exports `getCleanText()`. |
|
|
| `browse/src/write-commands.ts` | Mutating commands: `goto`, `click`, `fill`, `upload`, `dialog-accept`, `useragent` (with context recreation), etc. |
|
|
| `browse/src/meta-commands.ts` | Server management, chain routing, diff (DRY via `getCleanText`), snapshot delegation. |
|
|
| `browse/src/cookie-import-browser.ts` | Decrypt Chromium cookies from macOS and Linux browser profiles using platform-specific safe-storage key lookup. Auto-detects installed browsers. |
|
|
| `browse/src/cookie-picker-routes.ts` | HTTP routes for `/cookie-picker/*` — browser list, domain search, import, remove. |
|
|
| `browse/src/cookie-picker-ui.ts` | Self-contained HTML generator for the interactive cookie picker (dark theme, no frameworks). |
|
|
| `browse/src/activity.ts` | Activity streaming — `ActivityEntry` type, `CircularBuffer`, privacy filtering, SSE subscriber management. |
|
|
| `browse/src/buffers.ts` | `CircularBuffer<T>` (O(1) ring buffer) + console/network/dialog capture with async disk flush. |
|
|
|
|
### Deploying to the active skill
|
|
|
|
The active skill lives at `~/.claude/skills/gstack/`. After making changes:
|
|
|
|
1. Push your branch
|
|
2. Pull in the skill directory: `cd ~/.claude/skills/gstack && git pull`
|
|
3. Rebuild: `cd ~/.claude/skills/gstack && bun run build`
|
|
|
|
Or copy the binary directly: `cp browse/dist/browse ~/.claude/skills/gstack/browse/dist/browse`
|
|
|
|
### Adding a new command
|
|
|
|
1. Add the handler in `read-commands.ts` (non-mutating) or `write-commands.ts` (mutating)
|
|
2. Register the route in `server.ts`
|
|
3. Add a test case in `browse/test/commands.test.ts` with an HTML fixture if needed
|
|
4. Run `bun test` to verify
|
|
5. Run `bun run build` to compile
|