Files
gstack/browse/src/browser-manager.ts
T
Garry Tan 3f080de1b7 feat: GStack Browser — double-click AI browser with anti-bot stealth (#695)
* feat: CDP inspector module — persistent sessions, CSS cascade, style modification

New browse/src/cdp-inspector.ts with full CDP inspection engine:
- inspectElement() via CSS.getMatchedStylesForNode + DOM.getBoxModel
- modifyStyle() via CSS.setStyleTexts with headless page.evaluate fallback
- Persistent CDP session lifecycle (create, reuse, detach on nav, re-create)
- Specificity sorting, overridden property detection, UA rule filtering
- Modification history with undo support
- formatInspectorResult() for CLI output

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: browse server inspector endpoints + inspect/style/cleanup/prettyscreenshot CLI

Server endpoints: POST /inspector/pick, GET /inspector, POST /inspector/apply,
POST /inspector/reset, GET /inspector/history, GET /inspector/events (SSE).
CLI commands: inspect (CDP cascade), style (live CSS mod), cleanup (page clutter
removal), prettyscreenshot (clean screenshot pipeline).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: sidebar CSS inspector — element picker, box model, rule cascade, quick edit

Extension changes for the visual CSS inspector:
- inspector.js: element picker with hover highlight, CSS selector generation,
  basic mode fallback (getComputedStyle + CSSOM), page alteration handlers
- inspector.css: picker overlay styles (blue highlight + tooltip)
- background.js: inspector message routing (picker <-> server <-> sidepanel)
- sidepanel: Inspector tab with box model viz (gstack palette), matched rules
  with specificity badges, computed styles, click-to-edit quick edit,
  Send to Agent/Code button, empty/loading/error states

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: document inspect, style, cleanup, prettyscreenshot browse commands

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: auto-track user-created tabs and handle tab close

browser-manager.ts changes:
- context.on('page') listener: automatically tracks tabs opened by the user
  (Cmd+T, right-click open in new tab, window.open). Previously only
  programmatic newTab() was tracked, so user tabs were invisible.
- page.on('close') handler in wirePageEvents: removes closed tabs from the
  pages map and switches activeTabId to the last remaining tab.
- syncActiveTabByUrl: match Chrome extension's active tab URL to the correct
  Playwright page for accurate tab identity.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: per-tab agent isolation via BROWSE_TAB environment variable

Prevents parallel sidebar agents from interfering with each other's tab context.

Three-layer fix:
- sidebar-agent.ts: passes BROWSE_TAB=<tabId> env var to each claude process,
  per-tab processing set allows concurrent agents across tabs
- cli.ts: reads process.env.BROWSE_TAB and includes tabId in command request body
- server.ts: handleCommand() temporarily switches activeTabId when tabId is present,
  restores after command completes (safe: Bun event loop is single-threaded)

Also: per-tab agent state (TabAgentState map), per-tab message queuing,
per-tab chat buffers, verbose streaming narration, stop button endpoint.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: sidebar per-tab chat context, tab bar sync, stop button, UX polish

Extension changes:
- sidepanel.js: per-tab chat history (tabChatHistories map), switchChatTab()
  swaps entire chat view, browserTabActivated handler for instant tab sync,
  stop button wired to /sidebar-agent/stop, pollTabs renders tab bar
- sidepanel.html: updated banner text ("Browser co-pilot"), stop button markup,
  input placeholder "Ask about this page..."
- sidepanel.css: tab bar styles, stop button styles, loading state fixes
- background.js: chrome.tabs.onActivated sends browserTabActivated to sidepanel
  with tab URL for instant tab switch detection

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* test: per-tab isolation, BROWSE_TAB pinning, tab tracking, sidebar UX

sidebar-agent.test.ts (new tests):
- BROWSE_TAB env var passed to claude process
- CLI reads BROWSE_TAB and sends tabId in body
- handleCommand accepts tabId, saves/restores activeTabId
- Tab pinning only activates when tabId provided
- Per-tab agent state, queue, concurrency
- processingTabs set for parallel agents

sidebar-ux.test.ts (new tests):
- context.on('page') tracks user-created tabs
- page.on('close') removes tabs from pages map
- Tab isolation uses BROWSE_TAB not system prompt hack
- Per-tab chat context in sidepanel
- Tab bar rendering, stop button, banner text

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: resolve merge conflicts — keep security defenses + per-tab isolation

Merged main's security improvements (XML escaping, prompt injection defense,
allowed commands whitelist, --model opus, Write tool, stderr capture) with
our branch's per-tab isolation (BROWSE_TAB env var, processingTabs set,
no --resume). Updated test expectations for expanded system prompt.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* chore: bump version and changelog (v0.13.9.0)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: add inspector message types to background.js allowlist

Pre-existing bug found by Codex: ALLOWED_TYPES in background.js was missing
all inspector message types (startInspector, stopInspector, elementPicked,
pickerCancelled, applyStyle, toggleClass, injectCSS, resetAll, inspectResult).
Messages were silently rejected, making the inspector broken on ALL pages.

Also: separate executeScript and insertCSS into individual try blocks in
injectInspector(), store inspectorMode for routing, and add content.js
fallback when script injection fails (CSP, chrome:// pages).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: basic element picker in content.js for CSP-restricted pages

When inspector.js can't be injected (CSP, chrome:// pages), content.js
provides a basic picker using getComputedStyle + CSSOM:
- startBasicPicker/stopBasicPicker message handlers
- captureBasicData() with ~30 key CSS properties, box model, matched rules
- Hover highlight with outline save/restore (never leaves artifacts)
- Click uses e.target directly (no re-querying by selector)
- Sends inspectResult with mode:'basic' for sidebar rendering
- Escape key cancels picker and restores outlines

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: cleanup + screenshot buttons in sidebar inspector toolbar

Two action buttons in the inspector toolbar:
- Cleanup (🧹): POSTs cleanup --all to server, shows spinner, chat
  notification on success, resets inspector state (element may be removed)
- Screenshot (📸): POSTs screenshot to server, shows spinner, chat
  notification with saved file path

Shared infrastructure:
- .inspector-action-btn CSS with loading spinner via ::after pseudo-element
- chat-notification type in addChatEntry() for system messages
- package.json version bump to 0.13.9.0

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* test: inspector allowlist, CSP fallback, cleanup/screenshot buttons

16 new tests in sidebar-ux.test.ts:
- Inspector message allowlist includes all inspector types
- content.js basic picker (startBasicPicker, captureBasicData, CSSOM,
  outline save/restore, inspectResult with mode basic, Escape cleanup)
- background.js CSP fallback (separate try blocks, inspectorMode, fallback)
- Cleanup button (POST /command, inspector reset after success)
- Screenshot button (POST /command, notification rendering)
- Chat notification type and CSS styles

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: update project documentation for v0.13.9.0

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: cleanup + screenshot buttons in chat toolbar (not just inspector)

Quick actions toolbar (🧹 Cleanup, 📸 Screenshot) now appears above the chat
input, always visible. Both inspector and chat buttons share runCleanup() and
runScreenshot() helper functions. Clicking either set shows loading state on
both simultaneously.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* test: chat toolbar buttons, shared helpers, quick-action-btn styles

Tests that chat toolbar exists (chat-cleanup-btn, chat-screenshot-btn,
quick-actions container), CSS styles (.quick-action-btn, .quick-action-btn.loading),
shared runCleanup/runScreenshot helper functions, and cleanup inspector reset.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: aggressive cleanup heuristics — overlays, scroll unlock, blur removal

Massively expanded CLEANUP_SELECTORS with patterns from uBlock Origin and
Readability.js research:
- ads: 30+ selectors (Google, Amazon, Outbrain, Taboola, Criteo, etc.)
- cookies: OneTrust, Cookiebot, TrustArc, Quantcast + generic patterns
- overlays (NEW): paywalls, newsletter popups, interstitials, push prompts,
  app download banners, survey modals
- social: follow prompts, share tools
- Cleanup now defaults to --all when no args (sidebar button fix)
- Uses !important on all display:none (overrides inline styles)
- Unlocks body/html scroll (overflow:hidden from modal lockout)
- Removes blur/filter effects (paywall content blur)
- Removes max-height truncation (article teaser truncation)
- Collapses empty ad placeholder whitespace (empty divs after ad removal)
- Skips gstack-ctrl indicator in sticky removal

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: disable action buttons when disconnected, no error spam

- setActionButtonsEnabled() toggles .disabled class on all cleanup/screenshot
  buttons (both chat toolbar and inspector toolbar)
- Called with false in updateConnection when server URL is null
- Called with true when connection established
- runCleanup/runScreenshot silently return when disconnected instead of
  showing 'Not connected' error notifications
- CSS .disabled style: pointer-events:none, opacity:0.3, cursor:not-allowed

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* test: cleanup heuristics, button disabled state, overlay selectors

17 new tests:
- cleanup defaults to --all on empty args
- CLEANUP_SELECTORS overlays category (paywall, newsletter, interstitial)
- Major ad networks in selectors (doubleclick, taboola, criteo, etc.)
- Major consent frameworks (OneTrust, Cookiebot, TrustArc, Quantcast)
- !important override for inline styles
- Scroll unlock (body overflow:hidden)
- Blur removal (paywall content blur)
- Article truncation removal (max-height)
- Empty placeholder collapse
- gstack-ctrl indicator skip in sticky cleanup
- setActionButtonsEnabled function
- Buttons disabled when disconnected
- No error spam from cleanup/screenshot when disconnected
- CSS disabled styles for action buttons

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: LLM-based page cleanup — agent analyzes page semantically

Instead of brittle CSS selectors, the cleanup button now sends a prompt to
the sidebar agent (which IS an LLM). The agent:
1. Runs deterministic $B cleanup --all as a quick first pass
2. Takes a snapshot to see what's left
3. Analyzes the page semantically to identify remaining clutter
4. Removes elements intelligently, preserving site branding

This means cleanup works correctly on any site without site-specific selectors.
The LLM understands that "Your Daily Puzzles" is clutter, "ADVERTISEMENT" is
junk, but the SF Chronicle masthead should stay.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: aggressive cleanup heuristics + preserve top nav bar

Deterministic cleanup improvements (used as first pass before LLM analysis):
- New 'clutter' category: audio players, podcast widgets, sidebar puzzles/games,
  recirculation widgets (taboola, outbrain, nativo), cross-promotion banners
- Text-content detection: removes "ADVERTISEMENT", "Article continues below",
  "Sponsored", "Paid content" labels and their parent wrappers
- Sticky fix: preserves the topmost full-width element near viewport top (site
  nav bar) instead of hiding all sticky/fixed elements. Sorts by vertical
  position, preserves the first one that spans >80% viewport width.

Tests: clutter category, ad label removal, nav bar preservation logic.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* test: LLM-based cleanup architecture, deterministic heuristics, sticky nav

22 new tests covering:
- Cleanup button uses /sidebar-command (agent) not /command (deterministic)
- Cleanup prompt includes deterministic first pass + agent snapshot analysis
- Cleanup prompt lists specific clutter categories for agent guidance
- Cleanup prompt preserves site identity (masthead, headline, body, byline)
- Cleanup prompt instructs scroll unlock and $B eval removal
- Loading state management (async agent, setTimeout)
- Deterministic clutter: audio/podcast, games/puzzles, recirculation
- Ad label text patterns (ADVERTISEMENT, Sponsored, Article continues)
- Ad label parent wrapper hiding for small containers
- Sticky nav preservation (sort by position, first full-width near top)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: GStack Browser stealth + branding — anti-bot patches, custom UA, rebrand

- Add GSTACK_CHROMIUM_PATH env var for custom Chromium binary
- Add BROWSE_EXTENSIONS_DIR env var for extension path override
- Move auth token to /health endpoint (fixes read-only .app bundles)
- Anti-bot stealth: disable navigator.webdriver, fake plugins, languages
- Custom user agent: Chrome/<version> GStackBrowser (auto-detects version)
- Rebrand Chromium plist to "GStack Browser" at launch time
- Update security test to match new token-via-health approach

* feat: GStack Browser .app bundle — launcher script + build system

- scripts/app/gstack-browser: dual-mode launcher (dev + .app bundle)
- scripts/build-app.sh: compiles binary, bundles Chromium + extension, creates DMG
- Rebrands Chromium plist during build for "GStack Browser" in menu bar
- 389MB .app, 189MB compressed DMG, launches in ~5s

* docs: GStack Browser V0 master plan — AI-native development browser vision

5-phase roadmap from .app wrapper through Chromium fork, 9 capability
visions, competitive landscape, architecture diagrams, design system.

* fix: restore package.json and sync version to 0.14.3.0

* chore: bump version and changelog (v0.14.4.0)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* chore: gitignore top-level dist/ (GStack Browser build output)

* feat: GStack Browser icon — custom .icns replaces Chromium's Dock icon

- Generated 1024px icon: dark terminal window with amber prompt cursor
- Converted to .icns with all macOS sizes (16-1024px, 1x and 2x)
- build-app.sh copies icon into both the outer .app and bundled Chromium's
  Resources (Chromium's process owns the Dock icon, not the launcher)
- browser-manager.ts patches Chromium's icon at runtime for dev mode too
- Both the Dock and Cmd+Tab now show the GStack icon

* feat: rename /connect-chrome → /open-gstack-browser

- Rename skill directory + update frontmatter name and description
- Update SKILL.md.tmpl to reference GStack Browser branding/stealth
- Create connect-chrome symlink for backwards compatibility
- Setup script creates /connect-chrome alias in .claude/skills/
- Fix package.json version sync (0.14.5.0 → 0.14.6.0)

* feat: rename /connect-chrome → /open-gstack-browser across all references

Update README skill lists, docs/skills.md deep dive, extension sidepanel
banner copy button, and reconnect clipboard text.

* feat: left-align sidebar UI + extension-ready event for welcome page

- Left-align all sidebar text (chat welcome, loading, empty states,
  notifications, inspector empty, session placeholder)
- Dispatch 'gstack-extension-ready' CustomEvent from content.js so
  the welcome page can detect when the sidebar is active

* chore: add GStack Browser TODOs — CDP stealth patches + Chromium fork

P1: rebrowser-style postinstall patcher for Playwright 1.58.2 (suppress
Runtime.enable, addBinding context discovery, 6 files, ~200 lines).
P2: long-term Chromium fork for permanent stealth + native sidebar.

* chore: regenerate open-gstack-browser/SKILL.md from template

Fix timeline skill name (connect-chrome → open-gstack-browser) and
preamble formatting from merge with main's updated template.

* feat: welcome page served from browse server on headed launch

- Add /welcome endpoint to server.ts, serves welcome.html
- Navigate to /welcome after server starts (not during launchHeaded,
  which runs before the server is listening)
- welcome.html bundled in browse/src/ for portability

* feat: auto-open sidebar on every browser launch, not just first install

- Add top-level setTimeout in background.js that fires on every service
  worker startup (onInstalled only fires on install/update)
- Remove misaligned arrow from welcome page, replace with text fallback
  that hides when extension content script fires gstack-extension-ready

* fix: sidebar auto-open retry with backoff + welcome page tests

- Replace single-attempt sidePanel.open() with autoOpenSidePanel() that
  retries up to 5 times with 500ms-5000ms backoff
- Fire on both onInstalled AND every service worker startup
- Remove misaligned arrow from welcome page, replace with text fallback
- Add 12 tests: welcome page structure, /welcome endpoint, headed launch
  navigation timing, sidebar auto-open retry logic, extension-ready event

* feat: reload button in sidebar footer

Adds a "reload" button next to "debug" and "clear" in the sidebar
footer. Calls location.reload() to fully refresh the side panel,
re-run connection logic, and clear stale state.

* feat: right-pointing arrow hint for sidebar on welcome page

Replace invisible text fallback with visible amber bubble + animated
right arrow (→) pointing toward where the sidebar opens. Always correct
regardless of window size (unlike the old up arrow at toolbar chrome).

* fix: sidebar auth race — pass token in getPort response

The sidebar called tryConnect() → getPort → got {port, connected} but
NO token. All subsequent requests (SSE, chat poll) failed with 401.
The token only arrived later via the health broadcast, but by then
the SSE connection was already broken.

Fix: include authToken in the getPort response so the sidebar has
the token from its very first connection attempt.

* feat: sidebar debug visibility + auth race tests

- Show attempt count in loading screen ("Connecting... attempt 3")
- After 5 failed attempts, show debug details (port, connected, token)
  so stuck users can see exactly what's failing
- Add 4 tests: getPort includes token, tryConnect uses token,
  dead state exists with MAX_RECONNECT_ATTEMPTS, reconnectAttempts visible

* fix: startup health check retries every 1s instead of 10s

Root cause: extension service worker starts before Bun.serve() is
listening. First checkHealth() fails, next attempt is 10 seconds
later. User stares at "Connecting..." for 10 seconds.

Fix: retry every 1s for up to 15 attempts on startup, then switch
to 10s polling once connected (or after 15s gives up). Sidebar
should connect within 1-2 seconds of server becoming available.

3 new tests verify the fast-retry → slow-poll transition.

* feat: detailed step-by-step status in sidebar loading screen

Replace useless "Connecting..." with real-time debug info:
- "Looking for browse server... (attempt N)"
- Shows port, server responding status, token status
- Shows chrome.runtime errors if extension messaging fails
- Tells user to run /open-gstack-browser if server not found

* fix: sidebar connects directly to /health instead of waiting for background

Root cause: sidepanel asked background "are you connected?" but background's
health check hadn't succeeded yet (1-10s gap). Sidepanel waited forever.

Fix: when background says not connected, sidepanel hits /health directly
with fetch(). Gets the token from the response. Bypasses background
entirely for initial connection. Shows step-by-step debug info:
"Checking server directly... port: 34567 / Trying GET /health..."

* fix: suppress fake "session ended" and timeout errors in sidebar

Two issues making the sidebar look broken when it's actually working:
1. "Timed out after 300s" error displayed after agent_done — this is a
   cleanup timer, not a real error. Now suppressed when no active session.
2. "(session ended)" text appended on every idle poll — removed entirely.
   The thinking spinner is cleaned up silently instead.

* fix: sidebar agent passes BROWSE_PORT to child claude

Ensures the child claude process connects to the existing headed
browse server (port 34567) instead of spawning a new headless one.
Without this, sidebar chat commands run in an invisible browser.

* feat: BROWSE_NO_AUTOSTART prevents sidebar from spawning headless browser

When set, the browse CLI refuses to start a new server and exits with
a clear error: "Server not available, run /open-gstack-browser to restart."
The sidebar agent sets this so users never get an invisible headless
browser when the headed one is closed.

* test: BROWSE_NO_AUTOSTART guard in CLI + sidebar-agent env vars

5 tests: CLI checks env var before starting server, shows actionable
error, sidebar-agent sets the flag + BROWSE_PORT, guard runs before
lock acquisition to prevent stale lock files.

* fix: stale auth token causes Unauthorized + invisible error text

background.js checkHealth() never refreshed authToken from /health responses,
so when the browse server restarted with a new token, all sidebar-command
requests got 401 Unauthorized forever.

Also: error placeholder text was #3f3f46 on #0C0C0C (nearly invisible).
Now shows in red to match the error border.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: replace 40+ silent catch blocks with debug logging

Every empty catch {} in sidepanel.js, sidebar-agent.ts now logs with
[gstack sidebar] or [sidebar-agent] prefix. Chat poll 401s, stop agent,
tab poll, clear chat, SSE parse, refs fetch, stream JSON parse, queue
read/parse, process kill — all now visible in console.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: noisy debug logging + auto model routing in browse server

Server-side silent catch blocks (22 instances) now log with [browse] prefix:
chat persistence, session save/load, agent kill, tab pin/restore, welcome
page, buffer flush, worktree cleanup, lock files, SSE streams.

Also adds pickSidebarModel() — routes sidebar messages to sonnet for
navigation/interaction (click, goto, fill, screenshot) and opus for
analysis/comprehension (summarize, describe, find bugs). Sonnet is
~4x faster for action commands with zero quality difference.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* test: update sidebar tests for model router + longer stopAgent slice

- stopAgent slice 800→1000 to accommodate added error logging lines
- Replace hardcoded opus assertion with model router assertions

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: sidebar arrow hint stays visible until sidebar actually opens

Previously the welcome page arrow hid immediately when the extension's
content script loaded — but extension loaded ≠ sidebar open. Now the
signal flow is: sidepanel connects → tells background.js → relays to
content script → dispatches gstack-extension-ready → arrow hides.

Adds welcome-page.test.ts: 14 tests verifying arrow, branding, feature
cards, dark theme, and auto-hide behavior via real HTTP server.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* test: arrow hide signal chain (4-step) + stale session-ended assertion

8 new tests verify the sidebarOpened → background → content → welcome
signal chain. Updates stale "(session ended)" test that checked for
text removed in a prior commit.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: preserve optimistic UI during tab switch on first message

When the user sends a message and the server assigns it to a new tab
(because Chrome's active tab changed), switchChatTab() was blowing away
the optimistic user bubble and thinking dots with a welcome screen.
Now preserves the current DOM if we're mid-send with a thinking indicator.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: sidebar message flow architecture doc + CLAUDE.md pointer

SIDEBAR_MESSAGE_FLOW.md documents the full init timeline, message flow
(user types → claude responds), auth token chain, arrow hint signal
chain, model routing, tab concurrency, and known failure modes.

CLAUDE.md now tells you to read it before touching sidebar files.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: sidebar chat resets idle timer + shutdown kills sidebar-agent

Two fixes for the "browser died while chatting" problem:

1. /sidebar-command now calls resetIdleTimer(). Previously only CLI
   commands reset it, so the server would shut down after 30 min even
   while the user was actively chatting in the sidebar.

2. shutdown() now pkills the sidebar-agent daemon. Previously the agent
   survived server shutdown, kept polling a dead server, and spawned
   confused claude processes that auto-started headless browsers.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: disable idle timeout in headed mode — browser lives until closed

The 30-minute idle timeout only applies to headless mode now. In headed
mode the user is looking at the Chrome window, so auto-shutdown is wrong.
The browser stays alive until explicit disconnect or window close.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: cookies button in sidebar footer opens cookie picker

One-click cookie import from the sidebar. Navigates the headed browser
to /cookie-picker where you can select which domains to import from
your real Chrome profile.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: update project documentation for GStack Browser improvements

README.md: updated Real browser mode and sidebar agent sections with
model routing, cookie import button, no idle timeout in headed mode.
Updated skill table entries for /browse and /open-gstack-browser.

docs/skills.md: updated /open-gstack-browser deep dive with model
routing and cookie import details.

GSTACK_BROWSER_V0.md: added 6 new SHIPPED items to implementation
status table (model routing, debug logging, idle timeout, cookie
button, arrow hint, architecture doc).

TODOS.md: marked "Sidebar agent Write tool + error visibility" as
SHIPPED. Added new P2 TODO for direct API calls to eliminate
claude -p startup tax.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add Claude Code terminal example to welcome page TRY IT NOW

Fifth example shows the parent agent workflow: navigate, extract CSS,
write to file. The other four are all sidebar-only. This one shows
co-presence — the Claude Code session that launched the browser can
also control it directly.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: hide internal tool-result file reads from sidebar activity

Claude reads its own ~/.claude/projects/.../tool-results/ files as
internal plumbing. These showed up as long unreadable paths in the
sidebar. Now: describeToolCall returns empty for tool-result reads,
and the sidebar skips rendering tool_use entries with no description.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: collapse tool calls into "See reasoning" disclosure on completion

While the agent is working, tool calls stream live so you can watch
progress. When the agent finishes, all tool calls collapse into a
"See reasoning (N steps)" disclosure. Click to expand and see what
the agent did. The final text answer stays visible.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* test: 17 new tests for recent sidebar fixes

Covers: tool-result file filtering, empty tool_use skip, reasoning
disclosure collapse, idle timeout headed mode bypass, sidebar-command
idle reset, shutdown sidebar-agent kill, cookie button, and model
routing analysis-before-action priority.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: move cookies button to quick actions toolbar

Cookies now sits next to Cleanup and Screenshot as a primary action
button (🍪 Cookies) instead of buried in the footer. Same behavior,
more discoverable.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add instructional text to cookie picker page

"Select the domains of cookies you want to import to GStack Browser.
You'll be able to browse those sites with the same login as your
other browser."

Also fixes stale test that expected hardcoded '--model', 'opus'.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: 6-card welcome page with cookie import + dual-agent cards

3x2 grid layout (was 2x2). New cards: "Import your cookies" (click
🍪 Cookies to import login sessions from Chrome/Arc/Brave) and
"Or use your main agent" (your Claude Code terminal also controls
this browser). Responsive: 3 cols > 2 cols > 1 col.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: move sidebar arrow hint to top-right instead of vertically centered

The arrow was centered vertically which put it behind the feature cards.
Now positioned at top: 80px where there's open space and it's more visible.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-04 10:17:05 -07:00

1155 lines
42 KiB
TypeScript

/**
* Browser lifecycle manager
*
* Chromium crash handling:
* browser.on('disconnected') → log error → process.exit(1)
* CLI detects dead server → auto-restarts on next command
* We do NOT try to self-heal — don't hide failure.
*
* Dialog handling:
* page.on('dialog') → auto-accept by default → store in dialog buffer
* Prevents browser lockup from alert/confirm/prompt
*
* Context recreation (useragent):
* recreateContext() saves cookies/storage/URLs, creates new context,
* restores state. Falls back to clean slate on any failure.
*/
import { chromium, type Browser, type BrowserContext, type BrowserContextOptions, type Page, type Locator, type Cookie } from 'playwright';
import { addConsoleEntry, addNetworkEntry, addDialogEntry, networkBuffer, type DialogEntry } from './buffers';
import { validateNavigationUrl } from './url-validation';
export interface RefEntry {
locator: Locator;
role: string;
name: string;
}
export interface BrowserState {
cookies: Cookie[];
pages: Array<{
url: string;
isActive: boolean;
storage: { localStorage: Record<string, string>; sessionStorage: Record<string, string> } | null;
}>;
}
export class BrowserManager {
private browser: Browser | null = null;
private context: BrowserContext | null = null;
private pages: Map<number, Page> = new Map();
private activeTabId: number = 0;
private nextTabId: number = 1;
private extraHeaders: Record<string, string> = {};
private customUserAgent: string | null = null;
/** Server port — set after server starts, used by cookie-import-browser command */
public serverPort: number = 0;
// ─── Ref Map (snapshot → @e1, @e2, @c1, @c2, ...) ────────
private refMap: Map<string, RefEntry> = new Map();
// ─── Snapshot Diffing ─────────────────────────────────────
// NOT cleared on navigation — it's a text baseline for diffing
private lastSnapshot: string | null = null;
// ─── Dialog Handling ──────────────────────────────────────
private dialogAutoAccept: boolean = true;
private dialogPromptText: string | null = null;
// ─── Handoff State ─────────────────────────────────────────
private isHeaded: boolean = false;
private consecutiveFailures: number = 0;
// ─── Watch Mode ─────────────────────────────────────────
private watching = false;
public watchInterval: ReturnType<typeof setInterval> | null = null;
private watchSnapshots: string[] = [];
private watchStartTime: number = 0;
// ─── Headed State ────────────────────────────────────────
private connectionMode: 'launched' | 'headed' = 'launched';
private intentionalDisconnect = false;
getConnectionMode(): 'launched' | 'headed' { return this.connectionMode; }
// ─── Watch Mode Methods ─────────────────────────────────
isWatching(): boolean { return this.watching; }
startWatch(): void {
this.watching = true;
this.watchSnapshots = [];
this.watchStartTime = Date.now();
}
stopWatch(): { snapshots: string[]; duration: number } {
this.watching = false;
if (this.watchInterval) {
clearInterval(this.watchInterval);
this.watchInterval = null;
}
const snapshots = this.watchSnapshots;
const duration = Date.now() - this.watchStartTime;
this.watchSnapshots = [];
this.watchStartTime = 0;
return { snapshots, duration };
}
addWatchSnapshot(snapshot: string): void {
this.watchSnapshots.push(snapshot);
}
/**
* Find the gstack Chrome extension directory.
* Checks: repo root /extension, global install, dev install.
*/
private findExtensionPath(): string | null {
const fs = require('fs');
const path = require('path');
const candidates = [
// Explicit override via env var (used by GStack Browser.app bundle)
process.env.BROWSE_EXTENSIONS_DIR || '',
// Relative to this source file (dev mode: browse/src/ -> ../../extension)
path.resolve(__dirname, '..', '..', 'extension'),
// Global gstack install
path.join(process.env.HOME || '', '.claude', 'skills', 'gstack', 'extension'),
// Git repo root (detected via BROWSE_STATE_FILE location)
(() => {
const stateFile = process.env.BROWSE_STATE_FILE || '';
if (stateFile) {
const repoRoot = path.resolve(path.dirname(stateFile), '..');
return path.join(repoRoot, '.claude', 'skills', 'gstack', 'extension');
}
return '';
})(),
].filter(Boolean);
for (const candidate of candidates) {
try {
if (fs.existsSync(path.join(candidate, 'manifest.json'))) {
return candidate;
}
} catch {}
}
return null;
}
/**
* Get the ref map for external consumers (e.g., /refs endpoint).
*/
getRefMap(): Array<{ ref: string; role: string; name: string }> {
const refs: Array<{ ref: string; role: string; name: string }> = [];
for (const [ref, entry] of this.refMap) {
refs.push({ ref, role: entry.role, name: entry.name });
}
return refs;
}
async launch() {
// ─── Extension Support ────────────────────────────────────
// BROWSE_EXTENSIONS_DIR points to an unpacked Chrome extension directory.
// Extensions only work in headed mode, so we use an off-screen window.
const extensionsDir = process.env.BROWSE_EXTENSIONS_DIR;
const launchArgs: string[] = [];
let useHeadless = true;
// Docker/CI: Chromium sandbox requires unprivileged user namespaces which
// are typically disabled in containers. Detect container environment and
// add --no-sandbox automatically.
if (process.env.CI || process.env.CONTAINER) {
launchArgs.push('--no-sandbox');
}
if (extensionsDir) {
launchArgs.push(
`--disable-extensions-except=${extensionsDir}`,
`--load-extension=${extensionsDir}`,
'--window-position=-9999,-9999',
'--window-size=1,1',
);
useHeadless = false; // extensions require headed mode; off-screen window simulates headless
console.log(`[browse] Extensions loaded from: ${extensionsDir}`);
}
this.browser = await chromium.launch({
headless: useHeadless,
// On Windows, Chromium's sandbox fails when the server is spawned through
// the Bun→Node process chain (GitHub #276). Disable it — local daemon
// browsing user-specified URLs has marginal sandbox benefit.
chromiumSandbox: process.platform !== 'win32',
...(launchArgs.length > 0 ? { args: launchArgs } : {}),
});
// Chromium crash → exit with clear message
this.browser.on('disconnected', () => {
console.error('[browse] FATAL: Chromium process crashed or was killed. Server exiting.');
console.error('[browse] Console/network logs flushed to .gstack/browse-*.log');
process.exit(1);
});
const contextOptions: BrowserContextOptions = {
viewport: { width: 1280, height: 720 },
};
if (this.customUserAgent) {
contextOptions.userAgent = this.customUserAgent;
}
this.context = await this.browser.newContext(contextOptions);
if (Object.keys(this.extraHeaders).length > 0) {
await this.context.setExtraHTTPHeaders(this.extraHeaders);
}
// Create first tab
await this.newTab();
}
// ─── Headed Mode ─────────────────────────────────────────────
/**
* Launch Playwright's bundled Chromium in headed mode with the gstack
* Chrome extension auto-loaded. Uses launchPersistentContext() which
* is required for extension loading (launch() + newContext() can't
* load extensions).
*
* The browser launches headed with a visible window — the user sees
* every action Claude takes in real time.
*/
async launchHeaded(authToken?: string): Promise<void> {
// Clear old state before repopulating
this.pages.clear();
this.refMap.clear();
this.nextTabId = 1;
// Find the gstack extension directory for auto-loading
const extensionPath = this.findExtensionPath();
const launchArgs = [
'--hide-crash-restore-bubble',
// Anti-bot-detection: remove the navigator.webdriver flag that Playwright sets.
// Sites like Google and NYTimes check this to block automation browsers.
'--disable-blink-features=AutomationControlled',
];
if (extensionPath) {
launchArgs.push(`--disable-extensions-except=${extensionPath}`);
launchArgs.push(`--load-extension=${extensionPath}`);
// Write auth token for extension bootstrap.
// Write to ~/.gstack/.auth.json (not the extension dir, which may be read-only
// in .app bundles and breaks codesigning).
if (authToken) {
const fs = require('fs');
const path = require('path');
const gstackDir = path.join(process.env.HOME || '/tmp', '.gstack');
fs.mkdirSync(gstackDir, { recursive: true });
const authFile = path.join(gstackDir, '.auth.json');
try {
fs.writeFileSync(authFile, JSON.stringify({ token: authToken, port: this.serverPort || 34567 }), { mode: 0o600 });
} catch (err: any) {
console.warn(`[browse] Could not write .auth.json: ${err.message}`);
}
}
}
// Launch headed Chromium via Playwright's persistent context.
// Extensions REQUIRE launchPersistentContext (not launch + newContext).
// Real Chrome (executablePath/channel) silently blocks --load-extension,
// so we use Playwright's bundled Chromium which reliably loads extensions.
const fs = require('fs');
const path = require('path');
const userDataDir = path.join(process.env.HOME || '/tmp', '.gstack', 'chromium-profile');
fs.mkdirSync(userDataDir, { recursive: true });
// Support custom Chromium binary via GSTACK_CHROMIUM_PATH env var.
// Used by GStack Browser.app to point at the bundled Chromium.
const executablePath = process.env.GSTACK_CHROMIUM_PATH || undefined;
// Rebrand Chromium → GStack Browser in macOS menu bar / Dock / Cmd+Tab.
// Patch the Chromium .app's Info.plist so macOS shows our name.
// This works for both dev mode (system Playwright cache) and .app bundle.
const chromePath = executablePath || chromium.executablePath();
try {
// Walk up from binary to the .app's Info.plist
// e.g. .../Google Chrome for Testing.app/Contents/MacOS/Google Chrome for Testing
// → .../Google Chrome for Testing.app/Contents/Info.plist
const chromeContentsDir = path.resolve(path.dirname(chromePath), '..');
const chromePlist = path.join(chromeContentsDir, 'Info.plist');
if (fs.existsSync(chromePlist)) {
const plistContent = fs.readFileSync(chromePlist, 'utf-8');
if (plistContent.includes('Google Chrome for Testing')) {
const patched = plistContent
.replace(/Google Chrome for Testing/g, 'GStack Browser');
fs.writeFileSync(chromePlist, patched);
}
// Replace Chromium's Dock icon with ours (Chromium's process owns the Dock icon)
const iconCandidates = [
path.join(__dirname, '..', '..', 'scripts', 'app', 'icon.icns'), // repo dev mode
path.join(process.env.HOME || '', '.claude', 'skills', 'gstack', 'scripts', 'app', 'icon.icns'), // global install
];
const iconSrc = iconCandidates.find(p => fs.existsSync(p));
if (iconSrc) {
const chromeResources = path.join(chromeContentsDir, 'Resources');
// Read original icon name from plist
const iconMatch = plistContent.match(/<key>CFBundleIconFile<\/key>\s*<string>([^<]+)<\/string>/);
let origIcon = iconMatch ? iconMatch[1] : 'app';
if (!origIcon.endsWith('.icns')) origIcon += '.icns';
const destIcon = path.join(chromeResources, origIcon);
try { fs.copyFileSync(iconSrc, destIcon); } catch { /* non-fatal */ }
}
}
} catch {
// Non-fatal: app name just stays as Chrome for Testing
}
// Build custom user agent: keep Chrome version for site compatibility,
// but replace "Chrome for Testing" branding with "GStackBrowser"
let customUA: string | undefined;
if (!this.customUserAgent) {
// Detect Chrome version from the Chromium binary
const chromePath = executablePath || chromium.executablePath();
try {
const versionProc = Bun.spawnSync([chromePath, '--version'], {
stdout: 'pipe', stderr: 'pipe', timeout: 5000,
});
const versionOutput = versionProc.stdout.toString().trim();
// Output like: "Google Chrome for Testing 145.0.6422.0" or "Chromium 145.0.6422.0"
const versionMatch = versionOutput.match(/(\d+\.\d+\.\d+\.\d+)/);
const chromeVersion = versionMatch ? versionMatch[1] : '131.0.0.0';
customUA = `Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/${chromeVersion} Safari/537.36 GStackBrowser`;
} catch {
// Fallback: generic modern Chrome UA
customUA = 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/131.0.0.0 Safari/537.36 GStackBrowser';
}
}
this.context = await chromium.launchPersistentContext(userDataDir, {
headless: false,
args: launchArgs,
viewport: null, // Use browser's default viewport (real window size)
userAgent: this.customUserAgent || customUA,
...(executablePath ? { executablePath } : {}),
// Playwright adds flags that block extension loading
ignoreDefaultArgs: [
'--disable-extensions',
'--disable-component-extensions-with-background-pages',
],
});
this.browser = this.context.browser();
this.connectionMode = 'headed';
this.intentionalDisconnect = false;
// ─── Anti-bot-detection stealth patches ───────────────────────
// Playwright's Chromium is detected by sites like Google/NYTimes via:
// 1. navigator.webdriver = true (handled by --disable-blink-features above)
// 2. Missing plugins array (real Chrome has PDF viewer, etc.)
// 3. Missing languages
// 4. CDP runtime detection (window.cdc_* variables)
// 5. Permissions API returning 'denied' for notifications
await this.context.addInitScript(() => {
// Fake plugins array (real Chrome has at least PDF Viewer)
Object.defineProperty(navigator, 'plugins', {
get: () => {
const plugins = [
{ name: 'PDF Viewer', filename: 'internal-pdf-viewer', description: 'Portable Document Format' },
{ name: 'Chrome PDF Viewer', filename: 'internal-pdf-viewer', description: '' },
{ name: 'Chromium PDF Viewer', filename: 'internal-pdf-viewer', description: '' },
];
(plugins as any).namedItem = (name: string) => plugins.find(p => p.name === name) || null;
(plugins as any).refresh = () => {};
return plugins;
},
});
// Fake languages (Playwright sometimes sends empty)
Object.defineProperty(navigator, 'languages', {
get: () => ['en-US', 'en'],
});
// Remove CDP runtime artifacts that automation detectors look for
// cdc_ prefixed vars are injected by ChromeDriver/CDP
const cleanup = () => {
for (const key of Object.keys(window)) {
if (key.startsWith('cdc_') || key.startsWith('__webdriver')) {
try { delete (window as any)[key]; } catch {}
}
}
};
cleanup();
// Re-clean after a tick in case they're injected late
setTimeout(cleanup, 0);
// Override Permissions API to return 'prompt' for notifications
// (automation browsers return 'denied' which is a fingerprint)
const originalQuery = window.navigator.permissions?.query;
if (originalQuery) {
(window.navigator.permissions as any).query = (params: any) => {
if (params.name === 'notifications') {
return Promise.resolve({ state: 'prompt', onchange: null } as PermissionStatus);
}
return originalQuery.call(window.navigator.permissions, params);
};
}
});
// Inject visual indicator — subtle top-edge amber gradient
// Extension's content script handles the floating pill
const indicatorScript = () => {
const injectIndicator = () => {
if (document.getElementById('gstack-ctrl')) return;
const topLine = document.createElement('div');
topLine.id = 'gstack-ctrl';
topLine.style.cssText = `
position: fixed; top: 0; left: 0; right: 0; height: 2px;
background: linear-gradient(90deg, #F59E0B, #FBBF24, #F59E0B);
background-size: 200% 100%;
animation: gstack-shimmer 3s linear infinite;
pointer-events: none; z-index: 2147483647;
opacity: 0.8;
`;
const style = document.createElement('style');
style.textContent = `
@keyframes gstack-shimmer {
0% { background-position: 200% 0; }
100% { background-position: -200% 0; }
}
@media (prefers-reduced-motion: reduce) {
#gstack-ctrl { animation: none !important; }
}
`;
document.documentElement.appendChild(style);
document.documentElement.appendChild(topLine);
};
if (document.readyState === 'loading') {
document.addEventListener('DOMContentLoaded', injectIndicator);
} else {
injectIndicator();
}
};
await this.context.addInitScript(indicatorScript);
// Track user-created tabs automatically (Cmd+T, link opens in new tab, etc.)
this.context.on('page', (page) => {
const id = this.nextTabId++;
this.pages.set(id, page);
this.activeTabId = id;
this.wirePageEvents(page);
// Inject indicator on the new tab
page.evaluate(indicatorScript).catch(() => {});
console.log(`[browse] New tab detected (id=${id}, total=${this.pages.size})`);
});
// Persistent context opens a default page — adopt it instead of creating a new one
const existingPages = this.context.pages();
if (existingPages.length > 0) {
const page = existingPages[0];
const id = this.nextTabId++;
this.pages.set(id, page);
this.activeTabId = id;
this.wirePageEvents(page);
// Inject indicator on restored page (addInitScript only fires on new navigations)
try { await page.evaluate(indicatorScript); } catch {}
} else {
await this.newTab();
}
// Browser disconnect handler — exit code 2 distinguishes from crashes (1)
if (this.browser) {
this.browser.on('disconnected', () => {
if (this.intentionalDisconnect) return;
console.error('[browse] Real browser disconnected (user closed or crashed).');
console.error('[browse] Run `$B connect` to reconnect.');
process.exit(2);
});
}
// Headed mode defaults
this.dialogAutoAccept = false; // Don't dismiss user's real dialogs
this.isHeaded = true;
this.consecutiveFailures = 0;
}
async close() {
if (this.browser || (this.connectionMode === 'headed' && this.context)) {
if (this.connectionMode === 'headed') {
// Headed/persistent context mode: close the context (which closes the browser)
this.intentionalDisconnect = true;
if (this.browser) this.browser.removeAllListeners('disconnected');
await Promise.race([
this.context ? this.context.close() : Promise.resolve(),
new Promise(resolve => setTimeout(resolve, 5000)),
]).catch(() => {});
} else {
// Launched mode: close the browser we spawned
this.browser.removeAllListeners('disconnected');
await Promise.race([
this.browser.close(),
new Promise(resolve => setTimeout(resolve, 5000)),
]).catch(() => {});
}
this.browser = null;
}
}
/** Health check — verifies Chromium is connected AND responsive */
async isHealthy(): Promise<boolean> {
if (!this.browser || !this.browser.isConnected()) return false;
try {
const page = this.pages.get(this.activeTabId);
if (!page) return true; // connected but no pages — still healthy
await Promise.race([
page.evaluate('1'),
new Promise((_, reject) => setTimeout(() => reject(new Error('timeout')), 2000)),
]);
return true;
} catch {
return false;
}
}
// ─── Tab Management ────────────────────────────────────────
async newTab(url?: string): Promise<number> {
if (!this.context) throw new Error('Browser not launched');
// Validate URL before allocating page to avoid zombie tabs on rejection
if (url) {
await validateNavigationUrl(url);
}
const page = await this.context.newPage();
const id = this.nextTabId++;
this.pages.set(id, page);
this.activeTabId = id;
// Wire up console/network/dialog capture
this.wirePageEvents(page);
if (url) {
await page.goto(url, { waitUntil: 'domcontentloaded', timeout: 15000 });
}
return id;
}
async closeTab(id?: number): Promise<void> {
const tabId = id ?? this.activeTabId;
const page = this.pages.get(tabId);
if (!page) throw new Error(`Tab ${tabId} not found`);
await page.close();
this.pages.delete(tabId);
// Switch to another tab if we closed the active one
if (tabId === this.activeTabId) {
const remaining = [...this.pages.keys()];
if (remaining.length > 0) {
this.activeTabId = remaining[remaining.length - 1];
} else {
// No tabs left — create a new blank one
await this.newTab();
}
}
}
switchTab(id: number, opts?: { bringToFront?: boolean }): void {
if (!this.pages.has(id)) throw new Error(`Tab ${id} not found`);
this.activeTabId = id;
this.activeFrame = null; // Frame context is per-tab
// Only bring to front when explicitly requested (user-initiated tab switch).
// Internal tab pinning (BROWSE_TAB) should NOT steal focus.
if (opts?.bringToFront !== false) {
const page = this.pages.get(id);
if (page) page.bringToFront().catch(() => {});
}
}
/**
* Sync activeTabId to match the tab whose URL matches the Chrome extension's
* active tab. Called on every /sidebar-tabs poll so manual tab switches in
* the browser are detected within ~2s.
*/
syncActiveTabByUrl(activeUrl: string): void {
if (!activeUrl || this.pages.size <= 1) return;
// Try exact match first, then fuzzy match (origin+pathname, ignoring query/fragment)
let fuzzyId: number | null = null;
let activeOriginPath = '';
try {
const u = new URL(activeUrl);
activeOriginPath = u.origin + u.pathname;
} catch {}
for (const [id, page] of this.pages) {
try {
const pageUrl = page.url();
// Exact match — best case
if (pageUrl === activeUrl && id !== this.activeTabId) {
this.activeTabId = id;
this.activeFrame = null;
return;
}
// Fuzzy match — origin+pathname (handles query param / fragment differences)
if (activeOriginPath && fuzzyId === null && id !== this.activeTabId) {
try {
const pu = new URL(pageUrl);
if (pu.origin + pu.pathname === activeOriginPath) {
fuzzyId = id;
}
} catch {}
}
} catch {}
}
// Fall back to fuzzy match
if (fuzzyId !== null) {
this.activeTabId = fuzzyId;
this.activeFrame = null;
}
}
getActiveTabId(): number {
return this.activeTabId;
}
getTabCount(): number {
return this.pages.size;
}
async getTabListWithTitles(): Promise<Array<{ id: number; url: string; title: string; active: boolean }>> {
const tabs: Array<{ id: number; url: string; title: string; active: boolean }> = [];
for (const [id, page] of this.pages) {
tabs.push({
id,
url: page.url(),
title: await page.title().catch(() => ''),
active: id === this.activeTabId,
});
}
return tabs;
}
// ─── Page Access ───────────────────────────────────────────
getPage(): Page {
const page = this.pages.get(this.activeTabId);
if (!page) throw new Error('No active page. Use "browse goto <url>" first.');
return page;
}
getCurrentUrl(): string {
try {
return this.getPage().url();
} catch {
return 'about:blank';
}
}
// ─── Ref Map ──────────────────────────────────────────────
setRefMap(refs: Map<string, RefEntry>) {
this.refMap = refs;
}
clearRefs() {
this.refMap.clear();
}
/**
* Resolve a selector that may be a @ref (e.g., "@e3", "@c1") or a CSS selector.
* Returns { locator } for refs or { selector } for CSS selectors.
*/
async resolveRef(selector: string): Promise<{ locator: Locator } | { selector: string }> {
if (selector.startsWith('@e') || selector.startsWith('@c')) {
const ref = selector.slice(1); // "e3" or "c1"
const entry = this.refMap.get(ref);
if (!entry) {
throw new Error(
`Ref ${selector} not found. Run 'snapshot' to get fresh refs.`
);
}
const count = await entry.locator.count();
if (count === 0) {
throw new Error(
`Ref ${selector} (${entry.role} "${entry.name}") is stale — element no longer exists. ` +
`Run 'snapshot' for fresh refs.`
);
}
return { locator: entry.locator };
}
return { selector };
}
/** Get the ARIA role for a ref selector, or null for CSS selectors / unknown refs. */
getRefRole(selector: string): string | null {
if (selector.startsWith('@e') || selector.startsWith('@c')) {
const entry = this.refMap.get(selector.slice(1));
return entry?.role ?? null;
}
return null;
}
getRefCount(): number {
return this.refMap.size;
}
// ─── Snapshot Diffing ─────────────────────────────────────
setLastSnapshot(text: string | null) {
this.lastSnapshot = text;
}
getLastSnapshot(): string | null {
return this.lastSnapshot;
}
// ─── Dialog Control ───────────────────────────────────────
setDialogAutoAccept(accept: boolean) {
this.dialogAutoAccept = accept;
}
getDialogAutoAccept(): boolean {
return this.dialogAutoAccept;
}
setDialogPromptText(text: string | null) {
this.dialogPromptText = text;
}
getDialogPromptText(): string | null {
return this.dialogPromptText;
}
// ─── Viewport ──────────────────────────────────────────────
async setViewport(width: number, height: number) {
await this.getPage().setViewportSize({ width, height });
}
// ─── Extra Headers ─────────────────────────────────────────
async setExtraHeader(name: string, value: string) {
this.extraHeaders[name] = value;
if (this.context) {
await this.context.setExtraHTTPHeaders(this.extraHeaders);
}
}
// ─── User Agent ────────────────────────────────────────────
setUserAgent(ua: string) {
this.customUserAgent = ua;
}
getUserAgent(): string | null {
return this.customUserAgent;
}
// ─── Lifecycle helpers ───────────────────────────────
/**
* Close all open pages and clear the pages map.
* Used by state load to replace the current session.
*/
async closeAllPages(): Promise<void> {
for (const page of this.pages.values()) {
await page.close().catch(() => {});
}
this.pages.clear();
this.clearRefs();
}
// ─── Frame context ─────────────────────────────────
private activeFrame: import('playwright').Frame | null = null;
setFrame(frame: import('playwright').Frame | null): void {
this.activeFrame = frame;
}
getFrame(): import('playwright').Frame | null {
return this.activeFrame;
}
/**
* Returns the active frame if set, otherwise the current page.
* Use this for operations that work on both Page and Frame (locator, evaluate, etc.).
*/
getActiveFrameOrPage(): import('playwright').Page | import('playwright').Frame {
// Auto-recover from detached frames (iframe removed/navigated)
if (this.activeFrame?.isDetached()) {
this.activeFrame = null;
}
return this.activeFrame ?? this.getPage();
}
// ─── State Save/Restore (shared by recreateContext + handoff) ─
/**
* Capture browser state: cookies, localStorage, sessionStorage, URLs, active tab.
* Skips pages that fail storage reads (e.g., already closed).
*/
async saveState(): Promise<BrowserState> {
if (!this.context) throw new Error('Browser not launched');
const cookies = await this.context.cookies();
const pages: BrowserState['pages'] = [];
for (const [id, page] of this.pages) {
const url = page.url();
let storage = null;
try {
storage = await page.evaluate(() => ({
localStorage: { ...localStorage },
sessionStorage: { ...sessionStorage },
}));
} catch {}
pages.push({
url: url === 'about:blank' ? '' : url,
isActive: id === this.activeTabId,
storage,
});
}
return { cookies, pages };
}
/**
* Restore browser state into the current context: cookies, pages, storage.
* Navigates to saved URLs, restores storage, wires page events.
* Failures on individual pages are swallowed — partial restore is better than none.
*/
async restoreState(state: BrowserState): Promise<void> {
if (!this.context) throw new Error('Browser not launched');
// Restore cookies
if (state.cookies.length > 0) {
await this.context.addCookies(state.cookies);
}
// Re-create pages
let activeId: number | null = null;
for (const saved of state.pages) {
const page = await this.context.newPage();
const id = this.nextTabId++;
this.pages.set(id, page);
this.wirePageEvents(page);
if (saved.url) {
await page.goto(saved.url, { waitUntil: 'domcontentloaded', timeout: 15000 }).catch(() => {});
}
if (saved.storage) {
try {
await page.evaluate((s: { localStorage: Record<string, string>; sessionStorage: Record<string, string> }) => {
if (s.localStorage) {
for (const [k, v] of Object.entries(s.localStorage)) {
localStorage.setItem(k, v);
}
}
if (s.sessionStorage) {
for (const [k, v] of Object.entries(s.sessionStorage)) {
sessionStorage.setItem(k, v);
}
}
}, saved.storage);
} catch {}
}
if (saved.isActive) activeId = id;
}
// If no pages were saved, create a blank one
if (this.pages.size === 0) {
await this.newTab();
} else {
this.activeTabId = activeId ?? [...this.pages.keys()][0];
}
// Clear refs — pages are new, locators are stale
this.clearRefs();
}
/**
* Recreate the browser context to apply user agent changes.
* Saves and restores cookies, localStorage, sessionStorage, and open pages.
* Falls back to a clean slate on any failure.
*/
async recreateContext(): Promise<string | null> {
if (this.connectionMode === 'headed') {
throw new Error('Cannot recreate context in headed mode. Use disconnect first.');
}
if (!this.browser || !this.context) {
throw new Error('Browser not launched');
}
try {
// 1. Save state
const state = await this.saveState();
// 2. Close old pages and context
for (const page of this.pages.values()) {
await page.close().catch(() => {});
}
this.pages.clear();
await this.context.close().catch(() => {});
// 3. Create new context with updated settings
const contextOptions: BrowserContextOptions = {
viewport: { width: 1280, height: 720 },
};
if (this.customUserAgent) {
contextOptions.userAgent = this.customUserAgent;
}
this.context = await this.browser.newContext(contextOptions);
if (Object.keys(this.extraHeaders).length > 0) {
await this.context.setExtraHTTPHeaders(this.extraHeaders);
}
// 4. Restore state
await this.restoreState(state);
return null; // success
} catch (err: unknown) {
// Fallback: create a clean context + blank tab
try {
this.pages.clear();
if (this.context) await this.context.close().catch(() => {});
const contextOptions: BrowserContextOptions = {
viewport: { width: 1280, height: 720 },
};
if (this.customUserAgent) {
contextOptions.userAgent = this.customUserAgent;
}
this.context = await this.browser!.newContext(contextOptions);
await this.newTab();
this.clearRefs();
} catch {
// If even the fallback fails, we're in trouble — but browser is still alive
}
return `Context recreation failed: ${err instanceof Error ? err.message : String(err)}. Browser reset to blank tab.`;
}
}
// ─── Handoff: Headless → Headed ─────────────────────────────
/**
* Hand off browser control to the user by relaunching in headed mode.
*
* Flow (launch-first-close-second for safe rollback):
* 1. Save state from current headless browser
* 2. Launch NEW headed browser
* 3. Restore state into new browser
* 4. Close OLD headless browser
* If step 2 fails → return error, headless browser untouched
*/
async handoff(message: string): Promise<string> {
if (this.connectionMode === 'headed' || this.isHeaded) {
return `HANDOFF: Already in headed mode at ${this.getCurrentUrl()}`;
}
if (!this.browser || !this.context) {
throw new Error('Browser not launched');
}
// 1. Save state from current browser
const state = await this.saveState();
const currentUrl = this.getCurrentUrl();
// 2. Launch new headed browser with extension (same as launchHeaded)
// Uses launchPersistentContext so the extension auto-loads.
let newContext: BrowserContext;
try {
const fs = require('fs');
const path = require('path');
const extensionPath = this.findExtensionPath();
const launchArgs = ['--hide-crash-restore-bubble'];
if (extensionPath) {
launchArgs.push(`--disable-extensions-except=${extensionPath}`);
launchArgs.push(`--load-extension=${extensionPath}`);
// Auth token is served via /health endpoint now (no file write needed).
// Extension reads token from /health on connect.
console.log(`[browse] Handoff: loading extension from ${extensionPath}`);
} else {
console.log('[browse] Handoff: extension not found — headed mode without side panel');
}
const userDataDir = path.join(process.env.HOME || '/tmp', '.gstack', 'chromium-profile');
fs.mkdirSync(userDataDir, { recursive: true });
newContext = await chromium.launchPersistentContext(userDataDir, {
headless: false,
args: launchArgs,
viewport: null,
ignoreDefaultArgs: [
'--disable-extensions',
'--disable-component-extensions-with-background-pages',
],
timeout: 15000,
});
} catch (err: unknown) {
const msg = err instanceof Error ? err.message : String(err);
return `ERROR: Cannot open headed browser — ${msg}. Headless browser still running.`;
}
// 3. Restore state into new headed browser
try {
// Swap to new browser/context before restoreState (it uses this.context)
const oldBrowser = this.browser;
this.context = newContext;
this.browser = newContext.browser();
this.pages.clear();
this.connectionMode = 'headed';
if (Object.keys(this.extraHeaders).length > 0) {
await newContext.setExtraHTTPHeaders(this.extraHeaders);
}
// Register crash handler on new browser
if (this.browser) {
this.browser.on('disconnected', () => {
if (this.intentionalDisconnect) return;
console.error('[browse] FATAL: Chromium process crashed or was killed. Server exiting.');
process.exit(1);
});
}
await this.restoreState(state);
this.isHeaded = true;
this.dialogAutoAccept = false; // User controls dialogs in headed mode
// 4. Close old headless browser (fire-and-forget)
oldBrowser.removeAllListeners('disconnected');
oldBrowser.close().catch(() => {});
return [
`HANDOFF: Browser opened at ${currentUrl}`,
`MESSAGE: ${message}`,
`STATUS: Waiting for user. Run 'resume' when done.`,
].join('\n');
} catch (err: unknown) {
// Restore failed — close the new context, keep old state
await newContext.close().catch(() => {});
const msg = err instanceof Error ? err.message : String(err);
return `ERROR: Handoff failed during state restore — ${msg}. Headless browser still running.`;
}
}
/**
* Resume AI control after user handoff.
* Clears stale refs and resets failure counter.
* The meta-command handler calls handleSnapshot() after this.
*/
resume(): void {
this.clearRefs();
this.resetFailures();
this.activeFrame = null;
}
getIsHeaded(): boolean {
return this.isHeaded;
}
// ─── Auto-handoff Hint (consecutive failure tracking) ───────
incrementFailures(): void {
this.consecutiveFailures++;
}
resetFailures(): void {
this.consecutiveFailures = 0;
}
getFailureHint(): string | null {
if (this.consecutiveFailures >= 3 && !this.isHeaded) {
return `HINT: ${this.consecutiveFailures} consecutive failures. Consider using 'handoff' to let the user help.`;
}
return null;
}
// ─── Console/Network/Dialog/Ref Wiring ────────────────────
private wirePageEvents(page: Page) {
// Track tab close — remove from pages map, switch to another tab
page.on('close', () => {
for (const [id, p] of this.pages) {
if (p === page) {
this.pages.delete(id);
console.log(`[browse] Tab closed (id=${id}, remaining=${this.pages.size})`);
// If the closed tab was active, switch to another
if (this.activeTabId === id) {
const remaining = [...this.pages.keys()];
this.activeTabId = remaining.length > 0 ? remaining[remaining.length - 1] : 0;
}
break;
}
}
});
// Clear ref map on navigation — refs point to stale elements after page change
// (lastSnapshot is NOT cleared — it's a text baseline for diffing)
page.on('framenavigated', (frame) => {
if (frame === page.mainFrame()) {
this.clearRefs();
this.activeFrame = null; // Navigation invalidates frame context
}
});
// ─── Dialog auto-handling (prevents browser lockup) ─────
page.on('dialog', async (dialog) => {
const entry: DialogEntry = {
timestamp: Date.now(),
type: dialog.type(),
message: dialog.message(),
defaultValue: dialog.defaultValue() || undefined,
action: this.dialogAutoAccept ? 'accepted' : 'dismissed',
response: this.dialogAutoAccept ? (this.dialogPromptText ?? undefined) : undefined,
};
addDialogEntry(entry);
try {
if (this.dialogAutoAccept) {
await dialog.accept(this.dialogPromptText ?? undefined);
} else {
await dialog.dismiss();
}
} catch {
// Dialog may have been dismissed by navigation — ignore
}
});
page.on('console', (msg) => {
addConsoleEntry({
timestamp: Date.now(),
level: msg.type(),
text: msg.text(),
});
});
page.on('request', (req) => {
addNetworkEntry({
timestamp: Date.now(),
method: req.method(),
url: req.url(),
});
});
page.on('response', (res) => {
// Find matching request entry and update it (backward scan)
const url = res.url();
const status = res.status();
for (let i = networkBuffer.length - 1; i >= 0; i--) {
const entry = networkBuffer.get(i);
if (entry && entry.url === url && !entry.status) {
networkBuffer.set(i, { ...entry, status, duration: Date.now() - entry.timestamp });
break;
}
}
});
// Capture response sizes via response finished
page.on('requestfinished', async (req) => {
try {
const res = await req.response();
if (res) {
const url = req.url();
const body = await res.body().catch(() => null);
const size = body ? body.length : 0;
for (let i = networkBuffer.length - 1; i >= 0; i--) {
const entry = networkBuffer.get(i);
if (entry && entry.url === url && !entry.size) {
networkBuffer.set(i, { ...entry, size });
break;
}
}
}
} catch {}
});
}
}