Commit Graph

7 Commits

Author SHA1 Message Date
Garry Tan 03973c2fab fix: community security wave — 8 PRs, 4 contributors (v0.15.13.0) (#847)
* fix(bin): pass search params via env vars (RCE fix) (#819)

Replace shell string interpolation with process.env in gstack-learnings-search
to prevent arbitrary code execution via crafted learnings entries. Also fixes
the CROSS_PROJECT interpolation that the original PR missed.

Adds 3 regression tests verifying no shell interpolation remains in the bun -e block.

Co-authored-by: garagon <garagon@users.noreply.github.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(browse): add path validation to upload command (#821)

Add isPathWithin() and path traversal checks to the upload command,
blocking file exfiltration via crafted upload paths. Uses existing
SAFE_DIRECTORIES constant instead of a local copy. Adds 3 regression tests.

Co-authored-by: garagon <garagon@users.noreply.github.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(browse): symlink resolution in meta-commands validateOutputPath (#820)

Add realpathSync to validateOutputPath in meta-commands.ts to catch
symlink-based directory escapes in screenshot, pdf, and responsive
commands. Resolves SAFE_DIRECTORIES through realpathSync to handle
macOS /tmp -> /private/tmp symlinks. Existing path validation tests
pass with the hardened implementation.

Co-authored-by: garagon <garagon@users.noreply.github.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: add uninstall instructions to README (#812)

Community PR #812 by @0531Kim. Adds two uninstall paths: the gstack-uninstall
script (handles everything) and manual removal steps for when the repo isn't
cloned. Includes CLAUDE.md cleanup note and Playwright cache guidance.

Co-Authored-By: 0531Kim <0531Kim@users.noreply.github.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(browse): Windows launcher extraEnv + headed-mode token (#822)

Community PR #822 by @pieterklue. Three fixes:
1. Windows launcher now merges extraEnv into spawned server env (was
   only passing BROWSE_STATE_FILE, dropping all other env vars)
2. Welcome page fallback serves inline HTML instead of about:blank
   redirect (avoids ERR_UNSAFE_REDIRECT on Windows)
3. /health returns auth token in headed mode even without Origin header
   (fixes Playwright Chromium extensions that don't send it)

Also adds HOME/USERPROFILE fallback for cross-platform compatibility.

Co-Authored-By: pieterklue <pieterklue@users.noreply.github.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(browse): terminate orphan server when parent process exits (#808)

Community PR #808 by @mmporong. Passes BROWSE_PARENT_PID to the spawned
server process. The server polls every 15s with signal 0 and calls
shutdown() if the parent is gone. Prevents orphaned chrome-headless-shell
processes when Claude Code sessions exit abnormally.

Co-Authored-By: mmporong <mmporong@users.noreply.github.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(security): IPv6 ULA blocking, cookie redaction, per-tab cancel, targeted token (#664)

Community PR #664 by @mr-k-man (security audit round 1, new parts only).

- IPv6 ULA prefix blocking (fc00::/7) in url-validation.ts with false-positive
  guard for hostnames like fd.example.com
- Cookie value redaction for tokens, API keys, JWTs in browse cookies command
- Per-tab cancel files in killAgent() replacing broken global kill-signal
- design/serve.ts: realpathSync upgrade prevents symlink bypass in /api/reload
- extension: targeted getToken handler replaces token-in-health-broadcast
- Supabase migration 003: column-level GRANT restricts anon UPDATE scope
- Telemetry sync: upsert error logging
- 10 new tests for IPv6, cookie redaction, DNS rebinding, path traversal

Co-Authored-By: mr-k-man <mr-k-man@users.noreply.github.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(security): CSS injection guard, timeout clamping, session validation, tests (#806)

Community PR #806 by @mr-k-man (security audit round 2, new parts only).

- CSS value validation (DANGEROUS_CSS) in cdp-inspector, write-commands, extension inspector
- Queue file permissions (0o700/0o600) in cli, server, sidebar-agent
- escapeRegExp for frame --url ReDoS fix
- Responsive screenshot path validation with validateOutputPath
- State load cookie filtering (reject localhost/.internal/metadata cookies)
- Session ID format validation in loadSession
- /health endpoint: remove currentUrl and currentMessage fields
- QueueEntry interface + isValidQueueEntry validator for sidebar-agent
- SIGTERM->SIGKILL escalation in timeout handler
- Viewport dimension clamping (1-16384), wait timeout clamping (1s-300s)
- Cookie domain validation in cookie-import and cookie-import-browser
- DocumentFragment-based tab switching (XSS fix in sidepanel)
- pollInProgress reentrancy guard for pollChat
- toggleClass/injectCSS input validation in extension inspector
- Snapshot annotated path validation with realpathSync
- 714-line security-audit-r2.test.ts + 33-line learnings-injection.test.ts

Co-Authored-By: mr-k-man <mr-k-man@users.noreply.github.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* chore: bump version and changelog (v0.15.13.0)

Community security wave: 8 PRs from 4 contributors (@garagon, @mr-k-man,
@mmporong, @0531Kim, @pieterklue). IPv6 ULA blocking, cookie redaction,
per-tab cancel signaling, CSS injection guards, timeout clamping, session
validation, DocumentFragment XSS fix, parent process watchdog, uninstall
docs, Windows fixes, and 750+ lines of security regression tests.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: garagon <garagon@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: 0531Kim <0531Kim@users.noreply.github.com>
Co-authored-by: pieterklue <pieterklue@users.noreply.github.com>
Co-authored-by: mmporong <mmporong@users.noreply.github.com>
Co-authored-by: mr-k-man <mr-k-man@users.noreply.github.com>
2026-04-06 00:47:04 -07:00
Garry Tan 3f080de1b7 feat: GStack Browser — double-click AI browser with anti-bot stealth (#695)
* feat: CDP inspector module — persistent sessions, CSS cascade, style modification

New browse/src/cdp-inspector.ts with full CDP inspection engine:
- inspectElement() via CSS.getMatchedStylesForNode + DOM.getBoxModel
- modifyStyle() via CSS.setStyleTexts with headless page.evaluate fallback
- Persistent CDP session lifecycle (create, reuse, detach on nav, re-create)
- Specificity sorting, overridden property detection, UA rule filtering
- Modification history with undo support
- formatInspectorResult() for CLI output

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: browse server inspector endpoints + inspect/style/cleanup/prettyscreenshot CLI

Server endpoints: POST /inspector/pick, GET /inspector, POST /inspector/apply,
POST /inspector/reset, GET /inspector/history, GET /inspector/events (SSE).
CLI commands: inspect (CDP cascade), style (live CSS mod), cleanup (page clutter
removal), prettyscreenshot (clean screenshot pipeline).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: sidebar CSS inspector — element picker, box model, rule cascade, quick edit

Extension changes for the visual CSS inspector:
- inspector.js: element picker with hover highlight, CSS selector generation,
  basic mode fallback (getComputedStyle + CSSOM), page alteration handlers
- inspector.css: picker overlay styles (blue highlight + tooltip)
- background.js: inspector message routing (picker <-> server <-> sidepanel)
- sidepanel: Inspector tab with box model viz (gstack palette), matched rules
  with specificity badges, computed styles, click-to-edit quick edit,
  Send to Agent/Code button, empty/loading/error states

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: document inspect, style, cleanup, prettyscreenshot browse commands

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: auto-track user-created tabs and handle tab close

browser-manager.ts changes:
- context.on('page') listener: automatically tracks tabs opened by the user
  (Cmd+T, right-click open in new tab, window.open). Previously only
  programmatic newTab() was tracked, so user tabs were invisible.
- page.on('close') handler in wirePageEvents: removes closed tabs from the
  pages map and switches activeTabId to the last remaining tab.
- syncActiveTabByUrl: match Chrome extension's active tab URL to the correct
  Playwright page for accurate tab identity.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: per-tab agent isolation via BROWSE_TAB environment variable

Prevents parallel sidebar agents from interfering with each other's tab context.

Three-layer fix:
- sidebar-agent.ts: passes BROWSE_TAB=<tabId> env var to each claude process,
  per-tab processing set allows concurrent agents across tabs
- cli.ts: reads process.env.BROWSE_TAB and includes tabId in command request body
- server.ts: handleCommand() temporarily switches activeTabId when tabId is present,
  restores after command completes (safe: Bun event loop is single-threaded)

Also: per-tab agent state (TabAgentState map), per-tab message queuing,
per-tab chat buffers, verbose streaming narration, stop button endpoint.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: sidebar per-tab chat context, tab bar sync, stop button, UX polish

Extension changes:
- sidepanel.js: per-tab chat history (tabChatHistories map), switchChatTab()
  swaps entire chat view, browserTabActivated handler for instant tab sync,
  stop button wired to /sidebar-agent/stop, pollTabs renders tab bar
- sidepanel.html: updated banner text ("Browser co-pilot"), stop button markup,
  input placeholder "Ask about this page..."
- sidepanel.css: tab bar styles, stop button styles, loading state fixes
- background.js: chrome.tabs.onActivated sends browserTabActivated to sidepanel
  with tab URL for instant tab switch detection

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* test: per-tab isolation, BROWSE_TAB pinning, tab tracking, sidebar UX

sidebar-agent.test.ts (new tests):
- BROWSE_TAB env var passed to claude process
- CLI reads BROWSE_TAB and sends tabId in body
- handleCommand accepts tabId, saves/restores activeTabId
- Tab pinning only activates when tabId provided
- Per-tab agent state, queue, concurrency
- processingTabs set for parallel agents

sidebar-ux.test.ts (new tests):
- context.on('page') tracks user-created tabs
- page.on('close') removes tabs from pages map
- Tab isolation uses BROWSE_TAB not system prompt hack
- Per-tab chat context in sidepanel
- Tab bar rendering, stop button, banner text

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: resolve merge conflicts — keep security defenses + per-tab isolation

Merged main's security improvements (XML escaping, prompt injection defense,
allowed commands whitelist, --model opus, Write tool, stderr capture) with
our branch's per-tab isolation (BROWSE_TAB env var, processingTabs set,
no --resume). Updated test expectations for expanded system prompt.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* chore: bump version and changelog (v0.13.9.0)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: add inspector message types to background.js allowlist

Pre-existing bug found by Codex: ALLOWED_TYPES in background.js was missing
all inspector message types (startInspector, stopInspector, elementPicked,
pickerCancelled, applyStyle, toggleClass, injectCSS, resetAll, inspectResult).
Messages were silently rejected, making the inspector broken on ALL pages.

Also: separate executeScript and insertCSS into individual try blocks in
injectInspector(), store inspectorMode for routing, and add content.js
fallback when script injection fails (CSP, chrome:// pages).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: basic element picker in content.js for CSP-restricted pages

When inspector.js can't be injected (CSP, chrome:// pages), content.js
provides a basic picker using getComputedStyle + CSSOM:
- startBasicPicker/stopBasicPicker message handlers
- captureBasicData() with ~30 key CSS properties, box model, matched rules
- Hover highlight with outline save/restore (never leaves artifacts)
- Click uses e.target directly (no re-querying by selector)
- Sends inspectResult with mode:'basic' for sidebar rendering
- Escape key cancels picker and restores outlines

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: cleanup + screenshot buttons in sidebar inspector toolbar

Two action buttons in the inspector toolbar:
- Cleanup (🧹): POSTs cleanup --all to server, shows spinner, chat
  notification on success, resets inspector state (element may be removed)
- Screenshot (📸): POSTs screenshot to server, shows spinner, chat
  notification with saved file path

Shared infrastructure:
- .inspector-action-btn CSS with loading spinner via ::after pseudo-element
- chat-notification type in addChatEntry() for system messages
- package.json version bump to 0.13.9.0

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* test: inspector allowlist, CSP fallback, cleanup/screenshot buttons

16 new tests in sidebar-ux.test.ts:
- Inspector message allowlist includes all inspector types
- content.js basic picker (startBasicPicker, captureBasicData, CSSOM,
  outline save/restore, inspectResult with mode basic, Escape cleanup)
- background.js CSP fallback (separate try blocks, inspectorMode, fallback)
- Cleanup button (POST /command, inspector reset after success)
- Screenshot button (POST /command, notification rendering)
- Chat notification type and CSS styles

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: update project documentation for v0.13.9.0

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: cleanup + screenshot buttons in chat toolbar (not just inspector)

Quick actions toolbar (🧹 Cleanup, 📸 Screenshot) now appears above the chat
input, always visible. Both inspector and chat buttons share runCleanup() and
runScreenshot() helper functions. Clicking either set shows loading state on
both simultaneously.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* test: chat toolbar buttons, shared helpers, quick-action-btn styles

Tests that chat toolbar exists (chat-cleanup-btn, chat-screenshot-btn,
quick-actions container), CSS styles (.quick-action-btn, .quick-action-btn.loading),
shared runCleanup/runScreenshot helper functions, and cleanup inspector reset.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: aggressive cleanup heuristics — overlays, scroll unlock, blur removal

Massively expanded CLEANUP_SELECTORS with patterns from uBlock Origin and
Readability.js research:
- ads: 30+ selectors (Google, Amazon, Outbrain, Taboola, Criteo, etc.)
- cookies: OneTrust, Cookiebot, TrustArc, Quantcast + generic patterns
- overlays (NEW): paywalls, newsletter popups, interstitials, push prompts,
  app download banners, survey modals
- social: follow prompts, share tools
- Cleanup now defaults to --all when no args (sidebar button fix)
- Uses !important on all display:none (overrides inline styles)
- Unlocks body/html scroll (overflow:hidden from modal lockout)
- Removes blur/filter effects (paywall content blur)
- Removes max-height truncation (article teaser truncation)
- Collapses empty ad placeholder whitespace (empty divs after ad removal)
- Skips gstack-ctrl indicator in sticky removal

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: disable action buttons when disconnected, no error spam

- setActionButtonsEnabled() toggles .disabled class on all cleanup/screenshot
  buttons (both chat toolbar and inspector toolbar)
- Called with false in updateConnection when server URL is null
- Called with true when connection established
- runCleanup/runScreenshot silently return when disconnected instead of
  showing 'Not connected' error notifications
- CSS .disabled style: pointer-events:none, opacity:0.3, cursor:not-allowed

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* test: cleanup heuristics, button disabled state, overlay selectors

17 new tests:
- cleanup defaults to --all on empty args
- CLEANUP_SELECTORS overlays category (paywall, newsletter, interstitial)
- Major ad networks in selectors (doubleclick, taboola, criteo, etc.)
- Major consent frameworks (OneTrust, Cookiebot, TrustArc, Quantcast)
- !important override for inline styles
- Scroll unlock (body overflow:hidden)
- Blur removal (paywall content blur)
- Article truncation removal (max-height)
- Empty placeholder collapse
- gstack-ctrl indicator skip in sticky cleanup
- setActionButtonsEnabled function
- Buttons disabled when disconnected
- No error spam from cleanup/screenshot when disconnected
- CSS disabled styles for action buttons

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: LLM-based page cleanup — agent analyzes page semantically

Instead of brittle CSS selectors, the cleanup button now sends a prompt to
the sidebar agent (which IS an LLM). The agent:
1. Runs deterministic $B cleanup --all as a quick first pass
2. Takes a snapshot to see what's left
3. Analyzes the page semantically to identify remaining clutter
4. Removes elements intelligently, preserving site branding

This means cleanup works correctly on any site without site-specific selectors.
The LLM understands that "Your Daily Puzzles" is clutter, "ADVERTISEMENT" is
junk, but the SF Chronicle masthead should stay.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: aggressive cleanup heuristics + preserve top nav bar

Deterministic cleanup improvements (used as first pass before LLM analysis):
- New 'clutter' category: audio players, podcast widgets, sidebar puzzles/games,
  recirculation widgets (taboola, outbrain, nativo), cross-promotion banners
- Text-content detection: removes "ADVERTISEMENT", "Article continues below",
  "Sponsored", "Paid content" labels and their parent wrappers
- Sticky fix: preserves the topmost full-width element near viewport top (site
  nav bar) instead of hiding all sticky/fixed elements. Sorts by vertical
  position, preserves the first one that spans >80% viewport width.

Tests: clutter category, ad label removal, nav bar preservation logic.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* test: LLM-based cleanup architecture, deterministic heuristics, sticky nav

22 new tests covering:
- Cleanup button uses /sidebar-command (agent) not /command (deterministic)
- Cleanup prompt includes deterministic first pass + agent snapshot analysis
- Cleanup prompt lists specific clutter categories for agent guidance
- Cleanup prompt preserves site identity (masthead, headline, body, byline)
- Cleanup prompt instructs scroll unlock and $B eval removal
- Loading state management (async agent, setTimeout)
- Deterministic clutter: audio/podcast, games/puzzles, recirculation
- Ad label text patterns (ADVERTISEMENT, Sponsored, Article continues)
- Ad label parent wrapper hiding for small containers
- Sticky nav preservation (sort by position, first full-width near top)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: GStack Browser stealth + branding — anti-bot patches, custom UA, rebrand

- Add GSTACK_CHROMIUM_PATH env var for custom Chromium binary
- Add BROWSE_EXTENSIONS_DIR env var for extension path override
- Move auth token to /health endpoint (fixes read-only .app bundles)
- Anti-bot stealth: disable navigator.webdriver, fake plugins, languages
- Custom user agent: Chrome/<version> GStackBrowser (auto-detects version)
- Rebrand Chromium plist to "GStack Browser" at launch time
- Update security test to match new token-via-health approach

* feat: GStack Browser .app bundle — launcher script + build system

- scripts/app/gstack-browser: dual-mode launcher (dev + .app bundle)
- scripts/build-app.sh: compiles binary, bundles Chromium + extension, creates DMG
- Rebrands Chromium plist during build for "GStack Browser" in menu bar
- 389MB .app, 189MB compressed DMG, launches in ~5s

* docs: GStack Browser V0 master plan — AI-native development browser vision

5-phase roadmap from .app wrapper through Chromium fork, 9 capability
visions, competitive landscape, architecture diagrams, design system.

* fix: restore package.json and sync version to 0.14.3.0

* chore: bump version and changelog (v0.14.4.0)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* chore: gitignore top-level dist/ (GStack Browser build output)

* feat: GStack Browser icon — custom .icns replaces Chromium's Dock icon

- Generated 1024px icon: dark terminal window with amber prompt cursor
- Converted to .icns with all macOS sizes (16-1024px, 1x and 2x)
- build-app.sh copies icon into both the outer .app and bundled Chromium's
  Resources (Chromium's process owns the Dock icon, not the launcher)
- browser-manager.ts patches Chromium's icon at runtime for dev mode too
- Both the Dock and Cmd+Tab now show the GStack icon

* feat: rename /connect-chrome → /open-gstack-browser

- Rename skill directory + update frontmatter name and description
- Update SKILL.md.tmpl to reference GStack Browser branding/stealth
- Create connect-chrome symlink for backwards compatibility
- Setup script creates /connect-chrome alias in .claude/skills/
- Fix package.json version sync (0.14.5.0 → 0.14.6.0)

* feat: rename /connect-chrome → /open-gstack-browser across all references

Update README skill lists, docs/skills.md deep dive, extension sidepanel
banner copy button, and reconnect clipboard text.

* feat: left-align sidebar UI + extension-ready event for welcome page

- Left-align all sidebar text (chat welcome, loading, empty states,
  notifications, inspector empty, session placeholder)
- Dispatch 'gstack-extension-ready' CustomEvent from content.js so
  the welcome page can detect when the sidebar is active

* chore: add GStack Browser TODOs — CDP stealth patches + Chromium fork

P1: rebrowser-style postinstall patcher for Playwright 1.58.2 (suppress
Runtime.enable, addBinding context discovery, 6 files, ~200 lines).
P2: long-term Chromium fork for permanent stealth + native sidebar.

* chore: regenerate open-gstack-browser/SKILL.md from template

Fix timeline skill name (connect-chrome → open-gstack-browser) and
preamble formatting from merge with main's updated template.

* feat: welcome page served from browse server on headed launch

- Add /welcome endpoint to server.ts, serves welcome.html
- Navigate to /welcome after server starts (not during launchHeaded,
  which runs before the server is listening)
- welcome.html bundled in browse/src/ for portability

* feat: auto-open sidebar on every browser launch, not just first install

- Add top-level setTimeout in background.js that fires on every service
  worker startup (onInstalled only fires on install/update)
- Remove misaligned arrow from welcome page, replace with text fallback
  that hides when extension content script fires gstack-extension-ready

* fix: sidebar auto-open retry with backoff + welcome page tests

- Replace single-attempt sidePanel.open() with autoOpenSidePanel() that
  retries up to 5 times with 500ms-5000ms backoff
- Fire on both onInstalled AND every service worker startup
- Remove misaligned arrow from welcome page, replace with text fallback
- Add 12 tests: welcome page structure, /welcome endpoint, headed launch
  navigation timing, sidebar auto-open retry logic, extension-ready event

* feat: reload button in sidebar footer

Adds a "reload" button next to "debug" and "clear" in the sidebar
footer. Calls location.reload() to fully refresh the side panel,
re-run connection logic, and clear stale state.

* feat: right-pointing arrow hint for sidebar on welcome page

Replace invisible text fallback with visible amber bubble + animated
right arrow (→) pointing toward where the sidebar opens. Always correct
regardless of window size (unlike the old up arrow at toolbar chrome).

* fix: sidebar auth race — pass token in getPort response

The sidebar called tryConnect() → getPort → got {port, connected} but
NO token. All subsequent requests (SSE, chat poll) failed with 401.
The token only arrived later via the health broadcast, but by then
the SSE connection was already broken.

Fix: include authToken in the getPort response so the sidebar has
the token from its very first connection attempt.

* feat: sidebar debug visibility + auth race tests

- Show attempt count in loading screen ("Connecting... attempt 3")
- After 5 failed attempts, show debug details (port, connected, token)
  so stuck users can see exactly what's failing
- Add 4 tests: getPort includes token, tryConnect uses token,
  dead state exists with MAX_RECONNECT_ATTEMPTS, reconnectAttempts visible

* fix: startup health check retries every 1s instead of 10s

Root cause: extension service worker starts before Bun.serve() is
listening. First checkHealth() fails, next attempt is 10 seconds
later. User stares at "Connecting..." for 10 seconds.

Fix: retry every 1s for up to 15 attempts on startup, then switch
to 10s polling once connected (or after 15s gives up). Sidebar
should connect within 1-2 seconds of server becoming available.

3 new tests verify the fast-retry → slow-poll transition.

* feat: detailed step-by-step status in sidebar loading screen

Replace useless "Connecting..." with real-time debug info:
- "Looking for browse server... (attempt N)"
- Shows port, server responding status, token status
- Shows chrome.runtime errors if extension messaging fails
- Tells user to run /open-gstack-browser if server not found

* fix: sidebar connects directly to /health instead of waiting for background

Root cause: sidepanel asked background "are you connected?" but background's
health check hadn't succeeded yet (1-10s gap). Sidepanel waited forever.

Fix: when background says not connected, sidepanel hits /health directly
with fetch(). Gets the token from the response. Bypasses background
entirely for initial connection. Shows step-by-step debug info:
"Checking server directly... port: 34567 / Trying GET /health..."

* fix: suppress fake "session ended" and timeout errors in sidebar

Two issues making the sidebar look broken when it's actually working:
1. "Timed out after 300s" error displayed after agent_done — this is a
   cleanup timer, not a real error. Now suppressed when no active session.
2. "(session ended)" text appended on every idle poll — removed entirely.
   The thinking spinner is cleaned up silently instead.

* fix: sidebar agent passes BROWSE_PORT to child claude

Ensures the child claude process connects to the existing headed
browse server (port 34567) instead of spawning a new headless one.
Without this, sidebar chat commands run in an invisible browser.

* feat: BROWSE_NO_AUTOSTART prevents sidebar from spawning headless browser

When set, the browse CLI refuses to start a new server and exits with
a clear error: "Server not available, run /open-gstack-browser to restart."
The sidebar agent sets this so users never get an invisible headless
browser when the headed one is closed.

* test: BROWSE_NO_AUTOSTART guard in CLI + sidebar-agent env vars

5 tests: CLI checks env var before starting server, shows actionable
error, sidebar-agent sets the flag + BROWSE_PORT, guard runs before
lock acquisition to prevent stale lock files.

* fix: stale auth token causes Unauthorized + invisible error text

background.js checkHealth() never refreshed authToken from /health responses,
so when the browse server restarted with a new token, all sidebar-command
requests got 401 Unauthorized forever.

Also: error placeholder text was #3f3f46 on #0C0C0C (nearly invisible).
Now shows in red to match the error border.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: replace 40+ silent catch blocks with debug logging

Every empty catch {} in sidepanel.js, sidebar-agent.ts now logs with
[gstack sidebar] or [sidebar-agent] prefix. Chat poll 401s, stop agent,
tab poll, clear chat, SSE parse, refs fetch, stream JSON parse, queue
read/parse, process kill — all now visible in console.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: noisy debug logging + auto model routing in browse server

Server-side silent catch blocks (22 instances) now log with [browse] prefix:
chat persistence, session save/load, agent kill, tab pin/restore, welcome
page, buffer flush, worktree cleanup, lock files, SSE streams.

Also adds pickSidebarModel() — routes sidebar messages to sonnet for
navigation/interaction (click, goto, fill, screenshot) and opus for
analysis/comprehension (summarize, describe, find bugs). Sonnet is
~4x faster for action commands with zero quality difference.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* test: update sidebar tests for model router + longer stopAgent slice

- stopAgent slice 800→1000 to accommodate added error logging lines
- Replace hardcoded opus assertion with model router assertions

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: sidebar arrow hint stays visible until sidebar actually opens

Previously the welcome page arrow hid immediately when the extension's
content script loaded — but extension loaded ≠ sidebar open. Now the
signal flow is: sidepanel connects → tells background.js → relays to
content script → dispatches gstack-extension-ready → arrow hides.

Adds welcome-page.test.ts: 14 tests verifying arrow, branding, feature
cards, dark theme, and auto-hide behavior via real HTTP server.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* test: arrow hide signal chain (4-step) + stale session-ended assertion

8 new tests verify the sidebarOpened → background → content → welcome
signal chain. Updates stale "(session ended)" test that checked for
text removed in a prior commit.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: preserve optimistic UI during tab switch on first message

When the user sends a message and the server assigns it to a new tab
(because Chrome's active tab changed), switchChatTab() was blowing away
the optimistic user bubble and thinking dots with a welcome screen.
Now preserves the current DOM if we're mid-send with a thinking indicator.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: sidebar message flow architecture doc + CLAUDE.md pointer

SIDEBAR_MESSAGE_FLOW.md documents the full init timeline, message flow
(user types → claude responds), auth token chain, arrow hint signal
chain, model routing, tab concurrency, and known failure modes.

CLAUDE.md now tells you to read it before touching sidebar files.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: sidebar chat resets idle timer + shutdown kills sidebar-agent

Two fixes for the "browser died while chatting" problem:

1. /sidebar-command now calls resetIdleTimer(). Previously only CLI
   commands reset it, so the server would shut down after 30 min even
   while the user was actively chatting in the sidebar.

2. shutdown() now pkills the sidebar-agent daemon. Previously the agent
   survived server shutdown, kept polling a dead server, and spawned
   confused claude processes that auto-started headless browsers.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: disable idle timeout in headed mode — browser lives until closed

The 30-minute idle timeout only applies to headless mode now. In headed
mode the user is looking at the Chrome window, so auto-shutdown is wrong.
The browser stays alive until explicit disconnect or window close.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: cookies button in sidebar footer opens cookie picker

One-click cookie import from the sidebar. Navigates the headed browser
to /cookie-picker where you can select which domains to import from
your real Chrome profile.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: update project documentation for GStack Browser improvements

README.md: updated Real browser mode and sidebar agent sections with
model routing, cookie import button, no idle timeout in headed mode.
Updated skill table entries for /browse and /open-gstack-browser.

docs/skills.md: updated /open-gstack-browser deep dive with model
routing and cookie import details.

GSTACK_BROWSER_V0.md: added 6 new SHIPPED items to implementation
status table (model routing, debug logging, idle timeout, cookie
button, arrow hint, architecture doc).

TODOS.md: marked "Sidebar agent Write tool + error visibility" as
SHIPPED. Added new P2 TODO for direct API calls to eliminate
claude -p startup tax.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add Claude Code terminal example to welcome page TRY IT NOW

Fifth example shows the parent agent workflow: navigate, extract CSS,
write to file. The other four are all sidebar-only. This one shows
co-presence — the Claude Code session that launched the browser can
also control it directly.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: hide internal tool-result file reads from sidebar activity

Claude reads its own ~/.claude/projects/.../tool-results/ files as
internal plumbing. These showed up as long unreadable paths in the
sidebar. Now: describeToolCall returns empty for tool-result reads,
and the sidebar skips rendering tool_use entries with no description.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: collapse tool calls into "See reasoning" disclosure on completion

While the agent is working, tool calls stream live so you can watch
progress. When the agent finishes, all tool calls collapse into a
"See reasoning (N steps)" disclosure. Click to expand and see what
the agent did. The final text answer stays visible.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* test: 17 new tests for recent sidebar fixes

Covers: tool-result file filtering, empty tool_use skip, reasoning
disclosure collapse, idle timeout headed mode bypass, sidebar-command
idle reset, shutdown sidebar-agent kill, cookie button, and model
routing analysis-before-action priority.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: move cookies button to quick actions toolbar

Cookies now sits next to Cleanup and Screenshot as a primary action
button (🍪 Cookies) instead of buried in the footer. Same behavior,
more discoverable.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add instructional text to cookie picker page

"Select the domains of cookies you want to import to GStack Browser.
You'll be able to browse those sites with the same login as your
other browser."

Also fixes stale test that expected hardcoded '--model', 'opus'.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: 6-card welcome page with cookie import + dual-agent cards

3x2 grid layout (was 2x2). New cards: "Import your cookies" (click
🍪 Cookies to import login sessions from Chrome/Arc/Brave) and
"Or use your main agent" (your Claude Code terminal also controls
this browser). Responsive: 3 cols > 2 cols > 1 col.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: move sidebar arrow hint to top-right instead of vertically centered

The arrow was centered vertically which put it behind the feature cards.
Now positioned at top: 80px where there's open space and it's more visible.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-04 10:17:05 -07:00
Garry Tan a1a933614c feat: sidebar CSS inspector + per-tab agents (v0.13.9.0) (#650)
* feat: CDP inspector module — persistent sessions, CSS cascade, style modification

New browse/src/cdp-inspector.ts with full CDP inspection engine:
- inspectElement() via CSS.getMatchedStylesForNode + DOM.getBoxModel
- modifyStyle() via CSS.setStyleTexts with headless page.evaluate fallback
- Persistent CDP session lifecycle (create, reuse, detach on nav, re-create)
- Specificity sorting, overridden property detection, UA rule filtering
- Modification history with undo support
- formatInspectorResult() for CLI output

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: browse server inspector endpoints + inspect/style/cleanup/prettyscreenshot CLI

Server endpoints: POST /inspector/pick, GET /inspector, POST /inspector/apply,
POST /inspector/reset, GET /inspector/history, GET /inspector/events (SSE).
CLI commands: inspect (CDP cascade), style (live CSS mod), cleanup (page clutter
removal), prettyscreenshot (clean screenshot pipeline).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: sidebar CSS inspector — element picker, box model, rule cascade, quick edit

Extension changes for the visual CSS inspector:
- inspector.js: element picker with hover highlight, CSS selector generation,
  basic mode fallback (getComputedStyle + CSSOM), page alteration handlers
- inspector.css: picker overlay styles (blue highlight + tooltip)
- background.js: inspector message routing (picker <-> server <-> sidepanel)
- sidepanel: Inspector tab with box model viz (gstack palette), matched rules
  with specificity badges, computed styles, click-to-edit quick edit,
  Send to Agent/Code button, empty/loading/error states

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: document inspect, style, cleanup, prettyscreenshot browse commands

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: auto-track user-created tabs and handle tab close

browser-manager.ts changes:
- context.on('page') listener: automatically tracks tabs opened by the user
  (Cmd+T, right-click open in new tab, window.open). Previously only
  programmatic newTab() was tracked, so user tabs were invisible.
- page.on('close') handler in wirePageEvents: removes closed tabs from the
  pages map and switches activeTabId to the last remaining tab.
- syncActiveTabByUrl: match Chrome extension's active tab URL to the correct
  Playwright page for accurate tab identity.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: per-tab agent isolation via BROWSE_TAB environment variable

Prevents parallel sidebar agents from interfering with each other's tab context.

Three-layer fix:
- sidebar-agent.ts: passes BROWSE_TAB=<tabId> env var to each claude process,
  per-tab processing set allows concurrent agents across tabs
- cli.ts: reads process.env.BROWSE_TAB and includes tabId in command request body
- server.ts: handleCommand() temporarily switches activeTabId when tabId is present,
  restores after command completes (safe: Bun event loop is single-threaded)

Also: per-tab agent state (TabAgentState map), per-tab message queuing,
per-tab chat buffers, verbose streaming narration, stop button endpoint.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: sidebar per-tab chat context, tab bar sync, stop button, UX polish

Extension changes:
- sidepanel.js: per-tab chat history (tabChatHistories map), switchChatTab()
  swaps entire chat view, browserTabActivated handler for instant tab sync,
  stop button wired to /sidebar-agent/stop, pollTabs renders tab bar
- sidepanel.html: updated banner text ("Browser co-pilot"), stop button markup,
  input placeholder "Ask about this page..."
- sidepanel.css: tab bar styles, stop button styles, loading state fixes
- background.js: chrome.tabs.onActivated sends browserTabActivated to sidepanel
  with tab URL for instant tab switch detection

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* test: per-tab isolation, BROWSE_TAB pinning, tab tracking, sidebar UX

sidebar-agent.test.ts (new tests):
- BROWSE_TAB env var passed to claude process
- CLI reads BROWSE_TAB and sends tabId in body
- handleCommand accepts tabId, saves/restores activeTabId
- Tab pinning only activates when tabId provided
- Per-tab agent state, queue, concurrency
- processingTabs set for parallel agents

sidebar-ux.test.ts (new tests):
- context.on('page') tracks user-created tabs
- page.on('close') removes tabs from pages map
- Tab isolation uses BROWSE_TAB not system prompt hack
- Per-tab chat context in sidepanel
- Tab bar rendering, stop button, banner text

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: resolve merge conflicts — keep security defenses + per-tab isolation

Merged main's security improvements (XML escaping, prompt injection defense,
allowed commands whitelist, --model opus, Write tool, stderr capture) with
our branch's per-tab isolation (BROWSE_TAB env var, processingTabs set,
no --resume). Updated test expectations for expanded system prompt.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* chore: bump version and changelog (v0.13.9.0)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: add inspector message types to background.js allowlist

Pre-existing bug found by Codex: ALLOWED_TYPES in background.js was missing
all inspector message types (startInspector, stopInspector, elementPicked,
pickerCancelled, applyStyle, toggleClass, injectCSS, resetAll, inspectResult).
Messages were silently rejected, making the inspector broken on ALL pages.

Also: separate executeScript and insertCSS into individual try blocks in
injectInspector(), store inspectorMode for routing, and add content.js
fallback when script injection fails (CSP, chrome:// pages).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: basic element picker in content.js for CSP-restricted pages

When inspector.js can't be injected (CSP, chrome:// pages), content.js
provides a basic picker using getComputedStyle + CSSOM:
- startBasicPicker/stopBasicPicker message handlers
- captureBasicData() with ~30 key CSS properties, box model, matched rules
- Hover highlight with outline save/restore (never leaves artifacts)
- Click uses e.target directly (no re-querying by selector)
- Sends inspectResult with mode:'basic' for sidebar rendering
- Escape key cancels picker and restores outlines

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: cleanup + screenshot buttons in sidebar inspector toolbar

Two action buttons in the inspector toolbar:
- Cleanup (🧹): POSTs cleanup --all to server, shows spinner, chat
  notification on success, resets inspector state (element may be removed)
- Screenshot (📸): POSTs screenshot to server, shows spinner, chat
  notification with saved file path

Shared infrastructure:
- .inspector-action-btn CSS with loading spinner via ::after pseudo-element
- chat-notification type in addChatEntry() for system messages
- package.json version bump to 0.13.9.0

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* test: inspector allowlist, CSP fallback, cleanup/screenshot buttons

16 new tests in sidebar-ux.test.ts:
- Inspector message allowlist includes all inspector types
- content.js basic picker (startBasicPicker, captureBasicData, CSSOM,
  outline save/restore, inspectResult with mode basic, Escape cleanup)
- background.js CSP fallback (separate try blocks, inspectorMode, fallback)
- Cleanup button (POST /command, inspector reset after success)
- Screenshot button (POST /command, notification rendering)
- Chat notification type and CSS styles

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: update project documentation for v0.13.9.0

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: cleanup + screenshot buttons in chat toolbar (not just inspector)

Quick actions toolbar (🧹 Cleanup, 📸 Screenshot) now appears above the chat
input, always visible. Both inspector and chat buttons share runCleanup() and
runScreenshot() helper functions. Clicking either set shows loading state on
both simultaneously.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* test: chat toolbar buttons, shared helpers, quick-action-btn styles

Tests that chat toolbar exists (chat-cleanup-btn, chat-screenshot-btn,
quick-actions container), CSS styles (.quick-action-btn, .quick-action-btn.loading),
shared runCleanup/runScreenshot helper functions, and cleanup inspector reset.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: aggressive cleanup heuristics — overlays, scroll unlock, blur removal

Massively expanded CLEANUP_SELECTORS with patterns from uBlock Origin and
Readability.js research:
- ads: 30+ selectors (Google, Amazon, Outbrain, Taboola, Criteo, etc.)
- cookies: OneTrust, Cookiebot, TrustArc, Quantcast + generic patterns
- overlays (NEW): paywalls, newsletter popups, interstitials, push prompts,
  app download banners, survey modals
- social: follow prompts, share tools
- Cleanup now defaults to --all when no args (sidebar button fix)
- Uses !important on all display:none (overrides inline styles)
- Unlocks body/html scroll (overflow:hidden from modal lockout)
- Removes blur/filter effects (paywall content blur)
- Removes max-height truncation (article teaser truncation)
- Collapses empty ad placeholder whitespace (empty divs after ad removal)
- Skips gstack-ctrl indicator in sticky removal

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: disable action buttons when disconnected, no error spam

- setActionButtonsEnabled() toggles .disabled class on all cleanup/screenshot
  buttons (both chat toolbar and inspector toolbar)
- Called with false in updateConnection when server URL is null
- Called with true when connection established
- runCleanup/runScreenshot silently return when disconnected instead of
  showing 'Not connected' error notifications
- CSS .disabled style: pointer-events:none, opacity:0.3, cursor:not-allowed

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* test: cleanup heuristics, button disabled state, overlay selectors

17 new tests:
- cleanup defaults to --all on empty args
- CLEANUP_SELECTORS overlays category (paywall, newsletter, interstitial)
- Major ad networks in selectors (doubleclick, taboola, criteo, etc.)
- Major consent frameworks (OneTrust, Cookiebot, TrustArc, Quantcast)
- !important override for inline styles
- Scroll unlock (body overflow:hidden)
- Blur removal (paywall content blur)
- Article truncation removal (max-height)
- Empty placeholder collapse
- gstack-ctrl indicator skip in sticky cleanup
- setActionButtonsEnabled function
- Buttons disabled when disconnected
- No error spam from cleanup/screenshot when disconnected
- CSS disabled styles for action buttons

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: LLM-based page cleanup — agent analyzes page semantically

Instead of brittle CSS selectors, the cleanup button now sends a prompt to
the sidebar agent (which IS an LLM). The agent:
1. Runs deterministic $B cleanup --all as a quick first pass
2. Takes a snapshot to see what's left
3. Analyzes the page semantically to identify remaining clutter
4. Removes elements intelligently, preserving site branding

This means cleanup works correctly on any site without site-specific selectors.
The LLM understands that "Your Daily Puzzles" is clutter, "ADVERTISEMENT" is
junk, but the SF Chronicle masthead should stay.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: aggressive cleanup heuristics + preserve top nav bar

Deterministic cleanup improvements (used as first pass before LLM analysis):
- New 'clutter' category: audio players, podcast widgets, sidebar puzzles/games,
  recirculation widgets (taboola, outbrain, nativo), cross-promotion banners
- Text-content detection: removes "ADVERTISEMENT", "Article continues below",
  "Sponsored", "Paid content" labels and their parent wrappers
- Sticky fix: preserves the topmost full-width element near viewport top (site
  nav bar) instead of hiding all sticky/fixed elements. Sorts by vertical
  position, preserves the first one that spans >80% viewport width.

Tests: clutter category, ad label removal, nav bar preservation logic.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* test: LLM-based cleanup architecture, deterministic heuristics, sticky nav

22 new tests covering:
- Cleanup button uses /sidebar-command (agent) not /command (deterministic)
- Cleanup prompt includes deterministic first pass + agent snapshot analysis
- Cleanup prompt lists specific clutter categories for agent guidance
- Cleanup prompt preserves site identity (masthead, headline, body, byline)
- Cleanup prompt instructs scroll unlock and $B eval removal
- Loading state management (async agent, setTimeout)
- Deterministic clutter: audio/podcast, games/puzzles, recirculation
- Ad label text patterns (ADVERTISEMENT, Sponsored, Article continues)
- Ad label parent wrapper hiding for small containers
- Sticky nav preservation (sort by position, first full-width near top)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: prevent repeat chat message rendering on reconnect/replay

Root cause: server persists chat to disk (chat.jsonl) and replays on restart.
Client had no dedup, so every reconnect re-rendered the entire history.
Messages from an old HN session would repeat endlessly on the SF Chronicle tab.

Fix: renderedEntryIds Set tracks which entry IDs have been rendered. addChatEntry
skips entries already in the set. Entries without an id (local notifications)
bypass the check. Clear chat resets the set.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: agent stops when done, no focus stealing, opus for prompt injection safety

Three fixes for sidebar agent UX:
- System prompt: "Be CONCISE. STOP as soon as the task is done. Do NOT keep
  exploring or doing bonus work." Prevents agent from endlessly taking
  screenshots and highlighting elements after answering the question.
- switchTab(id, opts): new bringToFront option. Internal tab pinning
  (BROWSE_TAB) uses bringToFront: false so agent commands never steal
  window focus from the user's active app.
- Keep opus model (not sonnet) for prompt injection resistance on untrusted
  web pages. Remove Write from allowedTools (agent only needs Bash for $B).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* test: agent conciseness, focus stealing, opus model, switchTab opts

Tests for the three UX fixes:
- System prompt contains STOP/CONCISE/Do NOT keep exploring
- sidebar agent uses opus (not sonnet) for prompt injection resistance
- switchTab has bringToFront option, defaults to true (opt-out)
- handleCommand tab pinning uses bringToFront: false (no focus steal)
- Updated stale tests: switchTab signature, allowedTools excludes Write,
  narration -> conciseness, tab pinning restore calls

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* test: sidebar CSS interaction E2E — HN comment highlight round-trip

New E2E test (periodic tier, ~$2/run) that exercises the full sidebar
agent pipeline with CSS interaction:
1. Agent navigates to Hacker News
2. Clicks into the top story's comments
3. Reads comments and identifies the most insightful one
4. Highlights it with a 4px solid orange outline via style injection

Tests: navigation, snapshot, text reading, LLM judgment, CSS modification.
Requires real browser + real Claude (ANTHROPIC_API_KEY).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: sidebar CSS E2E test — correct idle timeout (ms not s), pipe stdio

Root cause of test failure: BROWSE_IDLE_TIMEOUT is in milliseconds, not
seconds. '600' = 0.6 seconds, server died immediately after health check.
Fixed to '600000' (10 minutes).

Also: use 'pipe' stdio instead of file descriptors (closing fds kills child
on macOS/bun), catch ConnectionRefused on poll retry, 4 min poll timeout
for the multi-step opus task.

Test passes: agent navigates to HN, reads comments, identifies most
insightful one, highlights it with orange CSS, stops. 114s, $0.00.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-30 12:51:05 -06:00
Garry Tan 3cda8deec9 fix: security audit round 2 (v0.13.4.0) (#640)
* fix: chrome-cdp localhost-only binding

Restrict Chrome CDP to localhost by adding --remote-debugging-address=127.0.0.1
and --remote-allow-origins to prevent network-accessible debugging sessions.

Clears 1 Socket anomaly (Chrome CDP session exposure).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: extension sender validation + message type allowlist

Add sender.id check and ALLOWED_TYPES allowlist to the Chrome extension's
message handler. Defense-in-depth against message spoofing from external
extensions or future externally_connectable changes.

Clears 2 Socket anomalies (extension permissions).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: checksum-verified bun install

Replace unverified curl|bash bun installation with checksum-verified
download-then-execute pattern. The install script is downloaded, sha256
verified against a known hash, then executed. Preserves the Bun-native
install path without adding a Node/npm dependency.

Clears Snyk W012 + 3 Socket anomalies.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: content trust boundary markers in browse output

Wrap page-content commands (text, html, links, forms, accessibility,
console, dialog, snapshot) with --- BEGIN/END UNTRUSTED EXTERNAL CONTENT ---
markers. Covers direct commands (server.ts), chain sub-commands, and
snapshot output (meta-commands.ts).

Adds PAGE_CONTENT_COMMANDS set and wrapUntrustedContent() helper in
commands.ts (single source of truth, DRY). Expands the SKILL.md trust
warning with explicit processing rules for agents.

Clears Snyk W011 (third-party content exposure).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: harden trust boundary markers against escape attacks

- Sanitize URLs in markers (remove newlines, cap at 200 chars) to prevent
  marker injection via history.pushState
- Escape marker strings in content (zero-width space) so malicious pages
  can't forge the END marker to break out of the untrusted block
- Wrap resume command snapshot with trust boundary markers
- Wrap diff command output with trust boundary markers
- Wrap watch stop last snapshot with trust boundary markers

Found by cross-model adversarial review (Claude + Codex).

* chore: bump version and changelog (v0.13.4.0)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* chore: gitignore .factory/ and remove from tracking

Factory Droid support was removed in this branch. The .factory/ directory
was re-added by merging main (which had v0.13.5.0 Factory support).
Gitignore it so it stays out.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-29 22:46:33 -06:00
Garry Tan 7450b5160b fix: security audit remediation — 12 fixes, 20 tests (v0.13.1.0) (#595)
* fix: remove auth token from /health, secure extension bootstrap (CRITICAL-02 + HIGH-03)

- Remove token from /health response (was leaked to any localhost process)
- Write .auth.json to extension dir for Manifest V3 bootstrap
- sidebar-agent reads token from state file via BROWSE_STATE_FILE env var
- Remove getToken handler from extension (token via health broadcast)
- Extension loads token before first health poll to prevent race condition

* fix: require auth on cookie-picker data routes (CRITICAL-01)

- Add Bearer token auth gate on all /cookie-picker/* data/action routes
- GET /cookie-picker HTML page stays unauthenticated (UI shell)
- Token embedded in served HTML for picker's fetch calls
- CORS preflight now allows Authorization header

* fix: add state file TTL and plaintext cookie warning (HIGH-02)

- Add savedAt timestamp to state save output
- Warn on load if state file older than 7 days
- Auto-delete stale state files (>7 days) on server startup
- Warning about plaintext cookie storage in save message

* fix: innerHTML XSS in extension content script and sidepanel (MEDIUM-01)

- content.js: replace innerHTML with createElement/textContent for ref panel
- sidepanel.js: escape entry.command with escapeHtml() in activity feed
- Both found by security audit + Codex adversarial red team

* fix: symlink bypass in validateReadPath (MEDIUM-02)

- Always resolve to absolute path first (fixes relative path bypass)
- Use realpathSync to follow symlinks before boundary check
- Throw on non-ENOENT realpathSync failures (explicit over silent)
- Resolve SAFE_DIRECTORIES through realpathSync (macOS /tmp → /private/tmp)
- Resolve directory part for non-existent files (ENOENT with symlinked parent)

* fix: freeze hook symlink bypass and prefix collision (MEDIUM-03)

- Add POSIX-portable path resolution (cd + pwd -P, works on macOS)
- Fix prefix collision: /project-evil no longer matches /project freeze dir
- Use trailing slash in boundary check to require directory boundary

* fix: shell script injection in gstack-config and telemetry (MEDIUM-04)

- gstack-config: validate keys (alphanumeric+underscore only)
- gstack-config: use grep -F (fixed string) instead of -E (regex)
- gstack-config: escape sed special chars in values, drop newlines
- gstack-telemetry-log: sanitize REPO_SLUG and BRANCH via json_safe()

* test: 20 security tests for audit remediation

- server-auth: verify token removed from /health, auth on /refs, /activity/*
- cookie-picker: auth required on data routes, HTML page unauthenticated
- path-validation: symlink bypass blocked, realpathSync failure throws
- gstack-config: regex key rejected, sed special chars preserved
- state-ttl: savedAt timestamp, 7-day TTL warning
- telemetry: branch/repo with quotes don't corrupt JSON
- adversarial: sidepanel escapes entry.command, freeze prefix collision

* chore: bump version and changelog (v0.13.1.0)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* docs: tone down changelog — defense in depth, not catastrophic bugs

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-28 08:35:24 -06:00
Garry Tan dc0bae82d3 fix: sidebar agent uses real tab URL instead of stale Playwright URL (v0.12.6.0) (#544)
* fix: sidebar agent uses extension's activeTabUrl instead of stale Playwright URL

When the user navigates manually in headed Chrome, Playwright's page.url()
stays on the old page. The sidebar agent was using this stale URL in its
system prompt, causing it to navigate to the wrong page (e.g., Hacker News
instead of the user's current page).

The Chrome extension now captures the active tab URL via chrome.tabs.query()
and sends it as activeTabUrl in the /sidebar-command POST body. The server
prefers this over Playwright's URL. The URL is sanitized (http/https only,
control chars stripped, 2048 char limit) to prevent prompt injection.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: connect-chrome pre-flight cleanup + improved onboarding docs

Adds Step 0 pre-flight cleanup that kills stale browse servers and cleans
Chromium profile locks before connecting. Improves the onboarding flow with
clearer instructions for finding the extension, opening the Side Panel, and
troubleshooting connection issues. Fixes Mode check from cdp to headed.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* test: sidebar agent test suite (layers 1-2)

Layer 1 (unit): 18 tests for URL sanitization in sidebar-utils.ts — http/https
pass, chrome:// rejected, javascript: rejected, control chars stripped, truncation.

Layer 2 (integration): 13 tests for server HTTP endpoints — auth, sidebar-command
queue writes, activeTabUrl override/fallback, event relay to chat buffer, message
queuing, queue overflow (429), chat clear, agent kill.

Source changes for testability:
- Extract sanitizeExtensionUrl() to browse/src/sidebar-utils.ts
- Add BROWSE_HEADLESS_SKIP env var to skip browser launch in HTTP-only tests
- Add SIDEBAR_QUEUE_PATH env var to both server.ts and sidebar-agent.ts
- Add SIDEBAR_AGENT_TIMEOUT env var to sidebar-agent.ts
- Sync package.json version to match VERSION (0.12.2.0)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* test: sidebar agent round-trip tests with mock claude (layer 3)

Starts server + sidebar-agent together with a mock claude binary (shell script
outputting canned stream-json). Verifies the full queue-based message flow:

- Full round-trip: POST /sidebar-command → queue → agent → mock claude → events → chat
- Claude crash recovery: mock exits 1, agent_error appears, status returns to idle
- Sequential queue drain: two rapid messages both process in order

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* test: sidebar agent E2E tests with real Claude (layer 4)

Two E2E tests that exercise the full sidebar agent flow with real Claude:

- sidebar-navigate: POST /sidebar-command asking Claude to describe a fixture
  page, verify it responds with page content through the chat buffer
- sidebar-url-accuracy: POST with activeTabUrl differing from Playwright URL,
  verify the queue prompt uses the extension URL (the core bug fix)

Both registered as periodic tier (~$0.80 total, non-deterministic).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: sidebar E2E tests — sequential execution + eval collector fix

Both tests now pass:
- sidebar-url-accuracy: deterministic queue file check (no Claude needed)
- sidebar-navigate: real Claude responds through sidebar agent queue

Fixed: testIfSelected (sequential, not concurrent) to avoid queue file
conflicts. Added cost_usd field for eval collector compatibility.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: kill stale sidebar-agent processes before starting new one

Each /connect-chrome starts a new sidebar-agent subprocess with unref()
but never kills the previous one. Old agents accumulate as zombies with
stale auth tokens. When they pick up queue entries, their event relay
fails (401), so the server never receives agent_done and marks the agent
as "hung". The user sees the sidebar freeze.

Fix: pkill any existing sidebar-agent.ts processes before spawning.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* chore: bump version and changelog (v0.12.6.0)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* docs: add P1 TODO for sidebar Write tool + error visibility

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-26 22:07:03 -06:00
Garry Tan 7665adf4fe feat: headed mode + sidebar agent + Chrome extension (v0.12.0) (#517)
* feat: CDP connect — control real Chrome/Comet via Playwright

Add `connectCDP()` to BrowserManager: connects to a running browser via
Chrome DevTools Protocol. All existing browse commands work unchanged
through Playwright's abstraction layer.

- chrome-launcher.ts: browser discovery, CDP probe, auto-relaunch with rollback
- browser-manager.ts: connectCDP(), mode guards (close/closeTab/recreateContext/handoff),
  auto-reconnect on browser restart, getRefMap() for extension API
- server.ts: CDP branch in start(), /health gains mode field, /refs endpoint,
  idle timer only resets on /command (not passive endpoints)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: browse connect/disconnect/focus CLI commands

- connect: pre-server command that discovers browser, starts server in CDP mode
- disconnect: drops CDP connection, restarts in headless mode
- focus: brings browser window to foreground via osascript (macOS)
- status: now shows Mode: cdp | launched | headed
- startServer() accepts extra env vars for CDP URL/port passthrough

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: CDP-aware skill templates — skip cookie import in real browser mode

Skills now check `$B status` for CDP mode and skip:
- /qa: cookie import prompt, user-agent override, headless workarounds
- /design-review: cookie import for authenticated pages
- /setup-browser-cookies: returns "not needed" in CDP mode

Regenerated SKILL.md files from updated templates.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: activity streaming — SSE endpoint for Chrome extension Side Panel

Real-time browse command feed via Server-Sent Events:
- activity.ts: ActivityEntry type, CircularBuffer (capacity 1000), privacy
  filtering (redacts passwords, auth tokens, sensitive URL params),
  cursor-based gap detection, async subscriber notification
- server.ts: /activity/stream SSE, /activity/history REST, handleCommand
  instrumented with command_start/command_end events
- 18 unit tests for filterArgs privacy, emitActivity, subscribe lifecycle

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: Chrome extension Side Panel + Conductor API proposal

Chrome extension (Manifest V3, sideload):
- Side Panel with live activity feed, @ref overlays, dark terminal aesthetic
- Background worker: health polling, SSE relay, ref fetching
- Popup: port config, connection status, side panel launcher
- Content script: floating ref panel with @ref badges

Conductor API proposal (docs/designs/CONDUCTOR_SESSION_API.md):
- SSE endpoint for full Claude Code session mirroring in Side Panel
- Discovery via HTTP endpoint (not filesystem — extensions can't read files)

TODOS.md: add $B watch, multi-agent tabs, cross-platform CDP, Web Store publishing.
Mark CDP mode as shipped.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: detect Conductor runtime, skip osascript quit for sandboxed apps

macOS App Management blocks Electron apps (Conductor) from quitting
other apps via osascript. Now detects the runtime environment:
- terminal/claude-code/codex: can manage apps freely
- conductor: prints manual restart instructions + polls for 60s

detectRuntime() checks env vars and parent process. When Chrome needs
restart but we can't quit it, prints step-by-step instructions and
waits for the user to restart Chrome with --remote-debugging-port.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: detect Conductor via actual env vars (CONDUCTOR_WORKSPACE_NAME)

Previous detection checked CONDUCTOR_WORKSPACE_ID which doesn't exist.
Conductor sets CONDUCTOR_WORKSPACE_NAME, CONDUCTOR_BIN_DIR, CONDUCTOR_PORT,
and __CFBundleIdentifier=com.conductor.app. Check these FIRST because
Conductor sessions also have ANTHROPIC_API_KEY (which was matching claude-code).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: connection status pill — floating indicator when gstack controls Chrome

Small pill in bottom-right corner of every page: "● gstack · 3 refs"
Shows when connected via CDP, fades to 30% opacity after 3s, full on hover.
Disappears entirely when disconnected.

Background worker now notifies content scripts on connect/disconnect state
changes so the pill appears/disappears without polling.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: Chrome requires --user-data-dir for remote debugging

Chrome refuses --remote-debugging-port without an explicit --user-data-dir.
Add userDataDir to BrowserBinary registry (macOS Application Support paths)
and pass it in both auto-launch and manual restart instructions.

Fix double-quoting in CLI manual restart instructions.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: Chrome must be fully quit before launching with --remote-debugging-port

Chrome refuses to enable CDP on its default profile when another instance
is running (even with explicit --user-data-dir). The only reliable path:
fully quit Chrome first, then relaunch with the flag.

Updated instructions to emphasize this clearly with verification step.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: bin/chrome-cdp — quit Chrome and relaunch with CDP in one command

Quits Chrome gracefully, waits for full exit, relaunches with
--remote-debugging-port, polls until CDP is ready. Usage: chrome-cdp [port]

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: use Playwright channel:chrome instead of broken connectOverCDP

Playwright's connectOverCDP hangs with Chrome 146 due to CDP protocol
version mismatch. Switch to channel:'chrome' which uses Playwright's
native pipe protocol to launch the system Chrome binary directly.

This is simpler and more reliable:
- No CDP port discovery needed
- No --remote-debugging-port or --user-data-dir hassles
- $B connect just works — launches real Chrome headed window
- All Playwright APIs (snapshot, click, fill) work unchanged

bin/chrome-cdp updated with symlinked profile approach (kept for
manual CDP use cases, but $B connect no longer needs it).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: green border + gstack label on controlled Chrome window

Injects a 2px green border and small "gstack" label on every page
loaded in the controlled Chrome window via context.addInitScript().
Users can instantly tell which Chrome window Claude controls.

Also fixes close() for channel:chrome mode (uses browser.close()
not browser.disconnect() which doesn't exist).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* chore: cleanup chrome-launcher runtime detection, remove puppeteer-core dep

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* style(design): redesign controlled Chrome indicator

Replace crude green border + label with polished indicator:
- 2px shimmer gradient at top edge (green→cyan→green, 3s loop)
- Floating pill bottom-right with frosted glass bg, fades to 25%
  opacity after 4s so it doesn't compete with page content
- prefers-reduced-motion disables shimmer animation
- Much more subtle — looks like a developer tool, not broken CSS

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: document real browser mode + Chrome extension in BROWSER.md and README.md

BROWSER.md: new sections for connect/disconnect/focus commands,
Chrome extension Side Panel install, CDP-aware skills, activity streaming.
Updated command reference table, key components, env vars, source map.

README.md: updated /browse description, added "Real browser mode" to
What's New section.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: step-by-step Chrome extension install guide in BROWSER.md

Replace terse bullet points with numbered walkthrough covering:
developer mode toggle, load unpacked, macOS file picker tip (Cmd+Shift+G),
pin extension, configure port, open side panel. Added troubleshooting section.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: add Cmd+Shift+. tip for hidden folders in macOS file picker

macOS hides folders starting with . by default. Added both shortcuts:
Cmd+Shift+G (paste path directly) and Cmd+Shift+. (show hidden files).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: integrate hidden folder tips into the install flow naturally

Move Cmd+Shift+G and Cmd+Shift+. tips inline with the file picker
step instead of as a separate tip block after it.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: auto-load Chrome extension when $B connect launches Chrome

Extension auto-loads via --load-extension flag — no manual chrome://extensions
install needed. findExtensionPath() checks repo root, global install, and dev
paths. Also adds bin/gstack-extension helper for manual install in regular
Chrome, and rewrites BROWSER.md install docs with auto-load as primary path.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: /connect-chrome skill — one command to launch Chrome with Side Panel

New skill that runs $B connect, verifies the connection, guides the user
to open the Side Panel, and demos the live activity feed. Extension auto-loads
via --load-extension so no manual chrome://extensions install needed.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: use launchPersistentContext for Chrome extension loading

Playwright's chromium.launch() silently ignores --load-extension.
Switch to launchPersistentContext with ignoreDefaultArgs to remove
--disable-extensions flag. Use bundled Chromium (real Chrome blocks
unpacked extensions). Fixed port 34567 for CDP mode so the extension
auto-connects.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: sync extension to DESIGN.md — amber accent, zinc neutrals, grain texture

Import design system from gstack-website. Update all extension colors:
green (#4ade80) → amber (#F59E0B/#FBBF24), zinc gray neutrals, grain
texture overlay. Regenerate icons as amber "G" monogram on dark background.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: sidebar chat with Claude Code — icon opens side panel directly

Replace popup flyout with direct side panel open on icon click. Primary
UI is now a chat interface that sends messages to Claude Code via file
queue. Activity/Refs tabs moved behind a debug toggle in the footer.
Command bar with history, auto-poll for responses, amber design system.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: sidebar agent — Claude-powered chat backend via file queue

Add /sidebar-command, /sidebar-response, and /sidebar-chat endpoints
to the browse server. sidebar-agent.ts watches the command queue file,
spawns claude -p with browse context for each message, and streams
responses back to the sidebar chat.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: remove duplicate gstack pill overlay, hide crash restore bubble

The addInitScript indicator and the extension's content script were both
injecting bottom-right pills, causing duplicates. Remove the pill from
addInitScript (extension handles it). Replace --restore-last-session with
--hide-crash-restore-bubble to suppress the "Chromium didn't shut down
correctly" dialog.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: state file authority — CDP server cannot be silently replaced

Hardens the connect/disconnect lifecycle:
- ensureServer() refuses to auto-start headless when CDP server is alive
- $B connect does full cleanup: SIGTERM → 2s → SIGKILL, profile locks, state
- shutdown() cleans Chromium SingletonLock/Socket/Cookie files
- uncaughtException/unhandledRejection handlers do emergency cleanup

This prevents the bug where a headless server overwrites the CDP server's
state file, causing $B commands to hit the wrong browser.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: sidebar agent streaming events + session state management

Enhance sidebar-agent.ts with:
- Live streaming of claude -p events (tool_use, text, result) to sidebar
- Session state file for BROWSE_STATE_FILE propagation to claude subprocess
- Improved logging (stderr, exit codes, event types)
- stdin.end() to prevent claude waiting for input
- summarizeToolInput() with path shortening for compact sidebar display

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: sidebar chat UI — streaming events, agent status, reconnect retry

Sidebar panel improvements:
- Chat tab renders streaming agent events (tool_use, text, result)
- Thinking dots animation while agent processes
- Agent error display with styled error blocks
- tryConnect() with 2s retry loop for initial connection
- Debug tabs (Activity/Refs) hidden behind gear toggle
- Clear chat button
- Compact tool call display with path shortening

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: server-integrated sidebar agent with sessions and message queue

Move the sidebar agent from a separate bun process into server.ts:
- Agent spawns claude -p directly when messages arrive via /sidebar-command
- In-memory chat buffer backed by per-session chat.jsonl on disk
- Session manager: create, load, persist, list sessions
- Message queue (cap 5) with agent status tracking (idle/processing/hung)
- Stop/kill endpoints with queue dismiss support
- /health now returns agent status + session info
- All sidebar endpoints require Bearer auth
- Agent killed on server shutdown
- 120s timeout detects hung claude processes

Eliminates: file-queue polling, separate sidebar-agent.ts process,
stale auth tokens, state file conflicts between processes.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: extension auth + token flow for server-integrated agent

Update Chrome extension to use Bearer auth on all sidebar endpoints:
- background.js captures auth token from /health, exposes via getToken msg
- background.js sets openPanelOnActionClick for direct side panel access
- sidepanel.js gets token from background, sends in all fetch headers
- Health broadcasts include token so sidebar auto-authenticates
- Removes popup from manifest — icon click opens side panel directly

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: self-healing sidebar — reconnect banner, state machine, copy button

Sidebar UI now handles disconnection gracefully:
- Connection state machine: connected → reconnecting → dead
- Amber pulsing banner during reconnect (2s retry, 30 attempts)
- Red "Server offline" banner with Reconnect + Copy /connect-chrome buttons
- Green "Reconnected" toast that fades after 3s on successful reconnect
- Copy button lets user paste /connect-chrome into any Claude Code session

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: crash handling — save session, kill agent, distinct exit codes

Hardened shutdown/crash behavior:
- Browser disconnect exits with code 2 (distinct from crash code 1)
- emergencyCleanup kills agent subprocess and saves session state
- Clean shutdown saves session before exit (chat history persists)
- Clear user message on browser disconnect: "Run $B connect to reconnect"

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: worktree-per-session isolation for sidebar agent

Each sidebar session gets an isolated git worktree so the agent's file
operations don't conflict with the user's working directory:
- createWorktree() creates detached HEAD worktree in ~/.gstack/worktrees/
- Falls back to main cwd for non-git repos or on creation failure
- Handles collision cleanup from prior crashes
- removeWorktree() cleans up on session switch and shutdown
- worktreePath persisted in session.json

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(qa): ISSUE-001 — disconnect blocked by CDP guard in ensureServer

$B disconnect was routed through ensureServer() which refused to start a
headless server when a CDP state file existed. Disconnect is now handled
before ensureServer() (like connect), with force-kill + cleanup fallback
when the CDP server is unresponsive.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: resolve claude binary path for daemon-spawned agent

The browse server runs as a daemon and may not inherit the user's shell
PATH. Add findClaudeBin() that checks ~/.local/bin/claude (standard
install location), which claude, and common system paths. Shows a clear
error in the sidebar chat if claude CLI is not found.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: resolve claude symlinks + check Conductor bundled binary

posix_spawn fails on symlinks in compiled bun binaries. Now:
- Checks Conductor app's bundled binary first (not a symlink)
- Scans ~/.local/share/claude/versions/ for direct versioned binaries
- Uses fs.realpathSync() to resolve symlinks before spawning

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: compiled bun binary cannot posix_spawn — use external agent process

Compiled bun binaries fail posix_spawn on ALL executables (even /bin/bash).
The server now writes to an agent queue file, and a separate non-compiled
bun process (sidebar-agent.ts) reads the queue, spawns claude, and POSTs
events back via /sidebar-agent/event.

Changes:
- server.ts: spawnClaude writes to queue file instead of spawning directly
- server.ts: new /sidebar-agent/event endpoint for agent → server relay
- server.ts: fix result event field name (event.text vs event.result)
- sidebar-agent.ts: rewritten to poll queue file, relay events via HTTP
- cli.ts: $B connect auto-starts sidebar-agent as non-compiled bun process

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: loading spinner on sidebar open while connecting to server

Shows an amber spinner with "Connecting..." when the sidebar first opens,
replacing the empty state. After the first successful /sidebar-chat poll:
- If chat history exists: renders it immediately
- If no history: shows the welcome message

Prevents the jarring empty-then-populated flash on sidebar open.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: zero-friction side panel — auto-open on install, pill is clickable

Three changes to eliminate manual side panel setup:
- Auto-open side panel on extension install/update (onInstalled listener)
- gstack pill (bottom-right) is now clickable — opens the side panel
- Pill has pointer-events: auto so clicks always register (was: none)

User no longer needs to find the puzzle piece icon, pin the extension,
or know the side panel exists. It opens automatically on first launch
and can be re-opened by clicking the floating gstack pill.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* refactor: kill CDP naming, delete chrome-launcher.ts dead code

The connectCDP() method and connectionMode: 'cdp' naming was a legacy
artifact — real Chrome was tried but failed (silently blocks
--load-extension), so the implementation already used Playwright's
bundled Chromium via launchPersistentContext(). The naming was
misleading.

Changes:
- Delete chrome-launcher.ts (361 LOC) — only import was in unreachable
  attemptReconnect() method
- Delete dead attemptReconnect() and reconnecting field
- Delete preExistingTabIds (was for protecting real Chrome tabs we
  never connect to)
- Rename connectCDP() → launchHeaded()
- Rename connectionMode: 'cdp' → 'headed' across all files
- Replace BROWSE_CDP_URL/BROWSE_CDP_PORT env vars with BROWSE_HEADED=1
- Regenerate SKILL.md files for updated command descriptions
- Move BrowserManager unit tests to browser-manager-unit.test.ts

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: converge handoff into connect — extension loads on handoff

Handoff now uses launchPersistentContext() with extension auto-loading,
same as the connect/launchHeaded() path. This means when the agent
gets stuck (2FA, CAPTCHA) and hands off to the user, the Chrome
extension + side panel are available automatically.

Before: handoff used chromium.launch() + newContext() — no extension
After: handoff uses chromium.launchPersistentContext() — extension loads

Also sets connectionMode to 'headed' and disables dialog auto-accept
on handoff, matching connect behavior.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: gate sidebar chat behind --chat flag

$B connect (default): headed Chromium + extension with Activity + Refs
tabs only. No separate agent spawned. Clean, no confusion.

$B connect --chat: same + Chat tab with standalone claude -p agent.
Shows experimental banner: "Standalone mode — this is a separate
agent from your workspace."

Implementation:
- cli.ts: parse --chat, set BROWSE_SIDEBAR_CHAT env, conditionally
  spawn sidebar-agent
- server.ts: gate /sidebar-* routes behind chatEnabled, return 403
  when disabled, include chatEnabled in /health response
- sidepanel.js: applyChatEnabled() hides/shows Chat tab + banner
- background.js: forward chatEnabled from health response
- sidepanel.html/css: experimental banner with amber styling

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: file drop relay + $B inbox command

Sidebar agent now writes structured messages to .context/sidebar-inbox/
when processing user input. The workspace agent can read these via
$B inbox to see what the user reported from the browser.

File drop format:
  .context/sidebar-inbox/{timestamp}-observation.json
  { type, timestamp, page: {url}, userMessage, sidebarSessionId }

Atomic writes (tmp + rename) prevent partial reads. $B inbox --clear
removes messages after display.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: $B watch — passive observation mode

Claude enters read-only mode and captures periodic snapshots (every 5s)
while the user browses. Mutation commands (click, fill, etc.) are
blocked during watch. $B watch stop exits and returns a summary with
the last snapshot.

Requires headed mode ($B connect). This is the inverse of the scout
pattern — the workspace agent watches through the browser instead of
the sidebar relaying to it.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* test: add coverage for sidebar-agent, file-drop, and watch mode

33 new tests covering:
- Sidebar agent queue parsing (valid/malformed/empty JSONL)
- writeToInbox file drop (directory creation, atomic writes, JSON format)
- Inbox command (display, sorting, --clear, malformed file handling)
- Watch mode state machine (start/stop cycles, snapshots, duration)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: TODOS cleanup + Chrome vs Chromium exploration doc

- Update TODOS.md: mark CDP mode, $B watch, sidebar scout as SHIPPED
- Delete dead "cross-platform CDP browser discovery" TODO
- Rename dependencies from "CDP connect" to "headed mode"
- Add docs/designs/CHROME_VS_CHROMIUM_EXPLORATION.md memorializing
  the architecture exploration and decision to use Playwright Chromium

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: add Conductor Chrome sidebar integration design doc

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: sidebar-agent validates cwd before spawning claude

The queue entry may reference a worktree that was cleaned up between
sessions. Now falls back to process.cwd() if the path doesn't exist,
preventing silent spawn failures.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: gen-skill-docs resolver merge + preamble tier gate + plan file discovery

The local RESOLVERS record in gen-skill-docs.ts was shadowing the imported
canonical resolvers, causing stale test coverage and preamble generators
to be used instead of the authoritative versions in resolvers/.

Changes:
- Merge imported RESOLVERS with local overrides (spread + override pattern)
- Fix preamble tier gate: tier 1 skills no longer get AskUserQuestion format
- Make plan file discovery host-agnostic (search multiple plan dirs)
- Add missing E2E tier entries for ship/review plan completion tests

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: ungate sidebar agent + raise timeout to 5 minutes (v0.12.0)

Sidebar chat is now always available in headed mode — no --chat flag needed.
Agent tasks get 5 minutes instead of 2, enabling multi-page workflows like
navigating directories and filling forms across pages.

Changes:
- cli.ts: remove --chat flag, always set BROWSE_SIDEBAR_CHAT=1, always spawn agent
- server.ts: remove chatEnabled gate (403 response), raise AGENT_TIMEOUT_MS to 300s
- sidebar-agent.ts: raise child process timeout from 120s to 300s

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: headed mode + sidebar agent documentation (v0.12.0)

- README: sidebar agent section, personal automation example (school parent
  portal), two auth paths (manual login + cookie import), DevTools MCP mention
- BROWSER.md: sidebar agent section with usage, timeout, session isolation,
  authentication, and random delay documentation
- connect-chrome template: add sidebar chat onboarding step
- CHANGELOG: v0.12.0 entry covering headed mode, sidebar agent, extension
- VERSION: bump to 0.12.0.0
- TODOS: Chrome DevTools MCP integration as P0

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* chore: regenerate SKILL.md files

Generated from updated templates + resolver merge. Key changes:
- Tier 1 skills no longer include AskUserQuestion format section
- Ship/review skills now include coverage gate with thresholds
- Connect-chrome skill includes sidebar chat onboarding step
- Plan file discovery uses host-agnostic paths

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* chore: regenerate Codex connect-chrome skill

Updated preamble with proactive prompt and sidebar chat onboarding step.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: network idle, state persistence, iframe support, chain pipe format (v0.12.1.0) (#516)

* feat: network idle detection + chain pipe format

- Upgrade click/fill/select from domcontentloaded to networkidle wait
  (2s timeout, best-effort). Catches XHR/fetch triggered by interactions.
- Add pipe-delimited format to chain as JSON fallback:
  $B chain 'goto url | click @e5 | snapshot -ic'
- Add post-loop networkidle wait in chain when last command was a write.
- Frame-aware: commands use target (getActiveFrameOrPage) for locator ops,
  page-only ops (goto/back/forward/reload) guard against frame context.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: $B state save/load + $B frame — new browse commands

- state save/load: persist cookies + URLs to .gstack/browse-states/{name}.json
  File perms 0o600, name sanitized to [a-zA-Z0-9_-]. V1 skips localStorage
  (breaks on load-before-navigate). Load replaces session via closeAllPages().
- frame: switch command context to iframe via CSS selector, @ref, --name, or
  --url. 'frame main' returns to main frame. Execution target abstraction
  (getActiveFrameOrPage) across read-commands, snapshot, and write-commands.
- Frame context cleared on tab switch, navigation, resume, and handoff.
- Snapshot shows [Context: iframe src="..."] header when in frame.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* test: add tests for network idle, chain pipe format, state, and frame

- Network idle: click on fetch button waits for XHR, static click is fast
- Chain pipe: pipe-delimited commands, quoted args, JSON still works
- State: save/load round-trip, name sanitization, missing state error
- Frame: switch to iframe + back, snapshot context header, fill in frame,
  goto-in-frame guard, usage error

New fixtures: network-idle.html (fetch + static buttons), iframe.html (srcdoc)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: review fixes — iframe ref scoping, detached frame recovery, state validation

- snapshot.ts: ref locators, cursor-interactive scan, and cursor locator
  now use target (frame-aware) instead of page — fixes @ref clicking in iframes
- browser-manager.ts: getActiveFrameOrPage auto-recovers from detached frames
  via isDetached() check
- meta-commands.ts: state load resets activeFrame, elementHandle disposed after
  contentFrame(), state file schema validation (cookies + pages arrays),
  filter empty pipe segments in chain tokenizer
- write-commands.ts: upload command uses target.locator() for frame support

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* chore: regenerate SKILL.md files + rebuild binary

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* chore: bump version and changelog (v0.12.1.0)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-26 11:15:24 -06:00