Merge remote-tracking branch 'origin/main' into garrytan/openclaw-plan

2026-05-07 14:06:42 +02:00 · 2026-04-04 16:08:14 -07:00
parent 21840c4c67 04b709d91a
commit 04f1fb64c7
98 changed files with 16663 additions and 520 deletions
@@ -0,0 +1,376 @@
+# GStack Browser V0 — The AI-Native Development Browser
+
+**Date:** 2026-03-30
+**Author:** Garry Tan + Claude Code
+**Status:** Phase 1a shipped, Phase 1b in progress
+**Branch:** garrytan/gstack-as-browser
+
+## The Thesis
+
+Every other AI browser (Atlas, Dia, Comet, Chrome Auto Browse) starts with a
+consumer browser and bolts AI onto it. GStack Browser inverts this. It starts
+with Claude Code as the runtime and gives it a browser viewport.
+
+The agent is the primary citizen. The browser is the canvas. Skills are
+first-class capabilities. You don't "use a browser with AI help." You use
+an AI that can see and interact with the web.
+
+This is the IDE for the post-IDE era. Code lives in the terminal. The product
+lives in the browser. The AI works across both simultaneously. What Cursor did
+for text editors, GStack Browser does for the browser.
+
+## What It Is Today (Phase 1a, shipped)
+
+A double-clickable macOS .app that wraps Playwright's Chromium with the gstack
+sidebar extension baked in. You open it and Claude Code can see your screen,
+navigate pages, fill forms, take screenshots, inspect CSS, clean up overlays,
+and run any gstack skill. All without touching a terminal.
+
+```
+GStack Browser.app (389MB, 189MB DMG)
+├── Compiled browse binary (58MB) — CLI + HTTP server
+├── Chrome extension (172KB) — sidebar, activity feed, inspector
+├── Playwright's Chromium (330MB) — the actual browser
+└── Launcher script — binds project dir, sets env vars
+```
+
+Launch → Chromium opens with sidebar → extension auto-connects to browse server
+→ agent ready in ~5 seconds.
+
+## What It Will Be
+
+### Phase 1b: Developer UX (next)
+
+**Command Palette (Cmd+K):** The signature interaction. Opens a fuzzy-filtered
+skill picker. Type "/qa" to start QA testing, "/investigate" to debug, "/ship"
+to create a PR. Skills are fetched from the browse server, not hardcoded. The
+palette is the entry point to everything.
+
+**Quick Screenshot (Cmd+Shift+S):** Capture the current viewport and pipe it into
+the sidebar chat with "What do you see?" context. The AI analyzes the screenshot
+and gives you actionable feedback. Visual bug reports in one keystroke.
+
+**Status Bar:** A persistent 30px bar at the bottom of every page. Shows agent
+status (idle/thinking), workspace name, current branch, and auto-detected dev
+servers. Click a dev server pill to navigate. Always-visible context about what
+the AI is doing.
+
+**Auto-Detect Dev Servers:** On launch, scans common ports (3000, 3001, 4200,
+5173, 5174, 8000, 8080). If exactly one server is found, auto-navigates to it.
+Dev server pills in the status bar for one-click switching.
+
+### Phase 2: BoomLooper Integration
+
+The sidebar connects to BoomLooper's Phoenix/Elixir APIs instead of a local
+`claude -p` subprocess. BoomLooper provides:
+
+- **Multi-agent orchestration.** Spawn 5 agents in parallel, each with its own
+  browser tab. One runs QA, one does design review, one watches for regressions.
+- **Docker infrastructure.** Each agent gets an isolated container. The browser
+  inside the container tests the dev server. No port conflicts, no state leakage.
+- **Session persistence.** Agent conversations survive browser restarts. Pick up
+  where you left off.
+- **Team visibility.** Your teammates can watch what your agents are doing in
+  real-time. Like pair programming, but the pair is 5 AI agents and you're the
+  conductor.
+
+### Phase 3: Browse as BoomLooper Tool
+
+The browse binary becomes an MCP tool in BoomLooper. Agents in Docker containers
+use browse commands to test dev servers, take screenshots, fill forms, and verify
+deployments. Cross-platform compilation (linux-arm64/x64) required.
+
+### Phase 4: Chromium Fork (trigger-gated)
+
+When the extension side panel hits hard API limits, GStack Browser ships to
+external users, build infra exists, and the business justifies maintenance:
+fork Chromium. Brave's `chromium_src` override pattern, CC-powered 6-week
+rebases (2-4 hours with CC vs 1-2 weeks human). ~20-30 files modified.
+
+### Phase 5: Native Shell
+
+SwiftUI/AppKit app shell with native sidebar, isolated Chromium service. Full
+platform integration. May be superseded by Phase 4 if the Chromium fork includes
+a native sidebar.
+
+## Vision: What an AI Browser Can Do
+
+### 1. See What You See
+
+The browser is the AI's eyes. Not through screenshots (though it can do that),
+but through DOM access, CSS inspection, network monitoring, and accessibility
+tree parsing. The AI understands the page structure, not just the pixels.
+
+**Today:** `snapshot` command returns an accessibility-tree representation of any
+page. The AI can "see" every button, link, form field, and text element. Element
+references (`@e1`, `@e2`) let the AI click, fill, and interact.
+
+**Next:** Real-time page observation. The AI notices when a page changes, when an
+error appears in the console, when a network request fails. Proactive debugging
+without being asked.
+
+**Future:** Visual understanding. The AI compares before/after screenshots to catch
+visual regressions. Pixel-level design review. "This button moved 3px left and the
+font changed from 14px to 13px."
+
+### 2. Act on What It Sees
+
+Not just reading pages, but interacting with them like a human user would.
+
+**Today:** Click, fill, select, hover, type, scroll, upload files, handle dialogs,
+navigate, manage tabs. All via simple commands through the browse server.
+
+**Next:** Multi-step user flows. "Log in, go to settings, change the timezone,
+verify the confirmation message." The AI chains commands with verification at each
+step.
+
+**Future:** Autonomous QA agent. "Test every link on this page. Fill every form.
+Try to break it." The AI runs exhaustive interaction testing without a script.
+Finds bugs a human tester would miss because it tries combinations humans don't
+think of.
+
+### 3. Write Code While Browsing
+
+This is the key differentiator. The AI can see the bug in the browser AND fix it
+in the code simultaneously.
+
+**Today:** The sidebar chat connects to Claude Code. You say "this button is
+misaligned" and the AI reads the CSS, identifies the issue, and proposes a fix.
+The `/design-review` skill takes screenshots, identifies visual issues, and
+commits fixes with before/after evidence.
+
+**Next:** Live reload loop. The AI edits CSS/HTML, the browser auto-reloads, the
+AI verifies the fix visually. No human in the loop for simple visual fixes.
+"Fix every spacing issue on this page" becomes a 30-second task.
+
+**Future:** Full-stack debugging. The AI sees a 500 error in the browser, reads
+the server logs, traces to the failing line, writes the fix, and verifies in the
+browser. One command: "This page is broken. Fix it."
+
+### 4. Understand the Whole Stack
+
+The browser isn't just a viewport. It's a window into the application's health.
+
+**Today:**
+- Console log capture — every `console.log`, `console.error`, and warning
+- Network request monitoring — every XHR, fetch, websocket, and static asset
+- Performance metrics — Core Web Vitals, resource timing, paint events
+- Cookie and storage inspection — read and write localStorage, sessionStorage
+- CSS inspection — computed styles, box model, rule cascade
+
+**Next:**
+- Network request replay — "replay this failing request with different params"
+- Performance regression detection — "this page is 200ms slower than yesterday"
+- Dependency auditing — "this page loads 47 third-party scripts"
+- Accessibility auditing — "this form has no labels, these colors fail contrast"
+
+**Future:**
+- Full application telemetry — CPU, memory, GPU usage in real-time
+- Cross-browser testing — same test suite across Chrome, Firefox, Safari
+- Real user monitoring correlation — "this bug affects 12% of production users"
+
+### 5. The Workspace Model
+
+The browser IS the workspace. Not a tab in a workspace. The workspace itself.
+
+**Today:** Each browser session is bound to a project directory. The sidebar shows
+the current branch. The status bar shows detected dev servers.
+
+**Next:** Multi-project support. Switch between projects without closing the
+browser. Each project gets its own set of tabs, its own agent, its own context.
+Like VSCode workspaces, but for the browser.
+
+**Future:** Team workspaces. Multiple developers share a browser workspace. See
+each other's agents working. Collaborative debugging where one person navigates
+and the other watches the AI fix things in real-time.
+
+### 6. Skills as Browser Capabilities
+
+Every gstack skill becomes a browser capability.
+
+| Skill | Browser Capability |
+|-------|-------------------|
+| `/qa` | Test every page, find bugs, fix them, verify fixes |
+| `/design-review` | Screenshot → analyze → fix CSS → screenshot again |
+| `/investigate` | See the error in browser → trace to code → fix → verify |
+| `/benchmark` | Measure page performance → detect regressions → alert |
+| `/canary` | Monitor deployed site → screenshot periodically → alert on changes |
+| `/ship` | Run tests → review diff → create PR → verify deployment in browser |
+| `/cso` | Audit page for XSS, open redirects, clickjacking in real browser |
+| `/office-hours` | Browse competitor sites → synthesize observations → design doc |
+
+The command palette (Cmd+K) is the hub. You don't need to know the skills exist.
+You type what you want, the fuzzy filter finds the right skill, and the AI runs it
+with the browser as context.
+
+### 7. The Design Loop
+
+AI-powered design is a loop, not a handoff.
+
+```
+Generate mockup (GPT Image API)
+  → Review in browser (side-by-side with live site)
+  → Iterate with feedback ("make the header taller")
+  → Approve direction
+  → Generate production HTML/CSS
+  → Preview in browser
+  → Fine-tune with /design-review
+  → Ship
+```
+
+The browser closes the gap between "what it looks like in Figma" and "what it
+looks like in production." Because the AI can see both simultaneously.
+
+### 8. The Security Loop
+
+CSO review in a real browser, not just static analysis.
+
+- Inject XSS payloads into every input field, check if they execute
+- Test CSRF by replaying requests from a different origin
+- Check for open redirects by navigating to crafted URLs
+- Verify CSP headers are actually enforced (not just present)
+- Test auth flows by manipulating cookies and tokens in real-time
+- Check for clickjacking by loading the site in an iframe
+
+Static analysis catches patterns. Browser testing catches reality.
+
+### 9. The Monitoring Loop
+
+Post-deploy canary monitoring, in a real browser.
+
+```
+Deploy → Browser loads production URL
+  → Screenshot baseline
+  → Every 5 minutes: screenshot, compare, check console
+  → Alert on: visual regression, new console errors, performance drop
+  → Auto-rollback if critical error detected
+```
+
+Synthetic monitoring with AI judgment. Not just "did the page return 200" but
+"does the page look right and work correctly."
+
+## Architecture
+
+```
+-------------------------------------------------------+
+|                  GStack Browser                        |
+|                                                        |
+|  +------------------+  +---------------------------+  |
+|  |   Chromium        |  |   Extension Side Panel    |  |
+|  |   (Playwright)    |  |   ├── Chat (Claude Code)  |  |
+|  |                   |  |   ├── Activity Feed        |  |
+|  |   ┌────────────┐  |  |   ├── Element Refs         |  |
+|  |   │ Status Bar  │  |  |   ├── CSS Inspector        |  |
+|  |   └────────────┘  |  |   ├── Command Palette      |  |
+|  +--------┬──────────+  |   └── Settings             |  |
+|           │              +-------------┬--------------+  |
+-----------┼────────────────────────────┼─────────────────+
+            │                            │
+            v                            v
+  +---------┴-----------+    +-----------┴-----------+
+  |  Browse Server      |    |  Sidebar Agent        |
+  |  (HTTP + SSE)       |    |  (claude -p wrapper)  |
+  |  :34567             |    |  Runs gstack skills   |
+  |                     |    |  Per-tab isolation     |
+  |  Commands:          |    |                       |
+  |  goto, click, fill  |    |  Future: BoomLooper   |
+  |  snapshot, screenshot|   |  GenServer agents     |
+  |  css, inspect, eval |    |                       |
+  +---------┬-----------+    +-----------┬-----------+
+            │                            │
+            v                            v
+  +---------┴-----------+    +-----------┴-----------+
+  |  User's App         |    |  Claude Code          |
+  |  localhost:3000     |    |  (reads/writes code)  |
+  |  (or any URL)       |    |                       |
+  +---------------------+    +-----------------------+
+```
+
+## Competitive Landscape
+
+| Browser | Approach | Differentiator | Weakness |
+|---------|----------|---------------|----------|
+| **Atlas** | Chromium fork + AI layer | Agentic browser, "OWL" isolated Chromium | Consumer-focused, no code integration |
+| **Dia** | AI-native browser | Clean UI, built for AI interaction | No dev tools, no code editing |
+| **Comet** | AI browser | Multi-agent browsing | Early, unclear dev workflow |
+| **Chrome Auto Browse** | Extension | Google's own, deep Chrome integration | Extension-only, no code editing |
+| **Cursor** | VSCode fork + AI | Best-in-class code editing | No browser viewport |
+| **GStack Browser** | CC runtime + browser viewport | See bug in browser, fix in code, verify | Currently macOS-only, no consumer features |
+
+GStack Browser doesn't compete with consumer browsers. It competes with the
+workflow of switching between browser and editor. The goal is to make that switch
+invisible.
+
+## Design System
+
+From DESIGN.md:
+- **Primary accent:** Amber-500 (#F59E0B) — agent active, focus states, pulse
+- **Background:** Zinc-950 (#09090B) through Zinc-800 (#27272A) — dark, dense
+- **Typography:** JetBrains Mono (code/status), DM Sans (UI/labels)
+- **Border radius:** 8px (md), 12px (lg), full (pills)
+- **Motion:** Pulse animation on agent active, 200ms transitions
+- **Layout:** Sidebar (right), status bar (bottom), palette (centered overlay)
+
+## Implementation Status
+
+| Component | Status | Notes |
+|-----------|--------|-------|
+| .app bundle | **SHIPPED** | 389MB, launches in ~5s |
+| DMG packaging | **SHIPPED** | 189MB compressed |
+| `GSTACK_CHROMIUM_PATH` | **SHIPPED** | Custom Chromium binary support |
+| `BROWSE_EXTENSIONS_DIR` | **SHIPPED** | Extension path override |
+| Auth via `/health` | **SHIPPED** | Replaces .auth.json file approach, auto-refreshes on server restart |
+| Build script | **SHIPPED** | `scripts/build-app.sh` |
+| Model routing | **SHIPPED** | Sonnet for actions, Opus for analysis (`pickSidebarModel`) |
+| Debug logging | **SHIPPED** | 40+ silent catches → prefixed console logging across 4 files |
+| No idle timeout (headed) | **SHIPPED** | Browser stays alive as long as window is open |
+| Cookie import button | **SHIPPED** | One-click in sidebar footer, opens `/cookie-picker` |
+| Sidebar arrow hint | **SHIPPED** | Points to sidebar, hides only when sidebar actually opens |
+| Architecture doc | **SHIPPED** | `docs/designs/SIDEBAR_MESSAGE_FLOW.md` |
+| Command palette | Planned | Phase 1b |
+| Quick screenshot | Planned | Phase 1b |
+| Status bar | Planned | Phase 1b |
+| Dev server detection | Planned | Phase 1b |
+| BoomLooper integration | Future | Phase 2 |
+| Cross-platform | Future | Phase 3 |
+| Chromium fork | Trigger-gated | Phase 4 |
+| Native shell | Deferred | Phase 5 |
+
+## The 12-Month Vision
+
+```
+TODAY (Phase 1)               6 MONTHS (Phase 2-3)          12 MONTHS (Phase 4-5)
+─────────────                 ──────────────────            ────────────────────
+macOS .app wrapper            BoomLooper multi-agent         Chromium fork OR
+Extension sidebar             Docker containers              Native SwiftUI shell
+Local claude -p agent         Team workspaces                Cross-platform
+Single project                Linux/x64 browse               Auto-update
+Manual skill invocation       Autonomous QA loops            Skill marketplace
+                              Performance monitoring          Plugin API
+                              Real-time collaboration         Enterprise features
+```
+
+The 12-month ideal: you open GStack Browser, it detects your project, starts
+your dev server, runs your test suite, and reports what's broken. You say "fix
+it" and the AI fixes every bug, verifies each fix visually, and creates a PR.
+You review the PR in the same browser, approve it, and the AI deploys it and
+monitors the canary. All in one window.
+
+That's the browser as AI workspace. Not a browser with AI bolted on. An AI
+with a browser bolted on.
+
+## Review History
+
+This plan went through 4 reviews:
+
+1. **CEO Review** (`/plan-ceo-review`, SELECTIVE EXPANSION) — 9 scope proposals,
+   3 accepted (Cmd+K, Cmd+Shift+S, status bar), 5 deferred, 1 skipped
+2. **Design Review** (`/plan-design-review`) — scored 5/10 → 8/10, 9 design
+   decisions added, 2 approved mockups generated
+3. **Eng Review** (`/plan-eng-review`) — 4 issues found, 0 critical gaps,
+   test plan produced
+4. **Codex Review** (outside voice) — 9 findings, 3 critical gaps caught
+   (server bundling, auth file location, project binding). All resolved.
+
+The Codex review caught 3 real architecture gaps that survived 3 prior reviews.
+Cross-model review works.
@@ -0,0 +1,190 @@
+# Sidebar Message Flow
+
+How the GStack Browser sidebar actually works. Read this before touching
+sidepanel.js, background.js, content.js, server.ts sidebar endpoints,
+or sidebar-agent.ts.
+
+## Components
+
+```
+┌─────────────────┐     ┌──────────────┐     ┌─────────────┐     ┌────────────────┐
+│  sidepanel.js   │────▶│ background.js│────▶│  server.ts   │────▶│sidebar-agent.ts│
+│  (Chrome panel) │     │ (svc worker) │     │  (Bun HTTP)  │     │  (Bun process) │
+└─────────────────┘     └──────────────┘     └─────────────┘     └────────────────┘
+        ▲                                           │                      │
+        │           polls /sidebar-chat             │    polls queue file   │
+        └───────────────────────────────────────────┘                      │
+                                                    ◀──────────────────────┘
+                                                    POST /sidebar-agent/event
+```
+
+## Startup Timeline
+
+```
+T+0ms     CLI runs `$B connect`
+            ├── Server starts on port 34567
+            ├── Writes state to .gstack/browse.json (pid, port, token)
+            ├── Launches headed Chromium with extension
+            └── Clears sidebar-agent-queue.jsonl
+
+T+500ms   sidebar-agent.ts spawned by CLI
+            ├── Reads auth token from .gstack/browse.json
+            ├── Creates queue file if missing
+            ├── Sets lastLine = current line count
+            └── Starts polling every 200ms
+
+T+1-3s    Extension loads in Chromium
+            ├── background.js: health poll every 1s (fast startup)
+            │     └── GET /health → gets auth token
+            ├── content.js: injects on welcome page
+            │     └── Does NOT fire gstack-extension-ready (waits for sidebar)
+            └── Side panel: may auto-open via chrome.sidePanel.open()
+
+T+2-10s   Side panel connects
+            ├── tryConnect() → asks background for port/token
+            ├── Fallback: direct GET /health for token
+            ├── updateConnection(url, token)
+            │     ├── Starts chat polling (1s interval)
+            │     ├── Starts tab polling (2s interval)
+            │     ├── Connects SSE activity stream
+            │     └── Sends { type: 'sidebarOpened' } to background
+            └── background relays to content script → hides welcome arrow
+
+T+10s+    Ready for messages
+```
+
+## Message Flow: User Types → Claude Responds
+
+```
+1. User types "go to hn" in sidebar, hits Enter
+
+2. sidepanel.js sendMessage()
+   ├── Renders user bubble immediately (optimistic)
+   ├── Renders thinking dots immediately
+   ├── Switches to fast poll (300ms)
+   └── chrome.runtime.sendMessage({ type: 'sidebar-command', message, tabId })
+
+3. background.js
+   ├── Gets active Chrome tab URL
+   └── POST /sidebar-command { message, activeTabUrl }
+       with Authorization: Bearer ${authToken}
+
+4. server.ts /sidebar-command handler
+   ├── validateAuth(req)
+   ├── syncActiveTabByUrl(extensionUrl) — syncs Playwright tab to Chrome tab
+   ├── pickSidebarModel(message) — 'sonnet' for actions, 'opus' for analysis
+   ├── Adds user message to chat buffer
+   ├── Builds system prompt + args
+   └── Appends JSON to ~/.gstack/sidebar-agent-queue.jsonl
+
+5. sidebar-agent.ts poll() (within 200ms)
+   ├── Reads new line from queue file
+   ├── Parses JSON entry
+   ├── Checks processingTabs — skips if tab already has agent running
+   └── askClaude(entry) — fire and forget
+
+6. sidebar-agent.ts askClaude()
+   ├── spawn('claude', ['-p', prompt, '--model', model, ...])
+   ├── Streams stdout line-by-line (stream-json format)
+   ├── For each event: POST /sidebar-agent/event { type, tool, text, tabId }
+   └── On close: POST /sidebar-agent/event { type: 'agent_done' }
+
+7. server.ts processAgentEvent()
+   ├── Adds entry to chat buffer (in-memory + disk)
+   ├── On agent_done: sets tab status to 'idle'
+   └── On agent_done: processes next queued message for that tab
+
+8. sidepanel.js pollChat() (every 300ms during fast poll)
+   ├── GET /sidebar-chat?after=${chatLineCount}&tabId=${tabId}
+   ├── Renders new entries (text, tool_use, agent_done)
+   └── On agent idle: removes thinking dots, stops fast poll
+```
+
+## Arrow Hint Hide Flow (4-step signal chain)
+
+The welcome page shows a right-pointing arrow until the sidebar opens.
+
+```
+1. sidepanel.js updateConnection()
+   └── chrome.runtime.sendMessage({ type: 'sidebarOpened' })
+
+2. background.js
+   └── chrome.tabs.sendMessage(activeTabId, { type: 'sidebarOpened' })
+
+3. content.js onMessage handler
+   └── document.dispatchEvent(new CustomEvent('gstack-extension-ready'))
+
+4. welcome.html script
+   └── addEventListener('gstack-extension-ready', () => arrow.classList.add('hidden'))
+```
+
+The arrow does NOT hide when the extension loads. Only when the sidebar connects.
+
+## Auth Token Flow
+
+```
+Server starts → AUTH_TOKEN = crypto.randomUUID()
+    │
+    ├── GET /health (no auth) → returns { token: AUTH_TOKEN }
+    │
+    ├── background.js checkHealth() → authToken = data.token
+    │     └── Refreshes on EVERY health poll (fixes stale token on restart)
+    │
+    ├── sidepanel.js tryConnect() → serverToken from background or /health
+    │     └── Used for chat polling: Authorization: Bearer ${serverToken}
+    │
+    └── sidebar-agent.ts refreshToken() → reads from .gstack/browse.json
+          └── Used for event relay: Authorization: Bearer ${authToken}
+```
+
+If the server restarts, all three components get fresh tokens within 10s
+(background health poll interval).
+
+## Model Routing
+
+`pickSidebarModel(message)` in server.ts classifies messages:
+
+| Pattern | Model | Why |
+|---------|-------|-----|
+| "click @e24", "go to hn", "screenshot" | sonnet | Deterministic tool calls, no thinking needed |
+| "what does this page say?", "summarize" | opus | Needs comprehension |
+| "find bugs", "check for broken links" | opus | Analysis task |
+| "navigate to X and fill the form" | sonnet | Action-oriented, no analysis words |
+
+Analysis words (`what`, `why`, `how`, `summarize`, `describe`, `analyze`, `read X and Y`)
+always override action verbs and force opus.
+
+## Known Failure Modes
+
+| Failure | Symptom | Root Cause | Fix |
+|---------|---------|------------|-----|
+| Stale auth token | "Unauthorized" in input | Server restarted, background had old token | background.js refreshes token on every health poll |
+| Tab ID mismatch | Message sent, no response visible | Server assigned tabId 1, sidebar polling tabId 0 | switchChatTab preserves optimistic UI during switch |
+| Sidebar agent not running | Messages queue forever | Agent process failed to spawn or crashed | Check `ps aux | grep sidebar-agent` |
+| Agent stale token | Agent runs but no events appear in sidebar | sidebar-agent has old token from .gstack/browse.json | Agent re-reads token before each event POST |
+| Queue file missing | spawnClaude fails | Race between server start and agent start | Both sides create file if missing |
+| Optimistic UI blown away | User bubble + dots vanish | switchChatTab replaced DOM with welcome screen | Preserved DOM when lastOptimisticMsg is set |
+
+## Per-Tab Concurrency
+
+Each browser tab can run its own agent simultaneously:
+
+- Server: `tabAgents: Map<number, TabAgentState>` with per-tab queue (max 5)
+- sidebar-agent: `processingTabs: Set<number>` prevents duplicate spawns
+- Two messages on same tab: queued sequentially, processed in order
+- Two messages on different tabs: run concurrently
+
+## File Locations
+
+| Component | File | Runs in |
+|-----------|------|---------|
+| Sidebar UI | `extension/sidepanel.js` | Chrome side panel |
+| Service worker | `extension/background.js` | Chrome background |
+| Content script | `extension/content.js` | Page context |
+| Welcome page | `browse/src/welcome.html` | Page context |
+| HTTP server | `browse/src/server.ts` | Bun (compiled binary) |
+| Agent process | `browse/src/sidebar-agent.ts` | Bun (non-compiled, can spawn) |
+| CLI entry | `browse/src/cli.ts` | Bun (compiled binary) |
+| Queue file | `~/.gstack/sidebar-agent-queue.jsonl` | Filesystem |
+| State file | `.gstack/browse.json` | Filesystem |
+| Chat log | `~/.gstack/sessions/<id>/chat.jsonl` | Filesystem |
@@ -0,0 +1,290 @@
+# Slate Host Integration — Research & Design Doc
+
+**Date:** 2026-04-02
+**Branch:** garrytan/slate-agent-support
+**Status:** Research complete, blocked on host config refactor
+**Supersedes:** None
+
+## What is Slate
+
+Slate is a proprietary coding agent CLI from Random Labs.
+Install: `npm i -g @randomlabs/slate` or `brew install anthropic/tap/slate`.
+License: Proprietary. 85MB compiled Bun binary (arm64/x64, darwin/linux/windows).
+npm package: `@randomlabs/slate@1.0.25` (thin 8.8KB launcher + platform-specific optional deps).
+
+Multi-model: dynamically selects Claude Sonnet/Opus/Haiku, plus other models.
+Built for "swarm orchestration" with extended multi-hour sessions.
+
+## Slate is an OpenCode fork
+
+**Confirmed via binary strings analysis** of the 85MB Mach-O arm64 binary:
+
+- Internal name: `name: "opencode"` (literal string in binary)
+- All `OPENCODE_*` env vars present alongside `SLATE_*` equivalents
+- Shares OpenCode's tool/skill architecture, LSP integration, terminal management
+- Own branding, API endpoints (`api.randomlabs.ai`, `agent-worker-prod.randomlabs.workers.dev`), and config paths
+
+This matters for integration: OpenCode conventions mostly apply, but Slate adds
+its own paths and env vars on top.
+
+## Skill Discovery (confirmed from binary)
+
+Slate scans ALL four directory families for skills. Error messages in binary confirm:
+
+```
+"failed .slate directory scan for skills"
+"failed .claude directory scan for skills"
+"failed .agents directory scan for skills"
+"failed .opencode directory scan for skills"
+```
+
+**Discovery paths (priority order from Slate docs):**
+
+1. `.slate/skills/<name>/SKILL.md` — project-level, highest priority
+2. `~/.slate/skills/<name>/SKILL.md` — global
+3. `.opencode/skills/`, `.agents/skills/` — compatibility fallback
+4. `.claude/skills/` — Claude Code compatibility fallback (lowest)
+5. Custom paths via `slate.json`
+
+**Glob patterns:** `**/SKILL.md` and `{skill,skills}/**/SKILL.md`
+
+**Commands:** Same directory structure but under `commands/` subdirs:
+`/.slate/commands/`, `/.claude/commands/`, `/.agents/commands/`, `/.opencode/commands/`
+
+**Skill frontmatter:** YAML with `name` and `description` fields (per Slate docs).
+No documented length limits on either field.
+
+## Project Instructions
+
+Slate reads both `CLAUDE.md` and `AGENTS.md` for project instructions.
+Both literal strings confirmed in binary. No changes needed to existing
+gstack projects... CLAUDE.md works as-is.
+
+## Configuration
+
+**Config file:** `slate.json` / `slate.jsonc` (NOT opencode.json)
+
+**Config options (from Slate docs):**
+- `privacy` (boolean) — disables telemetry/logging
+- Permissions: `allow`, `ask`, `deny` per tool (`read`, `edit`, `bash`, `grep`, `webfetch`, `websearch`, `*`)
+- Model slots: `models.main`, `models.subagent`, `models.search`, `models.reasoning`
+- MCP servers: local or remote with custom commands and headers
+- Custom commands: `/commands` with templates
+
+The setup script should NOT create `slate.json`. Users configure their own permissions.
+
+## CLI Flags (Headless Mode)
+
+```
+--stream-json / --output-format stream-json  — JSONL output, "compatible with Anthropic Claude Code SDK"
+--dangerously-skip-permissions               — bypass all permission checks (CI/automation)
+--input-format stream-json                   — programmatic input
+-q                                           — non-interactive mode
+-w <dir>                                     — workspace directory
+--output-format text                         — plain text output (default)
+```
+
+**Stream-JSON format:** Slate docs claim "compatible with Anthropic Claude Code SDK."
+Not yet empirically verified. Given OpenCode heritage, likely matches Claude Code's
+NDJSON event schema (type: "assistant", type: "tool_result", type: "result").
+
+**Need to verify:** Run `slate -q "hello" --stream-json` with valid credits and
+capture actual JSONL events before building the session runner parser.
+
+## Environment Variables (from binary strings)
+
+### Slate-specific
+```
+SLATE_API_KEY                              — API key
+SLATE_AGENT                                — agent selection
+SLATE_AUTO_SHARE                           — auto-share setting
+SLATE_CLIENT                               — client identifier
+SLATE_CONFIG                               — config override
+SLATE_CONFIG_CONTENT                       — inline config
+SLATE_CONFIG_DIR                           — config directory
+SLATE_DANGEROUSLY_SKIP_PERMISSIONS         — bypass permissions
+SLATE_DIR                                  — data directory override
+SLATE_DISABLE_AUTOUPDATE                   — disable auto-update
+SLATE_DISABLE_CLAUDE_CODE                  — disable Claude Code integration entirely
+SLATE_DISABLE_CLAUDE_CODE_PROMPT           — disable Claude Code prompt loading
+SLATE_DISABLE_CLAUDE_CODE_SKILLS           — disable .claude/skills/ loading
+SLATE_DISABLE_DEFAULT_PLUGINS              — disable default plugins
+SLATE_DISABLE_FILETIME_CHECK               — disable file time checks
+SLATE_DISABLE_LSP_DOWNLOAD                 — disable LSP auto-download
+SLATE_DISABLE_MODELS_FETCH                 — disable models config fetch
+SLATE_DISABLE_PROJECT_CONFIG               — disable project-level config
+SLATE_DISABLE_PRUNE                        — disable session pruning
+SLATE_DISABLE_TERMINAL_TITLE               — disable terminal title updates
+SLATE_ENABLE_EXA                           — enable Exa search
+SLATE_ENABLE_EXPERIMENTAL_MODELS           — enable experimental models
+SLATE_EXPERIMENTAL                         — enable experimental features
+SLATE_EXPERIMENTAL_BASH_DEFAULT_TIMEOUT_MS — bash timeout override
+SLATE_EXPERIMENTAL_DISABLE_COPY_ON_SELECT  — disable copy on select
+SLATE_EXPERIMENTAL_DISABLE_FILEWATCHER     — disable file watcher
+SLATE_EXPERIMENTAL_EXA                     — Exa search (alt flag)
+SLATE_EXPERIMENTAL_FILEWATCHER             — enable file watcher
+SLATE_EXPERIMENTAL_ICON_DISCOVERY          — icon discovery
+SLATE_EXPERIMENTAL_LSP_TOOL               — LSP tool
+SLATE_EXPERIMENTAL_LSP_TY                 — LSP type checking
+SLATE_EXPERIMENTAL_MARKDOWN               — markdown mode
+SLATE_EXPERIMENTAL_OUTPUT_TOKEN_MAX       — output token limit
+SLATE_EXPERIMENTAL_OXFMT                  — oxfmt integration
+SLATE_EXPERIMENTAL_PLAN_MODE              — plan mode
+SLATE_FAKE_VCS                            — fake VCS for testing
+SLATE_GIT_BASH_PATH                       — git bash path (Windows)
+SLATE_MODELS_URL                          — models config URL
+SLATE_PERMISSION                          — permission override
+SLATE_SERVER_PASSWORD                     — server auth
+SLATE_SERVER_USERNAME                     — server auth
+SLATE_TELEMETRY_DISABLED                  — disable telemetry
+SLATE_TEST_HOME                           — test home directory
+SLATE_TOKEN_DIR                           — token storage directory
+```
+
+### OpenCode legacy (still functional)
+```
+OPENCODE_DISABLE_LSP_DOWNLOAD
+OPENCODE_EXPERIMENTAL_DISABLE_FILEWATCHER
+OPENCODE_EXPERIMENTAL_FILEWATCHER
+OPENCODE_EXPERIMENTAL_ICON_DISCOVERY
+OPENCODE_EXPERIMENTAL_LSP_TY
+OPENCODE_EXPERIMENTAL_OXFMT
+OPENCODE_FAKE_VCS
+OPENCODE_GIT_BASH_PATH
+OPENCODE_LIBC
+OPENCODE_TERMINAL
+```
+
+### Critical env vars for gstack integration
+
+**`SLATE_DISABLE_CLAUDE_CODE_SKILLS`** — When set, `.claude/skills/` loading is disabled.
+This makes publishing to `.slate/skills/` load-bearing, not just an optimization.
+Without native `.slate/` publishing, gstack skills vanish when this flag is set.
+
+**`SLATE_TEST_HOME`** — Useful for E2E tests. Can redirect Slate's home directory
+to an isolated temp directory, similar to how Codex tests use a temp HOME.
+
+**`SLATE_DANGEROUSLY_SKIP_PERMISSIONS`** — Required for headless E2E tests.
+
+## Model References (from binary)
+
+```
+anthropic/claude-sonnet-4.6
+anthropic/claude-opus-4
+anthropic/claude-haiku-4
+anthropic/slate              — Slate's own model routing
+openai/gpt-5.3-codex
+google/nano-banana
+randomlabs/fast-default-alpha
+```
+
+## API Endpoints (from binary)
+
+```
+https://api.randomlabs.ai                          — main API
+https://api.randomlabs.ai/exaproxy                 — Exa search proxy
+https://agent-worker-prod.randomlabs.workers.dev   — production worker
+https://agent-worker-dev.randomlabs.workers.dev    — dev worker
+https://dashboard.randomlabs.ai                    — dashboard
+https://docs.randomlabs.ai                         — documentation
+https://randomlabs.ai/config.json                  — remote config
+```
+
+Brew tap: `anthropic/tap/slate` (notable: under Anthropic's tap, not Random Labs)
+
+## npm Package Structure
+
+```
+@randomlabs/slate (8.8 kB, thin launcher)
+├── bin/slate           — Node.js launcher (finds platform binary in node_modules)
+├── bin/slate1          — Bun launcher (same logic, import.meta.filename)
+├── postinstall.mjs     — Verifies platform binary exists, symlinks if needed
+└── package.json        — Declares optionalDependencies for all platforms
+
+Platform packages (85MB each):
+├── @randomlabs/slate-darwin-arm64
+├── @randomlabs/slate-darwin-x64
+├── @randomlabs/slate-linux-arm64
+├── @randomlabs/slate-linux-x64
+├── @randomlabs/slate-linux-x64-musl
+├── @randomlabs/slate-linux-arm64-musl
+├── @randomlabs/slate-linux-x64-baseline
+├── @randomlabs/slate-linux-x64-baseline-musl
+├── @randomlabs/slate-darwin-x64-baseline
+├── @randomlabs/slate-windows-x64
+└── @randomlabs/slate-windows-x64-baseline
+```
+
+Binary override: `SLATE_BIN_PATH` env var skips all discovery, runs the specified binary directly.
+
+## What Already Works Today
+
+gstack skills already work in Slate via the `.claude/skills/` fallback path.
+No changes needed for basic functionality. Users who install gstack for Claude Code
+and also use Slate will find their skills available in both agents.
+
+## What First-Class Support Adds
+
+1. **Reliability** — `.slate/skills/` is Slate's highest-priority path. Immune to
+   `SLATE_DISABLE_CLAUDE_CODE_SKILLS`.
+2. **Optimized frontmatter** — Strip Claude-specific fields (allowed-tools, hooks, version)
+   that Slate doesn't use. Keep only `name` and `description`.
+3. **Setup script** — Auto-detect `slate` binary, install skills to `~/.slate/skills/`.
+4. **E2E tests** — Verify skills work when invoked by Slate directly.
+
+## Blocked On: Host Config Refactor
+
+Codex's outside voice review identified that adding Slate as a 4th host (after Claude,
+Codex, Factory) is "host explosion for a path alias." The current architecture has:
+
+- Hard-coded host names in `type Host = 'claude' | 'codex' | 'factory'`
+- Per-host branches in `transformFrontmatter()` with near-duplicate logic
+- Per-host config in `EXTERNAL_HOST_CONFIG` with similar patterns
+- Per-host functions in the setup script (`create_codex_runtime_root`, `link_codex_skill_dirs`)
+- Host names duplicated in `bin/gstack-platform-detect`, `bin/gstack-uninstall`, `bin/dev-setup`
+
+Adding Slate means copying all of these patterns again. A refactor to make hosts
+data-driven (config objects instead of if/else branches) would make Slate integration
+trivial AND make future hosts (any new OpenCode fork, any new agent) zero-effort.
+
+### Missing from the plan (identified by Codex)
+
+- `lib/worktree.ts` only copies `.agents/`, not `.slate/` — E2E tests in worktrees won't
+  have Slate skills
+- `bin/gstack-uninstall` doesn't know about `.slate/`
+- `bin/dev-setup` doesn't wire `.slate/` for contributor dev mode
+- `bin/gstack-platform-detect` doesn't detect Slate
+- E2E tests should set `SLATE_DISABLE_CLAUDE_CODE_SKILLS=1` to prove `.slate/` path
+  actually works (not just falling back to `.claude/`)
+
+## Session Runner Design (for later)
+
+When the JSONL format is verified, the session runner should:
+
+- Spawn: `slate -q "<prompt>" --stream-json --dangerously-skip-permissions -w <dir>`
+- Parse: Claude Code SDK-compatible NDJSON (assumed, needs verification)
+- Skills: Install to `.slate/skills/` in test fixture (not `.claude/skills/`)
+- Auth: Use `SLATE_API_KEY` or existing `~/.slate/` credentials
+- Isolation: Use `SLATE_TEST_HOME` for home directory isolation
+- Timeout: 300s default (same as Codex)
+
+```typescript
+export interface SlateResult {
+  output: string;
+  toolCalls: string[];
+  tokens: number;
+  exitCode: number;
+  durationMs: number;
+  sessionId: string | null;
+  rawLines: string[];
+  stderr: string;
+}
+```
+
+## Docs References
+
+- Slate docs: https://docs.randomlabs.ai
+- Quickstart: https://docs.randomlabs.ai/en/getting-started/quickstart
+- Skills: https://docs.randomlabs.ai/en/using-slate/skills
+- Configuration: https://docs.randomlabs.ai/en/using-slate/configuration
+- Hotkeys: https://docs.randomlabs.ai/en/using-slate/hotkey_reference