diff --git a/README.md b/README.md index 6b7c5f07..24bbe8b9 100644 --- a/README.md +++ b/README.md @@ -183,7 +183,7 @@ Each skill feeds into the next. `/office-hours` writes a design doc that `/plan- | `/benchmark` | **Performance Engineer** | Baseline page load times, Core Web Vitals, and resource sizes. Compare before/after on every PR. | | `/document-release` | **Technical Writer** | Update all project docs to match what you just shipped. Catches stale READMEs automatically. | | `/retro` | **Eng Manager** | Team-aware weekly retro. Per-person breakdowns, shipping streaks, test health trends, growth opportunities. `/retro global` runs across all your projects and AI tools (Claude Code, Codex, Gemini). | -| `/browse` | **QA Engineer** | Give the agent eyes. Real Chromium browser, real clicks, real screenshots. ~100ms per command. `$B connect` launches your real Chrome as a headed window — watch every action live. | +| `/browse` | **QA Engineer** | Give the agent eyes. Real Chromium browser, real clicks, real screenshots. ~100ms per command. `/open-gstack-browser` launches GStack Browser with sidebar, anti-bot stealth, and auto model routing. | | `/setup-browser-cookies` | **Session Manager** | Import cookies from your real browser (Chrome, Arc, Brave, Edge) into the headless session. Test authenticated pages. | | `/autoplan` | **Review Pipeline** | One command, fully reviewed plan. Runs CEO → design → eng review automatically with encoded decision principles. Surfaces only taste decisions for your approval. | | `/learn` | **Memory** | Manage what gstack learned across sessions. Review, search, prune, and export project-specific patterns, pitfalls, and preferences. Learnings compound across sessions so gstack gets smarter on your codebase over time. | @@ -206,7 +206,7 @@ Each skill feeds into the next. `/office-hours` writes a design doc that `/plan- | `/freeze` | **Edit Lock** — restrict file edits to one directory. Prevents accidental changes outside scope while debugging. | | `/guard` | **Full Safety** — `/careful` + `/freeze` in one command. Maximum safety for prod work. | | `/unfreeze` | **Unlock** — remove the `/freeze` boundary. | -| `/open-gstack-browser` | **GStack Browser** — launch GStack Browser with sidebar, anti-bot stealth, and Claude Code integration. Clean up pages, take smart screenshots, edit CSS, and pass info back to your terminal. | +| `/open-gstack-browser` | **GStack Browser** — launch GStack Browser with sidebar, anti-bot stealth, auto model routing (Sonnet for actions, Opus for analysis), one-click cookie import, and Claude Code integration. Clean up pages, take smart screenshots, edit CSS, and pass info back to your terminal. | | `/setup-deploy` | **Deploy Configurator** — one-time setup for `/land-and-deploy`. Detects your platform, production URL, and deploy commands. | | `/gstack-upgrade` | **Self-Updater** — upgrade gstack to latest. Detects global vs vendored install, syncs both, shows what changed. | @@ -226,11 +226,11 @@ gstack works well with one sprint. It gets interesting with ten running at once. **`/document-release` is the engineer you never had.** It reads every doc file in your project, cross-references the diff, and updates everything that drifted. README, ARCHITECTURE, CONTRIBUTING, CLAUDE.md, TODOS — all kept current automatically. And now `/ship` auto-invokes it — docs stay current without an extra command. -**Real browser mode.** `$B connect` launches your actual Chrome as a headed window controlled by Playwright. You watch Claude click, fill, and navigate in real time — same window, same screen. A subtle green shimmer at the top edge tells you which Chrome window gstack controls. All existing browse commands work unchanged. `$B disconnect` returns to headless. A Chrome extension Side Panel shows a live activity feed of every command and a chat sidebar where you can direct Claude. This is co-presence — Claude isn't remote-controlling a hidden browser, it's sitting next to you in the same cockpit. +**Real browser mode.** `/open-gstack-browser` launches GStack Browser, an AI-controlled Chromium with anti-bot stealth, custom branding, and the sidebar extension baked in. Sites like Google and NYTimes work without captchas. The menu bar says "GStack Browser" instead of "Chrome for Testing." Your regular Chrome stays untouched. All existing browse commands work unchanged. `$B disconnect` returns to headless. The browser stays alive as long as the window is open... no idle timeout killing it while you're working. -**Sidebar agent — your AI browser assistant.** Type natural language instructions in the Chrome side panel and a child Claude instance executes them. "Navigate to the settings page and screenshot it." "Fill out this form with test data." "Go through every item in this list and extract the prices." Each task gets up to 5 minutes. The sidebar agent runs in an isolated session, so it won't interfere with your main Claude Code window. It's like having a second pair of hands in the browser. +**Sidebar agent — your AI browser assistant.** Type natural language in the Chrome side panel and a child Claude instance executes it. "Navigate to the settings page and screenshot it." "Fill out this form with test data." "Go through every item in this list and extract the prices." The sidebar auto-routes to the right model: Sonnet for fast actions (click, navigate, screenshot) and Opus for reading and analysis. Each task gets up to 5 minutes. The sidebar agent runs in an isolated session, so it won't interfere with your main Claude Code window. One-click cookie import right from the sidebar footer. -**Personal automation.** The sidebar agent isn't just for dev workflows. Example: "Browse my kid's school parent portal and add all the other parents' names, phone numbers, and photos to my Google Contacts." Two ways to get authenticated: (1) log in once in the headed browser — your session persists, or (2) run `/setup-browser-cookies` to import cookies from your real Chrome. Once authenticated, Claude navigates the directory, extracts the data, and creates the contacts. +**Personal automation.** The sidebar agent isn't just for dev workflows. Example: "Browse my kid's school parent portal and add all the other parents' names, phone numbers, and photos to my Google Contacts." Two ways to get authenticated: (1) log in once in the headed browser, your session persists, or (2) click the "cookies" button in the sidebar footer to import cookies from your real Chrome. Once authenticated, Claude navigates the directory, extracts the data, and creates the contacts. **Browser handoff when the AI gets stuck.** Hit a CAPTCHA, auth wall, or MFA prompt? `$B handoff` opens a visible Chrome at the exact same page with all your cookies and tabs intact. Solve the problem, tell Claude you're done, `$B resume` picks up right where it left off. The agent even suggests it automatically after 3 consecutive failures. diff --git a/TODOS.md b/TODOS.md index a58e3d53..0e3ac932 100644 --- a/TODOS.md +++ b/TODOS.md @@ -199,16 +199,22 @@ Sidebar agent writes structured messages to `.context/sidebar-inbox/`. Workspace **Priority:** P3 **Depends on:** Headed mode (shipped) -### Sidebar agent needs Write tool + better error visibility +### Sidebar agent needs Write tool + better error visibility — SHIPPED **What:** Two issues with the sidebar agent (`sidebar-agent.ts`): (1) `--allowedTools` is hardcoded to `Bash,Read,Glob,Grep`, missing `Write`. Claude can't create files (like CSVs) when asked. (2) When Claude errors or returns empty, the sidebar UI shows nothing, just a green dot. No error message, no "I tried but failed", nothing. -**Why:** Users ask "write this to a CSV" and the sidebar silently can't. Then they think it's broken. The UI needs to surface errors visibly, and Claude needs the tools to actually do what's asked. +**Completed:** v0.15.4.0 (2026-04-04). Write tool added to allowedTools. 40+ empty catch blocks replaced with `[gstack sidebar]`, `[gstack bg]`, `[browse]`, `[sidebar-agent]` prefixed console logging across all 4 files (sidepanel.js, background.js, server.ts, sidebar-agent.ts). Error placeholder text now shows in red. Auth token stale-refresh bug fixed. -**Context:** `sidebar-agent.ts:163` hardcodes `--allowedTools`. The event relay (`handleStreamEvent`) handles `agent_done` and `agent_error` but the extension's sidepanel.js may not be rendering error states. The sidebar should show "Error: ..." or "Claude finished but produced no output" instead of staying on the green dot forever. +### Sidebar direct API calls (eliminate claude -p startup tax) -**Effort:** S (human: ~2h / CC: ~10min) -**Priority:** P1 +**What:** Each sidebar message spawns a fresh `claude -p` process (~2-3s cold start overhead). For "click @e24" that's absurd. Direct Anthropic API calls would be sub-second. + +**Why:** The `claude -p` startup cost is: process spawn (~100ms) + CLI init (~500ms-1s) + API connection (~200ms) + first token. Model routing (Sonnet for actions) helps but doesn't fix the CLI overhead. + +**Context:** `server.ts:spawnClaude()` builds args and writes to queue file. `sidebar-agent.ts:askClaude()` spawns `claude -p`. Replace with direct `fetch('https://api.anthropic.com/...')` with tool use. Requires `ANTHROPIC_API_KEY` accessible to the browse server. + +**Effort:** M (human: ~1 week / CC: ~30min) +**Priority:** P2 **Depends on:** None ### Chrome Web Store publishing diff --git a/docs/designs/GSTACK_BROWSER_V0.md b/docs/designs/GSTACK_BROWSER_V0.md index be67270c..7539336a 100644 --- a/docs/designs/GSTACK_BROWSER_V0.md +++ b/docs/designs/GSTACK_BROWSER_V0.md @@ -319,8 +319,14 @@ From DESIGN.md: | DMG packaging | **SHIPPED** | 189MB compressed | | `GSTACK_CHROMIUM_PATH` | **SHIPPED** | Custom Chromium binary support | | `BROWSE_EXTENSIONS_DIR` | **SHIPPED** | Extension path override | -| Auth via `/health` | **SHIPPED** | Replaces .auth.json file approach | +| Auth via `/health` | **SHIPPED** | Replaces .auth.json file approach, auto-refreshes on server restart | | Build script | **SHIPPED** | `scripts/build-app.sh` | +| Model routing | **SHIPPED** | Sonnet for actions, Opus for analysis (`pickSidebarModel`) | +| Debug logging | **SHIPPED** | 40+ silent catches → prefixed console logging across 4 files | +| No idle timeout (headed) | **SHIPPED** | Browser stays alive as long as window is open | +| Cookie import button | **SHIPPED** | One-click in sidebar footer, opens `/cookie-picker` | +| Sidebar arrow hint | **SHIPPED** | Points to sidebar, hides only when sidebar actually opens | +| Architecture doc | **SHIPPED** | `docs/designs/SIDEBAR_MESSAGE_FLOW.md` | | Command palette | Planned | Phase 1b | | Quick screenshot | Planned | Phase 1b | | Status bar | Planned | Phase 1b | diff --git a/docs/skills.md b/docs/skills.md index 6c7679da..d93800a3 100644 --- a/docs/skills.md +++ b/docs/skills.md @@ -36,7 +36,7 @@ Detailed guides for every gstack skill — philosophy, workflow, and examples. | [`/freeze`](#safety--guardrails) | **Edit Lock** | Restrict all file edits to a single directory. Blocks Edit and Write outside the boundary. Accident prevention for debugging. | | [`/guard`](#safety--guardrails) | **Full Safety** | Combines /careful + /freeze in one command. Maximum safety for prod work. | | [`/unfreeze`](#safety--guardrails) | **Unlock** | Remove the /freeze boundary, allowing edits everywhere again. | -| [`/open-gstack-browser`](#open-gstack-browser) | **GStack Browser** | Launch GStack Browser with sidebar, anti-bot stealth, and Claude Code integration. Watch every action live. | +| [`/open-gstack-browser`](#open-gstack-browser) | **GStack Browser** | Launch GStack Browser with sidebar, anti-bot stealth, auto model routing, cookie import, and Claude Code integration. Watch every action live. | | [`/setup-deploy`](#setup-deploy) | **Deploy Configurator** | One-time setup for `/land-and-deploy`. Detects your platform, production URL, and deploy commands. | | [`/gstack-upgrade`](#gstack-upgrade) | **Self-Updater** | Upgrade gstack to the latest version. Detects global vs vendored install, syncs both, shows what changed. | @@ -961,7 +961,7 @@ This is my **co-presence mode**. `/browse` runs headless by default. You don't see what the agent sees. `/open-gstack-browser` changes that. It launches GStack Browser (rebranded Chromium with anti-bot stealth) controlled by Playwright, with the sidebar extension auto-loaded. You watch every action in real time. -The sidebar chat is a Claude instance that controls the browser. It can navigate, click, fill forms, clean up pages, take smart screenshots, edit CSS and DOM, and pass everything back to your Claude Code terminal. The menu bar says "GStack Browser" instead of "Chrome for Testing." +The sidebar chat is a Claude instance that controls the browser. It auto-routes to the right model: Sonnet for navigation and actions (click, goto, fill, screenshot), Opus for reading and analysis (summarize, find bugs, describe). One-click cookie import from the sidebar footer. The browser stays alive as long as the window is open... no idle timeout in headed mode. The menu bar says "GStack Browser" instead of "Chrome for Testing." ``` You: /open-gstack-browser @@ -969,6 +969,7 @@ You: /open-gstack-browser Claude: Launched GStack Browser with sidebar extension. Anti-bot stealth active. All $B commands run in headed mode. Type in the sidebar to direct the browser agent. + Sidebar model routing: sonnet for actions, opus for analysis. ``` ---