* fix: replace find-browse with direct path in SKILL.md setup blocks
Agents were skipping the find-browse binary and guessing bin/browse
(wrong path). Now the setup block explicitly checks browse/dist/browse
with workspace-local priority, global fallback.
Also adds || true to update check to prevent misleading exit code 1.
Adds {{UPDATE_CHECK}} and {{BROWSE_SETUP}} template placeholders to
gen-skill-docs.ts so all skills share a single source of truth.
* refactor: convert qa/ and setup-browser-cookies/ to .tmpl templates
Replaces hardcoded update check and find-browse blocks with
{{UPDATE_CHECK}} and {{BROWSE_SETUP}} placeholders. Both skills
are now generated from templates via gen-skill-docs.
* test: add e2e and LLM eval tests for SKILL.md setup block
- 3 Agent SDK e2e tests: happy path, NEEDS_SETUP, non-git-repo
- LLM eval: setup block clarity + actionability >= 4
- New error pattern: 'no such file or directory.*browse'
These tests catch the exact failure mode where agents can't discover
the browse binary via SKILL.md instructions.
* chore: bump version and changelog (v0.3.5)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
12 KiB
name, version, description, allowed-tools
| name | version | description | allowed-tools | |||
|---|---|---|---|---|---|---|
| gstack | 1.1.0 | Fast headless browser for QA testing and site dogfooding. Navigate any URL, interact with elements, verify page state, diff before/after actions, take annotated screenshots, check responsive layouts, test forms and uploads, handle dialogs, and assert element states. ~100ms per command. Use when you need to test a feature, verify a deployment, dogfood a user flow, or file a bug with evidence. |
|
Update Check (run first)
_UPD=$(~/.claude/skills/gstack/bin/gstack-update-check 2>/dev/null || .claude/skills/gstack/bin/gstack-update-check 2>/dev/null || true)
[ -n "$_UPD" ] && echo "$_UPD" || true
If output shows UPGRADE_AVAILABLE <old> <new>: read ~/.claude/skills/gstack/gstack-upgrade/SKILL.md and follow the "Inline upgrade flow" (AskUserQuestion → upgrade if yes, touch ~/.gstack/last-update-check if no). If JUST_UPGRADED <from> <to>: tell user "Running gstack v{to} (just updated!)" and continue.
gstack browse: QA Testing & Dogfooding
Persistent headless Chromium. First call auto-starts (~3s), then ~100-200ms per command. Auto-shuts down after 30 min idle. State persists between calls (cookies, tabs, sessions).
SETUP (run this check BEFORE any browse command)
_ROOT=$(git rev-parse --show-toplevel 2>/dev/null)
B=""
[ -n "$_ROOT" ] && [ -x "$_ROOT/.claude/skills/gstack/browse/dist/browse" ] && B="$_ROOT/.claude/skills/gstack/browse/dist/browse"
[ -z "$B" ] && B=~/.claude/skills/gstack/browse/dist/browse
if [ -x "$B" ]; then
echo "READY: $B"
else
echo "NEEDS_SETUP"
fi
If NEEDS_SETUP:
- Tell the user: "gstack browse needs a one-time build (~10 seconds). OK to proceed?" Then STOP and wait.
- Run:
cd <SKILL_DIR> && ./setup - If
bunis not installed:curl -fsSL https://bun.sh/install | bash
IMPORTANT
- Use the compiled binary via Bash:
$B <command> - NEVER use
mcp__claude-in-chrome__*tools. They are slow and unreliable. - Browser persists between calls — cookies, login sessions, and tabs carry over.
- Dialogs (alert/confirm/prompt) are auto-accepted by default — no browser lockup.
QA Workflows
Test a user flow (login, signup, checkout, etc.)
# 1. Go to the page
$B goto https://app.example.com/login
# 2. See what's interactive
$B snapshot -i
# 3. Fill the form using refs
$B fill @e3 "test@example.com"
$B fill @e4 "password123"
$B click @e5
# 4. Verify it worked
$B snapshot -D # diff shows what changed after clicking
$B is visible ".dashboard" # assert the dashboard appeared
$B screenshot /tmp/after-login.png
Verify a deployment / check prod
$B goto https://yourapp.com
$B text # read the page — does it load?
$B console # any JS errors?
$B network # any failed requests?
$B js "document.title" # correct title?
$B is visible ".hero-section" # key elements present?
$B screenshot /tmp/prod-check.png
Dogfood a feature end-to-end
# Navigate to the feature
$B goto https://app.example.com/new-feature
# Take annotated screenshot — shows every interactive element with labels
$B snapshot -i -a -o /tmp/feature-annotated.png
# Find ALL clickable things (including divs with cursor:pointer)
$B snapshot -C
# Walk through the flow
$B snapshot -i # baseline
$B click @e3 # interact
$B snapshot -D # what changed? (unified diff)
# Check element states
$B is visible ".success-toast"
$B is enabled "#next-step-btn"
$B is checked "#agree-checkbox"
# Check console for errors after interactions
$B console
Test responsive layouts
# Quick: 3 screenshots at mobile/tablet/desktop
$B goto https://yourapp.com
$B responsive /tmp/layout
# Manual: specific viewport
$B viewport 375x812 # iPhone
$B screenshot /tmp/mobile.png
$B viewport 1440x900 # Desktop
$B screenshot /tmp/desktop.png
Test file upload
$B goto https://app.example.com/upload
$B snapshot -i
$B upload @e3 /path/to/test-file.pdf
$B is visible ".upload-success"
$B screenshot /tmp/upload-result.png
Test forms with validation
$B goto https://app.example.com/form
$B snapshot -i
# Submit empty — check validation errors appear
$B click @e10 # submit button
$B snapshot -D # diff shows error messages appeared
$B is visible ".error-message"
# Fill and resubmit
$B fill @e3 "valid input"
$B click @e10
$B snapshot -D # diff shows errors gone, success state
Test dialogs (delete confirmations, prompts)
# Set up dialog handling BEFORE triggering
$B dialog-accept # will auto-accept next alert/confirm
$B click "#delete-button" # triggers confirmation dialog
$B dialog # see what dialog appeared
$B snapshot -D # verify the item was deleted
# For prompts that need input
$B dialog-accept "my answer" # accept with text
$B click "#rename-button" # triggers prompt
Test authenticated pages (import real browser cookies)
# Import cookies from your real browser (opens interactive picker)
$B cookie-import-browser
# Or import a specific domain directly
$B cookie-import-browser comet --domain .github.com
# Now test authenticated pages
$B goto https://github.com/settings/profile
$B snapshot -i
$B screenshot /tmp/github-profile.png
Compare two pages / environments
$B diff https://staging.app.com https://prod.app.com
Multi-step chain (efficient for long flows)
echo '[
["goto","https://app.example.com"],
["snapshot","-i"],
["fill","@e3","test@test.com"],
["fill","@e4","password"],
["click","@e5"],
["snapshot","-D"],
["screenshot","/tmp/result.png"]
]' | $B chain
Quick Assertion Patterns
# Element exists and is visible
$B is visible ".modal"
# Button is enabled/disabled
$B is enabled "#submit-btn"
$B is disabled "#submit-btn"
# Checkbox state
$B is checked "#agree"
# Input is editable
$B is editable "#name-field"
# Element has focus
$B is focused "#search-input"
# Page contains text
$B js "document.body.textContent.includes('Success')"
# Element count
$B js "document.querySelectorAll('.list-item').length"
# Specific attribute value
$B attrs "#logo" # returns all attributes as JSON
# CSS property
$B css ".button" "background-color"
Snapshot System
The snapshot is your primary tool for understanding and interacting with pages.
-i --interactive Interactive elements only (buttons, links, inputs) with @e refs
-c --compact Compact (no empty structural nodes)
-d <N> --depth Limit tree depth (0 = root only, default: unlimited)
-s <sel> --selector Scope to CSS selector
-D --diff Unified diff against previous snapshot (first call stores baseline)
-a --annotate Annotated screenshot with red overlay boxes and ref labels
-o <path> --output Output path for annotated screenshot (default: /tmp/browse-annotated.png)
-C --cursor-interactive Cursor-interactive elements (@c refs — divs with pointer, onclick)
All flags can be combined freely. -o only applies when -a is also used.
Example: $B snapshot -i -a -C -o /tmp/annotated.png
Ref numbering: @e refs are assigned sequentially (@e1, @e2, ...) in tree order.
@c refs from -C are numbered separately (@c1, @c2, ...).
After snapshot, use @refs as selectors in any command:
$B click @e3 $B fill @e4 "value" $B hover @e1
$B html @e2 $B css @e5 "color" $B attrs @e6
$B click @c1 # cursor-interactive ref (from -C)
Output format: indented accessibility tree with @ref IDs, one element per line.
@e1 [heading] "Welcome" [level=1]
@e2 [textbox] "Email"
@e3 [button] "Submit"
Refs are invalidated on navigation — run snapshot again after goto.
Command Reference
Navigation
| Command | Description |
|---|---|
back |
History back |
forward |
History forward |
goto <url> |
Navigate to URL |
reload |
Reload page |
url |
Print current URL |
Reading
| Command | Description |
|---|---|
accessibility |
Full ARIA tree |
forms |
Form fields as JSON |
html [selector] |
innerHTML of selector (throws if not found), or full page HTML if no selector given |
links |
All links as "text → href" |
text |
Cleaned page text |
Interaction
| Command | Description |
|---|---|
click <sel> |
Click element |
cookie <name>=<value> |
Set cookie on current page domain |
cookie-import <json> |
Import cookies from JSON file |
cookie-import-browser [browser] [--domain d] |
Import cookies from Comet, Chrome, Arc, Brave, or Edge (opens picker, or use --domain for direct import) |
dialog-accept [text] |
Auto-accept next alert/confirm/prompt. Optional text is sent as the prompt response |
dialog-dismiss |
Auto-dismiss next dialog |
fill <sel> <val> |
Fill input |
header <name>:<value> |
Set custom request header (colon-separated, sensitive values auto-redacted) |
hover <sel> |
Hover element |
press <key> |
Press key — Enter, Tab, Escape, ArrowUp/Down/Left/Right, Backspace, Delete, Home, End, PageUp, PageDown, or modifiers like Shift+Enter |
scroll [sel] |
Scroll element into view, or scroll to page bottom if no selector |
select <sel> <val> |
Select dropdown option by value, label, or visible text |
type <text> |
Type into focused element |
upload <sel> <file> [file2...] |
Upload file(s) |
useragent <string> |
Set user agent |
viewport <WxH> |
Set viewport size |
| `wait <sel | --networkidle |
Inspection
| Command | Description |
|---|---|
| `attrs <sel | @ref>` |
| `console [--clear | --errors]` |
cookies |
All cookies as JSON |
css <sel> <prop> |
Computed CSS value |
dialog [--clear] |
Dialog messages |
eval <file> |
Run JavaScript from file and return result as string (path must be under /tmp or cwd) |
is <prop> <sel> |
State check (visible/hidden/enabled/disabled/checked/editable/focused) |
js <expr> |
Run JavaScript expression and return result as string |
network [--clear] |
Network requests |
perf |
Page load timings |
storage [set k v] |
Read all localStorage + sessionStorage as JSON, or set to write localStorage |
Visual
| Command | Description |
|---|---|
diff <url1> <url2> |
Text diff between pages |
pdf [path] |
Save as PDF |
responsive [prefix] |
Screenshots at mobile (375x812), tablet (768x1024), desktop (1280x720). Saves as {prefix}-mobile.png etc. |
screenshot [path] |
Save screenshot |
Snapshot
| Command | Description |
|---|---|
snapshot [flags] |
Accessibility tree with @e refs for element selection. Flags: -i interactive only, -c compact, -d N depth limit, -s sel scope, -D diff vs previous, -a annotated screenshot, -o path output, -C cursor-interactive @c refs |
Meta
| Command | Description |
|---|---|
chain |
Run commands from JSON stdin. Format: [["cmd","arg1",...],...] |
Tabs
| Command | Description |
|---|---|
closetab [id] |
Close tab |
newtab [url] |
Open new tab |
tab <id> |
Switch to tab |
tabs |
List open tabs |
Server
| Command | Description |
|---|---|
restart |
Restart server |
status |
Health check |
stop |
Shutdown server |
Tips
- Navigate once, query many times.
gotoloads the page; thentext,js,screenshotall hit the loaded page instantly. - Use
snapshot -ifirst. See all interactive elements, then click/fill by ref. No CSS selector guessing. - Use
snapshot -Dto verify. Baseline → action → diff. See exactly what changed. - Use
isfor assertions.is visible .modalis faster and more reliable than parsing page text. - Use
snapshot -afor evidence. Annotated screenshots are great for bug reports. - Use
snapshot -Cfor tricky UIs. Finds clickable divs that the accessibility tree misses. - Check
consoleafter actions. Catch JS errors that don't surface visually. - Use
chainfor long flows. Single command, no per-step CLI overhead.