Phase 2: Rewrite SKILL.md as QA playbook + command reference

Reorient SKILL.md files from raw command reference to QA-first playbook
with 10 workflow patterns (test user flows, verify deployments, dogfood
features, responsive layouts, file upload, forms, dialogs, compare pages).
Compact command reference tables at the bottom.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
Garry Tan
2026-03-12 13:33:54 -07:00
parent f3ebd0adbf
commit d827276a8d
3 changed files with 383 additions and 428 deletions
+267 -197
View File
@@ -1,254 +1,324 @@
---
name: gstack
version: 1.0.0
version: 1.1.0
description: |
Fast web browsing for Claude Code via persistent headless Chromium daemon. Navigate to any URL,
read page content, click elements, fill forms, run JavaScript, take screenshots,
inspect CSS/DOM, capture console/network logs, and more. ~100ms per command after
first call. Use when you need to check a website, verify a deployment, read docs,
or interact with any web page. No MCP, no Chrome extension — just fast CLI.
Fast headless browser for QA testing and site dogfooding. Navigate any URL, interact with
elements, verify page state, diff before/after actions, take annotated screenshots, check
responsive layouts, test forms and uploads, handle dialogs, and assert element states.
~100ms per command. Use when you need to test a feature, verify a deployment, dogfood a
user flow, or file a bug with evidence.
allowed-tools:
- Bash
- Read
---
# gstack: Persistent Browser for Claude Code
# gstack browse: QA Testing & Dogfooding
Persistent headless Chromium daemon. First call auto-starts the server (~3s).
Every subsequent call: ~100-200ms. Auto-shuts down after 30 min idle.
Persistent headless Chromium. First call auto-starts (~3s), then ~100-200ms per command.
Auto-shuts down after 30 min idle. State persists between calls (cookies, tabs, sessions).
## SETUP (run this check BEFORE any browse command)
Before using any browse command, find the skill and check if the binary exists:
```bash
# Check project-level first, then user-level
if test -x .claude/skills/gstack/browse/dist/browse; then
echo "READY_PROJECT"
elif test -x ~/.claude/skills/gstack/browse/dist/browse; then
echo "READY_USER"
B=$(browse/bin/find-browse 2>/dev/null || ~/.claude/skills/gstack/browse/bin/find-browse 2>/dev/null)
if [ -n "$B" ]; then
echo "READY: $B"
else
echo "NEEDS_SETUP"
fi
```
Set `B` to whichever path is READY and use it for all commands. Prefer project-level if both exist.
If `NEEDS_SETUP`:
1. Tell the user: "gstack browse needs a one-time build (~10 seconds). OK to proceed?" Then STOP and wait for their response.
2. If they approve, determine the skill directory (project-level `.claude/skills/gstack` or user-level `~/.claude/skills/gstack`) and run:
```bash
cd <SKILL_DIR> && ./setup
```
3. If `bun` is not installed, tell the user to install it: `curl -fsSL https://bun.sh/install | bash`
4. Verify the `.gitignore` in the skill directory contains `browse/dist/` and `node_modules/`. If either line is missing, add it.
Once setup is done, it never needs to run again (the compiled binary persists).
1. Tell the user: "gstack browse needs a one-time build (~10 seconds). OK to proceed?" Then STOP and wait.
2. Run: `cd <SKILL_DIR> && ./setup`
3. If `bun` is not installed: `curl -fsSL https://bun.sh/install | bash`
## IMPORTANT
- Use the compiled binary via Bash: `.claude/skills/gstack/browse/dist/browse` (project) or `~/.claude/skills/gstack/browse/dist/browse` (user).
- Use the compiled binary via Bash: `$B <command>`
- NEVER use `mcp__claude-in-chrome__*` tools. They are slow and unreliable.
- The browser persists between calls — cookies, tabs, and state carry over.
- The server auto-starts on first command. No setup needed.
- Browser persists between calls — cookies, login sessions, and tabs carry over.
- Dialogs (alert/confirm/prompt) are auto-accepted by default — no browser lockup.
## Quick Reference
## QA Workflows
### Test a user flow (login, signup, checkout, etc.)
```bash
B=~/.claude/skills/gstack/browse/dist/browse
# Navigate to a page
$B goto https://example.com
# 1. Go to the page
$B goto https://app.example.com/login
# Read cleaned page text
$B text
# Take a screenshot (then Read the image)
$B screenshot /tmp/page.png
# Snapshot: accessibility tree with refs
# 2. See what's interactive
$B snapshot -i
# Click by ref (after snapshot)
$B click @e3
# 3. Fill the form using refs
$B fill @e3 "test@example.com"
$B fill @e4 "password123"
$B click @e5
# Fill by ref
$B fill @e4 "test@test.com"
# Run JavaScript
$B js "document.title"
# Get all links
$B links
# Click by CSS selector
$B click "button.submit"
# Fill a form by CSS selector
$B fill "#email" "test@test.com"
$B fill "#password" "abc123"
$B click "button[type=submit]"
# Get HTML of an element
$B html "main"
# Get computed CSS
$B css "body" "font-family"
# Get element attributes
$B attrs "nav"
# Wait for element to appear
$B wait ".loaded"
# Accessibility tree
$B accessibility
# Set viewport
$B viewport 375x812
# Set cookies / headers
$B cookie "session=abc123"
$B header "Authorization:Bearer token123"
# 4. Verify it worked
$B snapshot -D # diff shows what changed after clicking
$B is visible ".dashboard" # assert the dashboard appeared
$B screenshot /tmp/after-login.png
```
## Command Reference
### Verify a deployment / check prod
### Navigation
```
browse goto <url> Navigate current tab
browse back Go back
browse forward Go forward
browse reload Reload page
browse url Print current URL
```bash
$B goto https://yourapp.com
$B text # read the page — does it load?
$B console # any JS errors?
$B network # any failed requests?
$B js "document.title" # correct title?
$B is visible ".hero-section" # key elements present?
$B screenshot /tmp/prod-check.png
```
### Content extraction
```
browse text Cleaned page text (no scripts/styles)
browse html [selector] innerHTML of element, or full page HTML
browse links All links as "text → href"
browse forms All forms + fields as JSON
browse accessibility Accessibility tree snapshot (ARIA)
### Dogfood a feature end-to-end
```bash
# Navigate to the feature
$B goto https://app.example.com/new-feature
# Take annotated screenshot — shows every interactive element with labels
$B snapshot -i -a -o /tmp/feature-annotated.png
# Find ALL clickable things (including divs with cursor:pointer)
$B snapshot -C
# Walk through the flow
$B snapshot -i # baseline
$B click @e3 # interact
$B snapshot -D # what changed? (unified diff)
# Check element states
$B is visible ".success-toast"
$B is enabled "#next-step-btn"
$B is checked "#agree-checkbox"
# Check console for errors after interactions
$B console
```
### Snapshot (ref-based element selection)
```
browse snapshot Full accessibility tree with @refs
browse snapshot -i Interactive elements only (buttons, links, inputs)
browse snapshot -c Compact (no empty structural elements)
browse snapshot -d <N> Limit depth to N levels
browse snapshot -s <sel> Scope to CSS selector
### Test responsive layouts
```bash
# Quick: 3 screenshots at mobile/tablet/desktop
$B goto https://yourapp.com
$B responsive /tmp/layout
# Manual: specific viewport
$B viewport 375x812 # iPhone
$B screenshot /tmp/mobile.png
$B viewport 1440x900 # Desktop
$B screenshot /tmp/desktop.png
```
After snapshot, use @refs as selectors in any command:
### Test file upload
```bash
$B goto https://app.example.com/upload
$B snapshot -i
$B upload @e3 /path/to/test-file.pdf
$B is visible ".upload-success"
$B screenshot /tmp/upload-result.png
```
browse click @e3 Click the element assigned ref @e3
browse fill @e4 "value" Fill the input assigned ref @e4
browse hover @e1 Hover the element assigned ref @e1
browse html @e2 Get innerHTML of ref @e2
browse css @e5 "color" Get computed CSS of ref @e5
browse attrs @e6 Get attributes of ref @e6
### Test forms with validation
```bash
$B goto https://app.example.com/form
$B snapshot -i
# Submit empty — check validation errors appear
$B click @e10 # submit button
$B snapshot -D # diff shows error messages appeared
$B is visible ".error-message"
# Fill and resubmit
$B fill @e3 "valid input"
$B click @e10
$B snapshot -D # diff shows errors gone, success state
```
### Test dialogs (delete confirmations, prompts)
```bash
# Set up dialog handling BEFORE triggering
$B dialog-accept # will auto-accept next alert/confirm
$B click "#delete-button" # triggers confirmation dialog
$B dialog # see what dialog appeared
$B snapshot -D # verify the item was deleted
# For prompts that need input
$B dialog-accept "my answer" # accept with text
$B click "#rename-button" # triggers prompt
```
### Compare two pages / environments
```bash
$B diff https://staging.app.com https://prod.app.com
```
### Multi-step chain (efficient for long flows)
```bash
echo '[
["goto","https://app.example.com"],
["snapshot","-i"],
["fill","@e3","test@test.com"],
["fill","@e4","password"],
["click","@e5"],
["snapshot","-D"],
["screenshot","/tmp/result.png"]
]' | $B chain
```
## Quick Assertion Patterns
```bash
# Element exists and is visible
$B is visible ".modal"
# Button is enabled/disabled
$B is enabled "#submit-btn"
$B is disabled "#submit-btn"
# Checkbox state
$B is checked "#agree"
# Input is editable
$B is editable "#name-field"
# Element has focus
$B is focused "#search-input"
# Page contains text
$B js "document.body.textContent.includes('Success')"
# Element count
$B js "document.querySelectorAll('.list-item').length"
# Specific attribute value
$B attrs "#logo" # returns all attributes as JSON
# CSS property
$B css ".button" "background-color"
```
## Snapshot System
The snapshot is your primary tool for understanding and interacting with pages.
```bash
$B snapshot -i # Interactive elements only (buttons, links, inputs) with @e refs
$B snapshot -c # Compact (no empty structural elements)
$B snapshot -d 3 # Limit depth to 3 levels
$B snapshot -s "main" # Scope to CSS selector
$B snapshot -D # Diff against previous snapshot (what changed?)
$B snapshot -a # Annotated screenshot with ref labels
$B snapshot -o /tmp/x.png # Output path for annotated screenshot
$B snapshot -C # Cursor-interactive elements (@c refs — divs with pointer, onclick)
```
Combine flags: `$B snapshot -i -a -C -o /tmp/annotated.png`
After snapshot, use @refs everywhere:
```bash
$B click @e3 $B fill @e4 "value" $B hover @e1
$B html @e2 $B css @e5 "color" $B attrs @e6
$B click @c1 # cursor-interactive ref (from -C)
```
Refs are invalidated on navigation — run `snapshot` again after `goto`.
## Command Reference
### Navigation
| Command | Description |
|---------|-------------|
| `goto <url>` | Navigate to URL |
| `back` / `forward` | History navigation |
| `reload` | Reload page |
| `url` | Print current URL |
### Reading
| Command | Description |
|---------|-------------|
| `text` | Cleaned page text |
| `html [selector]` | innerHTML |
| `links` | All links as "text -> href" |
| `forms` | Forms + fields as JSON |
| `accessibility` | Full ARIA tree |
### Interaction
```
browse click <selector> Click element (CSS selector or @ref)
browse fill <selector> <value> Fill input field
browse select <selector> <val> Select dropdown value
browse hover <selector> Hover over element
browse type <text> Type into focused element
browse press <key> Press key (Enter, Tab, Escape, etc.)
browse scroll [selector] Scroll element into view, or page bottom
browse wait <selector> Wait for element to appear (max 10s)
browse viewport <WxH> Set viewport size (e.g. 375x812)
```
| Command | Description |
|---------|-------------|
| `click <sel>` | Click element |
| `fill <sel> <val>` | Fill input |
| `select <sel> <val>` | Select dropdown |
| `hover <sel>` | Hover element |
| `type <text>` | Type into focused element |
| `press <key>` | Press key (Enter, Tab, Escape) |
| `scroll [sel]` | Scroll element into view |
| `wait <sel>` | Wait for element (max 10s) |
| `wait --networkidle` | Wait for network to be idle |
| `wait --load` | Wait for page load event |
| `upload <sel> <file...>` | Upload file(s) |
| `cookie-import <json>` | Import cookies from JSON file |
| `dialog-accept [text]` | Auto-accept dialogs |
| `dialog-dismiss` | Auto-dismiss dialogs |
| `viewport <WxH>` | Set viewport size |
### Inspection
```
browse js <expression> Run JS, print result
browse eval <js-file> Run JS file against page
browse css <selector> <prop> Get computed CSS property
browse attrs <selector> Get element attributes as JSON
browse console Dump captured console messages
browse console --clear Clear console buffer
browse network Dump captured network requests
browse network --clear Clear network buffer
browse cookies Dump all cookies as JSON
browse storage localStorage + sessionStorage as JSON
browse storage set <key> <val> Set localStorage value
browse perf Page load performance timings
```
| Command | Description |
|---------|-------------|
| `js <expr>` | Run JavaScript |
| `eval <file>` | Run JS file |
| `css <sel> <prop>` | Computed CSS |
| `attrs <sel>` | Element attributes |
| `is <prop> <sel>` | State check (visible/hidden/enabled/disabled/checked/editable/focused) |
| `console [--clear\|--errors]` | Console messages (--errors filters to error/warning) |
| `network [--clear]` | Network requests |
| `dialog [--clear]` | Dialog messages |
| `cookies` | All cookies |
| `storage` | localStorage + sessionStorage |
| `perf` | Page load timings |
### Visual
```
browse screenshot [path] Screenshot (default: /tmp/browse-screenshot.png)
browse pdf [path] Save as PDF
browse responsive [prefix] Screenshots at mobile/tablet/desktop
```
### Compare
```
browse diff <url1> <url2> Text diff between two pages
```
### Multi-step (chain)
```
echo '[["goto","https://example.com"],["snapshot","-i"],["click","@e1"],["screenshot","/tmp/result.png"]]' | browse chain
```
| Command | Description |
|---------|-------------|
| `screenshot [path]` | Screenshot |
| `pdf [path]` | Save as PDF |
| `responsive [prefix]` | Mobile/tablet/desktop screenshots |
| `diff <url1> <url2>` | Text diff between pages |
### Tabs
```
browse tabs List tabs (id, url, title)
browse tab <id> Switch to tab
browse newtab [url] Open new tab
browse closetab [id] Close tab
```
| Command | Description |
|---------|-------------|
| `tabs` | List tabs |
| `tab <id>` | Switch tab |
| `newtab [url]` | Open tab |
| `closetab [id]` | Close tab |
### Server management
```
browse status Server health, uptime, tab count
browse stop Shutdown server
browse restart Kill + restart server
```
### Server
| Command | Description |
|---------|-------------|
| `status` | Health check |
| `stop` | Shutdown |
| `restart` | Restart |
## Speed Rules
## Tips
1. **Navigate once, query many times.** `goto` loads the page; then `text`, `js`, `css`, `screenshot` all run against the loaded page instantly.
2. **Use `snapshot -i` for interaction.** Get refs for all interactive elements, then click/fill by ref. No need to guess CSS selectors.
3. **Use `js` for precision.** `js "document.querySelector('.price').textContent"` is faster than parsing full page text.
4. **Use `links` to survey.** Faster than `text` when you just need navigation structure.
5. **Use `chain` for multi-step flows.** Avoids CLI overhead per step.
6. **Use `responsive` for layout checks.** One command = 3 viewport screenshots.
## When to Use What
| Task | Commands |
|------|----------|
| Read a page | `goto <url>` then `text` |
| Interact with elements | `snapshot -i` then `click @e3` |
| Check if element exists | `js "!!document.querySelector('.thing')"` |
| Extract specific data | `js "document.querySelector('.price').textContent"` |
| Visual check | `screenshot /tmp/x.png` then Read the image |
| Fill and submit form | `snapshot -i``fill @e4 "val"``click @e5``screenshot` |
| Check CSS | `css "selector" "property"` or `css @e3 "property"` |
| Inspect DOM | `html "selector"` or `attrs @e3` |
| Debug console errors | `console` |
| Check network requests | `network` |
| Check local dev | `goto http://127.0.0.1:3000` |
| Compare two pages | `diff <url1> <url2>` |
| Mobile layout check | `responsive /tmp/prefix` |
| Multi-step flow | `echo '[...]' \| browse chain` |
## Architecture
- Persistent Chromium daemon on localhost (port 9400-9410)
- Bearer token auth per session
- State file: `/tmp/browse-server.json`
- Console log: `/tmp/browse-console.log`
- Network log: `/tmp/browse-network.log`
- Auto-shutdown after 30 min idle
- Chromium crash → server exits → auto-restarts on next command
1. **Navigate once, query many times.** `goto` loads the page; then `text`, `js`, `screenshot` all hit the loaded page instantly.
2. **Use `snapshot -i` first.** See all interactive elements, then click/fill by ref. No CSS selector guessing.
3. **Use `snapshot -D` to verify.** Baseline → action → diff. See exactly what changed.
4. **Use `is` for assertions.** `is visible .modal` is faster and more reliable than parsing page text.
5. **Use `snapshot -a` for evidence.** Annotated screenshots are great for bug reports.
6. **Use `snapshot -C` for tricky UIs.** Finds clickable divs that the accessibility tree misses.
7. **Check `console` after actions.** Catch JS errors that don't surface visually.
8. **Use `chain` for long flows.** Single command, no per-step CLI overhead.