mirror of https://github.com/garrytan/gstack.git synced 2026-05-01 19:25:10 +02:00

Files

T

Garry Tan 2aa745cb0e feat: screenshot element/region clipping (v0.3.7) (#56 )

* feat: screenshot element/region clipping (--clip, --viewport, CSS/@ref)

Add element crop (CSS selector or @ref), region clip (--clip x,y,w,h),
and viewport-only (--viewport) modes to the screenshot command. Uses
Playwright's native locator.screenshot() and page.screenshot({ clip }).
Full page remains the default. Includes 10 new tests covering all modes
and error paths.

* chore: bump version and changelog (v0.3.7)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* docs: add screenshot modes to BROWSER.md command reference

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

2026-03-14 12:47:42 -07:00

9.5 KiB

Raw Blame History

name, version, description, allowed-tools

name

version

description

allowed-tools

browse

1.1.0

Fast headless browser for QA testing and site dogfooding. Navigate any URL, interact with elements, verify page state, diff before/after actions, take annotated screenshots, check responsive layouts, test forms and uploads, handle dialogs, and assert element states. ~100ms per command. Use when you need to test a feature, verify a deployment, dogfood a user flow, or file a bug with evidence.

Bash

Read

AskUserQuestion

Update Check (run first)

_UPD=$(~/.claude/skills/gstack/bin/gstack-update-check 2>/dev/null || .claude/skills/gstack/bin/gstack-update-check 2>/dev/null || true)
[ -n "$_UPD" ] && echo "$_UPD" || true

If output shows UPGRADE_AVAILABLE <old> <new>: read ~/.claude/skills/gstack/gstack-upgrade/SKILL.md and follow the "Inline upgrade flow" (AskUserQuestion → upgrade if yes, touch ~/.gstack/last-update-check if no). If JUST_UPGRADED <from> <to>: tell user "Running gstack v{to} (just updated!)" and continue.

browse: QA Testing & Dogfooding

Persistent headless Chromium. First call auto-starts (~3s), then ~100ms per command. State persists between calls (cookies, tabs, login sessions).

SETUP (run this check BEFORE any browse command)

_ROOT=$(git rev-parse --show-toplevel 2>/dev/null)
B=""
[ -n "$_ROOT" ] && [ -x "$_ROOT/.claude/skills/gstack/browse/dist/browse" ] && B="$_ROOT/.claude/skills/gstack/browse/dist/browse"
[ -z "$B" ] && B=~/.claude/skills/gstack/browse/dist/browse
if [ -x "$B" ]; then
  echo "READY: $B"
else
  echo "NEEDS_SETUP"
fi

If NEEDS_SETUP:

Tell the user: "gstack browse needs a one-time build (~10 seconds). OK to proceed?" Then STOP and wait.
Run: cd <SKILL_DIR> && ./setup
If bun is not installed: curl -fsSL https://bun.sh/install | bash

Core QA Patterns

1. Verify a page loads correctly

$B goto https://yourapp.com
$B text                          # content loads?
$B console                       # JS errors?
$B network                       # failed requests?
$B is visible ".main-content"    # key elements present?

2. Test a user flow

$B goto https://app.com/login
$B snapshot -i                   # see all interactive elements
$B fill @e3 "user@test.com"
$B fill @e4 "password"
$B click @e5                     # submit
$B snapshot -D                   # diff: what changed after submit?
$B is visible ".dashboard"       # success state present?

3. Verify an action worked

$B snapshot                      # baseline
$B click @e3                     # do something
$B snapshot -D                   # unified diff shows exactly what changed

4. Visual evidence for bug reports

$B snapshot -i -a -o /tmp/annotated.png   # labeled screenshot
$B screenshot /tmp/bug.png                # plain screenshot
$B console                                # error log

5. Find all clickable elements (including non-ARIA)

$B snapshot -C                   # finds divs with cursor:pointer, onclick, tabindex
$B click @c1                     # interact with them

6. Assert element states

$B is visible ".modal"
$B is enabled "#submit-btn"
$B is disabled "#submit-btn"
$B is checked "#agree-checkbox"
$B is editable "#name-field"
$B is focused "#search-input"
$B js "document.body.textContent.includes('Success')"

7. Test responsive layouts

$B responsive /tmp/layout        # mobile + tablet + desktop screenshots
$B viewport 375x812              # or set specific viewport
$B screenshot /tmp/mobile.png

8. Test file uploads

$B upload "#file-input" /path/to/file.pdf
$B is visible ".upload-success"

9. Test dialogs

$B dialog-accept "yes"           # set up handler
$B click "#delete-button"        # trigger dialog
$B dialog                        # see what appeared
$B snapshot -D                   # verify deletion happened

10. Compare environments

$B diff https://staging.app.com https://prod.app.com

Snapshot Flags

The snapshot is your primary tool for understanding and interacting with pages.

-i        --interactive           Interactive elements only (buttons, links, inputs) with @e refs
-c        --compact               Compact (no empty structural nodes)
-d <N>    --depth                 Limit tree depth (0 = root only, default: unlimited)
-s <sel>  --selector              Scope to CSS selector
-D        --diff                  Unified diff against previous snapshot (first call stores baseline)
-a        --annotate              Annotated screenshot with red overlay boxes and ref labels
-o <path> --output                Output path for annotated screenshot (default: /tmp/browse-annotated.png)
-C        --cursor-interactive    Cursor-interactive elements (@c refs — divs with pointer, onclick)

All flags can be combined freely. -o only applies when -a is also used. Example: $B snapshot -i -a -C -o /tmp/annotated.png

Ref numbering: @e refs are assigned sequentially (@e1, @e2, ...) in tree order. @c refs from -C are numbered separately (@c1, @c2, ...).

After snapshot, use @refs as selectors in any command:

$B click @e3       $B fill @e4 "value"     $B hover @e1
$B html @e2        $B css @e5 "color"      $B attrs @e6
$B click @c1       # cursor-interactive ref (from -C)

Output format: indented accessibility tree with @ref IDs, one element per line.

  @e1 [heading] "Welcome" [level=1]
  @e2 [textbox] "Email"
  @e3 [button] "Submit"

Refs are invalidated on navigation — run snapshot again after goto.

Full Command List

Command	Description
`back`	History back
`forward`	History forward
`goto <url>`	Navigate to URL
`reload`	Reload page
`url`	Print current URL

Reading

Command	Description
`accessibility`	Full ARIA tree
`forms`	Form fields as JSON
`html [selector]`	innerHTML of selector (throws if not found), or full page HTML if no selector given
`links`	All links as "text → href"
`text`	Cleaned page text

Interaction

Command	Description
`click <sel>`	Click element
`cookie <name>=<value>`	Set cookie on current page domain
`cookie-import <json>`	Import cookies from JSON file
`cookie-import-browser [browser] [--domain d]`	Import cookies from Comet, Chrome, Arc, Brave, or Edge (opens picker, or use --domain for direct import)
`dialog-accept [text]`	Auto-accept next alert/confirm/prompt. Optional text is sent as the prompt response
`dialog-dismiss`	Auto-dismiss next dialog
`fill <sel> <val>`	Fill input
`header <name>:<value>`	Set custom request header (colon-separated, sensitive values auto-redacted)
`hover <sel>`	Hover element
`press <key>`	Press key — Enter, Tab, Escape, ArrowUp/Down/Left/Right, Backspace, Delete, Home, End, PageUp, PageDown, or modifiers like Shift+Enter
`scroll [sel]`	Scroll element into view, or scroll to page bottom if no selector
`select <sel> <val>`	Select dropdown option by value, label, or visible text
`type <text>`	Type into focused element
`upload <sel> <file> [file2...]`	Upload file(s)
`useragent <string>`	Set user agent
`viewport <WxH>`	Set viewport size
`wait <sel	--networkidle

Inspection

Command	Description
`attrs <sel	@ref>`
`console [--clear	--errors]`
`cookies`	All cookies as JSON
`css <sel> <prop>`	Computed CSS value
`dialog [--clear]`	Dialog messages
`eval <file>`	Run JavaScript from file and return result as string (path must be under /tmp or cwd)
`is <prop> <sel>`	State check (visible/hidden/enabled/disabled/checked/editable/focused)
`js <expr>`	Run JavaScript expression and return result as string
`network [--clear]`	Network requests
`perf`	Page load timings
`storage [set k v]`	Read all localStorage + sessionStorage as JSON, or set to write localStorage

Visual

Command	Description
`diff <url1> <url2>`	Text diff between pages
`pdf [path]`	Save as PDF
`responsive [prefix]`	Screenshots at mobile (375x812), tablet (768x1024), desktop (1280x720). Saves as {prefix}-mobile.png etc.
`screenshot [--viewport] [--clip x,y,w,h] [selector	@ref] [path]`

Snapshot

Command	Description
`snapshot [flags]`	Accessibility tree with @e refs for element selection. Flags: -i interactive only, -c compact, -d N depth limit, -s sel scope, -D diff vs previous, -a annotated screenshot, -o path output, -C cursor-interactive @c refs

Command	Description
`chain`	Run commands from JSON stdin. Format: [["cmd","arg1",...],...]

Tabs

Command	Description
`closetab [id]`	Close tab
`newtab [url]`	Open new tab
`tab <id>`	Switch to tab
`tabs`	List open tabs

Server

Command	Description
`restart`	Restart server
`status`	Health check
`stop`	Shutdown server

9.5 KiB

Raw Blame History

Update Check (run first)

browse: QA Testing & Dogfooding

SETUP (run this check BEFORE any browse command)

Core QA Patterns

1. Verify a page loads correctly

2. Test a user flow

3. Verify an action worked

4. Visual evidence for bug reports

5. Find all clickable elements (including non-ARIA)

6. Assert element states

7. Test responsive layouts

8. Test file uploads

9. Test dialogs

10. Compare environments

Snapshot Flags

Full Command List

Navigation

Reading

Interaction

Inspection

Visual

Snapshot

Meta

Tabs

Server

9.5 KiB Raw Blame History

Update Check (run first)

browse: QA Testing & Dogfooding

SETUP (run this check BEFORE any browse command)

Core QA Patterns

1. Verify a page loads correctly

2. Test a user flow

3. Verify an action worked

4. Visual evidence for bug reports

5. Find all clickable elements (including non-ARIA)

6. Assert element states

7. Test responsive layouts

8. Test file uploads

9. Test dialogs

10. Compare environments

Snapshot Flags

Full Command List

Navigation

Reading

Interaction

Inspection

Visual

Snapshot

Meta

Tabs

Server

9.5 KiB

Raw Blame History