From 0baca1a3253ae4d59e790b18936329158eb0e067 Mon Sep 17 00:00:00 2001 From: Garry Tan Date: Sat, 14 Mar 2026 14:43:01 -0500 Subject: [PATCH] docs: add screenshot modes to BROWSER.md command reference Co-Authored-By: Claude Opus 4.6 --- BROWSER.md | 17 ++++++++++++++++- 1 file changed, 16 insertions(+), 1 deletion(-) diff --git a/BROWSER.md b/BROWSER.md index 640bb659..8d0c5775 100644 --- a/BROWSER.md +++ b/BROWSER.md @@ -11,7 +11,7 @@ This document covers the command reference and internals of gstack's headless br | Snapshot | `snapshot [-i] [-c] [-d N] [-s sel] [-D] [-a] [-o] [-C]` | Get refs, diff, annotate | | Interact | `click`, `fill`, `select`, `hover`, `type`, `press`, `scroll`, `wait`, `viewport`, `upload` | Use the page | | Inspect | `js`, `eval`, `css`, `attrs`, `is`, `console`, `network`, `dialog`, `cookies`, `storage`, `perf` | Debug and verify | -| Visual | `screenshot`, `pdf`, `responsive` | See what Claude sees | +| Visual | `screenshot [--viewport] [--clip x,y,w,h] [sel\|@ref] [path]`, `pdf`, `responsive` | See what Claude sees | | Compare | `diff ` | Spot differences between environments | | Dialogs | `dialog-accept [text]`, `dialog-dismiss` | Control alert/confirm/prompt handling | | Tabs | `tabs`, `tab`, `newtab`, `closetab` | Multi-page workflows | @@ -92,6 +92,21 @@ No DOM mutation. No injected scripts. Just Playwright's native accessibility API - `--annotate` (`-a`): Injects temporary overlay divs at each ref's bounding box, takes a screenshot with ref labels visible, then removes the overlays. Use `-o ` to control the output path. - `--cursor-interactive` (`-C`): Scans for non-ARIA interactive elements (divs with `cursor:pointer`, `onclick`, `tabindex>=0`) using `page.evaluate`. Assigns `@c1`, `@c2`... refs with deterministic `nth-child` CSS selectors. These are elements the ARIA tree misses but users can still click. +### Screenshot modes + +The `screenshot` command supports four modes: + +| Mode | Syntax | Playwright API | +|------|--------|----------------| +| Full page (default) | `screenshot [path]` | `page.screenshot({ fullPage: true })` | +| Viewport only | `screenshot --viewport [path]` | `page.screenshot({ fullPage: false })` | +| Element crop | `screenshot "#sel" [path]` or `screenshot @e3 [path]` | `locator.screenshot()` | +| Region clip | `screenshot --clip x,y,w,h [path]` | `page.screenshot({ clip })` | + +Element crop accepts CSS selectors (`.class`, `#id`, `[attr]`) or `@e`/`@c` refs from `snapshot`. Auto-detection: `@e`/`@c` prefix = ref, `.`/`#`/`[` prefix = CSS selector, `--` prefix = flag, everything else = output path. + +Mutual exclusion: `--clip` + selector and `--viewport` + `--clip` both throw errors. Unknown flags (e.g. `--bogus`) also throw. + ### Authentication Each server session generates a random UUID as a bearer token. The token is written to the state file (`.gstack/browse.json`) with chmod 600. Every HTTP request must include `Authorization: Bearer `. This prevents other processes on the machine from controlling the browser.