feat: screenshot element/region clipping (v0.3.7) (#56)

* feat: screenshot element/region clipping (--clip, --viewport, CSS/@ref)

Add element crop (CSS selector or @ref), region clip (--clip x,y,w,h),
and viewport-only (--viewport) modes to the screenshot command. Uses
Playwright's native locator.screenshot() and page.screenshot({ clip }).
Full page remains the default. Includes 10 new tests covering all modes
and error paths.

* chore: bump version and changelog (v0.3.7)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* docs: add screenshot modes to BROWSER.md command reference

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
Garry Tan
2026-03-14 12:47:42 -07:00
committed by GitHub
parent 0ac7ef4e81
commit 2aa745cb0e
10 changed files with 207 additions and 10 deletions
+16 -1
View File
@@ -11,7 +11,7 @@ This document covers the command reference and internals of gstack's headless br
| Snapshot | `snapshot [-i] [-c] [-d N] [-s sel] [-D] [-a] [-o] [-C]` | Get refs, diff, annotate | | Snapshot | `snapshot [-i] [-c] [-d N] [-s sel] [-D] [-a] [-o] [-C]` | Get refs, diff, annotate |
| Interact | `click`, `fill`, `select`, `hover`, `type`, `press`, `scroll`, `wait`, `viewport`, `upload` | Use the page | | Interact | `click`, `fill`, `select`, `hover`, `type`, `press`, `scroll`, `wait`, `viewport`, `upload` | Use the page |
| Inspect | `js`, `eval`, `css`, `attrs`, `is`, `console`, `network`, `dialog`, `cookies`, `storage`, `perf` | Debug and verify | | Inspect | `js`, `eval`, `css`, `attrs`, `is`, `console`, `network`, `dialog`, `cookies`, `storage`, `perf` | Debug and verify |
| Visual | `screenshot`, `pdf`, `responsive` | See what Claude sees | | Visual | `screenshot [--viewport] [--clip x,y,w,h] [sel\|@ref] [path]`, `pdf`, `responsive` | See what Claude sees |
| Compare | `diff <url1> <url2>` | Spot differences between environments | | Compare | `diff <url1> <url2>` | Spot differences between environments |
| Dialogs | `dialog-accept [text]`, `dialog-dismiss` | Control alert/confirm/prompt handling | | Dialogs | `dialog-accept [text]`, `dialog-dismiss` | Control alert/confirm/prompt handling |
| Tabs | `tabs`, `tab`, `newtab`, `closetab` | Multi-page workflows | | Tabs | `tabs`, `tab`, `newtab`, `closetab` | Multi-page workflows |
@@ -92,6 +92,21 @@ No DOM mutation. No injected scripts. Just Playwright's native accessibility API
- `--annotate` (`-a`): Injects temporary overlay divs at each ref's bounding box, takes a screenshot with ref labels visible, then removes the overlays. Use `-o <path>` to control the output path. - `--annotate` (`-a`): Injects temporary overlay divs at each ref's bounding box, takes a screenshot with ref labels visible, then removes the overlays. Use `-o <path>` to control the output path.
- `--cursor-interactive` (`-C`): Scans for non-ARIA interactive elements (divs with `cursor:pointer`, `onclick`, `tabindex>=0`) using `page.evaluate`. Assigns `@c1`, `@c2`... refs with deterministic `nth-child` CSS selectors. These are elements the ARIA tree misses but users can still click. - `--cursor-interactive` (`-C`): Scans for non-ARIA interactive elements (divs with `cursor:pointer`, `onclick`, `tabindex>=0`) using `page.evaluate`. Assigns `@c1`, `@c2`... refs with deterministic `nth-child` CSS selectors. These are elements the ARIA tree misses but users can still click.
### Screenshot modes
The `screenshot` command supports four modes:
| Mode | Syntax | Playwright API |
|------|--------|----------------|
| Full page (default) | `screenshot [path]` | `page.screenshot({ fullPage: true })` |
| Viewport only | `screenshot --viewport [path]` | `page.screenshot({ fullPage: false })` |
| Element crop | `screenshot "#sel" [path]` or `screenshot @e3 [path]` | `locator.screenshot()` |
| Region clip | `screenshot --clip x,y,w,h [path]` | `page.screenshot({ clip })` |
Element crop accepts CSS selectors (`.class`, `#id`, `[attr]`) or `@e`/`@c` refs from `snapshot`. Auto-detection: `@e`/`@c` prefix = ref, `.`/`#`/`[` prefix = CSS selector, `--` prefix = flag, everything else = output path.
Mutual exclusion: `--clip` + selector and `--viewport` + `--clip` both throw errors. Unknown flags (e.g. `--bogus`) also throw.
### Authentication ### Authentication
Each server session generates a random UUID as a bearer token. The token is written to the state file (`.gstack/browse.json`) with chmod 600. Every HTTP request must include `Authorization: Bearer <token>`. This prevents other processes on the machine from controlling the browser. Each server session generates a random UUID as a bearer token. The token is written to the state file (`.gstack/browse.json`) with chmod 600. Every HTTP request must include `Authorization: Bearer <token>`. This prevents other processes on the machine from controlling the browser.
+6
View File
@@ -1,5 +1,11 @@
# Changelog # Changelog
## 0.3.7 — 2026-03-14
### Added
- **Screenshot element/region clipping** — `screenshot` command now supports element crop via CSS selector or @ref (`screenshot "#hero" out.png`, `screenshot @e3 out.png`), region clip (`screenshot --clip x,y,w,h out.png`), and viewport-only mode (`screenshot --viewport out.png`). Uses Playwright's native `locator.screenshot()` and `page.screenshot({ clip })`. Full page remains the default.
- 10 new tests covering all screenshot modes (viewport, CSS, @ref, clip) and error paths (unknown flag, mutual exclusion, invalid coords, path validation, nonexistent selector).
## 0.3.6 — 2026-03-14 ## 0.3.6 — 2026-03-14
### Added ### Added
+12 -1
View File
@@ -128,6 +128,17 @@ $B viewport 375x812 # iPhone
$B screenshot /tmp/mobile.png $B screenshot /tmp/mobile.png
$B viewport 1440x900 # Desktop $B viewport 1440x900 # Desktop
$B screenshot /tmp/desktop.png $B screenshot /tmp/desktop.png
# Element screenshot (crop to specific element)
$B screenshot "#hero-banner" /tmp/hero.png
$B snapshot -i
$B screenshot @e3 /tmp/button.png
# Region crop
$B screenshot --clip 0,0,800,600 /tmp/above-fold.png
# Viewport only (no scroll)
$B screenshot --viewport /tmp/viewport.png
``` ```
### Test file upload ### Test file upload
@@ -337,7 +348,7 @@ Refs are invalidated on navigation — run `snapshot` again after `goto`.
| `diff <url1> <url2>` | Text diff between pages | | `diff <url1> <url2>` | Text diff between pages |
| `pdf [path]` | Save as PDF | | `pdf [path]` | Save as PDF |
| `responsive [prefix]` | Screenshots at mobile (375x812), tablet (768x1024), desktop (1280x720). Saves as {prefix}-mobile.png etc. | | `responsive [prefix]` | Screenshots at mobile (375x812), tablet (768x1024), desktop (1280x720). Saves as {prefix}-mobile.png etc. |
| `screenshot [path]` | Save screenshot | | `screenshot [--viewport] [--clip x,y,w,h] [selector|@ref] [path]` | Save screenshot (supports element crop via CSS/@ref, --clip region, --viewport) |
### Snapshot ### Snapshot
| Command | Description | | Command | Description |
+11
View File
@@ -102,6 +102,17 @@ $B viewport 375x812 # iPhone
$B screenshot /tmp/mobile.png $B screenshot /tmp/mobile.png
$B viewport 1440x900 # Desktop $B viewport 1440x900 # Desktop
$B screenshot /tmp/desktop.png $B screenshot /tmp/desktop.png
# Element screenshot (crop to specific element)
$B screenshot "#hero-banner" /tmp/hero.png
$B snapshot -i
$B screenshot @e3 /tmp/button.png
# Region crop
$B screenshot --clip 0,0,800,600 /tmp/above-fold.png
# Viewport only (no scroll)
$B screenshot --viewport /tmp/viewport.png
``` ```
### Test file upload ### Test file upload
+1 -1
View File
@@ -1 +1 @@
0.3.6 0.3.7
+1 -1
View File
@@ -227,7 +227,7 @@ Refs are invalidated on navigation — run `snapshot` again after `goto`.
| `diff <url1> <url2>` | Text diff between pages | | `diff <url1> <url2>` | Text diff between pages |
| `pdf [path]` | Save as PDF | | `pdf [path]` | Save as PDF |
| `responsive [prefix]` | Screenshots at mobile (375x812), tablet (768x1024), desktop (1280x720). Saves as {prefix}-mobile.png etc. | | `responsive [prefix]` | Screenshots at mobile (375x812), tablet (768x1024), desktop (1280x720). Saves as {prefix}-mobile.png etc. |
| `screenshot [path]` | Save screenshot | | `screenshot [--viewport] [--clip x,y,w,h] [selector|@ref] [path]` | Save screenshot (supports element crop via CSS/@ref, --clip region, --viewport) |
### Snapshot ### Snapshot
| Command | Description | | Command | Description |
+2 -1
View File
@@ -283,7 +283,8 @@ Inspection: js <expr> | eval <file> | css <sel> <prop> | attrs <sel>
console [--clear|--errors] | network [--clear] | dialog [--clear] console [--clear|--errors] | network [--clear] | dialog [--clear]
cookies | storage [set <k> <v>] | perf cookies | storage [set <k> <v>] | perf
is <prop> <sel> (visible|hidden|enabled|disabled|checked|editable|focused) is <prop> <sel> (visible|hidden|enabled|disabled|checked|editable|focused)
Visual: screenshot [path] | pdf [path] | responsive [prefix] Visual: screenshot [--viewport] [--clip x,y,w,h] [@ref|sel] [path]
pdf [path] | responsive [prefix]
Snapshot: snapshot [-i] [-c] [-d N] [-s sel] [-D] [-a] [-o path] [-C] Snapshot: snapshot [-i] [-c] [-d N] [-s sel] [-D] [-a] [-o path] [-C]
-D/--diff: diff against previous snapshot -D/--diff: diff against previous snapshot
-a/--annotate: annotated screenshot with ref labels -a/--annotate: annotated screenshot with ref labels
+1 -1
View File
@@ -78,7 +78,7 @@ export const COMMAND_DESCRIPTIONS: Record<string, { category: string; descriptio
'dialog-accept': { category: 'Interaction', description: 'Auto-accept next alert/confirm/prompt. Optional text is sent as the prompt response', usage: 'dialog-accept [text]' }, 'dialog-accept': { category: 'Interaction', description: 'Auto-accept next alert/confirm/prompt. Optional text is sent as the prompt response', usage: 'dialog-accept [text]' },
'dialog-dismiss': { category: 'Interaction', description: 'Auto-dismiss next dialog' }, 'dialog-dismiss': { category: 'Interaction', description: 'Auto-dismiss next dialog' },
// Visual // Visual
'screenshot': { category: 'Visual', description: 'Save screenshot', usage: 'screenshot [path]' }, 'screenshot': { category: 'Visual', description: 'Save screenshot (supports element crop via CSS/@ref, --clip region, --viewport)', usage: 'screenshot [--viewport] [--clip x,y,w,h] [selector|@ref] [path]' },
'pdf': { category: 'Visual', description: 'Save as PDF', usage: 'pdf [path]' }, 'pdf': { category: 'Visual', description: 'Save as PDF', usage: 'pdf [path]' },
'responsive': { category: 'Visual', description: 'Screenshots at mobile (375x812), tablet (768x1024), desktop (1280x720). Saves as {prefix}-mobile.png etc.', usage: 'responsive [prefix]' }, 'responsive': { category: 'Visual', description: 'Screenshots at mobile (375x812), tablet (768x1024), desktop (1280x720). Saves as {prefix}-mobile.png etc.', usage: 'responsive [prefix]' },
'diff': { category: 'Visual', description: 'Text diff between pages', usage: 'diff <url1> <url2>' }, 'diff': { category: 'Visual', description: 'Text diff between pages', usage: 'diff <url1> <url2>' },
+56 -4
View File
@@ -106,11 +106,63 @@ export async function handleMetaCommand(
// ─── Visual ──────────────────────────────────────── // ─── Visual ────────────────────────────────────────
case 'screenshot': { case 'screenshot': {
// Parse priority: flags (--viewport, --clip) → selector (@ref, CSS) → output path
const page = bm.getPage(); const page = bm.getPage();
const screenshotPath = args[0] || '/tmp/browse-screenshot.png'; let outputPath = '/tmp/browse-screenshot.png';
validateOutputPath(screenshotPath); let clipRect: { x: number; y: number; width: number; height: number } | undefined;
await page.screenshot({ path: screenshotPath, fullPage: true }); let targetSelector: string | undefined;
return `Screenshot saved: ${screenshotPath}`; let viewportOnly = false;
const remaining: string[] = [];
for (let i = 0; i < args.length; i++) {
if (args[i] === '--viewport') {
viewportOnly = true;
} else if (args[i] === '--clip') {
const coords = args[++i];
if (!coords) throw new Error('Usage: screenshot --clip x,y,w,h [path]');
const parts = coords.split(',').map(Number);
if (parts.length !== 4 || parts.some(isNaN))
throw new Error('Usage: screenshot --clip x,y,width,height — all must be numbers');
clipRect = { x: parts[0], y: parts[1], width: parts[2], height: parts[3] };
} else if (args[i].startsWith('--')) {
throw new Error(`Unknown screenshot flag: ${args[i]}`);
} else {
remaining.push(args[i]);
}
}
// Separate target (selector/@ref) from output path
for (const arg of remaining) {
if (arg.startsWith('@e') || arg.startsWith('@c') || arg.startsWith('.') || arg.startsWith('#') || arg.includes('[')) {
targetSelector = arg;
} else {
outputPath = arg;
}
}
validateOutputPath(outputPath);
if (clipRect && targetSelector) {
throw new Error('Cannot use --clip with a selector/ref — choose one');
}
if (viewportOnly && clipRect) {
throw new Error('Cannot use --viewport with --clip — choose one');
}
if (targetSelector) {
const resolved = bm.resolveRef(targetSelector);
const locator = 'locator' in resolved ? resolved.locator : page.locator(resolved.selector);
await locator.screenshot({ path: outputPath, timeout: 5000 });
return `Screenshot saved (element): ${outputPath}`;
}
if (clipRect) {
await page.screenshot({ path: outputPath, clip: clipRect });
return `Screenshot saved (clip ${clipRect.x},${clipRect.y},${clipRect.width},${clipRect.height}): ${outputPath}`;
}
await page.screenshot({ path: outputPath, fullPage: !viewportOnly });
return `Screenshot saved${viewportOnly ? ' (viewport)' : ''}: ${outputPath}`;
} }
case 'pdf': { case 'pdf': {
+101
View File
@@ -315,6 +315,107 @@ describe('Visual', () => {
fs.unlinkSync(screenshotPath); fs.unlinkSync(screenshotPath);
}); });
test('screenshot --viewport saves viewport-only', async () => {
await handleWriteCommand('goto', [baseUrl + '/basic.html'], bm);
const p = '/tmp/browse-test-viewport.png';
const result = await handleMetaCommand('screenshot', ['--viewport', p], bm, async () => {});
expect(result).toContain('Screenshot saved (viewport)');
expect(fs.existsSync(p)).toBe(true);
expect(fs.statSync(p).size).toBeGreaterThan(1000);
fs.unlinkSync(p);
});
test('screenshot with CSS selector crops to element', async () => {
await handleWriteCommand('goto', [baseUrl + '/basic.html'], bm);
const p = '/tmp/browse-test-element-css.png';
const result = await handleMetaCommand('screenshot', ['#title', p], bm, async () => {});
expect(result).toContain('Screenshot saved (element)');
expect(fs.existsSync(p)).toBe(true);
expect(fs.statSync(p).size).toBeGreaterThan(100);
fs.unlinkSync(p);
});
test('screenshot with @ref crops to element', async () => {
await handleWriteCommand('goto', [baseUrl + '/basic.html'], bm);
await handleMetaCommand('snapshot', [], bm, async () => {});
const p = '/tmp/browse-test-element-ref.png';
const result = await handleMetaCommand('screenshot', ['@e1', p], bm, async () => {});
expect(result).toContain('Screenshot saved (element)');
expect(fs.existsSync(p)).toBe(true);
expect(fs.statSync(p).size).toBeGreaterThan(100);
fs.unlinkSync(p);
});
test('screenshot --clip crops to region', async () => {
await handleWriteCommand('goto', [baseUrl + '/basic.html'], bm);
const p = '/tmp/browse-test-clip.png';
const result = await handleMetaCommand('screenshot', ['--clip', '0,0,100,100', p], bm, async () => {});
expect(result).toContain('Screenshot saved (clip 0,0,100,100)');
expect(fs.existsSync(p)).toBe(true);
expect(fs.statSync(p).size).toBeGreaterThan(100);
fs.unlinkSync(p);
});
test('screenshot --clip + selector throws', async () => {
await handleWriteCommand('goto', [baseUrl + '/basic.html'], bm);
try {
await handleMetaCommand('screenshot', ['--clip', '0,0,100,100', '#title'], bm, async () => {});
expect(true).toBe(false);
} catch (err: any) {
expect(err.message).toContain('Cannot use --clip with a selector/ref');
}
});
test('screenshot --viewport + --clip throws', async () => {
await handleWriteCommand('goto', [baseUrl + '/basic.html'], bm);
try {
await handleMetaCommand('screenshot', ['--viewport', '--clip', '0,0,100,100'], bm, async () => {});
expect(true).toBe(false);
} catch (err: any) {
expect(err.message).toContain('Cannot use --viewport with --clip');
}
});
test('screenshot --clip with invalid coords throws', async () => {
await handleWriteCommand('goto', [baseUrl + '/basic.html'], bm);
try {
await handleMetaCommand('screenshot', ['--clip', 'abc'], bm, async () => {});
expect(true).toBe(false);
} catch (err: any) {
expect(err.message).toContain('all must be numbers');
}
});
test('screenshot unknown flag throws', async () => {
await handleWriteCommand('goto', [baseUrl + '/basic.html'], bm);
try {
await handleMetaCommand('screenshot', ['--bogus', '/tmp/foo.png'], bm, async () => {});
expect(true).toBe(false);
} catch (err: any) {
expect(err.message).toContain('Unknown screenshot flag');
}
});
test('screenshot --viewport still validates path', async () => {
await handleWriteCommand('goto', [baseUrl + '/basic.html'], bm);
try {
await handleMetaCommand('screenshot', ['--viewport', '/etc/evil.png'], bm, async () => {});
expect(true).toBe(false);
} catch (err: any) {
expect(err.message).toContain('Path must be within');
}
});
test('screenshot with nonexistent selector throws timeout', async () => {
await handleWriteCommand('goto', [baseUrl + '/basic.html'], bm);
try {
await handleMetaCommand('screenshot', ['.nonexistent-element-xyz'], bm, async () => {});
expect(true).toBe(false);
} catch (err: any) {
expect(err.message).toBeDefined();
}
}, 10000);
test('responsive saves 3 screenshots', async () => { test('responsive saves 3 screenshots', async () => {
await handleWriteCommand('goto', [baseUrl + '/responsive.html'], bm); await handleWriteCommand('goto', [baseUrl + '/responsive.html'], bm);
const prefix = '/tmp/browse-test-resp'; const prefix = '/tmp/browse-test-resp';