diff --git a/CHANGELOG.md b/CHANGELOG.md index d89b840f..d687fdb9 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -1,10 +1,28 @@ # Changelog -## [0.15.17.0] - 2026-04-07 +## [0.16.1.0] - 2026-04-08 ### Fixed - Cookie picker no longer leaks the browse server auth token. Previously, opening the cookie picker page exposed the master bearer token in the HTML source, letting any local process extract it and execute arbitrary JavaScript in your browser session. Now uses a one-time code exchange with an HttpOnly session cookie. The token never appears in HTML, URLs, or browser history. (Reported by Horoshi at Vagabond Research, CVSS 7.8) +## [0.16.0.0] - 2026-04-07 + +### Added +- **Browser data platform.** Six new browse commands that turn gstack browser from "a thing that clicks buttons" into a full scraping and data extraction tool for AI agents. +- `media` command: discover every image, video, and audio element on a page. Returns URLs, dimensions, srcset, lazy-load state, and detects HLS/DASH streams. Filter with `--images`, `--videos`, `--audio`, or scope with a CSS selector. +- `data` command: extract structured data embedded in pages. JSON-LD (product prices, recipes, events), Open Graph, Twitter Cards, and meta tags. One command gives you what used to take 50 lines of DOM scraping. +- `download` command: fetch any URL or `@ref` element to disk using the browser's session cookies. Handles blob URLs via in-page base64 conversion. `--base64` flag returns inline data URI for remote agents. Detects HLS/DASH and tells you to use yt-dlp instead of silently failing. +- `scrape` command: bulk download all media from a page. Combines `media` discovery + `download` in a loop with URL deduplication, configurable limits, and a `manifest.json` for machine consumption. +- `archive` command: save complete pages as MHTML via CDP. One command, full page with all resources. +- `scroll --times N`: automated repeated scrolling for infinite feed content loading. Configurable delay between scrolls with `--wait`. +- `screenshot --base64`: return screenshots as inline data URIs instead of file paths. Eliminates the two-step screenshot-then-file-serve dance for remote agents. +- **Network response body capture.** `network --capture` intercepts API response bodies so agents get structured JSON instead of fragile DOM scraping. Filter by URL pattern (`--filter graphql`), export as JSONL (`--export`), view summary (`--bodies`). 50MB size-capped buffer with automatic eviction. +- `GET /file` endpoint: remote paired agents can now retrieve downloaded files (images, scraped media, screenshots) over HTTP. TEMP_DIR only to prevent project file exfiltration. Bearer token auth, MIME detection, zero-copy streaming via `Bun.file()`. + +### Changed +- Paired agents now get full access by default (read+write+admin+meta). The trust boundary is the pairing ceremony, not the scope. An agent that can click any button doesn't gain meaningful attack surface from also being able to run `js`. Browser-wide destructive commands (stop, restart, disconnect) moved to new `control` scope, still opt-in via `--control`. +- Path validation extracted to shared `path-security.ts` module. Was duplicated across three files with slightly different implementations. Now one source of truth with `validateOutputPath`, `validateReadPath`, and `validateTempPath`. + ## [0.15.16.0] - 2026-04-06 ### Added diff --git a/SKILL.md b/SKILL.md index 3d951a67..94ba826b 100644 --- a/SKILL.md +++ b/SKILL.md @@ -773,11 +773,20 @@ Refs are invalidated on navigation — run `snapshot` again after `goto`. | Command | Description | |---------|-------------| | `accessibility` | Full ARIA tree | +| `data [--jsonld|--og|--meta|--twitter]` | Structured data: JSON-LD, Open Graph, Twitter Cards, meta tags | | `forms` | Form fields as JSON | | `html [selector]` | innerHTML of selector (throws if not found), or full page HTML if no selector given | | `links` | All links as "text → href" | +| `media [--images|--videos|--audio] [selector]` | All media elements (images, videos, audio) with URLs, dimensions, types | | `text` | Cleaned page text | +### Extraction +| Command | Description | +|---------|-------------| +| `archive [path]` | Save complete page as MHTML via CDP | +| `download [path] [--base64]` | Download URL or media element to disk using browser cookies | +| `scrape [--selector sel] [--dir path] [--limit N]` | Bulk download all media from page. Writes manifest.json | + ### Interaction | Command | Description | |---------|-------------| diff --git a/VERSION b/VERSION index 4a2a39e3..84c82737 100644 --- a/VERSION +++ b/VERSION @@ -1 +1 @@ -0.15.17.0 +0.16.1.0 diff --git a/browse/SKILL.md b/browse/SKILL.md index 5bc9b02b..420e2b0b 100644 --- a/browse/SKILL.md +++ b/browse/SKILL.md @@ -665,11 +665,20 @@ $B prettyscreenshot --cleanup --scroll-to ".pricing" --width 1440 ~/Desktop/hero | Command | Description | |---------|-------------| | `accessibility` | Full ARIA tree | +| `data [--jsonld|--og|--meta|--twitter]` | Structured data: JSON-LD, Open Graph, Twitter Cards, meta tags | | `forms` | Form fields as JSON | | `html [selector]` | innerHTML of selector (throws if not found), or full page HTML if no selector given | | `links` | All links as "text → href" | +| `media [--images|--videos|--audio] [selector]` | All media elements (images, videos, audio) with URLs, dimensions, types | | `text` | Cleaned page text | +### Extraction +| Command | Description | +|---------|-------------| +| `archive [path]` | Save complete page as MHTML via CDP | +| `download [path] [--base64]` | Download URL or media element to disk using browser cookies | +| `scrape [--selector sel] [--dir path] [--limit N]` | Bulk download all media from page. Writes manifest.json | + ### Interaction | Command | Description | |---------|-------------| diff --git a/browse/src/cli.ts b/browse/src/cli.ts index bbd5c733..0f6210a2 100644 --- a/browse/src/cli.ts +++ b/browse/src/cli.ts @@ -566,7 +566,7 @@ COMMAND REFERENCE: New tab: {"command": "newtab", "args": ["URL"]} SCOPES: ${scopeDesc}. -${scopes.includes('admin') ? '' : `To get admin access (JS, cookies, storage), ask the user to re-pair with --admin.\n`} +${scopes.includes('control') ? '' : `To get browser control access (stop, restart, disconnect), ask the user to re-pair with --control.\n`} TOKEN: Expires ${expiresAt}. Revoke: ask the user to run $B tunnel revoke @@ -591,10 +591,13 @@ function hasFlag(args: string[], flag: string): boolean { async function handlePairAgent(state: ServerState, args: string[]): Promise { const clientName = parseFlag(args, '--client') || `remote-${Date.now()}`; const domains = parseFlag(args, '--domain')?.split(',').map(d => d.trim()); - const admin = hasFlag(args, '--admin'); + const control = hasFlag(args, '--control') || hasFlag(args, '--admin'); + const restrict = parseFlag(args, '--restrict'); const localHost = parseFlag(args, '--local'); // Call POST /pair to create a setup key + // Default: full access (read+write+admin+meta). --control adds browser-wide ops. + // --restrict limits: --restrict read (read-only), --restrict "read,write" (no admin) const pairResp = await fetch(`http://127.0.0.1:${state.port}/pair`, { method: 'POST', headers: { @@ -603,9 +606,9 @@ async function handlePairAgent(state: ServerState, args: string[]): Promise s.trim()) } : {}), }), signal: AbortSignal.timeout(5000), }); diff --git a/browse/src/commands.ts b/browse/src/commands.ts index ceb089f3..eacdf0cd 100644 --- a/browse/src/commands.ts +++ b/browse/src/commands.ts @@ -16,6 +16,7 @@ export const READ_COMMANDS = new Set([ 'console', 'network', 'cookies', 'storage', 'perf', 'dialog', 'is', 'inspect', + 'media', 'data', ]); export const WRITE_COMMANDS = new Set([ @@ -24,6 +25,7 @@ export const WRITE_COMMANDS = new Set([ 'viewport', 'cookie', 'cookie-import', 'cookie-import-browser', 'header', 'useragent', 'upload', 'dialog-accept', 'dialog-dismiss', 'style', 'cleanup', 'prettyscreenshot', + 'download', 'scrape', 'archive', ]); export const META_COMMANDS = new Set([ @@ -46,6 +48,7 @@ export const ALL_COMMANDS = new Set([...READ_COMMANDS, ...WRITE_COMMANDS, ...MET export const PAGE_CONTENT_COMMANDS = new Set([ 'text', 'html', 'links', 'forms', 'accessibility', 'attrs', 'console', 'dialog', + 'media', 'data', ]); /** Wrap output from untrusted-content commands with trust boundary markers */ @@ -70,6 +73,8 @@ export const COMMAND_DESCRIPTIONS: Record' }, 'eval': { category: 'Inspection', description: 'Run JavaScript from file and return result as string (path must be under /tmp or cwd)', usage: 'eval ' }, @@ -100,6 +105,10 @@ export const COMMAND_DESCRIPTIONS: Record' }, 'dialog-accept': { category: 'Interaction', description: 'Auto-accept next alert/confirm/prompt. Optional text is sent as the prompt response', usage: 'dialog-accept [text]' }, 'dialog-dismiss': { category: 'Interaction', description: 'Auto-dismiss next dialog' }, + // Data extraction + 'download': { category: 'Extraction', description: 'Download URL or media element to disk using browser cookies', usage: 'download [path] [--base64]' }, + 'scrape': { category: 'Extraction', description: 'Bulk download all media from page. Writes manifest.json', usage: 'scrape [--selector sel] [--dir path] [--limit N]' }, + 'archive': { category: 'Extraction', description: 'Save complete page as MHTML via CDP', usage: 'archive [path]' }, // Visual 'screenshot': { category: 'Visual', description: 'Save screenshot (supports element crop via CSS/@ref, --clip region, --viewport)', usage: 'screenshot [--viewport] [--clip x,y,w,h] [selector|@ref] [path]' }, 'pdf': { category: 'Visual', description: 'Save as PDF', usage: 'pdf [path]' }, diff --git a/browse/src/media-extract.ts b/browse/src/media-extract.ts new file mode 100644 index 00000000..4ff9b252 --- /dev/null +++ b/browse/src/media-extract.ts @@ -0,0 +1,177 @@ +/** + * Media extraction helper — shared between `media` (read) and `scrape` (write) commands. + * + * Runs page.evaluate() to discover all media elements on the page: + * - with src, srcset, currentSrc, alt, dimensions, loading, data-src + * -