feat: browser data platform for AI agents (v0.16.0.0) (#907)

* refactor: extract path-security.ts shared module

validateOutputPath, validateReadPath, and SAFE_DIRECTORIES were duplicated
across write-commands.ts, meta-commands.ts, and read-commands.ts. Extract
to a single shared module with re-exports for backward compatibility.

Also adds validateTempPath() for the upcoming GET /file endpoint (TEMP_DIR
only, not cwd, to prevent remote agents from reading project files).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: default paired agents to full access, split SCOPE_CONTROL

The trust boundary for paired agents is the pairing ceremony itself, not
the scope. An agent with write scope can already click anything and navigate
anywhere. Gating js/cookies behind --admin was security theater.

Changes:
- Default pair scopes: read+write+admin+meta (was read+write)
- New SCOPE_CONTROL for browser-wide destructive ops (stop, restart,
  disconnect, state, handoff, resume, connect)
- --admin flag now grants control scope (backward compat)
- New --restrict flag for limited access (e.g., --restrict read)
- Updated hint text: "re-pair with --control" instead of "--admin"

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add media and data commands for page content extraction

media command: discovers all img/video/audio/background-image elements
on the page. Returns JSON with URLs, dimensions, srcset, loading state,
HLS/DASH detection. Supports --images/--videos/--audio filters and
optional CSS selector scoping.

data command: extracts structured data embedded in pages (JSON-LD,
Open Graph, Twitter Cards, meta tags). One command returns product
prices, article metadata, social share info without DOM scraping.

Both are READ scope with untrusted content wrapping.
Shared media-extract.ts helper for reuse by the upcoming scrape command.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add download, scrape, and archive commands

download: fetch any URL or @ref element to disk using browser session
cookies via page.request.fetch(). Supports blob: URLs via in-page
base64 conversion. --base64 flag returns inline data URI (cap 10MB).
Detects HLS/DASH and rejects with yt-dlp hint.

scrape: bulk media download composing media discovery + download loop.
Sequential with 100ms delay, URL deduplication, configurable --limit.
Writes manifest.json with per-file metadata for machine consumption.

archive: saves complete page as MHTML via CDP Page.captureSnapshot.
No silent fallback -- errors clearly if CDP unavailable.

All three are WRITE scope (write to disk, blocked in watch mode).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add GET /file endpoint for remote agent file retrieval

Remote paired agents can now retrieve downloaded files over HTTP.
TEMP_DIR only (not cwd) to prevent project file exfiltration.

- Bearer token auth (root or scoped with read scope)
- Path validation via validateTempPath() (symlink-aware)
- 200MB size cap
- Extension-based MIME detection
- Zero-copy streaming via Bun.file()

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add scroll --times N for automated repeated scrolling

Extends the scroll command with --times N flag for infinite feed
scraping. Scrolls N times with configurable --wait delay (default
1000ms) between each scroll for content loading.

Usage: scroll --times 10
       scroll --times 5 --wait 2000
       scroll --times 3 .feed-container

Composable with scrape: scroll to load content, then scrape images.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add network response body capture (--capture/--export/--bodies)

The killer feature for social media scraping. Extends the existing
network command to intercept API response bodies:

  network --capture [--filter graphql]  # start capturing
  network --capture stop                # stop
  network --export /tmp/api.jsonl       # export as JSONL
  network --bodies                      # show summary

Uses page.on('response') listener with URL pattern filtering.
SizeCappedBuffer (50MB total, 5MB per-entry cap) evicts oldest
entries when full. Binary responses stored as base64, text as-is.

This lets agents tap Instagram's GraphQL API, TikTok's hydration
data, and any SPA's internal API responses instead of fragile DOM
scraping.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add screenshot --base64 for inline image return

Returns data:image/png;base64,... instead of writing to disk.
Cap at 10MB. Works with all screenshot modes (element, clip, viewport).

Eliminates the two-step screenshot+file-serve dance for remote agents.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* test: add data platform tests and media fixture

Tests for SizeCappedBuffer (eviction, export, summary), validateTempPath
(TEMP_DIR only, rejects cwd), command registration (all new commands in
correct scope sets), and MIME mapping source checks.

Rich HTML fixture with: standard images, lazy-loaded images, srcset,
video with sources + HLS, audio, CSS background-images, JSON-LD,
Open Graph, Twitter Cards, and meta tags.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: regenerate SKILL.md with Extraction category

Add Extraction category to browse command table ordering. Regenerate
SKILL.md files to include media, data, download, scrape, archive
commands in the generated documentation.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* chore: bump version and changelog (v0.16.0.0)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
Garry Tan
2026-04-08 00:41:55 -07:00
committed by GitHub
parent 9d34baa973
commit b73f364411
20 changed files with 1264 additions and 166 deletions
+18
View File
@@ -1,5 +1,23 @@
# Changelog
## [0.16.0.0] - 2026-04-07
### Added
- **Browser data platform.** Six new browse commands that turn gstack browser from "a thing that clicks buttons" into a full scraping and data extraction tool for AI agents.
- `media` command: discover every image, video, and audio element on a page. Returns URLs, dimensions, srcset, lazy-load state, and detects HLS/DASH streams. Filter with `--images`, `--videos`, `--audio`, or scope with a CSS selector.
- `data` command: extract structured data embedded in pages. JSON-LD (product prices, recipes, events), Open Graph, Twitter Cards, and meta tags. One command gives you what used to take 50 lines of DOM scraping.
- `download` command: fetch any URL or `@ref` element to disk using the browser's session cookies. Handles blob URLs via in-page base64 conversion. `--base64` flag returns inline data URI for remote agents. Detects HLS/DASH and tells you to use yt-dlp instead of silently failing.
- `scrape` command: bulk download all media from a page. Combines `media` discovery + `download` in a loop with URL deduplication, configurable limits, and a `manifest.json` for machine consumption.
- `archive` command: save complete pages as MHTML via CDP. One command, full page with all resources.
- `scroll --times N`: automated repeated scrolling for infinite feed content loading. Configurable delay between scrolls with `--wait`.
- `screenshot --base64`: return screenshots as inline data URIs instead of file paths. Eliminates the two-step screenshot-then-file-serve dance for remote agents.
- **Network response body capture.** `network --capture` intercepts API response bodies so agents get structured JSON instead of fragile DOM scraping. Filter by URL pattern (`--filter graphql`), export as JSONL (`--export`), view summary (`--bodies`). 50MB size-capped buffer with automatic eviction.
- `GET /file` endpoint: remote paired agents can now retrieve downloaded files (images, scraped media, screenshots) over HTTP. TEMP_DIR only to prevent project file exfiltration. Bearer token auth, MIME detection, zero-copy streaming via `Bun.file()`.
### Changed
- Paired agents now get full access by default (read+write+admin+meta). The trust boundary is the pairing ceremony, not the scope. An agent that can click any button doesn't gain meaningful attack surface from also being able to run `js`. Browser-wide destructive commands (stop, restart, disconnect) moved to new `control` scope, still opt-in via `--control`.
- Path validation extracted to shared `path-security.ts` module. Was duplicated across three files with slightly different implementations. Now one source of truth with `validateOutputPath`, `validateReadPath`, and `validateTempPath`.
## [0.15.16.0] - 2026-04-06
### Added
+9
View File
@@ -773,11 +773,20 @@ Refs are invalidated on navigation — run `snapshot` again after `goto`.
| Command | Description |
|---------|-------------|
| `accessibility` | Full ARIA tree |
| `data [--jsonld|--og|--meta|--twitter]` | Structured data: JSON-LD, Open Graph, Twitter Cards, meta tags |
| `forms` | Form fields as JSON |
| `html [selector]` | innerHTML of selector (throws if not found), or full page HTML if no selector given |
| `links` | All links as "text → href" |
| `media [--images|--videos|--audio] [selector]` | All media elements (images, videos, audio) with URLs, dimensions, types |
| `text` | Cleaned page text |
### Extraction
| Command | Description |
|---------|-------------|
| `archive [path]` | Save complete page as MHTML via CDP |
| `download <url|@ref> [path] [--base64]` | Download URL or media element to disk using browser cookies |
| `scrape <images|videos|media> [--selector sel] [--dir path] [--limit N]` | Bulk download all media from page. Writes manifest.json |
### Interaction
| Command | Description |
|---------|-------------|
+1 -1
View File
@@ -1 +1 @@
0.15.16.0
0.16.0.0
+9
View File
@@ -665,11 +665,20 @@ $B prettyscreenshot --cleanup --scroll-to ".pricing" --width 1440 ~/Desktop/hero
| Command | Description |
|---------|-------------|
| `accessibility` | Full ARIA tree |
| `data [--jsonld|--og|--meta|--twitter]` | Structured data: JSON-LD, Open Graph, Twitter Cards, meta tags |
| `forms` | Form fields as JSON |
| `html [selector]` | innerHTML of selector (throws if not found), or full page HTML if no selector given |
| `links` | All links as "text → href" |
| `media [--images|--videos|--audio] [selector]` | All media elements (images, videos, audio) with URLs, dimensions, types |
| `text` | Cleaned page text |
### Extraction
| Command | Description |
|---------|-------------|
| `archive [path]` | Save complete page as MHTML via CDP |
| `download <url|@ref> [path] [--base64]` | Download URL or media element to disk using browser cookies |
| `scrape <images|videos|media> [--selector sel] [--dir path] [--limit N]` | Bulk download all media from page. Writes manifest.json |
### Interaction
| Command | Description |
|---------|-------------|
+7 -4
View File
@@ -566,7 +566,7 @@ COMMAND REFERENCE:
New tab: {"command": "newtab", "args": ["URL"]}
SCOPES: ${scopeDesc}.
${scopes.includes('admin') ? '' : `To get admin access (JS, cookies, storage), ask the user to re-pair with --admin.\n`}
${scopes.includes('control') ? '' : `To get browser control access (stop, restart, disconnect), ask the user to re-pair with --control.\n`}
TOKEN: Expires ${expiresAt}. Revoke: ask the user to run
$B tunnel revoke <your-name>
@@ -591,10 +591,13 @@ function hasFlag(args: string[], flag: string): boolean {
async function handlePairAgent(state: ServerState, args: string[]): Promise<void> {
const clientName = parseFlag(args, '--client') || `remote-${Date.now()}`;
const domains = parseFlag(args, '--domain')?.split(',').map(d => d.trim());
const admin = hasFlag(args, '--admin');
const control = hasFlag(args, '--control') || hasFlag(args, '--admin');
const restrict = parseFlag(args, '--restrict');
const localHost = parseFlag(args, '--local');
// Call POST /pair to create a setup key
// Default: full access (read+write+admin+meta). --control adds browser-wide ops.
// --restrict limits: --restrict read (read-only), --restrict "read,write" (no admin)
const pairResp = await fetch(`http://127.0.0.1:${state.port}/pair`, {
method: 'POST',
headers: {
@@ -603,9 +606,9 @@ async function handlePairAgent(state: ServerState, args: string[]): Promise<void
},
body: JSON.stringify({
domains,
clientId: clientName,
admin,
control,
...(restrict ? { scopes: restrict.split(',').map(s => s.trim()) } : {}),
}),
signal: AbortSignal.timeout(5000),
});
+9
View File
@@ -16,6 +16,7 @@ export const READ_COMMANDS = new Set([
'console', 'network', 'cookies', 'storage', 'perf',
'dialog', 'is',
'inspect',
'media', 'data',
]);
export const WRITE_COMMANDS = new Set([
@@ -24,6 +25,7 @@ export const WRITE_COMMANDS = new Set([
'viewport', 'cookie', 'cookie-import', 'cookie-import-browser', 'header', 'useragent',
'upload', 'dialog-accept', 'dialog-dismiss',
'style', 'cleanup', 'prettyscreenshot',
'download', 'scrape', 'archive',
]);
export const META_COMMANDS = new Set([
@@ -46,6 +48,7 @@ export const ALL_COMMANDS = new Set([...READ_COMMANDS, ...WRITE_COMMANDS, ...MET
export const PAGE_CONTENT_COMMANDS = new Set([
'text', 'html', 'links', 'forms', 'accessibility', 'attrs',
'console', 'dialog',
'media', 'data',
]);
/** Wrap output from untrusted-content commands with trust boundary markers */
@@ -70,6 +73,8 @@ export const COMMAND_DESCRIPTIONS: Record<string, { category: string; descriptio
'links': { category: 'Reading', description: 'All links as "text → href"' },
'forms': { category: 'Reading', description: 'Form fields as JSON' },
'accessibility': { category: 'Reading', description: 'Full ARIA tree' },
'media': { category: 'Reading', description: 'All media elements (images, videos, audio) with URLs, dimensions, types', usage: 'media [--images|--videos|--audio] [selector]' },
'data': { category: 'Reading', description: 'Structured data: JSON-LD, Open Graph, Twitter Cards, meta tags', usage: 'data [--jsonld|--og|--meta|--twitter]' },
// Inspection
'js': { category: 'Inspection', description: 'Run JavaScript expression and return result as string', usage: 'js <expr>' },
'eval': { category: 'Inspection', description: 'Run JavaScript from file and return result as string (path must be under /tmp or cwd)', usage: 'eval <file>' },
@@ -100,6 +105,10 @@ export const COMMAND_DESCRIPTIONS: Record<string, { category: string; descriptio
'useragent': { category: 'Interaction', description: 'Set user agent', usage: 'useragent <string>' },
'dialog-accept': { category: 'Interaction', description: 'Auto-accept next alert/confirm/prompt. Optional text is sent as the prompt response', usage: 'dialog-accept [text]' },
'dialog-dismiss': { category: 'Interaction', description: 'Auto-dismiss next dialog' },
// Data extraction
'download': { category: 'Extraction', description: 'Download URL or media element to disk using browser cookies', usage: 'download <url|@ref> [path] [--base64]' },
'scrape': { category: 'Extraction', description: 'Bulk download all media from page. Writes manifest.json', usage: 'scrape <images|videos|media> [--selector sel] [--dir path] [--limit N]' },
'archive': { category: 'Extraction', description: 'Save complete page as MHTML via CDP', usage: 'archive [path]' },
// Visual
'screenshot': { category: 'Visual', description: 'Save screenshot (supports element crop via CSS/@ref, --clip region, --viewport)', usage: 'screenshot [--viewport] [--clip x,y,w,h] [selector|@ref] [path]' },
'pdf': { category: 'Visual', description: 'Save as PDF', usage: 'pdf [path]' },
+177
View File
@@ -0,0 +1,177 @@
/**
* Media extraction helper shared between `media` (read) and `scrape` (write) commands.
*
* Runs page.evaluate() to discover all media elements on the page:
* - <img> with src, srcset, currentSrc, alt, dimensions, loading, data-src
* - <video> with currentSrc, poster, duration, <source> children, HLS/DASH detection
* - <audio> with src, duration, type
* - CSS background-image (capped at 500 elements)
*/
import type { Page, Frame } from 'playwright';
export interface ImageInfo {
index: number;
src: string;
srcset: string;
currentSrc: string;
alt: string;
width: number;
height: number;
naturalWidth: number;
naturalHeight: number;
loading: string;
dataSrc: string;
visible: boolean;
}
export interface VideoSource {
src: string;
type: string;
}
export interface VideoInfo {
index: number;
src: string;
currentSrc: string;
poster: string;
width: number;
height: number;
duration: number;
type: string;
sources: VideoSource[];
isHLS: boolean;
isDASH: boolean;
}
export interface AudioInfo {
index: number;
src: string;
currentSrc: string;
duration: number;
type: string;
}
export interface BackgroundImageInfo {
index: number;
url: string;
selector: string;
element: string;
}
export interface MediaResult {
images: ImageInfo[];
videos: VideoInfo[];
audio: AudioInfo[];
backgroundImages: BackgroundImageInfo[];
total: number;
}
/** Extract all media elements from the page or a scoped subtree. */
export async function extractMedia(
target: Page | Frame,
options?: { selector?: string; filter?: 'images' | 'videos' | 'audio' },
): Promise<MediaResult> {
const result = await target.evaluate(({ scopeSelector, filter }) => {
const root = scopeSelector
? document.querySelector(scopeSelector) || document
: document;
const images: any[] = [];
const videos: any[] = [];
const audio: any[] = [];
const backgroundImages: any[] = [];
// Images
if (!filter || filter === 'images') {
const imgs = root.querySelectorAll('img');
imgs.forEach((img, i) => {
const rect = img.getBoundingClientRect();
images.push({
index: i,
src: img.src || '',
srcset: img.srcset || '',
currentSrc: img.currentSrc || '',
alt: img.alt || '',
width: img.width,
height: img.height,
naturalWidth: img.naturalWidth,
naturalHeight: img.naturalHeight,
loading: img.loading || '',
dataSrc: img.getAttribute('data-src') || img.getAttribute('data-lazy-src') || img.getAttribute('data-original') || '',
visible: rect.width > 0 && rect.height > 0 && rect.bottom > 0 && rect.right > 0,
});
});
}
// Videos
if (!filter || filter === 'videos') {
const vids = root.querySelectorAll('video');
vids.forEach((vid, i) => {
const sources = Array.from(vid.querySelectorAll('source')).map(s => ({
src: s.src || '',
type: s.type || '',
}));
const isHLS = sources.some(s => s.type.includes('mpegURL') || s.src.includes('.m3u8'));
const isDASH = sources.some(s => s.type.includes('dash') || s.src.includes('.mpd'));
videos.push({
index: i,
src: vid.src || '',
currentSrc: vid.currentSrc || '',
poster: vid.poster || '',
width: vid.videoWidth || vid.width,
height: vid.videoHeight || vid.height,
duration: isFinite(vid.duration) ? vid.duration : 0,
type: sources[0]?.type || '',
sources,
isHLS,
isDASH,
});
});
}
// Audio
if (!filter || filter === 'audio') {
const auds = root.querySelectorAll('audio');
auds.forEach((aud, i) => {
const source = aud.querySelector('source');
audio.push({
index: i,
src: aud.src || source?.src || '',
currentSrc: aud.currentSrc || '',
duration: isFinite(aud.duration) ? aud.duration : 0,
type: source?.type || '',
});
});
}
// Background images (capped at 500 elements for performance)
if (!filter || filter === 'images') {
const allElements = root.querySelectorAll('*');
let bgCount = 0;
for (let i = 0; i < allElements.length && bgCount < 500; i++) {
const el = allElements[i];
const bg = getComputedStyle(el).backgroundImage;
if (bg && bg !== 'none') {
const urlMatch = bg.match(/url\(["']?([^"')]+)["']?\)/);
if (urlMatch && urlMatch[1] && !urlMatch[1].startsWith('data:')) {
backgroundImages.push({
index: bgCount,
url: urlMatch[1],
selector: el.tagName.toLowerCase() + (el.id ? `#${el.id}` : '') + (el.className && typeof el.className === 'string' ? '.' + el.className.trim().split(/\s+/).join('.') : ''),
element: el.tagName.toLowerCase(),
});
bgCount++;
}
}
}
}
return { images, videos, audio, backgroundImages };
}, { scopeSelector: options?.selector || null, filter: options?.filter || null });
return {
...result,
total: result.images.length + result.videos.length + result.audio.length + result.backgroundImages.length,
};
}
+41 -52
View File
@@ -8,48 +8,16 @@ import { getCleanText } from './read-commands';
import { READ_COMMANDS, WRITE_COMMANDS, META_COMMANDS, PAGE_CONTENT_COMMANDS, wrapUntrustedContent } from './commands';
import { validateNavigationUrl } from './url-validation';
import { checkScope, type TokenInfo } from './token-registry';
import { validateOutputPath, escapeRegExp } from './path-security';
// Re-export for backward compatibility (tests import from meta-commands)
export { validateOutputPath, escapeRegExp } from './path-security';
import * as Diff from 'diff';
import * as fs from 'fs';
import * as path from 'path';
import { TEMP_DIR, isPathWithin } from './platform';
import { TEMP_DIR } from './platform';
import { resolveConfig } from './config';
import type { Frame } from 'playwright';
// Security: Path validation to prevent path traversal attacks
// Resolve safe directories through realpathSync to handle symlinks (e.g., macOS /tmp → /private/tmp)
const SAFE_DIRECTORIES = [TEMP_DIR, process.cwd()].map(d => {
try { return fs.realpathSync(d); } catch { return d; }
});
export function validateOutputPath(filePath: string): void {
const resolved = path.resolve(filePath);
// Resolve real path of the parent directory to catch symlinks.
// The file itself may not exist yet (e.g., screenshot output).
let dir = path.dirname(resolved);
let realDir: string;
try {
realDir = fs.realpathSync(dir);
} catch {
try {
realDir = fs.realpathSync(path.dirname(dir));
} catch {
throw new Error(`Path must be within: ${SAFE_DIRECTORIES.join(', ')}`);
}
}
const realResolved = path.join(realDir, path.basename(resolved));
const isSafe = SAFE_DIRECTORIES.some(dir => isPathWithin(realResolved, dir));
if (!isSafe) {
throw new Error(`Path must be within: ${SAFE_DIRECTORIES.join(', ')}`);
}
}
/** Escape special regex metacharacters in a user-supplied string to prevent ReDoS. */
export function escapeRegExp(s: string): string {
return s.replace(/[.*+?^${}()|[\]\\]/g, '\\$&');
}
/** Tokenize a pipe segment respecting double-quoted strings. */
function tokenizePipeSegment(segment: string): string[] {
const tokens: string[] = [];
@@ -117,7 +85,7 @@ export async function handleMetaCommand(
// ─── Server Control ────────────────────────────────
case 'status': {
const page = session.getPage();
const page = bm.getPage();
const tabs = bm.getTabCount();
const mode = bm.getConnectionMode();
return [
@@ -147,17 +115,20 @@ export async function handleMetaCommand(
// ─── Visual ────────────────────────────────────────
case 'screenshot': {
// Parse priority: flags (--viewport, --clip) → selector (@ref, CSS) → output path
const page = session.getPage();
// Parse priority: flags (--viewport, --clip, --base64) → selector (@ref, CSS) → output path
const page = bm.getPage();
let outputPath = `${TEMP_DIR}/browse-screenshot.png`;
let clipRect: { x: number; y: number; width: number; height: number } | undefined;
let targetSelector: string | undefined;
let viewportOnly = false;
let base64Mode = false;
const remaining: string[] = [];
for (let i = 0; i < args.length; i++) {
if (args[i] === '--viewport') {
viewportOnly = true;
} else if (args[i] === '--base64') {
base64Mode = true;
} else if (args[i] === '--clip') {
const coords = args[++i];
if (!coords) throw new Error('Usage: screenshot --clip x,y,w,h [path]');
@@ -194,8 +165,26 @@ export async function handleMetaCommand(
throw new Error('Cannot use --viewport with --clip — choose one');
}
// --base64 mode: capture to buffer instead of disk
if (base64Mode) {
let buffer: Buffer;
if (targetSelector) {
const resolved = await bm.resolveRef(targetSelector);
const locator = 'locator' in resolved ? resolved.locator : page.locator(resolved.selector);
buffer = await locator.screenshot({ timeout: 5000 });
} else if (clipRect) {
buffer = await page.screenshot({ clip: clipRect });
} else {
buffer = await page.screenshot({ fullPage: !viewportOnly });
}
if (buffer.length > 10 * 1024 * 1024) {
throw new Error('Screenshot too large for --base64 (>10MB). Use disk path instead.');
}
return `data:image/png;base64,${buffer.toString('base64')}`;
}
if (targetSelector) {
const resolved = await session.resolveRef(targetSelector);
const resolved = await bm.resolveRef(targetSelector);
const locator = 'locator' in resolved ? resolved.locator : page.locator(resolved.selector);
await locator.screenshot({ path: outputPath, timeout: 5000 });
return `Screenshot saved (element): ${outputPath}`;
@@ -211,7 +200,7 @@ export async function handleMetaCommand(
}
case 'pdf': {
const page = session.getPage();
const page = bm.getPage();
const pdfPath = args[0] || `${TEMP_DIR}/browse-page.pdf`;
validateOutputPath(pdfPath);
await page.pdf({ path: pdfPath, format: 'A4' });
@@ -219,7 +208,7 @@ export async function handleMetaCommand(
}
case 'responsive': {
const page = session.getPage();
const page = bm.getPage();
const prefix = args[0] || `${TEMP_DIR}/browse-responsive`;
validateOutputPath(prefix);
const viewports = [
@@ -344,7 +333,7 @@ export async function handleMetaCommand(
// Wait for network to settle after write commands before returning
if (lastWasWrite) {
await session.getPage().waitForLoadState('networkidle', { timeout: 2000 }).catch(() => {});
await bm.getPage().waitForLoadState('networkidle', { timeout: 2000 }).catch(() => {});
}
return results.join('\n\n');
@@ -355,7 +344,7 @@ export async function handleMetaCommand(
const [url1, url2] = args;
if (!url1 || !url2) throw new Error('Usage: browse diff <url1> <url2>');
const page = session.getPage();
const page = bm.getPage();
await validateNavigationUrl(url1);
await page.goto(url1, { waitUntil: 'domcontentloaded', timeout: 15000 });
const text1 = await getCleanText(page);
@@ -454,7 +443,7 @@ export async function handleMetaCommand(
// If a ref was passed, scroll it into view
if (args.length > 0 && args[0].startsWith('@')) {
try {
const resolved = await session.resolveRef(args[0]);
const resolved = await bm.resolveRef(args[0]);
if ('locator' in resolved) {
await resolved.locator.scrollIntoViewIfNeeded({ timeout: 5000 });
return `Browser activated. Scrolled ${args[0]} into view.`;
@@ -611,7 +600,7 @@ export async function handleMetaCommand(
}
}
// Close existing pages, then restore (replace, not merge)
session.setFrame(null);
bm.setFrame(null);
await bm.closeAllPages();
await bm.restoreState({
cookies: validatedCookies,
@@ -629,12 +618,12 @@ export async function handleMetaCommand(
if (!target) throw new Error('Usage: frame <selector|@ref|--name name|--url pattern|main>');
if (target === 'main') {
session.setFrame(null);
session.clearRefs();
bm.setFrame(null);
bm.clearRefs();
return 'Switched to main frame';
}
const page = session.getPage();
const page = bm.getPage();
let frame: Frame | null = null;
if (target === '--name') {
@@ -645,7 +634,7 @@ export async function handleMetaCommand(
frame = page.frame({ url: new RegExp(escapeRegExp(args[1])) });
} else {
// CSS selector or @ref for the iframe element
const resolved = await session.resolveRef(target);
const resolved = await bm.resolveRef(target);
const locator = 'locator' in resolved ? resolved.locator : page.locator(resolved.selector);
const elementHandle = await locator.elementHandle({ timeout: 5000 });
frame = await elementHandle?.contentFrame() ?? null;
@@ -653,8 +642,8 @@ export async function handleMetaCommand(
}
if (!frame) throw new Error(`Frame not found: ${target}`);
session.setFrame(frame);
session.clearRefs();
bm.setFrame(frame);
bm.clearRefs();
return `Switched to frame: ${frame.url()}`;
}
+179
View File
@@ -0,0 +1,179 @@
/**
* Network response body capture SizeCappedBuffer + capture lifecycle.
*
* Architecture:
* page.on('response') listener filter by URL pattern store body
* SizeCappedBuffer: evicts oldest entries when total size exceeds cap
* Export: writes JSONL file (one response per line)
*
* Memory management:
* - 50MB total buffer cap (configurable)
* - 5MB per-entry body cap (larger responses stored as metadata only)
* - Binary responses stored as base64
* - Text responses stored as-is
*/
import * as fs from 'fs';
import type { Response as PlaywrightResponse } from 'playwright';
export interface CapturedResponse {
url: string;
status: number;
headers: Record<string, string>;
body: string;
contentType: string;
timestamp: number;
size: number;
bodyTruncated: boolean;
}
const MAX_BUFFER_SIZE = 50 * 1024 * 1024; // 50MB total
const MAX_ENTRY_SIZE = 5 * 1024 * 1024; // 5MB per response body
export class SizeCappedBuffer {
private entries: CapturedResponse[] = [];
private totalSize = 0;
private readonly maxSize: number;
constructor(maxSize = MAX_BUFFER_SIZE) {
this.maxSize = maxSize;
}
push(entry: CapturedResponse): void {
// Evict oldest entries until we have room
while (this.entries.length > 0 && this.totalSize + entry.size > this.maxSize) {
const evicted = this.entries.shift()!;
this.totalSize -= evicted.size;
}
this.entries.push(entry);
this.totalSize += entry.size;
}
toArray(): CapturedResponse[] {
return [...this.entries];
}
get length(): number {
return this.entries.length;
}
get byteSize(): number {
return this.totalSize;
}
clear(): void {
this.entries = [];
this.totalSize = 0;
}
/** Export to JSONL file. */
exportToFile(filePath: string): number {
const lines = this.entries.map(e => JSON.stringify(e));
fs.writeFileSync(filePath, lines.join('\n') + '\n');
return this.entries.length;
}
/** Summary of captured responses (URL, status, size). */
summary(): string {
if (this.entries.length === 0) return 'No captured responses.';
const lines = this.entries.map((e, i) =>
` [${i + 1}] ${e.status} ${e.url.slice(0, 100)} (${Math.round(e.size / 1024)}KB${e.bodyTruncated ? ', truncated' : ''})`
);
return `${this.entries.length} responses (${Math.round(this.totalSize / 1024)}KB total):\n${lines.join('\n')}`;
}
}
/** Global capture state. */
let captureBuffer = new SizeCappedBuffer();
let captureActive = false;
let captureFilter: RegExp | null = null;
let captureListener: ((response: PlaywrightResponse) => Promise<void>) | null = null;
export function isCaptureActive(): boolean {
return captureActive;
}
export function getCaptureBuffer(): SizeCappedBuffer {
return captureBuffer;
}
/** Create the response listener function. */
function createResponseListener(filter: RegExp | null): (response: PlaywrightResponse) => Promise<void> {
return async (response: PlaywrightResponse) => {
const url = response.url();
if (filter && !filter.test(url)) return;
// Skip non-content responses (redirects, 204, etc.)
const status = response.status();
if (status === 204 || status === 301 || status === 302 || status === 304) return;
const contentType = response.headers()['content-type'] || '';
let body = '';
let bodySize = 0;
let truncated = false;
try {
const rawBody = await response.body();
bodySize = rawBody.length;
if (bodySize > MAX_ENTRY_SIZE) {
truncated = true;
body = '';
} else if (contentType.includes('json') || contentType.includes('text') || contentType.includes('xml') || contentType.includes('html')) {
body = rawBody.toString('utf-8');
} else {
body = rawBody.toString('base64');
}
} catch {
// Response body may be unavailable (e.g., streaming, aborted)
body = '';
truncated = true;
}
const entry: CapturedResponse = {
url,
status,
headers: response.headers(),
body,
contentType,
timestamp: Date.now(),
size: bodySize,
bodyTruncated: truncated,
};
captureBuffer.push(entry);
};
}
/** Start capturing response bodies. */
export function startCapture(filterPattern?: string): { filter: string | null } {
captureFilter = filterPattern ? new RegExp(filterPattern) : null;
captureActive = true;
captureListener = createResponseListener(captureFilter);
return { filter: filterPattern || null };
}
/** Get the active listener (to attach to page). */
export function getCaptureListener(): ((response: PlaywrightResponse) => Promise<void>) | null {
return captureListener;
}
/** Stop capturing. */
export function stopCapture(): { count: number; sizeKB: number } {
captureActive = false;
captureListener = null;
return {
count: captureBuffer.length,
sizeKB: Math.round(captureBuffer.byteSize / 1024),
};
}
/** Clear the capture buffer. */
export function clearCapture(): void {
captureBuffer.clear();
}
/** Export captured responses to JSONL file. */
export function exportCapture(filePath: string): number {
return captureBuffer.exportToFile(filePath);
}
+103
View File
@@ -0,0 +1,103 @@
/**
* Shared path validation single source of truth for file path security.
*
* Previously duplicated across write-commands.ts, meta-commands.ts, and read-commands.ts.
* All file I/O commands (screenshot, pdf, download, scrape, archive, eval) must
* validate paths through these functions.
*
* validateOutputPath(path) for writing files (screenshot, pdf, download, scrape, archive)
* validateReadPath(path) for reading files (eval)
* validateTempPath(path) for serving files to remote agents (GET /file, TEMP_DIR only)
*
* Security invariants:
* 1. All paths resolved to absolute before checking
* 2. Symlinks resolved to catch traversal via symlink inside safe dir
* 3. SAFE_DIRECTORIES = [TEMP_DIR, cwd] for local commands
* 4. TEMP_ONLY = [TEMP_DIR] for remote file serving (prevents project file exfil)
*/
import * as fs from 'fs';
import * as path from 'path';
import { TEMP_DIR, isPathWithin } from './platform';
// Resolve safe directories through realpathSync to handle symlinks (e.g., macOS /tmp → /private/tmp)
export const SAFE_DIRECTORIES = [TEMP_DIR, process.cwd()].map(d => {
try { return fs.realpathSync(d); } catch { return d; }
});
const TEMP_ONLY = [TEMP_DIR].map(d => {
try { return fs.realpathSync(d); } catch { return d; }
});
/** Validate a file path for writing (screenshot, pdf, download, scrape, archive). */
export function validateOutputPath(filePath: string): void {
const resolved = path.resolve(filePath);
// Resolve real path of the parent directory to catch symlinks.
// The file itself may not exist yet (e.g., screenshot output).
// This also handles macOS /tmp → /private/tmp transparently.
let dir = path.dirname(resolved);
let realDir: string;
try {
realDir = fs.realpathSync(dir);
} catch {
try {
realDir = fs.realpathSync(path.dirname(dir));
} catch {
throw new Error(`Path must be within: ${SAFE_DIRECTORIES.join(', ')}`);
}
}
const realResolved = path.join(realDir, path.basename(resolved));
const isSafe = SAFE_DIRECTORIES.some(dir => isPathWithin(realResolved, dir));
if (!isSafe) {
throw new Error(`Path must be within: ${SAFE_DIRECTORIES.join(', ')}`);
}
}
/** Validate a file path for reading (eval command). */
export function validateReadPath(filePath: string): void {
const resolved = path.resolve(filePath);
let realPath: string;
try {
realPath = fs.realpathSync(resolved);
} catch (err: any) {
if (err.code === 'ENOENT') {
try {
const dir = fs.realpathSync(path.dirname(resolved));
realPath = path.join(dir, path.basename(resolved));
} catch {
realPath = resolved;
}
} else {
throw new Error(`Cannot resolve real path: ${filePath} (${err.code})`);
}
}
const isSafe = SAFE_DIRECTORIES.some(dir => isPathWithin(realPath, dir));
if (!isSafe) {
throw new Error(`Path must be within: ${SAFE_DIRECTORIES.join(', ')}`);
}
}
/** Validate a file path for remote serving (GET /file). TEMP_DIR only, not cwd. */
export function validateTempPath(filePath: string): void {
const resolved = path.resolve(filePath);
let realPath: string;
try {
realPath = fs.realpathSync(resolved);
} catch (err: any) {
if (err.code === 'ENOENT') {
throw new Error('File not found');
}
throw new Error(`Cannot resolve path: ${filePath}`);
}
const isSafe = TEMP_ONLY.some(dir => isPathWithin(realPath, dir));
if (!isSafe) {
throw new Error(`Path must be within: ${TEMP_ONLY.join(', ')} (remote file serving is restricted to temp directory)`);
}
}
/** Escape special regex metacharacters in a user-supplied string to prevent ReDoS. */
export function escapeRegExp(s: string): string {
return s.replace(/[.*+?^${}()|[\]\\]/g, '\\$&');
}
+118 -33
View File
@@ -10,8 +10,11 @@ import { consoleBuffer, networkBuffer, dialogBuffer } from './buffers';
import type { Page, Frame } from 'playwright';
import * as fs from 'fs';
import * as path from 'path';
import { TEMP_DIR, isPathWithin } from './platform';
import { TEMP_DIR } from './platform';
import { inspectElement, formatInspectorResult, getModificationHistory } from './cdp-inspector';
import { validateReadPath } from './path-security';
// Re-export for backward compatibility (tests import from read-commands)
export { validateReadPath } from './path-security';
// Redaction patterns for sensitive cookie/storage values — exported for test coverage
export const SENSITIVE_COOKIE_NAME = /(^|[_.-])(token|secret|key|password|credential|auth|jwt|session|csrf|sid)($|[_.-])|api.?key/i;
@@ -41,38 +44,6 @@ function wrapForEvaluate(code: string): string {
: `(async()=>(${trimmed}))()`;
}
// Security: Path validation to prevent path traversal attacks
// Resolve safe directories through realpathSync to handle symlinks (e.g., macOS /tmp → /private/tmp)
const SAFE_DIRECTORIES = [TEMP_DIR, process.cwd()].map(d => {
try { return fs.realpathSync(d); } catch { return d; }
});
export function validateReadPath(filePath: string): void {
// Always resolve to absolute first (fixes relative path symlink bypass)
const resolved = path.resolve(filePath);
// Resolve symlinks — throw on non-ENOENT errors
let realPath: string;
try {
realPath = fs.realpathSync(resolved);
} catch (err: any) {
if (err.code === 'ENOENT') {
// File doesn't exist — resolve directory part for symlinks (e.g., /tmp → /private/tmp)
try {
const dir = fs.realpathSync(path.dirname(resolved));
realPath = path.join(dir, path.basename(resolved));
} catch {
realPath = resolved;
}
} else {
throw new Error(`Cannot resolve real path: ${filePath} (${err.code})`);
}
}
const isSafe = SAFE_DIRECTORIES.some(dir => isPathWithin(realPath, dir));
if (!isSafe) {
throw new Error(`Path must be within: ${SAFE_DIRECTORIES.join(', ')}`);
}
}
/**
* Extract clean text from a page (strips script/style/noscript/svg).
* Exported for DRY reuse in meta-commands (diff).
@@ -254,6 +225,50 @@ export async function handleReadCommand(
networkBuffer.clear();
return 'Network buffer cleared.';
}
// Network capture extensions
if (args[0] === '--capture') {
const {
startCapture, stopCapture, getCaptureListener, isCaptureActive,
} = await import('./network-capture');
if (args[1] === 'stop') {
// Detach listener from current page
const page = bm.getPage();
const listener = getCaptureListener();
if (listener) page.removeListener('response', listener);
const result = stopCapture();
return `Network capture stopped. ${result.count} responses captured (${result.sizeKB}KB).`;
}
// Start capture
if (isCaptureActive()) return 'Capture already active. Use --capture stop first.';
const filterIdx = args.indexOf('--filter');
const filterPattern = filterIdx >= 0 ? args[filterIdx + 1] : undefined;
const info = startCapture(filterPattern);
// Attach listener to current page
const page = bm.getPage();
const listener = getCaptureListener();
if (listener) page.on('response', listener);
return `Network capture started${info.filter ? ` (filter: ${info.filter})` : ''}. Use --capture stop to stop.`;
}
if (args[0] === '--export') {
const { exportCapture } = await import('./network-capture');
const { validateOutputPath: vop } = await import('./path-security');
const exportPath = args[1];
if (!exportPath) throw new Error('Usage: network --export <path>');
vop(exportPath);
const count = exportCapture(exportPath);
return `Exported ${count} captured responses to ${exportPath}`;
}
if (args[0] === '--bodies') {
const { getCaptureBuffer } = await import('./network-capture');
return getCaptureBuffer().summary();
}
// Default: show request metadata
if (networkBuffer.length === 0) return '(no network requests)';
return networkBuffer.toArray().map(e =>
`${e.method} ${e.url}${e.status || 'pending'} (${e.duration || '?'}ms, ${e.size || '?'}B)`
@@ -412,6 +427,76 @@ export async function handleReadCommand(
return formatInspectorResult(result, { includeUA });
}
case 'media': {
const { extractMedia } = await import('./media-extract');
const target = bm.getActiveFrameOrPage();
const filter = args.includes('--images') ? 'images' as const
: args.includes('--videos') ? 'videos' as const
: args.includes('--audio') ? 'audio' as const
: undefined;
const selectorArg = args.find(a => !a.startsWith('--'));
const result = await extractMedia(target, { selector: selectorArg, filter });
return JSON.stringify(result, null, 2);
}
case 'data': {
const target = bm.getActiveFrameOrPage();
const wantJsonLd = args.includes('--jsonld') || args.length === 0;
const wantOg = args.includes('--og') || args.length === 0;
const wantTwitter = args.includes('--twitter') || args.length === 0;
const wantMeta = args.includes('--meta') || args.length === 0;
const result = await target.evaluate(({ wantJsonLd, wantOg, wantTwitter, wantMeta }) => {
const data: Record<string, any> = {};
if (wantJsonLd) {
const scripts = document.querySelectorAll('script[type="application/ld+json"]');
const jsonLd: any[] = [];
scripts.forEach(s => {
try { jsonLd.push(JSON.parse(s.textContent || '')); } catch {}
});
data.jsonLd = jsonLd;
}
if (wantOg) {
const og: Record<string, string> = {};
document.querySelectorAll('meta[property^="og:"]').forEach(m => {
const prop = m.getAttribute('property')?.replace('og:', '') || '';
og[prop] = m.getAttribute('content') || '';
});
data.openGraph = og;
}
if (wantTwitter) {
const tw: Record<string, string> = {};
document.querySelectorAll('meta[name^="twitter:"]').forEach(m => {
const name = m.getAttribute('name')?.replace('twitter:', '') || '';
tw[name] = m.getAttribute('content') || '';
});
data.twitterCards = tw;
}
if (wantMeta) {
const meta: Record<string, string> = {};
const canonical = document.querySelector('link[rel="canonical"]');
if (canonical) meta.canonical = canonical.getAttribute('href') || '';
const desc = document.querySelector('meta[name="description"]');
if (desc) meta.description = desc.getAttribute('content') || '';
const keywords = document.querySelector('meta[name="keywords"]');
if (keywords) meta.keywords = keywords.getAttribute('content') || '';
const author = document.querySelector('meta[name="author"]');
if (author) meta.author = author.getAttribute('content') || '';
const title = document.querySelector('title');
if (title) meta.title = title.textContent || '';
data.meta = meta;
}
return data;
}, { wantJsonLd, wantOg, wantTwitter, wantMeta });
return JSON.stringify(result, null, 2);
}
default:
throw new Error(`Unknown read command: ${command}`);
}
+61 -3
View File
@@ -32,6 +32,7 @@ import {
rotateRoot, listTokens, serializeRegistry, restoreRegistry, recordCommand,
isRootToken, checkConnectRateLimit, type TokenInfo,
} from './token-registry';
import { validateTempPath } from './path-security';
import { resolveConfig, ensureStateDir, readVersionHash } from './config';
import { emitActivity, subscribe, getActivityAfter, getActivityHistory, getSubscriberCount } from './activity';
import { inspectElement, modifyStyle, resetModifications, getModificationHistory, detachSession, type InspectorResult } from './cdp-inspector';
@@ -1457,9 +1458,12 @@ async function start() {
}
try {
const pairBody = await req.json() as any;
const scopes = pairBody.admin
? ['read', 'write', 'admin', 'meta'] as const
: (pairBody.scopes || ['read', 'write']) as const;
// Default: full access (read+write+admin+meta). The trust boundary is
// the pairing ceremony itself, not the scope. --control adds browser-wide
// destructive commands (stop, restart, disconnect). --restrict limits scope.
const scopes = pairBody.control || pairBody.admin
? ['read', 'write', 'admin', 'meta', 'control'] as const
: (pairBody.scopes || ['read', 'write', 'admin', 'meta']) as const;
const setupKey = createSetupKey({
clientId: pairBody.clientId,
scopes: [...scopes],
@@ -2031,6 +2035,60 @@ async function start() {
});
}
// ─── File serving endpoint (for remote agents to retrieve downloaded files) ────
if (url.pathname === '/file' && req.method === 'GET') {
const tokenInfo = getTokenInfo(req);
if (!tokenInfo) {
return new Response(JSON.stringify({ error: 'Unauthorized' }), {
status: 401, headers: { 'Content-Type': 'application/json' },
});
}
const filePath = url.searchParams.get('path');
if (!filePath) {
return new Response(JSON.stringify({ error: 'Missing "path" query parameter' }), {
status: 400, headers: { 'Content-Type': 'application/json' },
});
}
try {
validateTempPath(filePath);
} catch (err: any) {
return new Response(JSON.stringify({ error: err.message }), {
status: 403, headers: { 'Content-Type': 'application/json' },
});
}
if (!fs.existsSync(filePath)) {
return new Response(JSON.stringify({ error: 'File not found' }), {
status: 404, headers: { 'Content-Type': 'application/json' },
});
}
const stat = fs.statSync(filePath);
if (stat.size > 200 * 1024 * 1024) {
return new Response(JSON.stringify({ error: 'File too large (max 200MB)' }), {
status: 413, headers: { 'Content-Type': 'application/json' },
});
}
const ext = path.extname(filePath).toLowerCase();
const MIME_MAP: Record<string, string> = {
'.png': 'image/png', '.jpg': 'image/jpeg', '.jpeg': 'image/jpeg',
'.gif': 'image/gif', '.webp': 'image/webp', '.svg': 'image/svg+xml',
'.avif': 'image/avif',
'.mp4': 'video/mp4', '.webm': 'video/webm', '.mov': 'video/quicktime',
'.mp3': 'audio/mpeg', '.wav': 'audio/wav', '.ogg': 'audio/ogg',
'.pdf': 'application/pdf', '.json': 'application/json',
'.html': 'text/html', '.txt': 'text/plain', '.mhtml': 'message/rfc822',
};
const contentType = MIME_MAP[ext] || 'application/octet-stream';
resetIdleTimer();
return new Response(Bun.file(filePath), {
headers: {
'Content-Type': contentType,
'Content-Length': String(stat.size),
'Content-Disposition': `inline; filename="${path.basename(filePath)}"`,
'Cache-Control': 'no-cache',
},
});
}
// ─── Command endpoint (accepts both root AND scoped tokens) ────
// Must be checked BEFORE the blanket root-only auth gate below,
// because scoped tokens from /connect are valid for /command.
+11 -5
View File
@@ -40,6 +40,7 @@ export const SCOPE_READ = new Set([
'snapshot', 'text', 'html', 'links', 'forms', 'accessibility',
'console', 'network', 'perf', 'dialog', 'is', 'inspect',
'url', 'tabs', 'status', 'screenshot', 'pdf', 'css', 'attrs',
'media', 'data',
]);
/** Commands that modify page state or navigate */
@@ -48,15 +49,19 @@ export const SCOPE_WRITE = new Set([
'click', 'fill', 'select', 'hover', 'type', 'press', 'scroll', 'wait',
'upload', 'viewport', 'newtab', 'closetab',
'dialog-accept', 'dialog-dismiss',
'download', 'scrape', 'archive',
]);
/** Dangerous commands — JS execution, credential access, browser-wide mutations */
/** Page-level power tools — JS execution, credential access, page mutations */
export const SCOPE_ADMIN = new Set([
'eval', 'js', 'cookies', 'storage',
'cookie', 'cookie-import', 'cookie-import-browser',
'header', 'useragent',
'style', 'cleanup', 'prettyscreenshot',
// Browser-wide destructive commands (from Codex adversarial finding):
]);
/** Browser-wide destructive commands — can kill the server, disconnect headed mode */
export const SCOPE_CONTROL = new Set([
'state', 'handoff', 'resume', 'stop', 'restart', 'connect', 'disconnect',
]);
@@ -66,12 +71,13 @@ export const SCOPE_META = new Set([
'watch', 'inbox', 'focus',
]);
export type ScopeCategory = 'read' | 'write' | 'admin' | 'meta';
export type ScopeCategory = 'read' | 'write' | 'admin' | 'meta' | 'control';
const SCOPE_MAP: Record<ScopeCategory, Set<string>> = {
read: SCOPE_READ,
write: SCOPE_WRITE,
admin: SCOPE_ADMIN,
control: SCOPE_CONTROL,
meta: SCOPE_META,
};
@@ -170,7 +176,7 @@ export function createToken(opts: CreateTokenOptions): TokenInfo {
} = opts;
// Validate inputs
const validScopes: ScopeCategory[] = ['read', 'write', 'admin', 'meta'];
const validScopes: ScopeCategory[] = ['read', 'write', 'admin', 'meta', 'control'];
for (const s of scopes) {
if (!validScopes.includes(s as ScopeCategory)) {
throw new Error(`Invalid scope: ${s}. Valid: ${validScopes.join(', ')}`);
@@ -297,7 +303,7 @@ export function validateToken(token: string): TokenInfo | null {
token: rootToken,
clientId: 'root',
type: 'session',
scopes: ['read', 'write', 'admin', 'meta'],
scopes: ['read', 'write', 'admin', 'meta', 'control'],
tabPolicy: 'shared',
rateLimit: 0, // unlimited
expiresAt: null,
+251 -45
View File
@@ -9,54 +9,12 @@ import type { TabSession } from './tab-session';
import type { BrowserManager } from './browser-manager';
import { findInstalledBrowsers, importCookies, listSupportedBrowserNames } from './cookie-import-browser';
import { validateNavigationUrl } from './url-validation';
import { validateOutputPath } from './path-security';
import * as fs from 'fs';
import * as path from 'path';
import { TEMP_DIR, isPathWithin } from './platform';
import { TEMP_DIR } from './platform';
import { modifyStyle, undoModification, resetModifications, getModificationHistory } from './cdp-inspector';
// Security: Path validation for screenshot output
// Resolve safe directories through realpathSync to handle symlinks (e.g., macOS /tmp -> /private/tmp)
const SAFE_DIRECTORIES = [TEMP_DIR, process.cwd()].map(d => {
try { return fs.realpathSync(d); } catch { return d; }
});
function validateOutputPath(filePath: string): void {
const resolved = path.resolve(filePath);
// Basic containment check using lexical resolution only.
// This catches obvious traversal (../../../etc/passwd) but NOT symlinks.
const isSafe = SAFE_DIRECTORIES.some(dir => isPathWithin(resolved, dir));
if (!isSafe) {
throw new Error(`Path must be within: ${SAFE_DIRECTORIES.join(', ')}`);
}
// Symlink check: resolve the real path of the nearest existing ancestor
// directory and re-validate. This closes the symlink bypass where a
// symlink inside /tmp or cwd points outside the safe zone.
//
// We resolve the parent dir (not the file itself — it may not exist yet).
// If the parent doesn't exist either we fall back up the tree.
let dir = path.dirname(resolved);
let realDir: string;
try {
realDir = fs.realpathSync(dir);
} catch {
// Parent doesn't exist — check the grandparent, or skip if inaccessible
try {
realDir = fs.realpathSync(path.dirname(dir));
} catch {
// Can't resolve — fail safe
throw new Error(`Path must be within: ${SAFE_DIRECTORIES.join(', ')}`);
}
}
const realResolved = path.join(realDir, path.basename(resolved));
const isRealSafe = SAFE_DIRECTORIES.some(dir => isPathWithin(realResolved, dir));
if (!isRealSafe) {
throw new Error(`Path must be within: ${SAFE_DIRECTORIES.join(', ')} (symlink target blocked)`);
}
}
/**
* Aggressive page cleanup selectors and heuristics.
* Goal: make the page readable and clean while keeping it recognizable.
@@ -313,7 +271,32 @@ export async function handleWriteCommand(
}
case 'scroll': {
const selector = args[0];
// Parse --times N and --wait ms flags
const timesIdx = args.indexOf('--times');
const times = timesIdx >= 0 ? parseInt(args[timesIdx + 1], 10) || 1 : 0;
const waitIdx = args.indexOf('--wait');
const waitMs = waitIdx >= 0 ? parseInt(args[waitIdx + 1], 10) || 1000 : 1000;
const selector = args.find(a => !a.startsWith('--') && args.indexOf(a) !== timesIdx + 1 && args.indexOf(a) !== waitIdx + 1);
if (times > 0) {
// Repeated scroll mode
for (let i = 0; i < times; i++) {
if (selector) {
const resolved = await bm.resolveRef(selector);
if ('locator' in resolved) {
await resolved.locator.scrollIntoViewIfNeeded({ timeout: 5000 });
} else {
await target.locator(resolved.selector).scrollIntoViewIfNeeded({ timeout: 5000 });
}
} else {
await target.evaluate(() => window.scrollTo(0, document.body.scrollHeight));
}
if (i < times - 1) await new Promise(r => setTimeout(r, waitMs));
}
return `Scrolled ${times} times${selector ? ` (${selector})` : ''} with ${waitMs}ms delay`;
}
// Single scroll (original behavior)
if (selector) {
const resolved = await session.resolveRef(selector);
if ('locator' in resolved) {
@@ -913,7 +896,230 @@ export async function handleWriteCommand(
return parts.join(' ');
}
case 'download': {
if (args.length === 0) throw new Error('Usage: download <url|@ref> [path] [--base64]');
const isBase64 = args.includes('--base64');
const filteredArgs = args.filter(a => a !== '--base64');
let url = filteredArgs[0];
const outputPath = filteredArgs[1];
// Resolve @ref to element src
if (url.startsWith('@')) {
const resolved = await bm.resolveRef(url);
if (!('locator' in resolved)) throw new Error(`Expected @ref, got CSS selector: ${url}`);
const locator = resolved.locator;
const tagName = await locator.evaluate(el => el.tagName.toLowerCase());
if (tagName === 'img') {
url = await locator.evaluate(el => {
const img = el as HTMLImageElement;
return img.currentSrc || img.src || img.getAttribute('data-src') || '';
});
} else if (tagName === 'video') {
url = await locator.evaluate(el => (el as HTMLVideoElement).currentSrc || (el as HTMLVideoElement).src || '');
} else if (tagName === 'audio') {
url = await locator.evaluate(el => (el as HTMLAudioElement).currentSrc || (el as HTMLAudioElement).src || '');
} else {
// Try src attribute on any element
url = await locator.evaluate(el => el.getAttribute('src') || '');
}
if (!url) throw new Error(`Could not extract URL from ${filteredArgs[0]} (${tagName})`);
}
// Check for HLS/DASH
if (url.includes('.m3u8') || url.includes('.mpd')) {
throw new Error('This is an HLS/DASH stream. Use yt-dlp or ffmpeg for adaptive stream downloads.');
}
// Determine output path and extension
const page = bm.getPage();
let contentType = 'application/octet-stream';
let buffer: Buffer;
if (url.startsWith('blob:')) {
// Strategy 3: Blob URL -- in-page fetch + base64
const dataUrl = await page.evaluate(async (blobUrl) => {
try {
const resp = await fetch(blobUrl);
const blob = await resp.blob();
if (blob.size > 100 * 1024 * 1024) return 'ERROR:TOO_LARGE';
return new Promise<string>((resolve, reject) => {
const reader = new FileReader();
reader.onloadend = () => resolve(reader.result as string);
reader.onerror = () => reject('Failed to read blob');
reader.readAsDataURL(blob);
});
} catch {
return 'ERROR:EXPIRED';
}
}, url);
if (dataUrl === 'ERROR:TOO_LARGE') throw new Error('Blob too large (>100MB). Use a different approach.');
if (dataUrl === 'ERROR:EXPIRED') throw new Error('Blob URL expired or inaccessible.');
const match = dataUrl.match(/^data:([^;]+);base64,(.+)$/);
if (!match) throw new Error('Failed to decode blob data');
contentType = match[1];
buffer = Buffer.from(match[2], 'base64');
} else {
// Strategy 1: Direct URL via page.request.fetch()
const response = await page.request.fetch(url, { timeout: 30000 });
const status = response.status();
if (status >= 400) {
throw new Error(`Download failed: HTTP ${status} ${response.statusText()}`);
}
contentType = response.headers()['content-type'] || 'application/octet-stream';
buffer = Buffer.from(await response.body());
if (buffer.length > 200 * 1024 * 1024) {
throw new Error('File too large (>200MB).');
}
}
// --base64 mode: return inline
if (isBase64) {
if (buffer.length > 10 * 1024 * 1024) {
throw new Error('File too large for --base64 (>10MB). Use disk download + GET /file instead.');
}
const mimeType = contentType.split(';')[0].trim();
return `data:${mimeType};base64,${buffer.toString('base64')}`;
}
// Write to disk
const ext = contentType.split(';')[0].includes('/')
? mimeToExt(contentType.split(';')[0].trim())
: '.bin';
const destPath = outputPath || path.join(TEMP_DIR, `browse-download-${Date.now()}${ext}`);
validateOutputPath(destPath);
fs.writeFileSync(destPath, buffer);
const sizeKB = Math.round(buffer.length / 1024);
return `Downloaded: ${destPath} (${sizeKB}KB, ${contentType.split(';')[0].trim()})`;
}
case 'scrape': {
if (args.length === 0) throw new Error('Usage: scrape <images|videos|media> [--selector sel] [--dir path] [--limit N]');
const mediaType = args[0];
if (!['images', 'videos', 'media'].includes(mediaType)) {
throw new Error(`Invalid type: ${mediaType}. Use: images, videos, or media`);
}
// Parse flags
const selectorIdx = args.indexOf('--selector');
const selector = selectorIdx >= 0 ? args[selectorIdx + 1] : undefined;
const dirIdx = args.indexOf('--dir');
const dir = dirIdx >= 0 ? args[dirIdx + 1] : path.join(TEMP_DIR, `browse-scrape-${Date.now()}`);
const limitIdx = args.indexOf('--limit');
const limit = Math.min(limitIdx >= 0 ? parseInt(args[limitIdx + 1], 10) || 50 : 50, 200);
validateOutputPath(dir);
fs.mkdirSync(dir, { recursive: true });
const { extractMedia } = await import('./media-extract');
const target = bm.getActiveFrameOrPage();
const filter = mediaType === 'images' ? 'images' as const
: mediaType === 'videos' ? 'videos' as const
: undefined;
const mediaResult = await extractMedia(target, { selector, filter });
// Collect URLs to download
const urls: Array<{ url: string; type: string }> = [];
const seen = new Set<string>();
for (const img of mediaResult.images) {
const url = img.currentSrc || img.src || img.dataSrc;
if (url && !seen.has(url) && !url.startsWith('data:')) {
seen.add(url);
urls.push({ url, type: 'image' });
}
}
for (const vid of mediaResult.videos) {
const url = vid.currentSrc || vid.src;
if (url && !seen.has(url) && !url.startsWith('blob:') && !vid.isHLS && !vid.isDASH) {
seen.add(url);
urls.push({ url, type: 'video' });
}
}
for (const bg of mediaResult.backgroundImages) {
if (bg.url && !seen.has(bg.url)) {
seen.add(bg.url);
urls.push({ url: bg.url, type: 'image' });
}
}
const toDownload = urls.slice(0, limit);
const page = bm.getPage();
const manifest: any = {
url: page.url(),
scraped_at: new Date().toISOString(),
files: [] as any[],
total_size: 0,
succeeded: 0,
failed: 0,
};
const lines: string[] = [];
for (let i = 0; i < toDownload.length; i++) {
const { url, type } = toDownload[i];
try {
const response = await page.request.fetch(url, { timeout: 30000 });
if (response.status() >= 400) throw new Error(`HTTP ${response.status()}`);
const ct = response.headers()['content-type'] || 'application/octet-stream';
const ext = mimeToExt(ct.split(';')[0].trim());
const filename = `${type}-${String(i + 1).padStart(3, '0')}${ext}`;
const filePath = path.join(dir, filename);
const body = Buffer.from(await response.body());
try {
fs.writeFileSync(filePath, body);
} catch (writeErr: any) {
throw new Error(`Disk write failed: ${writeErr.message}`);
}
manifest.files.push({ path: filename, src: url, size: body.length, type: ct.split(';')[0].trim() });
manifest.total_size += body.length;
manifest.succeeded++;
lines.push(` [${i + 1}/${toDownload.length}] ${filename} (${Math.round(body.length / 1024)}KB)`);
} catch (err: any) {
manifest.files.push({ path: null, src: url, size: 0, type: '', error: err.message });
manifest.failed++;
lines.push(` [${i + 1}/${toDownload.length}] FAILED: ${err.message}`);
}
// 100ms delay between downloads
if (i < toDownload.length - 1) await new Promise(r => setTimeout(r, 100));
}
// Write manifest
fs.writeFileSync(path.join(dir, 'manifest.json'), JSON.stringify(manifest, null, 2));
return `Scraped ${toDownload.length} items to ${dir}/\n${lines.join('\n')}\n\nSummary: ${manifest.succeeded} succeeded, ${manifest.failed} failed, ${Math.round(manifest.total_size / 1024)}KB total`;
}
case 'archive': {
const page = bm.getPage();
const outputPath = args[0] || path.join(TEMP_DIR, `browse-archive-${Date.now()}.mhtml`);
validateOutputPath(outputPath);
try {
const cdp = await page.context().newCDPSession(page);
const { data } = await cdp.send('Page.captureSnapshot', { format: 'mhtml' });
await cdp.detach();
fs.writeFileSync(outputPath, data);
return `Archive saved: ${outputPath} (${Math.round(data.length / 1024)}KB, MHTML)`;
} catch (err: any) {
throw new Error(`MHTML archive requires Chromium CDP. Use 'text' or 'html' for raw page content. (${err.message})`);
}
}
default:
throw new Error(`Unknown write command: ${command}`);
}
}
/** Map MIME type to file extension. */
function mimeToExt(mime: string): string {
const map: Record<string, string> = {
'image/png': '.png', 'image/jpeg': '.jpg', 'image/gif': '.gif',
'image/webp': '.webp', 'image/svg+xml': '.svg', 'image/avif': '.avif',
'video/mp4': '.mp4', 'video/webm': '.webm', 'video/quicktime': '.mov',
'audio/mpeg': '.mp3', 'audio/wav': '.wav', 'audio/ogg': '.ogg',
'application/pdf': '.pdf', 'application/json': '.json',
'text/html': '.html', 'text/plain': '.txt',
};
return map[mime] || '.bin';
}
+176
View File
@@ -0,0 +1,176 @@
/**
* Tests for the browser data platform: media extraction, network capture,
* path security, and structured data extraction.
*/
import { describe, it, expect } from 'bun:test';
import { SizeCappedBuffer, type CapturedResponse } from '../src/network-capture';
import { validateTempPath, validateOutputPath, validateReadPath } from '../src/path-security';
import { TEMP_DIR } from '../src/platform';
import * as fs from 'fs';
import * as path from 'path';
import * as os from 'os';
// ─── SizeCappedBuffer ─────────────────────────────────────────
describe('SizeCappedBuffer', () => {
function makeEntry(size: number, url = 'https://example.com'): CapturedResponse {
return {
url,
status: 200,
headers: {},
body: 'x'.repeat(size),
contentType: 'text/plain',
timestamp: Date.now(),
size,
bodyTruncated: false,
};
}
it('stores entries within capacity', () => {
const buf = new SizeCappedBuffer(1000);
buf.push(makeEntry(100));
buf.push(makeEntry(200));
expect(buf.length).toBe(2);
expect(buf.byteSize).toBe(300);
});
it('evicts oldest entries when over capacity', () => {
const buf = new SizeCappedBuffer(500);
buf.push(makeEntry(200, 'https://a.com'));
buf.push(makeEntry(200, 'https://b.com'));
buf.push(makeEntry(200, 'https://c.com')); // should evict first entry
expect(buf.length).toBe(2);
const urls = buf.toArray().map(e => e.url);
expect(urls).toContain('https://b.com');
expect(urls).toContain('https://c.com');
expect(urls).not.toContain('https://a.com');
});
it('evicts multiple entries for one large push', () => {
const buf = new SizeCappedBuffer(500);
buf.push(makeEntry(100));
buf.push(makeEntry(100));
buf.push(makeEntry(100));
buf.push(makeEntry(400)); // evicts first two (need totalSize + 400 <= 500, so totalSize <= 100)
expect(buf.length).toBe(2); // one 100-byte entry + one 400-byte entry
expect(buf.byteSize).toBe(500);
});
it('clear resets buffer', () => {
const buf = new SizeCappedBuffer(1000);
buf.push(makeEntry(100));
buf.push(makeEntry(200));
buf.clear();
expect(buf.length).toBe(0);
expect(buf.byteSize).toBe(0);
});
it('exports to JSONL file', () => {
const buf = new SizeCappedBuffer(1000);
buf.push(makeEntry(10, 'https://a.com'));
buf.push(makeEntry(20, 'https://b.com'));
const tmpFile = path.join(os.tmpdir(), `test-export-${Date.now()}.jsonl`);
try {
const count = buf.exportToFile(tmpFile);
expect(count).toBe(2);
const lines = fs.readFileSync(tmpFile, 'utf-8').trim().split('\n');
expect(lines.length).toBe(2);
const parsed = JSON.parse(lines[0]);
expect(parsed.url).toBe('https://a.com');
} finally {
fs.unlinkSync(tmpFile);
}
});
it('summary shows entries', () => {
const buf = new SizeCappedBuffer(1000);
buf.push(makeEntry(1024, 'https://api.example.com/graphql'));
const summary = buf.summary();
expect(summary).toContain('1 responses');
expect(summary).toContain('graphql');
expect(summary).toContain('1KB');
});
it('summary shows empty message when no entries', () => {
const buf = new SizeCappedBuffer(1000);
expect(buf.summary()).toBe('No captured responses.');
});
});
// ─── validateTempPath ─────────────────────────────────────────
describe('validateTempPath', () => {
let tmpFile: string;
it('allows paths within /tmp that exist', () => {
tmpFile = path.join(TEMP_DIR, `test-temp-path-${Date.now()}.jpg`);
fs.writeFileSync(tmpFile, 'test');
try {
expect(() => validateTempPath(tmpFile)).not.toThrow();
} finally {
fs.unlinkSync(tmpFile);
}
});
it('rejects non-existent files', () => {
expect(() => validateTempPath('/tmp/nonexistent-file-12345.jpg')).toThrow(/not found/i);
});
it('rejects paths in cwd', () => {
// Create a real file in cwd to test the path check (not the existence check)
const cwdFile = path.join(process.cwd(), 'package.json');
expect(() => validateTempPath(cwdFile)).toThrow(/temp directory/i);
});
it('rejects absolute paths outside safe dirs', () => {
expect(() => validateTempPath('/etc/passwd')).toThrow();
});
});
// ─── Command registration ─────────────────────────────────────
describe('command registration', () => {
it('all new commands have descriptions', () => {
// The load-time validation in commands.ts throws if any command
// is missing from COMMAND_DESCRIPTIONS. If this import succeeds,
// all commands are properly registered.
const { COMMAND_DESCRIPTIONS, ALL_COMMANDS } = require('../src/commands');
const newCommands = ['media', 'data', 'download', 'scrape', 'archive'];
for (const cmd of newCommands) {
expect(ALL_COMMANDS.has(cmd)).toBe(true);
expect(COMMAND_DESCRIPTIONS[cmd]).toBeTruthy();
}
});
it('new commands are in correct scope sets', () => {
const { SCOPE_READ, SCOPE_WRITE } = require('../src/token-registry');
expect(SCOPE_READ.has('media')).toBe(true);
expect(SCOPE_READ.has('data')).toBe(true);
expect(SCOPE_WRITE.has('download')).toBe(true);
expect(SCOPE_WRITE.has('scrape')).toBe(true);
expect(SCOPE_WRITE.has('archive')).toBe(true);
});
it('media and data are in PAGE_CONTENT_COMMANDS', () => {
const { PAGE_CONTENT_COMMANDS } = require('../src/commands');
expect(PAGE_CONTENT_COMMANDS.has('media')).toBe(true);
expect(PAGE_CONTENT_COMMANDS.has('data')).toBe(true);
});
});
// ─── MIME type mapping ─────────────────────────────────────────
describe('mimeToExt', () => {
// mimeToExt is a private function in write-commands.ts,
// so we test it indirectly through command behavior.
// This test verifies the source contains the expected mappings.
it('write-commands.ts contains MIME mappings', () => {
const src = fs.readFileSync(path.join(import.meta.dir, '../src/write-commands.ts'), 'utf-8');
expect(src).toContain("'image/png': '.png'");
expect(src).toContain("'image/jpeg': '.jpg'");
expect(src).toContain("'video/mp4': '.mp4'");
expect(src).toContain("'audio/mpeg': '.mp3'");
});
});
+67
View File
@@ -0,0 +1,67 @@
<!DOCTYPE html>
<html>
<head>
<title>Media Test Page</title>
<meta property="og:title" content="Test Product">
<meta property="og:description" content="A test product description">
<meta property="og:image" content="https://example.com/og-image.jpg">
<meta property="og:type" content="product">
<meta name="twitter:card" content="summary_large_image">
<meta name="twitter:title" content="Test Product Tweet">
<meta name="description" content="Page description for SEO">
<meta name="keywords" content="test, product, media">
<meta name="author" content="Test Author">
<link rel="canonical" href="https://example.com/test-product">
<script type="application/ld+json">
{
"@context": "https://schema.org",
"@type": "Product",
"name": "Test Widget",
"description": "A widget for testing",
"image": "https://example.com/widget.jpg",
"offers": {
"@type": "Offer",
"price": "29.99",
"priceCurrency": "USD"
}
}
</script>
<style>
.hero { background-image: url('https://example.com/hero-bg.jpg'); width: 100%; height: 300px; }
.banner { background-image: url('https://example.com/banner.png'); width: 100%; height: 100px; }
</style>
</head>
<body>
<div class="hero"></div>
<div class="banner"></div>
<!-- Standard images -->
<img src="https://example.com/photo1.jpg" alt="Photo 1" width="800" height="600">
<img src="https://example.com/photo2.png" alt="Photo 2" width="400" height="300">
<!-- Lazy loaded image -->
<img data-src="https://example.com/lazy.jpg" alt="Lazy Image" loading="lazy" width="600" height="400">
<!-- Image with srcset -->
<img src="https://example.com/responsive-sm.jpg"
srcset="https://example.com/responsive-sm.jpg 480w, https://example.com/responsive-lg.jpg 1200w"
alt="Responsive Image"
width="480" height="320">
<!-- Video with sources -->
<video width="640" height="480" poster="https://example.com/poster.jpg">
<source src="https://example.com/video.mp4" type="video/mp4">
<source src="https://example.com/video.webm" type="video/webm">
</video>
<!-- HLS video -->
<video width="1920" height="1080">
<source src="https://example.com/stream.m3u8" type="application/x-mpegURL">
</video>
<!-- Audio -->
<audio>
<source src="https://example.com/podcast.mp3" type="audio/mpeg">
</audio>
</body>
</html>
+13 -13
View File
@@ -17,6 +17,7 @@ const WRITE_SRC = fs.readFileSync(path.join(import.meta.dir, '../src/write-comma
const SERVER_SRC = fs.readFileSync(path.join(import.meta.dir, '../src/server.ts'), 'utf-8');
const AGENT_SRC = fs.readFileSync(path.join(import.meta.dir, '../src/sidebar-agent.ts'), 'utf-8');
const SNAPSHOT_SRC = fs.readFileSync(path.join(import.meta.dir, '../src/snapshot.ts'), 'utf-8');
const PATH_SECURITY_SRC = fs.readFileSync(path.join(import.meta.dir, '../src/path-security.ts'), 'utf-8');
// ─── Helper ─────────────────────────────────────────────────────────────────
@@ -159,26 +160,25 @@ describe('Task 2: CSS value validator blocks dangerous patterns', () => {
describe('Task 1: validateOutputPath uses realpathSync', () => {
describe('source-level checks', () => {
it('meta-commands.ts validateOutputPath contains realpathSync', () => {
const fn = extractFunction(META_SRC, 'validateOutputPath');
it('path-security.ts validateOutputPath contains realpathSync', () => {
const fn = extractFunction(PATH_SECURITY_SRC, 'validateOutputPath');
expect(fn).toBeTruthy();
expect(fn).toContain('realpathSync');
});
it('write-commands.ts validateOutputPath contains realpathSync', () => {
const fn = extractFunction(WRITE_SRC, 'validateOutputPath');
expect(fn).toBeTruthy();
expect(fn).toContain('realpathSync');
});
it('meta-commands.ts SAFE_DIRECTORIES resolves with realpathSync', () => {
const safeBlock = sliceBetween(META_SRC, 'const SAFE_DIRECTORIES', ';');
it('path-security.ts SAFE_DIRECTORIES resolves with realpathSync', () => {
const safeBlock = sliceBetween(PATH_SECURITY_SRC, 'const SAFE_DIRECTORIES', ';');
expect(safeBlock).toContain('realpathSync');
});
it('write-commands.ts SAFE_DIRECTORIES resolves with realpathSync', () => {
const safeBlock = sliceBetween(WRITE_SRC, 'const SAFE_DIRECTORIES', ';');
expect(safeBlock).toContain('realpathSync');
it('meta-commands.ts re-exports validateOutputPath from path-security', () => {
expect(META_SRC).toContain("from './path-security'");
expect(META_SRC).toContain('validateOutputPath');
});
it('write-commands.ts imports validateOutputPath from path-security', () => {
expect(WRITE_SRC).toContain("from './path-security'");
expect(WRITE_SRC).toContain('validateOutputPath');
});
});
+4 -4
View File
@@ -113,15 +113,15 @@ describe('generateInstructionBlock', () => {
expect(block).not.toContain('re-pair with --admin');
});
it('shows re-pair hint when admin not included', () => {
it('shows re-pair hint when control not included', () => {
const block = generateInstructionBlock({
setupKey: 'gsk_setup_nonadmin',
setupKey: 'gsk_setup_nocontrol',
serverUrl: 'https://test.ngrok.dev',
scopes: ['read', 'write'],
scopes: ['read', 'write', 'admin', 'meta'],
expiresAt: '2026-04-06T00:00:00Z',
});
expect(block).toContain('re-pair with --admin');
expect(block).toContain('re-pair with --control');
});
it('includes newtab as step 2 (agents must own their tab)', () => {
+9 -5
View File
@@ -5,7 +5,7 @@ import {
validateToken, checkScope, checkDomain, checkRate,
revokeToken, rotateRoot, listTokens, recordCommand,
serializeRegistry, restoreRegistry, checkConnectRateLimit,
SCOPE_READ, SCOPE_WRITE, SCOPE_ADMIN, SCOPE_META,
SCOPE_READ, SCOPE_WRITE, SCOPE_ADMIN, SCOPE_CONTROL, SCOPE_META,
} from '../src/token-registry';
describe('token-registry', () => {
@@ -25,7 +25,7 @@ describe('token-registry', () => {
const info = validateToken('root-token-for-tests');
expect(info).not.toBeNull();
expect(info!.clientId).toBe('root');
expect(info!.scopes).toEqual(['read', 'write', 'admin', 'meta']);
expect(info!.scopes).toEqual(['read', 'write', 'admin', 'meta', 'control']);
expect(info!.rateLimit).toBe(0);
});
});
@@ -324,7 +324,7 @@ describe('token-registry', () => {
it('every command in commands.ts is covered by a scope', () => {
// Import the command sets to verify coverage
const allInScopes = new Set([
...SCOPE_READ, ...SCOPE_WRITE, ...SCOPE_ADMIN, ...SCOPE_META,
...SCOPE_READ, ...SCOPE_WRITE, ...SCOPE_ADMIN, ...SCOPE_CONTROL, ...SCOPE_META,
]);
// chain is a special case (checked via meta scope but dispatches subcommands)
allInScopes.add('chain');
@@ -339,8 +339,12 @@ describe('token-registry', () => {
expect(SCOPE_ADMIN.has('cookies')).toBe(true);
expect(SCOPE_ADMIN.has('storage')).toBe(true);
expect(SCOPE_ADMIN.has('useragent')).toBe(true);
expect(SCOPE_ADMIN.has('state')).toBe(true);
expect(SCOPE_ADMIN.has('handoff')).toBe(true);
// Browser-wide destructive commands moved to SCOPE_CONTROL
expect(SCOPE_CONTROL.has('state')).toBe(true);
expect(SCOPE_CONTROL.has('handoff')).toBe(true);
expect(SCOPE_CONTROL.has('stop')).toBe(true);
expect(SCOPE_CONTROL.has('restart')).toBe(true);
expect(SCOPE_CONTROL.has('disconnect')).toBe(true);
// Verify safe read commands are NOT in admin
expect(SCOPE_ADMIN.has('text')).toBe(false);
+1 -1
View File
@@ -13,7 +13,7 @@ export function generateCommandReference(_ctx: TemplateContext): string {
// Category display order
const categoryOrder = [
'Navigation', 'Reading', 'Interaction', 'Inspection',
'Navigation', 'Reading', 'Extraction', 'Interaction', 'Inspection',
'Visual', 'Snapshot', 'Meta', 'Tabs', 'Server',
];