Files
gstack/browse/src/meta-commands.ts
T
Garry Tan c15b805cd8 feat(browse): Puppeteer parity — load-html, screenshot --selector, viewport --scale, file:// (v1.1.0.0) (#1062)
* feat(browse): TabSession loadedHtml + command aliases + DX polish primitives

Adds the foundation layer for Puppeteer-parity features:

- TabSession.loadedHtml + setTabContent/getLoadedHtml/clearLoadedHtml —
  enables load-html content to survive context recreation (viewport --scale)
  via in-memory replay. ASCII lifecycle diagram in the source explains the
  clear-before-navigation contract.

- COMMAND_ALIASES + canonicalizeCommand() helper — single source of truth
  for name aliases (setcontent / set-content / setContent → load-html),
  consumed by server dispatch and chain prevalidation.

- buildUnknownCommandError() pure function — rich error messages with
  Levenshtein-based "Did you mean" suggestions (distance ≤ 2, input
  length ≥ 4 to skip 2-letter noise) and NEW_IN_VERSION upgrade hints.

- load-html registered in WRITE_COMMANDS + SCOPE_WRITE so scoped write
  tokens can use it.

- screenshot and viewport descriptions updated for upcoming flags.

- New browse/test/dx-polish.test.ts (15 tests): alias canonicalization,
  Levenshtein threshold + alphabetical tiebreak, short-input guard,
  NEW_IN_VERSION upgrade hint, alias + scope integration invariants.

No consumers yet — pure additive foundation. Safe to bisect on its own.

* feat(browse): accept file:// in goto with smart cwd/home-relative parsing

Extends validateNavigationUrl to accept file:// URLs scoped to safe dirs
(cwd + TEMP_DIR) via the existing validateReadPath policy. The workhorse is a
new normalizeFileUrl() helper that handles non-standard relative forms BEFORE
the WHATWG URL parser sees them:

    file:///abs/path.html       → unchanged
    file://./docs/page.html     → file://<cwd>/docs/page.html
    file://~/Documents/page.html → file://<HOME>/Documents/page.html
    file://docs/page.html       → file://<cwd>/docs/page.html
    file://localhost/abs/path   → unchanged
    file://host.example.com/... → rejected (UNC/network)
    file:// and file:///        → rejected (would list a directory)

Host heuristic rejects segments with '.', ':', '\\', '%', IPv6 brackets, or
Windows drive-letter patterns — so file://docs.v1/page.html, file://127.0.0.1/x,
file://[::1]/x, and file://C:/Users/x are explicit errors.

Uses fileURLToPath() + pathToFileURL() from node:url (never string-concat) so
URL escapes like %20 decode correctly and Node rejects encoded-slash traversal
(%2F..%2F) outright.

Signature change: validateNavigationUrl now returns Promise<string> (the
normalized URL) instead of Promise<void>. Existing callers that ignore the
return value still compile — they just don't benefit from smart-parsing until
updated in follow-up commits. Callers will be migrated in the next few commits
(goto, diff, newTab, restoreState).

Rewrites the url-validation test file: updates existing tests for the new
return type, adds 20+ new tests covering every normalizeFileUrl shape variant,
URL-encoding edge cases, and path-traversal rejection.

References: codex consult v3 P1 findings on URL parser semantics and fileURLToPath.

* feat(browse): BrowserManager deviceScaleFactor + setContent replay + file:// plumbing

Three tightly-coupled changes to BrowserManager, all in service of the
Puppeteer-parity workflow:

1. deviceScaleFactor + currentViewport tracking. New private fields (default
   scale=1, viewport=1280x720) + setDeviceScaleFactor(scale, w, h) method.
   deviceScaleFactor is a context-level Playwright option — changing it
   requires recreateContext(). The method validates (finite number, 1-3 cap,
   headed-mode rejected), stores new values, calls recreateContext(), and
   rolls back the fields on failure so a bad call doesn't leave inconsistent
   state. Context options at all three sites (launch, recreate happy path,
   recreate fallback) now honor the stored values instead of hardcoding
   1280x720.

2. BrowserState.loadedHtml + loadedHtmlWaitUntil. saveState captures per-tab
   loadedHtml from the session; restoreState replays it via newSession.
   setTabContent() — NOT bare page.setContent() — so TabSession.loadedHtml
   is rehydrated and survives *subsequent* scale changes. In-memory only,
   never persisted to disk (HTML may contain secrets or customer data).

3. newTab + restoreState now consume validateNavigationUrl's normalized
   return value. file://./x, file://~/x, and bare-segment forms now take
   effect at every navigation site, not just the top-level goto command.

Together these enable: load-html → viewport --scale 2 → viewport --scale 1.5
→ screenshot, with content surviving both context recreations. Codex v2 P0
flagged that bare page.setContent in restoreState would lose content on the
second scale change — this commit implements the rehydration path.

References: codex v2 P0 (TabSession rehydration), codex v3 P1 (4-caller
return value), plan Feature 3 + Feature 4.

* feat(browse): load-html, screenshot --selector, viewport --scale, alias dispatch

Wires the new handlers and dispatch logic that the previous commits made
possible:

write-commands.ts
- New 'load-html' case: validateReadPath for safe-dir scoping, stat-based
  actionable errors (not found, directory, oversize), extension allowlist
  (.html/.htm/.xhtml/.svg), magic-byte sniff with UTF-8 BOM strip accepting
  any <[a-zA-Z!?] markup opener (not just <!doctype — bare fragments like
  <div>...</div> work for setContent), 50MB cap via GSTACK_BROWSE_MAX_HTML_BYTES
  override, frame-context rejection. Calls session.setTabContent() so replay
  metadata is rehydrated.
- viewport command extended: optional [<WxH>], optional [--scale <n>],
  scale-only variant reads current size via page.viewportSize(). Invalid
  scale (NaN, Infinity, empty, out of 1-3) throws with named value. Headed
  mode rejected explicitly.
- clearLoadedHtml() called BEFORE goto/back/forward/reload navigation
  (not after) so a timed-out goto post-commit doesn't leave stale metadata
  that could resurrect on a later context recreation. Codex v2 P1 catch.
- goto uses validateNavigationUrl's normalized return value.

meta-commands.ts
- screenshot --selector <css> flag: explicit element-screenshot form.
  Rejects alongside positional selector (both = error), preserves --clip
  conflict at line 161, composes with --base64 at lines 168-174.
- chain canonicalizes each step with canonicalizeCommand — step shape is
  now { rawName, name, args } so prevalidation, dispatch, WRITE_COMMANDS.has,
  watch blocking, and result labels all use canonical names while audit
  labels show 'rawName→name' when aliased. Codex v3 P2 catch — prior shape
  only canonicalized at prevalidation and diverged everywhere else.
- diff command consumes validateNavigationUrl return value for both URLs.

server.ts
- Command canonicalization inserted immediately after parse, before scope /
  watch / tab-ownership / content-wrapping checks. rawCommand preserved for
  future audit (not wired into audit log in this commit — follow-up).
- Unknown-command handler replaced with buildUnknownCommandError() from
  commands.ts — produces 'Unknown command: X. Did you mean Y?' with optional
  upgrade hint for NEW_IN_VERSION entries.

security-audit-r2.test.ts
- Updated chain-loop marker from 'for (const cmd of commands)' to
  'for (const c of commands)' to match the new chain step shape. Same
  isWatching + BLOCKED invariants still asserted.

* chore: bump version and changelog (v1.1.0.0)

- VERSION: 1.0.0.0 → 1.1.0.0 (MINOR bump — new user-facing commands)
- package.json: matching version bump
- CHANGELOG.md: new 1.1.0.0 entry describing load-html, screenshot --selector,
  viewport --scale, file:// support, setContent replay, and DX polish in user
  voice with a dedicated Security section for file:// safe-dirs policy
- browse/SKILL.md.tmpl: adds pattern #12 "Render local HTML", pattern #13
  "Retina screenshots", and a full Puppeteer → browse cheatsheet with side-by-
  side API mapping and a worked tweet-renderer migration example
- browse/SKILL.md + SKILL.md: regenerated from templates via `bun run gen:skill-docs`
  to reflect the new command descriptions

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix: pre-landing review fixes (9 findings from specialist + adversarial review)

Adversarial review (Claude subagent + Codex) surfaced 9 bugs across
CRITICAL/HIGH severity. All fixed:

1. tab-session.ts:setTabContent — state mutation moved AFTER the setContent
   await. Prior order left phantom HTML in replay metadata if setContent
   threw (timeout, browser crash), which a later viewport --scale would
   silently replay. Now loadedHtml is only recorded on successful load.

2. browser-manager.ts:setDeviceScaleFactor — rollback now forces a second
   recreateContext after restoring the old fields. The fallback path in
   the original recreateContext builds a blank context using whatever
   this.deviceScaleFactor/currentViewport hold at that moment (which were
   the NEW values we were trying to apply). Rolling back the fields without
   a second recreate left the live context at new-scale while state tracked
   old-scale. Now: restore fields, force re-recreate with old values, only
   if that ALSO fails do we return a combined error.

3. commands.ts:buildUnknownCommandError — Levenshtein tiebreak simplified
   to 'd <= 2 && d < bestDist' (strict less). Candidates are pre-sorted
   alphabetically, so first equal-distance wins by default. The prior
   '(d === bestDist && best !== undefined && cand < best)' clause was dead
   code.

4. tab-session.ts:onMainFrameNavigated — now clears loadedHtml, not just
   refs + frame. Without this, a user who load-html'd then clicked a link
   (or had a form submit / JS redirect / OAuth flow) would retain the stale
   replay metadata. The next viewport --scale would silently revert the
   tab to the ORIGINAL loaded HTML, losing whatever the post-navigation
   content was. Silent data corruption. Browser-emitted navigations trigger
   this path via wirePageEvents.

5. browser-manager.ts:saveState + restoreState — tab ownership now flows
   through BrowserState.owner. Without this, a scoped agent's viewport
   --scale would strand them: tab IDs change during recreate, ownership
   map held stale IDs, owner lookup failed. New IDs had no owner, so
   writes without tabId were denied (DoS). Worse, if the agent sent a
   stale tabId the server's swallowed-tab-switch-error path would let the
   command hit whatever tab was currently active (cross-tab authz bypass).
   Now: clear ownership before restore, re-add per-tab with new IDs.

6. meta-commands.ts:state load — disk-loaded state.pages is now explicit
   allowlist (url, isActive, storage:null) instead of object spread.
   Spreading accepted loadedHtml, loadedHtmlWaitUntil, and owner from a
   user-writable state file, letting a tampered state.json smuggle HTML
   past load-html's safe-dirs / extension / magic-byte / 50MB-cap
   validators, or forge tab ownership. Now stripped at the boundary.

7. url-validation.ts:normalizeFileUrl — preserves query string + fragment
   across normalization. file://./app.html?route=home#login previously
   resolved to a filesystem path that URL-encoded '?' as %3F and '#' as
   %23, or (for absolute forms) pathToFileURL dropped them entirely. SPAs
   and fixture URLs with query params 404'd or loaded the wrong route.
   Now: split on ?/# before path resolution, reattach after.

8. url-validation.ts:validateNavigationUrl — reattaches parsed.search +
   parsed.hash to the normalized file:// URL. Same fix at the main
   validator for absolute paths that go through fileURLToPath round-trip.

9. server.ts:writeAuditEntry — audit entries now include aliasOf when the
   user typed an alias ('setcontent' → cmd: 'load-html', aliasOf:
   'setcontent'). Previously the isAliased variable was computed but
   dropped, losing the raw input from the forensic trail. Completes the
   plan's codex v3 P2 requirement.

Also added bm.getCurrentViewport() and switched 'viewport --scale'-
without-size to read from it (more reliable than page.viewportSize() on
headed/transition contexts).

Tests pass: exit 0, no failures. Build clean.

* test: integration coverage for load-html, screenshot --selector, viewport --scale, replay, aliases

Adds 28 Playwright-integration tests that close the coverage gap flagged
by the ship-workflow coverage audit (50% → expected ~80%+).

**load-html (12 tests):**
- happy path loads HTML file, page text matches
- bare HTML fragments (<div>...</div>) accepted, not just full documents
- missing file arg throws usage
- non-.html extension rejected by allowlist
- /etc/passwd.html rejected by safe-dirs policy
- ENOENT path rejected with actionable "not found" error
- directory target rejected
- binary file (PNG magic bytes) disguised as .html rejected by magic-byte check
- UTF-8 BOM stripped before magic-byte check — BOM-prefixed HTML accepted
- --wait-until networkidle exercises non-default branch
- invalid --wait-until value rejected
- unknown flag rejected

**screenshot --selector (5 tests):**
- --selector flag captures element, validates Screenshot saved (element)
- conflicts with positional selector (both = error)
- conflicts with --clip (mutually exclusive)
- composes with --base64 (returns data:image/png;base64,...)
- missing value throws usage

**viewport --scale (5 tests):**
- WxH --scale 2 produces PNG with 2x element dimensions (parses IHDR bytes 16-23)
- --scale without WxH keeps current size + applies scale
- non-finite value (abc) throws "not a finite number"
- out-of-range (4, 0.5) throws "between 1 and 3"
- missing value throws

**setContent replay across context recreation (3 tests):**
- load-html → viewport --scale 2: content survives (hits setTabContent replay path)
- double cycle 2x → 1.5x: content still survives (proves TabSession rehydration)
- goto after load-html clears replay: subsequent viewport --scale does NOT
  resurrect the stale HTML (validates the onMainFrameNavigated fix)

**Command aliases (2 tests):**
- setcontent routes to load-html via chain canonicalization
- set-content (hyphenated) also routes — both end-to-end through chain dispatch

Fixture paths use /tmp (SAFE_DIRECTORIES entry) instead of $TMPDIR which is
/var/folders/... on macOS and outside the safe-dirs boundary. Chain result
labels use rawName→name format when an alias is resolved (matches the
meta-commands.ts chain refactor).

Full suite: exit 0, 223/223 pass.

* docs: update BROWSER.md + CHANGELOG for v1.1.0.0

BROWSER.md:
- Command reference table updated: goto now lists file:// support,
  load-html added to Navigate row, viewport flagged with --scale
  option, screenshot row shows --selector + --base64 flags
- Screenshot modes table adds the fifth mode (element crop via
  --selector flag) and notes the tag-selector-not-caught-positionally
  gotcha
- New "Retina screenshots — viewport --scale" subsection explains
  deviceScaleFactor mechanics, context recreation side effects, and
  headed-mode rejection
- New "Loading local HTML — goto file:// vs load-html" subsection
  explains the two paths, their tradeoffs (URL state, relative asset
  resolution), the safe-dirs policy, extension allowlist + magic-byte
  sniff, 50MB cap, setContent replay across recreateContext, and the
  alias routing (setcontent → load-html before scope check)

CHANGELOG.md (v1.1.0.0 security section expanded, no existing content
removed):
- State files cannot smuggle HTML or forge tab ownership (allowlist
  on disk-loaded page fields)
- Audit log records aliasOf when a canonical command was reached via
  an alias (setcontent → load-html)
- load-html content clears on real navigations (clicks, form submits,
  JS redirects) — not just explicit goto. Also notes SPA query/fragment
  preservation for goto file://

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 23:25:33 +08:00

802 lines
33 KiB
TypeScript

/**
* Meta commands — tabs, server control, screenshots, chain, diff, snapshot
*/
import type { BrowserManager } from './browser-manager';
import { handleSnapshot } from './snapshot';
import { getCleanText } from './read-commands';
import { READ_COMMANDS, WRITE_COMMANDS, META_COMMANDS, PAGE_CONTENT_COMMANDS, wrapUntrustedContent, canonicalizeCommand } from './commands';
import { validateNavigationUrl } from './url-validation';
import { checkScope, type TokenInfo } from './token-registry';
import { validateOutputPath, escapeRegExp } from './path-security';
// Re-export for backward compatibility (tests import from meta-commands)
export { validateOutputPath, escapeRegExp } from './path-security';
import * as Diff from 'diff';
import * as fs from 'fs';
import * as path from 'path';
import { TEMP_DIR } from './platform';
import { resolveConfig } from './config';
import type { Frame } from 'playwright';
/** Tokenize a pipe segment respecting double-quoted strings. */
function tokenizePipeSegment(segment: string): string[] {
const tokens: string[] = [];
let current = '';
let inQuote = false;
for (let i = 0; i < segment.length; i++) {
const ch = segment[i];
if (ch === '"') {
inQuote = !inQuote;
} else if (ch === ' ' && !inQuote) {
if (current) { tokens.push(current); current = ''; }
} else {
current += ch;
}
}
if (current) tokens.push(current);
return tokens;
}
/** Options passed from handleCommandInternal for chain routing */
export interface MetaCommandOpts {
chainDepth?: number;
/** Callback to route subcommands through the full security pipeline (handleCommandInternal) */
executeCommand?: (body: { command: string; args?: string[]; tabId?: number }, tokenInfo?: TokenInfo | null) => Promise<{ status: number; result: string; json?: boolean }>;
}
export async function handleMetaCommand(
command: string,
args: string[],
bm: BrowserManager,
shutdown: () => Promise<void> | void,
tokenInfo?: TokenInfo | null,
opts?: MetaCommandOpts,
): Promise<string> {
// Per-tab operations use the active session; global operations use bm directly
const session = bm.getActiveSession();
switch (command) {
// ─── Tabs ──────────────────────────────────────────
case 'tabs': {
const tabs = await bm.getTabListWithTitles();
return tabs.map(t =>
`${t.active ? '→ ' : ' '}[${t.id}] ${t.title || '(untitled)'}${t.url}`
).join('\n');
}
case 'tab': {
const id = parseInt(args[0], 10);
if (isNaN(id)) throw new Error('Usage: browse tab <id>');
bm.switchTab(id);
return `Switched to tab ${id}`;
}
case 'newtab': {
const url = args[0];
const id = await bm.newTab(url);
return `Opened tab ${id}${url ? `${url}` : ''}`;
}
case 'closetab': {
const id = args[0] ? parseInt(args[0], 10) : undefined;
await bm.closeTab(id);
return `Closed tab${id ? ` ${id}` : ''}`;
}
// ─── Server Control ────────────────────────────────
case 'status': {
const page = bm.getPage();
const tabs = bm.getTabCount();
const mode = bm.getConnectionMode();
return [
`Status: healthy`,
`Mode: ${mode}`,
`URL: ${page.url()}`,
`Tabs: ${tabs}`,
`PID: ${process.pid}`,
].join('\n');
}
case 'url': {
return bm.getCurrentUrl();
}
case 'stop': {
await shutdown();
return 'Server stopped';
}
case 'restart': {
// Signal that we want a restart — the CLI will detect exit and restart
console.log('[browse] Restart requested. Exiting for CLI to restart.');
await shutdown();
return 'Restarting...';
}
// ─── Visual ────────────────────────────────────────
case 'screenshot': {
// Parse priority: flags (--viewport, --clip, --base64) → selector (@ref, CSS) → output path
const page = bm.getPage();
let outputPath = `${TEMP_DIR}/browse-screenshot.png`;
let clipRect: { x: number; y: number; width: number; height: number } | undefined;
let targetSelector: string | undefined;
let viewportOnly = false;
let base64Mode = false;
const remaining: string[] = [];
let flagSelector: string | undefined;
for (let i = 0; i < args.length; i++) {
if (args[i] === '--viewport') {
viewportOnly = true;
} else if (args[i] === '--base64') {
base64Mode = true;
} else if (args[i] === '--selector') {
flagSelector = args[++i];
if (!flagSelector) throw new Error('Usage: screenshot --selector <css> [path]');
} else if (args[i] === '--clip') {
const coords = args[++i];
if (!coords) throw new Error('Usage: screenshot --clip x,y,w,h [path]');
const parts = coords.split(',').map(Number);
if (parts.length !== 4 || parts.some(isNaN))
throw new Error('Usage: screenshot --clip x,y,width,height — all must be numbers');
clipRect = { x: parts[0], y: parts[1], width: parts[2], height: parts[3] };
} else if (args[i].startsWith('--')) {
throw new Error(`Unknown screenshot flag: ${args[i]}`);
} else {
remaining.push(args[i]);
}
}
// Separate target (selector/@ref) from output path
for (const arg of remaining) {
// File paths containing / and ending with an image/pdf extension are never CSS selectors
const isFilePath = arg.includes('/') && /\.(png|jpe?g|webp|pdf)$/i.test(arg);
if (isFilePath) {
outputPath = arg;
} else if (arg.startsWith('@e') || arg.startsWith('@c') || arg.startsWith('.') || arg.startsWith('#') || arg.includes('[')) {
targetSelector = arg;
} else {
outputPath = arg;
}
}
// --selector flag takes precedence; conflict with positional selector.
if (flagSelector !== undefined) {
if (targetSelector !== undefined) {
throw new Error('--selector conflicts with positional selector — choose one');
}
targetSelector = flagSelector;
}
validateOutputPath(outputPath);
if (clipRect && targetSelector) {
throw new Error('Cannot use --clip with a selector/ref — choose one');
}
if (viewportOnly && clipRect) {
throw new Error('Cannot use --viewport with --clip — choose one');
}
// --base64 mode: capture to buffer instead of disk
if (base64Mode) {
let buffer: Buffer;
if (targetSelector) {
const resolved = await bm.resolveRef(targetSelector);
const locator = 'locator' in resolved ? resolved.locator : page.locator(resolved.selector);
buffer = await locator.screenshot({ timeout: 5000 });
} else if (clipRect) {
buffer = await page.screenshot({ clip: clipRect });
} else {
buffer = await page.screenshot({ fullPage: !viewportOnly });
}
if (buffer.length > 10 * 1024 * 1024) {
throw new Error('Screenshot too large for --base64 (>10MB). Use disk path instead.');
}
return `data:image/png;base64,${buffer.toString('base64')}`;
}
if (targetSelector) {
const resolved = await bm.resolveRef(targetSelector);
const locator = 'locator' in resolved ? resolved.locator : page.locator(resolved.selector);
await locator.screenshot({ path: outputPath, timeout: 5000 });
return `Screenshot saved (element): ${outputPath}`;
}
if (clipRect) {
await page.screenshot({ path: outputPath, clip: clipRect });
return `Screenshot saved (clip ${clipRect.x},${clipRect.y},${clipRect.width},${clipRect.height}): ${outputPath}`;
}
await page.screenshot({ path: outputPath, fullPage: !viewportOnly });
return `Screenshot saved${viewportOnly ? ' (viewport)' : ''}: ${outputPath}`;
}
case 'pdf': {
const page = bm.getPage();
const pdfPath = args[0] || `${TEMP_DIR}/browse-page.pdf`;
validateOutputPath(pdfPath);
await page.pdf({ path: pdfPath, format: 'A4' });
return `PDF saved: ${pdfPath}`;
}
case 'responsive': {
const page = bm.getPage();
const prefix = args[0] || `${TEMP_DIR}/browse-responsive`;
validateOutputPath(prefix);
const viewports = [
{ name: 'mobile', width: 375, height: 812 },
{ name: 'tablet', width: 768, height: 1024 },
{ name: 'desktop', width: 1280, height: 720 },
];
const originalViewport = page.viewportSize();
const results: string[] = [];
for (const vp of viewports) {
await page.setViewportSize({ width: vp.width, height: vp.height });
const screenshotPath = `${prefix}-${vp.name}.png`;
validateOutputPath(screenshotPath);
await page.screenshot({ path: screenshotPath, fullPage: true });
results.push(`${vp.name} (${vp.width}x${vp.height}): ${screenshotPath}`);
}
// Restore original viewport
if (originalViewport) {
await page.setViewportSize(originalViewport);
}
return results.join('\n');
}
// ─── Chain ─────────────────────────────────────────
case 'chain': {
// Read JSON array from args[0] (if provided) or expect it was passed as body
const jsonStr = args[0];
if (!jsonStr) throw new Error(
'Usage: echo \'[["goto","url"],["text"]]\' | browse chain\n' +
' or: browse chain \'goto url | click @e5 | snapshot -ic\''
);
let rawCommands: string[][];
try {
rawCommands = JSON.parse(jsonStr);
if (!Array.isArray(rawCommands)) throw new Error('not array');
} catch (err: any) {
// Fallback: pipe-delimited format "goto url | click @e5 | snapshot -ic"
if (!(err instanceof SyntaxError) && err?.message !== 'not array') throw err;
rawCommands = jsonStr.split(' | ')
.filter(seg => seg.trim().length > 0)
.map(seg => tokenizePipeSegment(seg.trim()));
}
// Canonicalize aliases across the whole chain. Pair canonical name with the raw
// input so result labels + error messages reflect what the user typed, but every
// dispatch path (scope check, WRITE_COMMANDS.has, watch blocking, handler lookup)
// uses the canonical name. Otherwise `chain '[["setcontent","/tmp/x.html"]]'`
// bypasses prevalidation or runs under the wrong command set.
const commands = rawCommands.map(cmd => {
const [rawName, ...cmdArgs] = cmd;
const name = canonicalizeCommand(rawName);
return { rawName, name, args: cmdArgs };
});
// Pre-validate ALL subcommands against the token's scope before executing any.
// Uses canonical name so aliases don't bypass scope checks.
if (tokenInfo && tokenInfo.clientId !== 'root') {
for (const c of commands) {
if (!checkScope(tokenInfo, c.name)) {
throw new Error(
`Chain rejected: subcommand "${c.rawName}" not allowed by your token scope (${tokenInfo.scopes.join(', ')}). ` +
`All subcommands must be within scope.`
);
}
}
}
// Route each subcommand through handleCommandInternal for full security:
// scope, domain, tab ownership, content wrapping — all enforced per subcommand.
// Chain-specific options: skip rate check (chain = 1 request), skip activity
// events (chain emits 1 event), increment chain depth (recursion guard).
const executeCmd = opts?.executeCommand;
const results: string[] = [];
let lastWasWrite = false;
if (executeCmd) {
// Full security pipeline via handleCommandInternal.
// Pass rawName so the server's own canonicalization is a no-op (already canonical).
for (const c of commands) {
const cr = await executeCmd(
{ command: c.name, args: c.args },
tokenInfo,
);
const label = c.rawName === c.name ? c.name : `${c.rawName}${c.name}`;
if (cr.status === 200) {
results.push(`[${label}] ${cr.result}`);
} else {
// Parse error from JSON result
let errMsg = cr.result;
try { errMsg = JSON.parse(cr.result).error || cr.result; } catch (err: any) { if (!(err instanceof SyntaxError)) throw err; }
results.push(`[${label}] ERROR: ${errMsg}`);
}
lastWasWrite = WRITE_COMMANDS.has(c.name);
}
} else {
// Fallback: direct dispatch (CLI mode, no server context)
const { handleReadCommand } = await import('./read-commands');
const { handleWriteCommand } = await import('./write-commands');
for (const c of commands) {
const name = c.name;
const cmdArgs = c.args;
const label = c.rawName === name ? name : `${c.rawName}${name}`;
try {
let result: string;
if (WRITE_COMMANDS.has(name)) {
if (bm.isWatching()) {
result = 'BLOCKED: write commands disabled in watch mode';
} else {
result = await handleWriteCommand(name, cmdArgs, session, bm);
}
lastWasWrite = true;
} else if (READ_COMMANDS.has(name)) {
result = await handleReadCommand(name, cmdArgs, session);
if (PAGE_CONTENT_COMMANDS.has(name)) {
result = wrapUntrustedContent(result, bm.getCurrentUrl());
}
lastWasWrite = false;
} else if (META_COMMANDS.has(name)) {
result = await handleMetaCommand(name, cmdArgs, bm, shutdown, tokenInfo, opts);
lastWasWrite = false;
} else {
throw new Error(`Unknown command: ${c.rawName}`);
}
results.push(`[${label}] ${result}`);
} catch (err: any) {
results.push(`[${label}] ERROR: ${err.message}`);
}
}
}
// Wait for network to settle after write commands before returning
if (lastWasWrite) {
await bm.getPage().waitForLoadState('networkidle', { timeout: 2000 }).catch(() => {});
}
return results.join('\n\n');
}
// ─── Diff ──────────────────────────────────────────
case 'diff': {
const [url1, url2] = args;
if (!url1 || !url2) throw new Error('Usage: browse diff <url1> <url2>');
const page = bm.getPage();
const normalizedUrl1 = await validateNavigationUrl(url1);
await page.goto(normalizedUrl1, { waitUntil: 'domcontentloaded', timeout: 15000 });
const text1 = await getCleanText(page);
const normalizedUrl2 = await validateNavigationUrl(url2);
await page.goto(normalizedUrl2, { waitUntil: 'domcontentloaded', timeout: 15000 });
const text2 = await getCleanText(page);
const changes = Diff.diffLines(text1, text2);
const output: string[] = [`--- ${url1}`, `+++ ${url2}`, ''];
for (const part of changes) {
const prefix = part.added ? '+' : part.removed ? '-' : ' ';
const lines = part.value.split('\n').filter(l => l.length > 0);
for (const line of lines) {
output.push(`${prefix} ${line}`);
}
}
return wrapUntrustedContent(output.join('\n'), `diff: ${url1} vs ${url2}`);
}
// ─── Snapshot ─────────────────────────────────────
case 'snapshot': {
const isScoped = tokenInfo && tokenInfo.clientId !== 'root';
const snapshotResult = await handleSnapshot(args, session, {
splitForScoped: !!isScoped,
});
// Scoped tokens get split format (refs outside envelope); root gets basic wrapping
if (isScoped) {
return snapshotResult; // already has envelope from split format
}
return wrapUntrustedContent(snapshotResult, bm.getCurrentUrl());
}
// ─── Handoff ────────────────────────────────────
case 'handoff': {
const message = args.join(' ') || 'User takeover requested';
return await bm.handoff(message);
}
case 'resume': {
bm.resume();
// Re-snapshot to capture current page state after human interaction
const isScoped2 = tokenInfo && tokenInfo.clientId !== 'root';
const snapshot = await handleSnapshot(['-i'], session, { splitForScoped: !!isScoped2 });
if (isScoped2) {
return `RESUMED\n${snapshot}`;
}
return `RESUMED\n${wrapUntrustedContent(snapshot, bm.getCurrentUrl())}`;
}
// ─── Headed Mode ──────────────────────────────────────
case 'connect': {
// connect is handled as a pre-server command in cli.ts
// If we get here, server is already running — tell the user
if (bm.getConnectionMode() === 'headed') {
return 'Already in headed mode with extension.';
}
return 'The connect command must be run from the CLI (not sent to a running server). Run: $B connect';
}
case 'disconnect': {
if (bm.getConnectionMode() !== 'headed') {
return 'Not in headed mode — nothing to disconnect.';
}
// Signal that we want a restart in headless mode
console.log('[browse] Disconnecting headed browser. Restarting in headless mode.');
await shutdown();
return 'Disconnected. Server will restart in headless mode on next command.';
}
case 'focus': {
if (bm.getConnectionMode() !== 'headed') {
return 'focus requires headed mode. Run `$B connect` first.';
}
try {
const { execSync } = await import('child_process');
// Try common Chromium-based browser app names to bring to foreground
const appNames = ['Comet', 'Google Chrome', 'Arc', 'Brave Browser', 'Microsoft Edge'];
let activated = false;
for (const appName of appNames) {
try {
execSync(`osascript -e 'tell application "${appName}" to activate'`, { stdio: 'pipe', timeout: 3000 });
activated = true;
break;
} catch (err: any) {
// Try next browser — osascript fails if app not found or AppleScript errors
if (err?.status === undefined && !err?.message?.includes('Command failed')) throw err;
}
}
if (!activated) {
return 'Could not bring browser to foreground. macOS only.';
}
// If a ref was passed, scroll it into view
if (args.length > 0 && args[0].startsWith('@')) {
try {
const resolved = await bm.resolveRef(args[0]);
if ('locator' in resolved) {
await resolved.locator.scrollIntoViewIfNeeded({ timeout: 5000 });
return `Browser activated. Scrolled ${args[0]} into view.`;
}
} catch (err: any) {
// Ref not found or element gone — still activated the browser
if (!err?.message?.includes('not found') && !err?.message?.includes('closed') && !err?.message?.includes('Target') && !err?.message?.includes('timeout')) throw err;
}
}
return 'Browser window activated.';
} catch (err: any) {
return `focus failed: ${err.message}. macOS only.`;
}
}
// ─── Watch ──────────────────────────────────────────
case 'watch': {
if (args[0] === 'stop') {
if (!bm.isWatching()) return 'Not currently watching.';
const result = bm.stopWatch();
const durationSec = Math.round(result.duration / 1000);
const lastSnapshot = result.snapshots.length > 0
? wrapUntrustedContent(result.snapshots[result.snapshots.length - 1], bm.getCurrentUrl())
: '(none)';
return [
`WATCH STOPPED (${durationSec}s, ${result.snapshots.length} snapshots)`,
'',
'Last snapshot:',
lastSnapshot,
].join('\n');
}
if (bm.isWatching()) return 'Already watching. Run `$B watch stop` to stop.';
if (bm.getConnectionMode() !== 'headed') {
return 'watch requires headed mode. Run `$B connect` first.';
}
bm.startWatch();
return 'WATCHING — observing user browsing. Periodic snapshots every 5s.\nRun `$B watch stop` to stop and get summary.';
}
// ─── Inbox ──────────────────────────────────────────
case 'inbox': {
const { execSync } = await import('child_process');
let gitRoot: string;
try {
gitRoot = execSync('git rev-parse --show-toplevel', { encoding: 'utf-8', stdio: ['pipe', 'pipe', 'pipe'] }).trim();
} catch (err: any) {
// execSync throws with exit status on non-git directories
if (err?.status === undefined && !err?.message?.includes('Command failed')) throw err;
return 'Not in a git repository — cannot locate inbox.';
}
const inboxDir = path.join(gitRoot, '.context', 'sidebar-inbox');
if (!fs.existsSync(inboxDir)) return 'Inbox empty.';
const files = fs.readdirSync(inboxDir)
.filter(f => f.endsWith('.json') && !f.startsWith('.'))
.sort()
.reverse(); // newest first
if (files.length === 0) return 'Inbox empty.';
const messages: { timestamp: string; url: string; userMessage: string }[] = [];
for (const file of files) {
try {
const data = JSON.parse(fs.readFileSync(path.join(inboxDir, file), 'utf-8'));
messages.push({
timestamp: data.timestamp || '',
url: data.page?.url || 'unknown',
userMessage: data.userMessage || '',
});
} catch (err: any) {
// Skip malformed JSON or unreadable files
if (!(err instanceof SyntaxError) && err?.code !== 'ENOENT' && err?.code !== 'EACCES') throw err;
}
}
if (messages.length === 0) return 'Inbox empty.';
const lines: string[] = [];
lines.push(`SIDEBAR INBOX (${messages.length} message${messages.length === 1 ? '' : 's'})`);
lines.push('────────────────────────────────');
for (const msg of messages) {
const ts = msg.timestamp ? `[${msg.timestamp}]` : '[unknown]';
lines.push(`${ts} ${wrapUntrustedContent(msg.url, 'inbox-url')}`);
lines.push(` "${wrapUntrustedContent(msg.userMessage, 'inbox-message')}"`);
lines.push('');
}
lines.push('────────────────────────────────');
// Handle --clear flag
if (args.includes('--clear')) {
for (const file of files) {
try { fs.unlinkSync(path.join(inboxDir, file)); } catch (err: any) { if (err?.code !== 'ENOENT') throw err; }
}
lines.push(`Cleared ${files.length} message${files.length === 1 ? '' : 's'}.`);
}
return lines.join('\n');
}
// ─── State ────────────────────────────────────────
case 'state': {
const [action, name] = args;
if (!action || !name) throw new Error('Usage: state save|load <name>');
// Sanitize name: alphanumeric + hyphens + underscores only
if (!/^[a-zA-Z0-9_-]+$/.test(name)) {
throw new Error('State name must be alphanumeric (a-z, 0-9, _, -)');
}
const config = resolveConfig();
const stateDir = path.join(config.stateDir, 'browse-states');
fs.mkdirSync(stateDir, { recursive: true });
const statePath = path.join(stateDir, `${name}.json`);
if (action === 'save') {
const state = await bm.saveState();
// V1: cookies + URLs only (not localStorage — breaks on load-before-navigate)
const saveData = {
version: 1,
savedAt: new Date().toISOString(),
cookies: state.cookies,
pages: state.pages.map(p => ({ url: p.url, isActive: p.isActive })),
};
fs.writeFileSync(statePath, JSON.stringify(saveData, null, 2), { mode: 0o600 });
return `State saved: ${statePath} (${state.cookies.length} cookies, ${state.pages.length} pages)\n⚠️ Cookies stored in plaintext. Delete when no longer needed.`;
}
if (action === 'load') {
if (!fs.existsSync(statePath)) throw new Error(`State not found: ${statePath}`);
const data = JSON.parse(fs.readFileSync(statePath, 'utf-8'));
if (!Array.isArray(data.cookies) || !Array.isArray(data.pages)) {
throw new Error('Invalid state file: expected cookies and pages arrays');
}
// Validate and filter cookies — reject malformed or internal-network cookies
const validatedCookies = data.cookies.filter((c: any) => {
if (typeof c !== 'object' || !c) return false;
if (typeof c.name !== 'string' || typeof c.value !== 'string') return false;
if (typeof c.domain !== 'string' || !c.domain) return false;
const d = c.domain.startsWith('.') ? c.domain.slice(1) : c.domain;
if (d === 'localhost' || d.endsWith('.internal') || d === '169.254.169.254') return false;
return true;
});
if (validatedCookies.length < data.cookies.length) {
console.warn(`[browse] Filtered ${data.cookies.length - validatedCookies.length} invalid cookies from state file`);
}
// Warn on state files older than 7 days
if (data.savedAt) {
const ageMs = Date.now() - new Date(data.savedAt).getTime();
const SEVEN_DAYS = 7 * 24 * 60 * 60 * 1000;
if (ageMs > SEVEN_DAYS) {
console.warn(`[browse] Warning: State file is ${Math.round(ageMs / 86400000)} days old. Consider re-saving.`);
}
}
// Close existing pages, then restore (replace, not merge)
bm.setFrame(null);
await bm.closeAllPages();
// Allowlist disk-loaded page fields — NEVER accept loadedHtml, loadedHtmlWaitUntil,
// or owner from disk. Those are in-memory-only invariants; allowing them would let
// a tampered state file smuggle HTML past load-html's safe-dirs + magic-byte + size
// checks, or forge tab ownership for cross-agent authorization bypass.
await bm.restoreState({
cookies: validatedCookies,
pages: data.pages.map((p: any) => ({
url: typeof p.url === 'string' ? p.url : '',
isActive: Boolean(p.isActive),
storage: null,
})),
});
return `State loaded: ${data.cookies.length} cookies, ${data.pages.length} pages`;
}
throw new Error('Usage: state save|load <name>');
}
// ─── Frame ───────────────────────────────────────
case 'frame': {
const target = args[0];
if (!target) throw new Error('Usage: frame <selector|@ref|--name name|--url pattern|main>');
if (target === 'main') {
bm.setFrame(null);
bm.clearRefs();
return 'Switched to main frame';
}
const page = bm.getPage();
let frame: Frame | null = null;
if (target === '--name') {
if (!args[1]) throw new Error('Usage: frame --name <name>');
frame = page.frame({ name: args[1] });
} else if (target === '--url') {
if (!args[1]) throw new Error('Usage: frame --url <pattern>');
frame = page.frame({ url: new RegExp(escapeRegExp(args[1])) });
} else {
// CSS selector or @ref for the iframe element
const resolved = await bm.resolveRef(target);
const locator = 'locator' in resolved ? resolved.locator : page.locator(resolved.selector);
const elementHandle = await locator.elementHandle({ timeout: 5000 });
frame = await elementHandle?.contentFrame() ?? null;
await elementHandle?.dispose();
}
if (!frame) throw new Error(`Frame not found: ${target}`);
bm.setFrame(frame);
bm.clearRefs();
return `Switched to frame: ${frame.url()}`;
}
// ─── UX Audit ─────────────────────────────────────
case 'ux-audit': {
const page = bm.getPage();
// Extract page structure for UX behavioral analysis
// Agent interprets the data and applies Krug's 6 usability tests
// Uses textContent (not innerText) to avoid layout computation on large DOMs
const data = await page.evaluate(() => {
const HEADING_CAP = 50;
const INTERACTIVE_CAP = 200;
const TEXT_BLOCK_CAP = 50;
// Site ID: logo or brand element
const logoEl = document.querySelector('[class*="logo"], [id*="logo"], header img, [aria-label*="home"], a[href="/"]');
const siteId = logoEl ? {
found: true,
text: (logoEl.textContent || '').trim().slice(0, 100),
tag: logoEl.tagName,
alt: (logoEl as HTMLImageElement).alt || null,
} : { found: false, text: null, tag: null, alt: null };
// Page name: main heading
const h1 = document.querySelector('h1');
const pageName = h1 ? {
found: true,
text: h1.textContent?.trim().slice(0, 200) || '',
} : { found: false, text: null };
// Navigation: primary nav elements
const navEls = document.querySelectorAll('nav, [role="navigation"]');
const navItems: Array<{ text: string; links: number }> = [];
navEls.forEach((nav, i) => {
if (i >= 5) return;
const links = nav.querySelectorAll('a');
navItems.push({
text: (nav.getAttribute('aria-label') || `nav-${i}`).slice(0, 50),
links: links.length,
});
});
// "You are here" indicator: current/active nav items
// Scoped to nav containers to avoid false positives from animation classes
const activeNavItems = document.querySelectorAll('nav [aria-current], nav .active, nav .current, [role="navigation"] [aria-current], [role="navigation"] .active, [role="navigation"] .current');
const youAreHere = Array.from(activeNavItems).slice(0, 5).map(el => ({
text: (el.textContent || '').trim().slice(0, 50),
tag: el.tagName,
}));
// Search: search box presence
const searchEl = document.querySelector('input[type="search"], [role="search"], input[name*="search"], input[placeholder*="search" i], input[aria-label*="search" i]');
const search = { found: !!searchEl };
// Breadcrumbs
const breadcrumbEl = document.querySelector('[aria-label*="breadcrumb" i], .breadcrumb, .breadcrumbs, [class*="breadcrumb"]');
const breadcrumbs = breadcrumbEl ? {
found: true,
items: Array.from(breadcrumbEl.querySelectorAll('a, span, li')).slice(0, 10).map(el => (el.textContent || '').trim().slice(0, 30)),
} : { found: false, items: [] };
// Headings: heading hierarchy
const headings = Array.from(document.querySelectorAll('h1,h2,h3,h4,h5,h6')).slice(0, HEADING_CAP).map(h => ({
tag: h.tagName,
text: (h.textContent || '').trim().slice(0, 80),
size: getComputedStyle(h).fontSize,
}));
// Interactive elements: buttons, links, inputs
const interactiveEls = Array.from(document.querySelectorAll('a, button, input, select, textarea, [role="button"], [tabindex]')).slice(0, INTERACTIVE_CAP);
const interactive = interactiveEls.map(el => {
const rect = el.getBoundingClientRect();
return {
tag: el.tagName,
text: (el.textContent || (el as HTMLInputElement).placeholder || '').trim().slice(0, 50),
type: (el as HTMLInputElement).type || null,
role: el.getAttribute('role'),
w: Math.round(rect.width),
h: Math.round(rect.height),
visible: rect.width > 0 && rect.height > 0,
};
}).filter(el => el.visible);
// Text blocks: paragraphs and large text areas
const textBlocks = Array.from(document.querySelectorAll('p, [class*="description"], [class*="intro"], [class*="welcome"], [class*="hero"] p, main p')).slice(0, TEXT_BLOCK_CAP).map(el => ({
text: (el.textContent || '').trim().slice(0, 200),
wordCount: (el.textContent || '').trim().split(/\s+/).filter(Boolean).length,
}));
// Total visible text word count (textContent avoids layout computation)
const bodyText = (document.body?.textContent || '').trim();
const totalWords = bodyText.split(/\s+/).filter(Boolean).length;
return {
url: window.location.href,
title: document.title,
siteId,
pageName,
navigation: navItems,
youAreHere,
search,
breadcrumbs,
headings,
interactive,
textBlocks,
totalWords,
};
});
return JSON.stringify(data, null, 2);
}
default:
throw new Error(`Unknown meta command: ${command}`);
}
}