feat(browse): Puppeteer parity — load-html, screenshot --selector, viewport --scale, file:// (v1.1.0.0) (#1062)

* feat(browse): TabSession loadedHtml + command aliases + DX polish primitives

Adds the foundation layer for Puppeteer-parity features:

- TabSession.loadedHtml + setTabContent/getLoadedHtml/clearLoadedHtml —
  enables load-html content to survive context recreation (viewport --scale)
  via in-memory replay. ASCII lifecycle diagram in the source explains the
  clear-before-navigation contract.

- COMMAND_ALIASES + canonicalizeCommand() helper — single source of truth
  for name aliases (setcontent / set-content / setContent → load-html),
  consumed by server dispatch and chain prevalidation.

- buildUnknownCommandError() pure function — rich error messages with
  Levenshtein-based "Did you mean" suggestions (distance ≤ 2, input
  length ≥ 4 to skip 2-letter noise) and NEW_IN_VERSION upgrade hints.

- load-html registered in WRITE_COMMANDS + SCOPE_WRITE so scoped write
  tokens can use it.

- screenshot and viewport descriptions updated for upcoming flags.

- New browse/test/dx-polish.test.ts (15 tests): alias canonicalization,
  Levenshtein threshold + alphabetical tiebreak, short-input guard,
  NEW_IN_VERSION upgrade hint, alias + scope integration invariants.

No consumers yet — pure additive foundation. Safe to bisect on its own.

* feat(browse): accept file:// in goto with smart cwd/home-relative parsing

Extends validateNavigationUrl to accept file:// URLs scoped to safe dirs
(cwd + TEMP_DIR) via the existing validateReadPath policy. The workhorse is a
new normalizeFileUrl() helper that handles non-standard relative forms BEFORE
the WHATWG URL parser sees them:

    file:///abs/path.html       → unchanged
    file://./docs/page.html     → file://<cwd>/docs/page.html
    file://~/Documents/page.html → file://<HOME>/Documents/page.html
    file://docs/page.html       → file://<cwd>/docs/page.html
    file://localhost/abs/path   → unchanged
    file://host.example.com/... → rejected (UNC/network)
    file:// and file:///        → rejected (would list a directory)

Host heuristic rejects segments with '.', ':', '\\', '%', IPv6 brackets, or
Windows drive-letter patterns — so file://docs.v1/page.html, file://127.0.0.1/x,
file://[::1]/x, and file://C:/Users/x are explicit errors.

Uses fileURLToPath() + pathToFileURL() from node:url (never string-concat) so
URL escapes like %20 decode correctly and Node rejects encoded-slash traversal
(%2F..%2F) outright.

Signature change: validateNavigationUrl now returns Promise<string> (the
normalized URL) instead of Promise<void>. Existing callers that ignore the
return value still compile — they just don't benefit from smart-parsing until
updated in follow-up commits. Callers will be migrated in the next few commits
(goto, diff, newTab, restoreState).

Rewrites the url-validation test file: updates existing tests for the new
return type, adds 20+ new tests covering every normalizeFileUrl shape variant,
URL-encoding edge cases, and path-traversal rejection.

References: codex consult v3 P1 findings on URL parser semantics and fileURLToPath.

* feat(browse): BrowserManager deviceScaleFactor + setContent replay + file:// plumbing

Three tightly-coupled changes to BrowserManager, all in service of the
Puppeteer-parity workflow:

1. deviceScaleFactor + currentViewport tracking. New private fields (default
   scale=1, viewport=1280x720) + setDeviceScaleFactor(scale, w, h) method.
   deviceScaleFactor is a context-level Playwright option — changing it
   requires recreateContext(). The method validates (finite number, 1-3 cap,
   headed-mode rejected), stores new values, calls recreateContext(), and
   rolls back the fields on failure so a bad call doesn't leave inconsistent
   state. Context options at all three sites (launch, recreate happy path,
   recreate fallback) now honor the stored values instead of hardcoding
   1280x720.

2. BrowserState.loadedHtml + loadedHtmlWaitUntil. saveState captures per-tab
   loadedHtml from the session; restoreState replays it via newSession.
   setTabContent() — NOT bare page.setContent() — so TabSession.loadedHtml
   is rehydrated and survives *subsequent* scale changes. In-memory only,
   never persisted to disk (HTML may contain secrets or customer data).

3. newTab + restoreState now consume validateNavigationUrl's normalized
   return value. file://./x, file://~/x, and bare-segment forms now take
   effect at every navigation site, not just the top-level goto command.

Together these enable: load-html → viewport --scale 2 → viewport --scale 1.5
→ screenshot, with content surviving both context recreations. Codex v2 P0
flagged that bare page.setContent in restoreState would lose content on the
second scale change — this commit implements the rehydration path.

References: codex v2 P0 (TabSession rehydration), codex v3 P1 (4-caller
return value), plan Feature 3 + Feature 4.

* feat(browse): load-html, screenshot --selector, viewport --scale, alias dispatch

Wires the new handlers and dispatch logic that the previous commits made
possible:

write-commands.ts
- New 'load-html' case: validateReadPath for safe-dir scoping, stat-based
  actionable errors (not found, directory, oversize), extension allowlist
  (.html/.htm/.xhtml/.svg), magic-byte sniff with UTF-8 BOM strip accepting
  any <[a-zA-Z!?] markup opener (not just <!doctype — bare fragments like
  <div>...</div> work for setContent), 50MB cap via GSTACK_BROWSE_MAX_HTML_BYTES
  override, frame-context rejection. Calls session.setTabContent() so replay
  metadata is rehydrated.
- viewport command extended: optional [<WxH>], optional [--scale <n>],
  scale-only variant reads current size via page.viewportSize(). Invalid
  scale (NaN, Infinity, empty, out of 1-3) throws with named value. Headed
  mode rejected explicitly.
- clearLoadedHtml() called BEFORE goto/back/forward/reload navigation
  (not after) so a timed-out goto post-commit doesn't leave stale metadata
  that could resurrect on a later context recreation. Codex v2 P1 catch.
- goto uses validateNavigationUrl's normalized return value.

meta-commands.ts
- screenshot --selector <css> flag: explicit element-screenshot form.
  Rejects alongside positional selector (both = error), preserves --clip
  conflict at line 161, composes with --base64 at lines 168-174.
- chain canonicalizes each step with canonicalizeCommand — step shape is
  now { rawName, name, args } so prevalidation, dispatch, WRITE_COMMANDS.has,
  watch blocking, and result labels all use canonical names while audit
  labels show 'rawName→name' when aliased. Codex v3 P2 catch — prior shape
  only canonicalized at prevalidation and diverged everywhere else.
- diff command consumes validateNavigationUrl return value for both URLs.

server.ts
- Command canonicalization inserted immediately after parse, before scope /
  watch / tab-ownership / content-wrapping checks. rawCommand preserved for
  future audit (not wired into audit log in this commit — follow-up).
- Unknown-command handler replaced with buildUnknownCommandError() from
  commands.ts — produces 'Unknown command: X. Did you mean Y?' with optional
  upgrade hint for NEW_IN_VERSION entries.

security-audit-r2.test.ts
- Updated chain-loop marker from 'for (const cmd of commands)' to
  'for (const c of commands)' to match the new chain step shape. Same
  isWatching + BLOCKED invariants still asserted.

* chore: bump version and changelog (v1.1.0.0)

- VERSION: 1.0.0.0 → 1.1.0.0 (MINOR bump — new user-facing commands)
- package.json: matching version bump
- CHANGELOG.md: new 1.1.0.0 entry describing load-html, screenshot --selector,
  viewport --scale, file:// support, setContent replay, and DX polish in user
  voice with a dedicated Security section for file:// safe-dirs policy
- browse/SKILL.md.tmpl: adds pattern #12 "Render local HTML", pattern #13
  "Retina screenshots", and a full Puppeteer → browse cheatsheet with side-by-
  side API mapping and a worked tweet-renderer migration example
- browse/SKILL.md + SKILL.md: regenerated from templates via `bun run gen:skill-docs`
  to reflect the new command descriptions

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix: pre-landing review fixes (9 findings from specialist + adversarial review)

Adversarial review (Claude subagent + Codex) surfaced 9 bugs across
CRITICAL/HIGH severity. All fixed:

1. tab-session.ts:setTabContent — state mutation moved AFTER the setContent
   await. Prior order left phantom HTML in replay metadata if setContent
   threw (timeout, browser crash), which a later viewport --scale would
   silently replay. Now loadedHtml is only recorded on successful load.

2. browser-manager.ts:setDeviceScaleFactor — rollback now forces a second
   recreateContext after restoring the old fields. The fallback path in
   the original recreateContext builds a blank context using whatever
   this.deviceScaleFactor/currentViewport hold at that moment (which were
   the NEW values we were trying to apply). Rolling back the fields without
   a second recreate left the live context at new-scale while state tracked
   old-scale. Now: restore fields, force re-recreate with old values, only
   if that ALSO fails do we return a combined error.

3. commands.ts:buildUnknownCommandError — Levenshtein tiebreak simplified
   to 'd <= 2 && d < bestDist' (strict less). Candidates are pre-sorted
   alphabetically, so first equal-distance wins by default. The prior
   '(d === bestDist && best !== undefined && cand < best)' clause was dead
   code.

4. tab-session.ts:onMainFrameNavigated — now clears loadedHtml, not just
   refs + frame. Without this, a user who load-html'd then clicked a link
   (or had a form submit / JS redirect / OAuth flow) would retain the stale
   replay metadata. The next viewport --scale would silently revert the
   tab to the ORIGINAL loaded HTML, losing whatever the post-navigation
   content was. Silent data corruption. Browser-emitted navigations trigger
   this path via wirePageEvents.

5. browser-manager.ts:saveState + restoreState — tab ownership now flows
   through BrowserState.owner. Without this, a scoped agent's viewport
   --scale would strand them: tab IDs change during recreate, ownership
   map held stale IDs, owner lookup failed. New IDs had no owner, so
   writes without tabId were denied (DoS). Worse, if the agent sent a
   stale tabId the server's swallowed-tab-switch-error path would let the
   command hit whatever tab was currently active (cross-tab authz bypass).
   Now: clear ownership before restore, re-add per-tab with new IDs.

6. meta-commands.ts:state load — disk-loaded state.pages is now explicit
   allowlist (url, isActive, storage:null) instead of object spread.
   Spreading accepted loadedHtml, loadedHtmlWaitUntil, and owner from a
   user-writable state file, letting a tampered state.json smuggle HTML
   past load-html's safe-dirs / extension / magic-byte / 50MB-cap
   validators, or forge tab ownership. Now stripped at the boundary.

7. url-validation.ts:normalizeFileUrl — preserves query string + fragment
   across normalization. file://./app.html?route=home#login previously
   resolved to a filesystem path that URL-encoded '?' as %3F and '#' as
   %23, or (for absolute forms) pathToFileURL dropped them entirely. SPAs
   and fixture URLs with query params 404'd or loaded the wrong route.
   Now: split on ?/# before path resolution, reattach after.

8. url-validation.ts:validateNavigationUrl — reattaches parsed.search +
   parsed.hash to the normalized file:// URL. Same fix at the main
   validator for absolute paths that go through fileURLToPath round-trip.

9. server.ts:writeAuditEntry — audit entries now include aliasOf when the
   user typed an alias ('setcontent' → cmd: 'load-html', aliasOf:
   'setcontent'). Previously the isAliased variable was computed but
   dropped, losing the raw input from the forensic trail. Completes the
   plan's codex v3 P2 requirement.

Also added bm.getCurrentViewport() and switched 'viewport --scale'-
without-size to read from it (more reliable than page.viewportSize() on
headed/transition contexts).

Tests pass: exit 0, no failures. Build clean.

* test: integration coverage for load-html, screenshot --selector, viewport --scale, replay, aliases

Adds 28 Playwright-integration tests that close the coverage gap flagged
by the ship-workflow coverage audit (50% → expected ~80%+).

**load-html (12 tests):**
- happy path loads HTML file, page text matches
- bare HTML fragments (<div>...</div>) accepted, not just full documents
- missing file arg throws usage
- non-.html extension rejected by allowlist
- /etc/passwd.html rejected by safe-dirs policy
- ENOENT path rejected with actionable "not found" error
- directory target rejected
- binary file (PNG magic bytes) disguised as .html rejected by magic-byte check
- UTF-8 BOM stripped before magic-byte check — BOM-prefixed HTML accepted
- --wait-until networkidle exercises non-default branch
- invalid --wait-until value rejected
- unknown flag rejected

**screenshot --selector (5 tests):**
- --selector flag captures element, validates Screenshot saved (element)
- conflicts with positional selector (both = error)
- conflicts with --clip (mutually exclusive)
- composes with --base64 (returns data:image/png;base64,...)
- missing value throws usage

**viewport --scale (5 tests):**
- WxH --scale 2 produces PNG with 2x element dimensions (parses IHDR bytes 16-23)
- --scale without WxH keeps current size + applies scale
- non-finite value (abc) throws "not a finite number"
- out-of-range (4, 0.5) throws "between 1 and 3"
- missing value throws

**setContent replay across context recreation (3 tests):**
- load-html → viewport --scale 2: content survives (hits setTabContent replay path)
- double cycle 2x → 1.5x: content still survives (proves TabSession rehydration)
- goto after load-html clears replay: subsequent viewport --scale does NOT
  resurrect the stale HTML (validates the onMainFrameNavigated fix)

**Command aliases (2 tests):**
- setcontent routes to load-html via chain canonicalization
- set-content (hyphenated) also routes — both end-to-end through chain dispatch

Fixture paths use /tmp (SAFE_DIRECTORIES entry) instead of $TMPDIR which is
/var/folders/... on macOS and outside the safe-dirs boundary. Chain result
labels use rawName→name format when an alias is resolved (matches the
meta-commands.ts chain refactor).

Full suite: exit 0, 223/223 pass.

* docs: update BROWSER.md + CHANGELOG for v1.1.0.0

BROWSER.md:
- Command reference table updated: goto now lists file:// support,
  load-html added to Navigate row, viewport flagged with --scale
  option, screenshot row shows --selector + --base64 flags
- Screenshot modes table adds the fifth mode (element crop via
  --selector flag) and notes the tag-selector-not-caught-positionally
  gotcha
- New "Retina screenshots — viewport --scale" subsection explains
  deviceScaleFactor mechanics, context recreation side effects, and
  headed-mode rejection
- New "Loading local HTML — goto file:// vs load-html" subsection
  explains the two paths, their tradeoffs (URL state, relative asset
  resolution), the safe-dirs policy, extension allowlist + magic-byte
  sniff, 50MB cap, setContent replay across recreateContext, and the
  alias routing (setcontent → load-html before scope check)

CHANGELOG.md (v1.1.0.0 security section expanded, no existing content
removed):
- State files cannot smuggle HTML or forge tab ownership (allowlist
  on disk-loaded page fields)
- Audit log records aliasOf when a canonical command was reached via
  an alias (setcontent → load-html)
- load-html content clears on real navigations (clicks, form submits,
  JS redirects) — not just explicit goto. Also notes SPA query/fragment
  preservation for goto file://

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Garry Tan
2026-04-18 23:25:33 +08:00
committed by GitHub
parent 4d2c8d94d0
commit c15b805cd8
20 changed files with 1439 additions and 92 deletions
+4
View File
@@ -18,6 +18,9 @@ import * as fs from 'fs';
export interface AuditEntry {
ts: string;
cmd: string;
/** If the agent typed an alias (e.g. 'setcontent'), the raw input is preserved here
* while `cmd` holds the canonical name ('load-html'). Omitted when cmd === rawCmd. */
aliasOf?: string;
args: string;
origin: string;
durationMs: number;
@@ -56,6 +59,7 @@ export function writeAuditEntry(entry: AuditEntry): void {
hasCookies: entry.hasCookies,
mode: entry.mode,
};
if (entry.aliasOf) record.aliasOf = entry.aliasOf;
if (truncatedError) record.error = truncatedError;
fs.appendFileSync(auditPath, JSON.stringify(record) + '\n');
+132 -13
View File
@@ -31,6 +31,18 @@ export interface BrowserState {
url: string;
isActive: boolean;
storage: { localStorage: Record<string, string>; sessionStorage: Record<string, string> } | null;
/**
* HTML content loaded via load-html (setContent), replayed after context recreation.
* In-memory only — never persisted to disk (HTML may contain secrets or customer data).
*/
loadedHtml?: string;
loadedHtmlWaitUntil?: 'load' | 'domcontentloaded' | 'networkidle';
/**
* Tab owner clientId for multi-agent isolation. Survives context recreation so
* scoped agents don't get locked out of their own tabs after viewport --scale.
* In-memory only.
*/
owner?: string;
}>;
}
@@ -44,6 +56,14 @@ export class BrowserManager {
private extraHeaders: Record<string, string> = {};
private customUserAgent: string | null = null;
// ─── Viewport + deviceScaleFactor (context options) ──────────
// Tracked at the manager level so recreateContext() preserves them.
// deviceScaleFactor is a *context* option, not a page-level setter — changes
// require recreateContext(). Viewport width/height can change on-page, but we
// track the latest so context recreation restores it instead of hardcoding 1280x720.
private deviceScaleFactor: number = 1;
private currentViewport: { width: number; height: number } = { width: 1280, height: 720 };
/** Server port — set after server starts, used by cookie-import-browser command */
public serverPort: number = 0;
@@ -197,7 +217,8 @@ export class BrowserManager {
});
const contextOptions: BrowserContextOptions = {
viewport: { width: 1280, height: 720 },
viewport: { width: this.currentViewport.width, height: this.currentViewport.height },
deviceScaleFactor: this.deviceScaleFactor,
};
if (this.customUserAgent) {
contextOptions.userAgent = this.customUserAgent;
@@ -550,9 +571,12 @@ export class BrowserManager {
async newTab(url?: string, clientId?: string): Promise<number> {
if (!this.context) throw new Error('Browser not launched');
// Validate URL before allocating page to avoid zombie tabs on rejection
// Validate URL before allocating page to avoid zombie tabs on rejection.
// Use the normalized return value for navigation — it handles file://./x and
// file://<segment> cwd-relative forms that the standard URL parser doesn't.
let normalizedUrl: string | undefined;
if (url) {
await validateNavigationUrl(url);
normalizedUrl = await validateNavigationUrl(url);
}
const page = await this.context.newPage();
@@ -569,8 +593,8 @@ export class BrowserManager {
// Wire up console/network/dialog capture
this.wirePageEvents(page);
if (url) {
await page.goto(url, { waitUntil: 'domcontentloaded', timeout: 15000 });
if (normalizedUrl) {
await page.goto(normalizedUrl, { waitUntil: 'domcontentloaded', timeout: 15000 });
}
return id;
@@ -792,6 +816,7 @@ export class BrowserManager {
// ─── Viewport ──────────────────────────────────────────────
async setViewport(width: number, height: number) {
this.currentViewport = { width, height };
await this.getPage().setViewportSize({ width, height });
}
@@ -858,10 +883,21 @@ export class BrowserManager {
sessionStorage: { ...sessionStorage },
}));
} catch {}
// Capture load-html content so a later context recreation (viewport --scale)
// can replay it via setTabContent. Never persisted to disk.
const session = this.tabSessions.get(id);
const loaded = session?.getLoadedHtml();
// Preserve tab ownership through recreation so scoped agents aren't locked out.
const owner = this.tabOwnership.get(id);
pages.push({
url: url === 'about:blank' ? '' : url,
isActive: id === this.activeTabId,
storage,
loadedHtml: loaded?.html,
loadedHtmlWaitUntil: loaded?.waitUntil,
owner,
});
}
@@ -881,25 +917,49 @@ export class BrowserManager {
await this.context.addCookies(state.cookies);
}
// Clear stale ownership — the old tab IDs are gone. We'll re-add per-tab
// owners below as each saved tab gets a fresh ID. Without this reset, old
// tabId → clientId entries would linger and match new tabs with the same
// sequential IDs, silently granting ownership to the wrong clients.
this.tabOwnership.clear();
// Re-create pages
let activeId: number | null = null;
for (const saved of state.pages) {
const page = await this.context.newPage();
const id = this.nextTabId++;
this.pages.set(id, page);
this.tabSessions.set(id, new TabSession(page));
const newSession = new TabSession(page);
this.tabSessions.set(id, newSession);
this.wirePageEvents(page);
if (saved.url) {
// Validate the saved URL before navigating — the state file is user-writable and
// a tampered URL could navigate to cloud metadata endpoints or file:// URIs.
// Restore tab ownership for the new ID — preserves scoped-agent isolation
// across context recreation (viewport --scale, user-agent change, handoff).
if (saved.owner) {
this.tabOwnership.set(id, saved.owner);
}
if (saved.loadedHtml) {
// Replay load-html content via setTabContent — this rehydrates
// TabSession.loadedHtml so the next saveState sees it. page.setContent()
// alone would restore the DOM but lose the replay metadata.
try {
await validateNavigationUrl(saved.url);
await newSession.setTabContent(saved.loadedHtml, { waitUntil: saved.loadedHtmlWaitUntil });
} catch (err: any) {
console.warn(`[browse] Failed to replay loadedHtml for tab ${id}: ${err.message}`);
}
} else if (saved.url) {
// Validate the saved URL before navigating — the state file is user-writable and
// a tampered URL could navigate to cloud metadata endpoints. Use the normalized
// return value so file:// forms get consistent treatment with live goto.
let normalizedUrl: string;
try {
normalizedUrl = await validateNavigationUrl(saved.url);
} catch (err: any) {
console.warn(`[browse] Skipping invalid URL in state file: ${saved.url}${err.message}`);
continue;
}
await page.goto(saved.url, { waitUntil: 'domcontentloaded', timeout: 15000 }).catch(() => {});
await page.goto(normalizedUrl, { waitUntil: 'domcontentloaded', timeout: 15000 }).catch(() => {});
}
if (saved.storage) {
@@ -960,7 +1020,8 @@ export class BrowserManager {
// 3. Create new context with updated settings
const contextOptions: BrowserContextOptions = {
viewport: { width: 1280, height: 720 },
viewport: { width: this.currentViewport.width, height: this.currentViewport.height },
deviceScaleFactor: this.deviceScaleFactor,
};
if (this.customUserAgent) {
contextOptions.userAgent = this.customUserAgent;
@@ -983,7 +1044,8 @@ export class BrowserManager {
if (this.context) await this.context.close().catch(() => {});
const contextOptions: BrowserContextOptions = {
viewport: { width: 1280, height: 720 },
viewport: { width: this.currentViewport.width, height: this.currentViewport.height },
deviceScaleFactor: this.deviceScaleFactor,
};
if (this.customUserAgent) {
contextOptions.userAgent = this.customUserAgent;
@@ -998,6 +1060,63 @@ export class BrowserManager {
}
}
/**
* Change deviceScaleFactor + viewport size atomically.
*
* deviceScaleFactor is a context-level option, so Playwright requires a full context
* recreation. This method validates the input, stores the new values, calls
* recreateContext(), and rolls back the fields on failure so a bad call doesn't
* leave the manager in an inconsistent state.
*
* Returns null on success, or an error string if the new context couldn't be built
* (state may have been lost, per recreateContext's fallback behavior).
*/
async setDeviceScaleFactor(scale: number, width: number, height: number): Promise<string | null> {
if (!Number.isFinite(scale)) {
throw new Error(`viewport --scale: value must be a finite number, got ${scale}`);
}
if (scale < 1 || scale > 3) {
throw new Error(`viewport --scale: value must be between 1 and 3 (gstack policy cap), got ${scale}`);
}
if (this.connectionMode === 'headed') {
throw new Error('viewport --scale is not supported in headed mode — scale is controlled by the real browser window.');
}
const prevScale = this.deviceScaleFactor;
const prevViewport = { ...this.currentViewport };
this.deviceScaleFactor = scale;
this.currentViewport = { width, height };
const err = await this.recreateContext();
if (err !== null) {
// recreateContext's fallback path built a blank context using the NEW scale +
// viewport (the fields we just set). Rolling the fields back without a second
// recreate would leave the live context at new-scale while state says old-scale.
// Roll back fields FIRST, then force a second recreate against the old values
// so live state matches tracked state.
this.deviceScaleFactor = prevScale;
this.currentViewport = prevViewport;
const rollbackErr = await this.recreateContext();
if (rollbackErr !== null) {
// Second recreate also failed — we're in a clean blank slate via fallback, but
// with old scale. Return the original error so the caller sees the primary failure.
return `${err} (rollback also encountered: ${rollbackErr})`;
}
return err;
}
return null;
}
/** Read current deviceScaleFactor (for tests + debug). */
getDeviceScaleFactor(): number {
return this.deviceScaleFactor;
}
/** Read current tracked viewport (for tests + `viewport --scale` size fallback). */
getCurrentViewport(): { width: number; height: number } {
return { ...this.currentViewport };
}
// ─── Handoff: Headless → Headed ─────────────────────────────
/**
* Hand off browser control to the user by relaunching in headed mode.
+103 -3
View File
@@ -21,6 +21,7 @@ export const READ_COMMANDS = new Set([
export const WRITE_COMMANDS = new Set([
'goto', 'back', 'forward', 'reload',
'load-html',
'click', 'fill', 'select', 'hover', 'type', 'press', 'scroll', 'wait',
'viewport', 'cookie', 'cookie-import', 'cookie-import-browser', 'header', 'useragent',
'upload', 'dialog-accept', 'dialog-dismiss',
@@ -64,7 +65,8 @@ export function wrapUntrustedContent(result: string, url: string): string {
export const COMMAND_DESCRIPTIONS: Record<string, { category: string; description: string; usage?: string }> = {
// Navigation
'goto': { category: 'Navigation', description: 'Navigate to URL', usage: 'goto <url>' },
'goto': { category: 'Navigation', description: 'Navigate to URL (http://, https://, or file:// scoped to cwd/TEMP_DIR)', usage: 'goto <url>' },
'load-html': { category: 'Navigation', description: 'Load a local HTML file via setContent (no HTTP server needed). For self-contained HTML (inline CSS/JS, data URIs). For HTML on disk, goto file://... is often cleaner.', usage: 'load-html <file> [--wait-until load|domcontentloaded|networkidle]' },
'back': { category: 'Navigation', description: 'History back' },
'forward': { category: 'Navigation', description: 'History forward' },
'reload': { category: 'Navigation', description: 'Reload page' },
@@ -99,7 +101,7 @@ export const COMMAND_DESCRIPTIONS: Record<string, { category: string; descriptio
'scroll': { category: 'Interaction', description: 'Scroll element into view, or scroll to page bottom if no selector', usage: 'scroll [sel]' },
'wait': { category: 'Interaction', description: 'Wait for element, network idle, or page load (timeout: 15s)', usage: 'wait <sel|--networkidle|--load>' },
'upload': { category: 'Interaction', description: 'Upload file(s)', usage: 'upload <sel> <file> [file2...]' },
'viewport':{ category: 'Interaction', description: 'Set viewport size', usage: 'viewport <WxH>' },
'viewport':{ category: 'Interaction', description: 'Set viewport size and optional deviceScaleFactor (1-3, for retina screenshots). --scale requires a context rebuild.', usage: 'viewport [<WxH>] [--scale <n>]' },
'cookie': { category: 'Interaction', description: 'Set cookie on current page domain', usage: 'cookie <name>=<value>' },
'cookie-import': { category: 'Interaction', description: 'Import cookies from JSON file', usage: 'cookie-import <json>' },
'cookie-import-browser': { category: 'Interaction', description: 'Import cookies from installed Chromium browsers (opens picker, or use --domain for direct import)', usage: 'cookie-import-browser [browser] [--domain d]' },
@@ -112,7 +114,7 @@ export const COMMAND_DESCRIPTIONS: Record<string, { category: string; descriptio
'scrape': { category: 'Extraction', description: 'Bulk download all media from page. Writes manifest.json', usage: 'scrape <images|videos|media> [--selector sel] [--dir path] [--limit N]' },
'archive': { category: 'Extraction', description: 'Save complete page as MHTML via CDP', usage: 'archive [path]' },
// Visual
'screenshot': { category: 'Visual', description: 'Save screenshot (supports element crop via CSS/@ref, --clip region, --viewport)', usage: 'screenshot [--viewport] [--clip x,y,w,h] [selector|@ref] [path]' },
'screenshot': { category: 'Visual', description: 'Save screenshot. --selector targets a specific element (explicit flag form). Positional selectors starting with ./#/@/[ still work.', usage: 'screenshot [--selector <css>] [--viewport] [--clip x,y,w,h] [--base64] [selector|@ref] [path]' },
'pdf': { category: 'Visual', description: 'Save as PDF', usage: 'pdf [path]' },
'responsive': { category: 'Visual', description: 'Screenshots at mobile (375x812), tablet (768x1024), desktop (1280x720). Saves as {prefix}-mobile.png etc.', usage: 'responsive [prefix]' },
'diff': { category: 'Visual', description: 'Text diff between pages', usage: 'diff <url1> <url2>' },
@@ -161,3 +163,101 @@ for (const cmd of allCmds) {
for (const key of descKeys) {
if (!allCmds.has(key)) throw new Error(`COMMAND_DESCRIPTIONS has unknown command: ${key}`);
}
/**
* Command aliases — user-friendly names that route to canonical commands.
*
* Single source of truth: server.ts dispatch and meta-commands.ts chain prevalidation
* both import `canonicalizeCommand()`, so aliases resolve identically everywhere.
*
* When adding a new alias: keep the alias name guessable (e.g. setcontent → load-html
* helps agents migrating from Puppeteer's page.setContent()).
*/
export const COMMAND_ALIASES: Record<string, string> = {
'setcontent': 'load-html',
'set-content': 'load-html',
'setContent': 'load-html',
};
/** Resolve an alias to its canonical command name. Non-aliases pass through unchanged. */
export function canonicalizeCommand(cmd: string): string {
return COMMAND_ALIASES[cmd] ?? cmd;
}
/**
* Commands added in specific versions — enables future "this command was added in vX"
* upgrade hints in unknown-command errors. Only helps agents on *newer* browse builds
* that encounter typos of recently-added commands; does NOT help agents on old builds
* that type a new command (they don't have this map).
*/
export const NEW_IN_VERSION: Record<string, string> = {
'load-html': '0.19.0.0',
};
/**
* Levenshtein distance (dynamic programming).
* O(a.length * b.length) — fast for command name sizes (<20 chars).
*/
function levenshtein(a: string, b: string): number {
if (a === b) return 0;
if (a.length === 0) return b.length;
if (b.length === 0) return a.length;
const m: number[][] = [];
for (let i = 0; i <= a.length; i++) m.push([i, ...Array(b.length).fill(0)]);
for (let j = 0; j <= b.length; j++) m[0][j] = j;
for (let i = 1; i <= a.length; i++) {
for (let j = 1; j <= b.length; j++) {
const cost = a[i - 1] === b[j - 1] ? 0 : 1;
m[i][j] = Math.min(m[i - 1][j] + 1, m[i][j - 1] + 1, m[i - 1][j - 1] + cost);
}
}
return m[a.length][b.length];
}
/**
* Build an actionable error message for an unknown command.
*
* Pure function — takes the full command set + alias map + version map as args so tests
* can exercise the synthetic "older-version" case without mutating any global state.
*
* 1. Always names the input.
* 2. If Levenshtein distance ≤ 2 AND input.length ≥ 4, suggests the closest match
* (alphabetical tiebreak for determinism). Short-input guard prevents noisy
* suggestions for typos of 2-letter commands like 'js' or 'is'.
* 3. If the input appears in newInVersion, appends an upgrade hint. Honesty caveat:
* this only fires on builds that have this handler AND the map entry; agents on
* older builds hitting a newly-added command won't see it. Net benefit compounds
* as more commands land.
*/
export function buildUnknownCommandError(
command: string,
commandSet: Set<string>,
aliasMap: Record<string, string> = COMMAND_ALIASES,
newInVersion: Record<string, string> = NEW_IN_VERSION,
): string {
let msg = `Unknown command: '${command}'.`;
// Suggestion via Levenshtein, gated on input length to avoid noisy short-input matches.
// Candidates are pre-sorted alphabetically, so strict "d < bestDist" gives us the
// closest match with alphabetical tiebreak for free — first equal-distance candidate
// wins because subsequent equal-distance candidates fail the strict-less check.
if (command.length >= 4) {
let best: string | undefined;
let bestDist = 3; // sentinel: distance 3 would be rejected by the <= 2 gate below
const candidates = [...commandSet, ...Object.keys(aliasMap)].sort();
for (const cand of candidates) {
const d = levenshtein(command, cand);
if (d <= 2 && d < bestDist) {
best = cand;
bestDist = d;
}
}
if (best) msg += ` Did you mean '${best}'?`;
}
if (newInVersion[command]) {
msg += ` This command was added in browse v${newInVersion[command]}. Upgrade: cd ~/.claude/skills/gstack && git pull && bun run build.`;
}
return msg;
}
+60 -28
View File
@@ -5,7 +5,7 @@
import type { BrowserManager } from './browser-manager';
import { handleSnapshot } from './snapshot';
import { getCleanText } from './read-commands';
import { READ_COMMANDS, WRITE_COMMANDS, META_COMMANDS, PAGE_CONTENT_COMMANDS, wrapUntrustedContent } from './commands';
import { READ_COMMANDS, WRITE_COMMANDS, META_COMMANDS, PAGE_CONTENT_COMMANDS, wrapUntrustedContent, canonicalizeCommand } from './commands';
import { validateNavigationUrl } from './url-validation';
import { checkScope, type TokenInfo } from './token-registry';
import { validateOutputPath, escapeRegExp } from './path-security';
@@ -124,11 +124,15 @@ export async function handleMetaCommand(
let base64Mode = false;
const remaining: string[] = [];
let flagSelector: string | undefined;
for (let i = 0; i < args.length; i++) {
if (args[i] === '--viewport') {
viewportOnly = true;
} else if (args[i] === '--base64') {
base64Mode = true;
} else if (args[i] === '--selector') {
flagSelector = args[++i];
if (!flagSelector) throw new Error('Usage: screenshot --selector <css> [path]');
} else if (args[i] === '--clip') {
const coords = args[++i];
if (!coords) throw new Error('Usage: screenshot --clip x,y,w,h [path]');
@@ -156,6 +160,14 @@ export async function handleMetaCommand(
}
}
// --selector flag takes precedence; conflict with positional selector.
if (flagSelector !== undefined) {
if (targetSelector !== undefined) {
throw new Error('--selector conflicts with positional selector — choose one');
}
targetSelector = flagSelector;
}
validateOutputPath(outputPath);
if (clipRect && targetSelector) {
@@ -244,27 +256,36 @@ export async function handleMetaCommand(
' or: browse chain \'goto url | click @e5 | snapshot -ic\''
);
let commands: string[][];
let rawCommands: string[][];
try {
commands = JSON.parse(jsonStr);
if (!Array.isArray(commands)) throw new Error('not array');
rawCommands = JSON.parse(jsonStr);
if (!Array.isArray(rawCommands)) throw new Error('not array');
} catch (err: any) {
// Fallback: pipe-delimited format "goto url | click @e5 | snapshot -ic"
if (!(err instanceof SyntaxError) && err?.message !== 'not array') throw err;
commands = jsonStr.split(' | ')
rawCommands = jsonStr.split(' | ')
.filter(seg => seg.trim().length > 0)
.map(seg => tokenizePipeSegment(seg.trim()));
}
// Canonicalize aliases across the whole chain. Pair canonical name with the raw
// input so result labels + error messages reflect what the user typed, but every
// dispatch path (scope check, WRITE_COMMANDS.has, watch blocking, handler lookup)
// uses the canonical name. Otherwise `chain '[["setcontent","/tmp/x.html"]]'`
// bypasses prevalidation or runs under the wrong command set.
const commands = rawCommands.map(cmd => {
const [rawName, ...cmdArgs] = cmd;
const name = canonicalizeCommand(rawName);
return { rawName, name, args: cmdArgs };
});
// Pre-validate ALL subcommands against the token's scope before executing any.
// This prevents partial execution where some subcommands succeed before a
// scope violation is hit, leaving the browser in an inconsistent state.
// Uses canonical name so aliases don't bypass scope checks.
if (tokenInfo && tokenInfo.clientId !== 'root') {
for (const cmd of commands) {
const [name] = cmd;
if (!checkScope(tokenInfo, name)) {
for (const c of commands) {
if (!checkScope(tokenInfo, c.name)) {
throw new Error(
`Chain rejected: subcommand "${name}" not allowed by your token scope (${tokenInfo.scopes.join(', ')}). ` +
`Chain rejected: subcommand "${c.rawName}" not allowed by your token scope (${tokenInfo.scopes.join(', ')}). ` +
`All subcommands must be within scope.`
);
}
@@ -280,30 +301,33 @@ export async function handleMetaCommand(
let lastWasWrite = false;
if (executeCmd) {
// Full security pipeline via handleCommandInternal
for (const cmd of commands) {
const [name, ...cmdArgs] = cmd;
// Full security pipeline via handleCommandInternal.
// Pass rawName so the server's own canonicalization is a no-op (already canonical).
for (const c of commands) {
const cr = await executeCmd(
{ command: name, args: cmdArgs },
{ command: c.name, args: c.args },
tokenInfo,
);
const label = c.rawName === c.name ? c.name : `${c.rawName}${c.name}`;
if (cr.status === 200) {
results.push(`[${name}] ${cr.result}`);
results.push(`[${label}] ${cr.result}`);
} else {
// Parse error from JSON result
let errMsg = cr.result;
try { errMsg = JSON.parse(cr.result).error || cr.result; } catch (err: any) { if (!(err instanceof SyntaxError)) throw err; }
results.push(`[${name}] ERROR: ${errMsg}`);
results.push(`[${label}] ERROR: ${errMsg}`);
}
lastWasWrite = WRITE_COMMANDS.has(name);
lastWasWrite = WRITE_COMMANDS.has(c.name);
}
} else {
// Fallback: direct dispatch (CLI mode, no server context)
const { handleReadCommand } = await import('./read-commands');
const { handleWriteCommand } = await import('./write-commands');
for (const cmd of commands) {
const [name, ...cmdArgs] = cmd;
for (const c of commands) {
const name = c.name;
const cmdArgs = c.args;
const label = c.rawName === name ? name : `${c.rawName}${name}`;
try {
let result: string;
if (WRITE_COMMANDS.has(name)) {
@@ -323,11 +347,11 @@ export async function handleMetaCommand(
result = await handleMetaCommand(name, cmdArgs, bm, shutdown, tokenInfo, opts);
lastWasWrite = false;
} else {
throw new Error(`Unknown command: ${name}`);
throw new Error(`Unknown command: ${c.rawName}`);
}
results.push(`[${name}] ${result}`);
results.push(`[${label}] ${result}`);
} catch (err: any) {
results.push(`[${name}] ERROR: ${err.message}`);
results.push(`[${label}] ERROR: ${err.message}`);
}
}
}
@@ -346,12 +370,12 @@ export async function handleMetaCommand(
if (!url1 || !url2) throw new Error('Usage: browse diff <url1> <url2>');
const page = bm.getPage();
await validateNavigationUrl(url1);
await page.goto(url1, { waitUntil: 'domcontentloaded', timeout: 15000 });
const normalizedUrl1 = await validateNavigationUrl(url1);
await page.goto(normalizedUrl1, { waitUntil: 'domcontentloaded', timeout: 15000 });
const text1 = await getCleanText(page);
await validateNavigationUrl(url2);
await page.goto(url2, { waitUntil: 'domcontentloaded', timeout: 15000 });
const normalizedUrl2 = await validateNavigationUrl(url2);
await page.goto(normalizedUrl2, { waitUntil: 'domcontentloaded', timeout: 15000 });
const text2 = await getCleanText(page);
const changes = Diff.diffLines(text1, text2);
@@ -608,9 +632,17 @@ export async function handleMetaCommand(
// Close existing pages, then restore (replace, not merge)
bm.setFrame(null);
await bm.closeAllPages();
// Allowlist disk-loaded page fields — NEVER accept loadedHtml, loadedHtmlWaitUntil,
// or owner from disk. Those are in-memory-only invariants; allowing them would let
// a tampered state file smuggle HTML past load-html's safe-dirs + magic-byte + size
// checks, or forge tab ownership for cross-agent authorization bypass.
await bm.restoreState({
cookies: validatedCookies,
pages: data.pages.map((p: any) => ({ ...p, storage: null })),
pages: data.pages.map((p: any) => ({
url: typeof p.url === 'string' ? p.url : '',
isActive: Boolean(p.isActive),
storage: null,
})),
});
return `State loaded: ${data.cookies.length} cookies, ${data.pages.length} pages`;
}
+18 -4
View File
@@ -19,7 +19,7 @@ import { handleWriteCommand } from './write-commands';
import { handleMetaCommand } from './meta-commands';
import { handleCookiePickerRoute, hasActivePicker } from './cookie-picker-routes';
import { sanitizeExtensionUrl } from './sidebar-utils';
import { COMMAND_DESCRIPTIONS, PAGE_CONTENT_COMMANDS, wrapUntrustedContent } from './commands';
import { COMMAND_DESCRIPTIONS, PAGE_CONTENT_COMMANDS, wrapUntrustedContent, canonicalizeCommand, buildUnknownCommandError, ALL_COMMANDS } from './commands';
import {
wrapUntrustedPageContent, datamarkContent,
runContentFilters, type ContentFilterResult,
@@ -916,12 +916,21 @@ async function handleCommandInternal(
tokenInfo?: TokenInfo | null,
opts?: { skipRateCheck?: boolean; skipActivity?: boolean; chainDepth?: number },
): Promise<CommandResult> {
const { command, args = [], tabId } = body;
const { args = [], tabId } = body;
const rawCommand = body.command;
if (!command) {
if (!rawCommand) {
return { status: 400, result: JSON.stringify({ error: 'Missing "command" field' }), json: true };
}
// ─── Alias canonicalization (before scope, watch, tab-ownership, dispatch) ─
// Agent-friendly names like 'setcontent' route to canonical 'load-html'. Must
// happen BEFORE scope check so a read-scoped token calling 'setcontent' is still
// rejected (load-html lives in SCOPE_WRITE). Audit logging preserves rawCommand
// so the trail records what the agent actually typed.
const command = canonicalizeCommand(rawCommand);
const isAliased = command !== rawCommand;
// ─── Recursion guard: reject nested chains ──────────────────
if (command === 'chain' && (opts?.chainDepth ?? 0) > 0) {
return { status: 400, result: JSON.stringify({ error: 'Nested chain commands are not allowed' }), json: true };
@@ -1090,10 +1099,13 @@ async function handleCommandInternal(
const helpText = generateHelpText();
return { status: 200, result: helpText };
} else {
// Use the rich unknown-command helper: names the input, suggests the closest
// match via Levenshtein (≤ 2 distance, ≥ 4 chars input), and appends an upgrade
// hint if the command is listed in NEW_IN_VERSION.
return {
status: 400, json: true,
result: JSON.stringify({
error: `Unknown command: ${command}`,
error: buildUnknownCommandError(rawCommand, ALL_COMMANDS),
hint: `Available commands: ${[...READ_COMMANDS, ...WRITE_COMMANDS, ...META_COMMANDS].sort().join(', ')}`,
}),
};
@@ -1148,6 +1160,7 @@ async function handleCommandInternal(
writeAuditEntry({
ts: new Date().toISOString(),
cmd: command,
aliasOf: isAliased ? rawCommand : undefined,
args: args.join(' '),
origin: browserManager.getCurrentUrl(),
durationMs: successDuration,
@@ -1192,6 +1205,7 @@ async function handleCommandInternal(
writeAuditEntry({
ts: new Date().toISOString(),
cmd: command,
aliasOf: isAliased ? rawCommand : undefined,
args: args.join(' '),
origin: browserManager.getCurrentUrl(),
durationMs: errorDuration,
+64 -1
View File
@@ -24,6 +24,8 @@ export interface RefEntry {
name: string;
}
export type SetContentWaitUntil = 'load' | 'domcontentloaded' | 'networkidle';
export class TabSession {
readonly page: Page;
@@ -37,6 +39,30 @@ export class TabSession {
// ─── Frame context ─────────────────────────────────────────
private activeFrame: Frame | null = null;
// ─── Loaded HTML (for load-html replay across context recreation) ─
//
// loadedHtml lifecycle:
//
// load-html cmd ──▶ session.setTabContent(html, opts)
// ├─▶ page.setContent(html, opts)
// └─▶ this.loadedHtml = html
// this.loadedHtmlWaitUntil = opts.waitUntil
//
// goto/back/forward/reload ──▶ session.clearLoadedHtml()
// (BEFORE Playwright call, so timeouts
// don't leave stale state)
//
// viewport --scale ──▶ recreateContext()
// ├─▶ saveState() captures { url, loadedHtml } per tab
// │ (in-memory only, never to disk)
// └─▶ restoreState():
// for each tab with loadedHtml:
// newSession.setTabContent(html, opts)
// (NOT page.setContent — must rehydrate
// TabSession.loadedHtml too)
private loadedHtml: string | null = null;
private loadedHtmlWaitUntil: SetContentWaitUntil | undefined;
constructor(page: Page) {
this.page = page;
}
@@ -131,10 +157,47 @@ export class TabSession {
}
/**
* Called on main-frame navigation to clear stale refs and frame context.
* Called on main-frame navigation to clear stale refs, frame context, and any
* load-html replay metadata. Runs for every main-frame nav explicit goto/back/
* forward/reload AND browser-emitted navigations (link clicks, form submits, JS
* redirects, OAuth). Without clearing loadedHtml here, a user who load-html'd and
* then clicked a link would silently revert to the original HTML on the next
* viewport --scale.
*/
onMainFrameNavigated(): void {
this.clearRefs();
this.activeFrame = null;
this.loadedHtml = null;
this.loadedHtmlWaitUntil = undefined;
}
// ─── Loaded HTML (load-html replay) ───────────────────────
/**
* Load HTML content into the tab AND store it for replay after context recreation
* (e.g. viewport --scale). Unlike page.setContent() alone, this rehydrates
* TabSession.loadedHtml so the next saveState()/restoreState() round-trip preserves
* the content.
*/
async setTabContent(html: string, opts: { waitUntil?: SetContentWaitUntil } = {}): Promise<void> {
const waitUntil = opts.waitUntil ?? 'domcontentloaded';
// Call setContent FIRST — only record the replay metadata after a successful load.
// If setContent throws (timeout, crash), we must not leave phantom HTML that a
// later viewport --scale would replay.
await this.page.setContent(html, { waitUntil, timeout: 15000 });
this.loadedHtml = html;
this.loadedHtmlWaitUntil = waitUntil;
}
/** Get stored HTML + waitUntil for state replay. Returns null if no load-html happened. */
getLoadedHtml(): { html: string; waitUntil?: SetContentWaitUntil } | null {
if (this.loadedHtml === null) return null;
return { html: this.loadedHtml, waitUntil: this.loadedHtmlWaitUntil };
}
/** Clear stored HTML. Called BEFORE goto/back/forward/reload navigation. */
clearLoadedHtml(): void {
this.loadedHtml = null;
this.loadedHtmlWaitUntil = undefined;
}
}
+1
View File
@@ -46,6 +46,7 @@ export const SCOPE_READ = new Set([
/** Commands that modify page state or navigate */
export const SCOPE_WRITE = new Set([
'goto', 'back', 'forward', 'reload',
'load-html',
'click', 'fill', 'select', 'hover', 'type', 'press', 'scroll', 'wait',
'upload', 'viewport', 'newtab', 'closetab',
'dialog-accept', 'dialog-dismiss',
+162 -3
View File
@@ -3,6 +3,11 @@
* Localhost and private IPs are allowed (primary use case: QA testing local dev servers).
*/
import { fileURLToPath, pathToFileURL } from 'node:url';
import * as path from 'node:path';
import * as os from 'node:os';
import { validateReadPath } from './path-security';
export const BLOCKED_METADATA_HOSTS = new Set([
'169.254.169.254', // AWS/GCP/Azure instance metadata
'fe80::1', // IPv6 link-local — common metadata endpoint alias
@@ -105,17 +110,169 @@ async function resolvesToBlockedIp(hostname: string): Promise<boolean> {
}
}
export async function validateNavigationUrl(url: string): Promise<void> {
/**
* Normalize non-standard file:// URLs into absolute form before the WHATWG URL parser
* sees them. Handles cwd-relative, home-relative, and bare-segment shapes that the
* standard parser would otherwise mis-interpret as hostnames.
*
* file:///abs/path.html → unchanged
* file://./<rel> → file://<cwd>/<rel>
* file://~/<rel> → file://<HOME>/<rel>
* file://<single-segment>/... → file://<cwd>/<single-segment>/... (cwd-relative)
* file://localhost/<abs> → unchanged
* file://<host-like>/... → unchanged (caller rejects via host heuristic)
*
* Rejects empty (file://) and root-only (file:///) URLs — these would silently
* trigger Chromium's directory listing, which is a different product surface.
*/
export function normalizeFileUrl(url: string): string {
if (!url.toLowerCase().startsWith('file:')) return url;
// Split off query + fragment BEFORE touching the path — SPAs + fixture URLs rely
// on these. path.resolve would URL-encode `?` and `#` as `%3F`/`%23` (and
// pathToFileURL drops them entirely), silently routing preview URLs to the
// wrong fixture. Extract, normalize the path, reattach at the end.
//
// Parse order: `?` before `#` per RFC 3986 — '?' in a fragment is literal.
// Find the FIRST `?` or `#`, whichever comes first, and take everything
// after (including the delimiter) as the trailing segment.
const qIdx = url.indexOf('?');
const hIdx = url.indexOf('#');
let delimIdx = -1;
if (qIdx >= 0 && hIdx >= 0) delimIdx = Math.min(qIdx, hIdx);
else if (qIdx >= 0) delimIdx = qIdx;
else if (hIdx >= 0) delimIdx = hIdx;
const pathPart = delimIdx >= 0 ? url.slice(0, delimIdx) : url;
const trailing = delimIdx >= 0 ? url.slice(delimIdx) : '';
const rest = pathPart.slice('file:'.length);
// file:/// or longer → standard absolute; pass through unchanged (caller validates path).
if (rest.startsWith('///')) {
// Reject bare root-only (file:/// with nothing after)
if (rest === '///' || rest === '////') {
throw new Error('Invalid file URL: file:/// has no path. Use file:///<absolute-path>.');
}
return pathPart + trailing;
}
// Everything else: must start with // (we accept file://... only)
if (!rest.startsWith('//')) {
throw new Error(`Invalid file URL: ${url}. Use file:///<absolute-path> or file://./<rel> or file://~/<rel>.`);
}
const afterDoubleSlash = rest.slice(2);
// Reject empty (file://) and trailing-slash-only (file://./ listing cwd).
if (afterDoubleSlash === '') {
throw new Error('Invalid file URL: file:// is empty. Use file:///<absolute-path>.');
}
if (afterDoubleSlash === '.' || afterDoubleSlash === './') {
throw new Error('Invalid file URL: file://./ would list the current directory. Use file://./<filename> to render a specific file.');
}
if (afterDoubleSlash === '~' || afterDoubleSlash === '~/') {
throw new Error('Invalid file URL: file://~/ would list the home directory. Use file://~/<filename> to render a specific file.');
}
// Home-relative: file://~/<rel>
if (afterDoubleSlash.startsWith('~/')) {
const rel = afterDoubleSlash.slice(2);
const absPath = path.join(os.homedir(), rel);
return pathToFileURL(absPath).href + trailing;
}
// cwd-relative with explicit ./ : file://./<rel>
if (afterDoubleSlash.startsWith('./')) {
const rel = afterDoubleSlash.slice(2);
const absPath = path.resolve(process.cwd(), rel);
return pathToFileURL(absPath).href + trailing;
}
// localhost host explicitly allowed: file://localhost/<abs> (pass through to standard parser).
if (afterDoubleSlash.toLowerCase().startsWith('localhost/')) {
return pathPart + trailing;
}
// Ambiguous: file://<segment>/<rest> — treat as cwd-relative ONLY if <segment> is a
// simple path name (no dots, no colons, no backslashes, no percent-encoding, no
// IPv6 brackets, no Windows drive letter pattern).
const firstSlash = afterDoubleSlash.indexOf('/');
const segment = firstSlash === -1 ? afterDoubleSlash : afterDoubleSlash.slice(0, firstSlash);
// Reject host-like segments: dotted names (docs.v1), IPs (127.0.0.1), IPv6 ([::1]),
// drive letters (C:), percent-encoded, or backslash paths.
const looksLikeHost = /[.:\\%]/.test(segment) || segment.startsWith('[');
if (looksLikeHost) {
throw new Error(
`Unsupported file URL host: ${segment}. Use file:///<absolute-path> for local files (network/UNC paths are not supported).`
);
}
// Simple-segment cwd-relative: file://docs/page.html → cwd/docs/page.html
const absPath = path.resolve(process.cwd(), afterDoubleSlash);
return pathToFileURL(absPath).href + trailing;
}
/**
* Validate a navigation URL and return a normalized version suitable for page.goto().
*
* Callers MUST use the return value normalization of non-standard file:// forms
* only takes effect at the navigation site, not at the original URL.
*
* Callers (keep this list current, grep before removing):
* - write-commands.ts:goto
* - meta-commands.ts:diff (both URL args)
* - browser-manager.ts:newTab
* - browser-manager.ts:restoreState
*/
export async function validateNavigationUrl(url: string): Promise<string> {
// Normalize non-standard file:// shapes before the URL parser sees them.
let normalized = url;
if (url.toLowerCase().startsWith('file:')) {
normalized = normalizeFileUrl(url);
}
let parsed: URL;
try {
parsed = new URL(url);
parsed = new URL(normalized);
} catch {
throw new Error(`Invalid URL: ${url}`);
}
// file:// path: validate against safe-dirs and allow; otherwise defer to http(s) logic.
if (parsed.protocol === 'file:') {
// Reject non-empty non-localhost hosts (UNC / network paths).
if (parsed.host !== '' && parsed.host.toLowerCase() !== 'localhost') {
throw new Error(
`Unsupported file URL host: ${parsed.host}. Use file:///<absolute-path> for local files.`
);
}
// Convert URL → filesystem path with proper decoding (handles %20, %2F, etc.)
// fileURLToPath strips query + hash; we reattach them after validation so SPA
// fixture URLs like file:///tmp/app.html?route=home#login survive intact.
let fsPath: string;
try {
fsPath = fileURLToPath(parsed);
} catch (e: any) {
throw new Error(`Invalid file URL: ${url} (${e.message})`);
}
// Reject path traversal after decoding — e.g. file:///tmp/safe%2F..%2Fetc/passwd
// Note: fileURLToPath doesn't collapse .., so a literal '..' in the decoded path
// is suspicious. path.resolve will normalize it; check the result against safe dirs.
validateReadPath(fsPath);
// Return the canonical file:// URL derived from the filesystem path + original
// query + hash. This guarantees page.goto() gets a well-formed URL regardless
// of input shape while preserving SPA route/query params.
return pathToFileURL(fsPath).href + parsed.search + parsed.hash;
}
if (parsed.protocol !== 'http:' && parsed.protocol !== 'https:') {
throw new Error(
`Blocked: scheme "${parsed.protocol}" is not allowed. Only http: and https: URLs are permitted.`
`Blocked: scheme "${parsed.protocol}" is not allowed. Only http:, https:, and file: URLs are permitted.`
);
}
@@ -137,4 +294,6 @@ export async function validateNavigationUrl(url: string): Promise<void> {
`Blocked: ${parsed.hostname} resolves to a cloud metadata IP. Possible DNS rebinding attack.`
);
}
return url;
}
+153 -9
View File
@@ -10,9 +10,10 @@ import type { BrowserManager } from './browser-manager';
import { findInstalledBrowsers, importCookies, importCookiesViaCdp, hasV20Cookies, listSupportedBrowserNames } from './cookie-import-browser';
import { generatePickerCode } from './cookie-picker-routes';
import { validateNavigationUrl } from './url-validation';
import { validateOutputPath } from './path-security';
import { validateOutputPath, validateReadPath } from './path-security';
import * as fs from 'fs';
import * as path from 'path';
import type { SetContentWaitUntil } from './tab-session';
import { TEMP_DIR, isPathWithin } from './platform';
import { SAFE_DIRECTORIES } from './path-security';
import { modifyStyle, undoModification, resetModifications, getModificationHistory } from './cdp-inspector';
@@ -142,30 +143,129 @@ export async function handleWriteCommand(
if (inFrame) throw new Error('Cannot use goto inside a frame. Run \'frame main\' first.');
const url = args[0];
if (!url) throw new Error('Usage: browse goto <url>');
await validateNavigationUrl(url);
const response = await page.goto(url, { waitUntil: 'domcontentloaded', timeout: 15000 });
// Clear loadedHtml BEFORE navigation — a timeout after the main-frame commit
// must not leave stale content that could resurrect on a later context recreation.
session.clearLoadedHtml();
const normalizedUrl = await validateNavigationUrl(url);
const response = await page.goto(normalizedUrl, { waitUntil: 'domcontentloaded', timeout: 15000 });
const status = response?.status() || 'unknown';
return `Navigated to ${url} (${status})`;
return `Navigated to ${normalizedUrl} (${status})`;
}
case 'back': {
if (inFrame) throw new Error('Cannot use back inside a frame. Run \'frame main\' first.');
session.clearLoadedHtml();
await page.goBack({ waitUntil: 'domcontentloaded', timeout: 15000 });
return `Back → ${page.url()}`;
}
case 'forward': {
if (inFrame) throw new Error('Cannot use forward inside a frame. Run \'frame main\' first.');
session.clearLoadedHtml();
await page.goForward({ waitUntil: 'domcontentloaded', timeout: 15000 });
return `Forward → ${page.url()}`;
}
case 'reload': {
if (inFrame) throw new Error('Cannot use reload inside a frame. Run \'frame main\' first.');
session.clearLoadedHtml();
await page.reload({ waitUntil: 'domcontentloaded', timeout: 15000 });
return `Reloaded ${page.url()}`;
}
case 'load-html': {
if (inFrame) throw new Error('Cannot use load-html inside a frame. Run \'frame main\' first.');
const filePath = args[0];
if (!filePath) throw new Error('Usage: browse load-html <file> [--wait-until load|domcontentloaded|networkidle]');
// Parse --wait-until flag
let waitUntil: SetContentWaitUntil = 'domcontentloaded';
for (let i = 1; i < args.length; i++) {
if (args[i] === '--wait-until') {
const val = args[++i];
if (val !== 'load' && val !== 'domcontentloaded' && val !== 'networkidle') {
throw new Error(`Invalid --wait-until '${val}'. Must be one of: load, domcontentloaded, networkidle.`);
}
waitUntil = val;
} else if (args[i].startsWith('--')) {
throw new Error(`Unknown flag: ${args[i]}`);
}
}
// Extension allowlist
const ALLOWED_EXT = ['.html', '.htm', '.xhtml', '.svg'];
const ext = path.extname(filePath).toLowerCase();
if (!ALLOWED_EXT.includes(ext)) {
throw new Error(
`load-html: file does not appear to be HTML. Expected .html/.htm/.xhtml/.svg, got ${ext || '(no extension)'}. Rename the file if it's really HTML.`
);
}
const absolutePath = path.resolve(filePath);
// Safe-dirs check (reuses canonical read-side policy)
try {
validateReadPath(absolutePath);
} catch (e: any) {
throw new Error(
`load-html: ${absolutePath} must be under ${SAFE_DIRECTORIES.join(' or ')} (security policy). Copy the file into the project tree or /tmp first.`
);
}
// stat check — reject non-file targets with actionable error
let stat: fs.Stats;
try {
stat = await fs.promises.stat(absolutePath);
} catch (e: any) {
if (e.code === 'ENOENT') {
throw new Error(
`load-html: file not found at ${absolutePath}. Check spelling or copy the file under ${process.cwd()} or ${TEMP_DIR}.`
);
}
throw e;
}
if (stat.isDirectory()) {
throw new Error(`load-html: ${absolutePath} is a directory, not a file. Pass a .html file.`);
}
if (!stat.isFile()) {
throw new Error(`load-html: ${absolutePath} is not a regular file.`);
}
// Size cap
const MAX_BYTES = parseInt(process.env.GSTACK_BROWSE_MAX_HTML_BYTES || '', 10) || (50 * 1024 * 1024);
if (stat.size > MAX_BYTES) {
throw new Error(
`load-html: file too large (${stat.size} bytes > ${MAX_BYTES} cap). Raise with GSTACK_BROWSE_MAX_HTML_BYTES=<N> or split the HTML.`
);
}
// Single read: Buffer → magic-byte peek → utf-8 string
const buf = await fs.promises.readFile(absolutePath);
// Magic-byte check: strip UTF-8 BOM + leading whitespace, then verify the first
// non-whitespace byte starts a markup construct. Accepts any <tag, <!doctype,
// <!-- comment, <?xml prolog — including bare HTML fragments like `<div>...</div>`
// which setContent wraps in a full document. Rejects binary files mis-renamed .html
// (first byte won't be `<`).
let peek = buf.slice(0, 200);
if (peek[0] === 0xEF && peek[1] === 0xBB && peek[2] === 0xBF) {
peek = peek.slice(3);
}
const peekStr = peek.toString('utf8').trimStart();
// Valid markup opener: '<' followed by alpha (tag), '!' (doctype/comment), or '?' (xml prolog)
const looksLikeMarkup = /^<[a-zA-Z!?]/.test(peekStr);
if (!looksLikeMarkup) {
const hexDump = Array.from(buf.slice(0, 16)).map(b => b.toString(16).padStart(2, '0')).join(' ');
throw new Error(
`load-html: ${absolutePath} has ${ext} extension but content does not look like HTML. First bytes: ${hexDump}`
);
}
const html = buf.toString('utf8');
await session.setTabContent(html, { waitUntil });
return `Loaded HTML: ${absolutePath} (${stat.size} bytes)`;
}
case 'click': {
const selector = args[0];
if (!selector) throw new Error('Usage: browse click <selector>');
@@ -343,11 +443,55 @@ export async function handleWriteCommand(
}
case 'viewport': {
const size = args[0];
if (!size || !size.includes('x')) throw new Error('Usage: browse viewport <WxH> (e.g., 375x812)');
const [rawW, rawH] = size.split('x').map(Number);
const w = Math.min(Math.max(Math.round(rawW) || 1280, 1), 16384);
const h = Math.min(Math.max(Math.round(rawH) || 720, 1), 16384);
// Parse args: [<WxH>] [--scale <n>]. Either may be omitted, but NOT both.
let sizeArg: string | undefined;
let scaleArg: number | undefined;
for (let i = 0; i < args.length; i++) {
if (args[i] === '--scale') {
const val = args[++i];
if (val === undefined || val === '') {
throw new Error('viewport --scale: missing value. Usage: viewport [WxH] --scale <n>');
}
const parsed = Number(val);
if (!Number.isFinite(parsed)) {
throw new Error(`viewport --scale: value '${val}' is not a finite number.`);
}
scaleArg = parsed;
} else if (args[i].startsWith('--')) {
throw new Error(`Unknown viewport flag: ${args[i]}`);
} else if (sizeArg === undefined) {
sizeArg = args[i];
} else {
throw new Error(`Unexpected positional arg: ${args[i]}. Usage: viewport [WxH] [--scale <n>]`);
}
}
if (sizeArg === undefined && scaleArg === undefined) {
throw new Error('Usage: browse viewport [<WxH>] [--scale <n>] (e.g. 375x812, or --scale 2 to keep current size)');
}
// Resolve width/height: either from sizeArg or from current viewport if --scale-only.
let w: number, h: number;
if (sizeArg) {
if (!sizeArg.includes('x')) throw new Error('Usage: browse viewport [<WxH>] [--scale <n>] (e.g., 375x812)');
const [rawW, rawH] = sizeArg.split('x').map(Number);
w = Math.min(Math.max(Math.round(rawW) || 1280, 1), 16384);
h = Math.min(Math.max(Math.round(rawH) || 720, 1), 16384);
} else {
// --scale without WxH → use BrowserManager's tracked viewport (source of truth
// since setViewport + launchContext keep it in sync). Falls back reliably on
// headed → headless transitions or contexts with viewport:null.
const current = bm.getCurrentViewport();
w = current.width;
h = current.height;
}
if (scaleArg !== undefined) {
const err = await bm.setDeviceScaleFactor(scaleArg, w, h);
if (err) return `Viewport partially set: ${err}`;
return `Viewport set to ${w}x${h} @ ${scaleArg}x (context recreated; refs and load-html content replayed)`;
}
await bm.setViewport(w, h);
return `Viewport set to ${w}x${h}`;
}