Files
gstack/browse/src/meta-commands.ts
T
Garry Tan 7ca04d8ef0 v1.42.0.0 Daegu wave: 23 community-filed bugs + PTY classifier enforcement (24 bisect commits) (#1594)
* fix(gstack-paths): guard CLAUDE_PLUGIN_DATA against cross-plugin contamination (#1569)

gstack-paths previously trusted CLAUDE_PLUGIN_DATA as a fallback for
GSTACK_STATE_ROOT whenever GSTACK_HOME was unset. When another plugin
(e.g. Codex) persists its own CLAUDE_PLUGIN_DATA into the session env
via CLAUDE_ENV_FILE, gstack picked it up and wrote checkpoints,
analytics, and learnings into that plugin's directory. Anyone with the
Codex plugin installed alongside gstack hit this silently.

Fix: guard the CLAUDE_PLUGIN_DATA branch so it only fires when
CLAUDE_PLUGIN_ROOT confirms we're running as the gstack plugin (path
contains "gstack"). Skill installs fall through to \$HOME/.gstack.

Contributed by @ElliotDrel via #1570. Closes #1569.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(gbrain-sync): sourceLocalPath handles wrapped {sources:[...]} shape from gbrain v0.20+

gbrain v0.20+ changed `gbrain sources list --json` to return
{sources: [...]} instead of a flat array. sourceLocalPath crashed
upstream with `list.find is not a function` on every /sync-gbrain
invocation against modern gbrain. Accept both shapes for
forward/backward compat, matching probeSource/sourcePageCount in
lib/gbrain-sources.ts.

Contributed by @jakehann11 via #1571. Closes #1567. Supersedes #1564
(@tonyjzhou, same fix, different shape — credit retained).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(brain-context-load): probe gbrain via execFile, not shell builtin (#1559)

gbrainAvailable() used `execFileSync("command", ["-v", "gbrain"])`,
which fails in any environment where the `command` builtin isn't on
the spawned process's PATH (most non-interactive shells). The probe
then reported gbrain as missing even when it was installed, and
context-load silently skipped vector/list queries.

Fix: probe `gbrain --version` directly with a 500ms timeout (matching
the rest of the file's MCP_TIMEOUT_MS). Same semantics, works
everywhere execFile works.

Contributed by @jbetala7 via #1560. Closes #1559.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test(gbrain-doctor): pin schema_version:2 doctor parse path (#1418)

Adds an exec-path regression test that runs a fake gbrain shim emitting
the v0.25+ doctor JSON shape (schema_version: 2, status: "warnings",
exit 1 for health_score < 100, no top-level `engine` field). Confirms
freshDetectEngineTier recovers stdout from the non-zero exit and falls
back to GBRAIN_HOME/config.json for the engine label.

The pre-existing test for #1415 only stripped gbrain from PATH; this
test exercises the actual doctor parse path, closing the gap that
codex's plan review flagged.

Also documents the schema_version separation in
lib/gbrain-local-status.ts: the local CacheEntry stays at version 1,
distinct from the doctor-output schema_version which we accept across
versions in gstack-memory-helpers.

Closes #1418 (credit @mvanhorn for surfacing the doctor + schema_v2
collapse). The fix landed pre-emptively in v1.29.x; this commit pins
it with a stronger test.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test(memory-ingest): pin put_page regression + scrub stale name from --help and comments (#1346)

#1346 reported that gstack-memory-ingest still called the renamed
gbrain put_page subcommand on gbrain v0.18+. The actual code migrated
to `gbrain put` and later to batch `gbrain import <dir>` before this
report landed — only documentation lag remained.

This commit:
- Updates the --help string ("Skip gbrain put calls (still updates
  state file)") so user-facing docs match the shipped subcommand
- Updates two inline comments that still referenced the old name
- Adds test/memory-ingest-no-put_page.test.ts: a regression pin that
  strips comments from bin/gstack-memory-ingest.ts and fails the build
  if "put_page" appears in any active code or string literal, plus a
  sanity check that the file still calls a supported gbrain page-write
  verb (put or import)

Closes #1346. Reporter @kylma-code surfaced the doc lag; the original
code migration credit is on the v1.27.x wave.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(resolvers): rewrite all gbrain put_page instructions to canonical put <slug>

scripts/resolvers/gbrain.ts emitted user-facing copy-paste instructions
using the renamed `gbrain put_page` subcommand across 10 skills
(office-hours, investigate, plan-ceo-review, retro, plan-eng-review,
ship, cso, design-consultation, fallback, entity-stub). Every gstack
user copying those snippets hit "unknown command: put_page" on gbrain
v0.18+.

This commit:
- Rewrites all 10 instruction templates to use `gbrain put <slug>
  --content "$(cat <<EOF...EOF)"` with title/tags moved into YAML
  frontmatter inside --content, matching the v0.18+ subcommand shape
- Updates README.md and USING_GBRAIN_WITH_GSTACK.md "common commands"
  table to reference `gbrain put` and `gbrain get`
- Adds test/resolvers-gbrain-put-rewrite.test.ts pinning two
  invariants: (a) resolver source ships only canonical instructions,
  (b) every tracked SKILL.md file is free of `gbrain put_page`

CHANGELOG entries are deliberately left untouched (historical record).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(build): extract package.json build to scripts/build.sh for Windows Bun compat (#1538, #1537, #1530, #1457, #1561)

Bun's Windows shell parser rejects multiple constructs the inline
package.json build chain used: brace groups `{ cmd; }`, subshells with
redirection `( git ... ) > path/.version`, and (in Bun 1.3.x) subshells
near redirections in general. Every Windows install + every
auto-upgrade since v1.34.2.0 has failed on `bun run build`.

Extracts the build chain to scripts/build.sh and the .version writes to
scripts/write-version-files.sh. POSIX-portable, no Bun shell parsing
involved. Also adds Windows-specific bun.exe handling for non-ASCII
PATHs (a separate Windows footgun where Bun's --compile fails when the
binary lives under a path with non-ASCII chars).

Updates test/build-script-shell-compat.test.ts to assert the new shape:
no subshells with redirections anywhere in the build chain, and build
delegates to scripts/build.sh which delegates .version writes.

Contributed by @Charlie-El via #1544. Supersedes #1531 (@scarson, fixed
in build helper), #1480 (@mikepsinn, partial overlap), #1460
(@realcarsonterry, brace-group fix subsumed) — credit retained.
Closes #1538, #1537, #1530, #1457, #1561.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(windows): .exe glob in .gitignore + .exe extension resolution in find-browse (#1554)

bun build --compile on Windows appends .exe to the output filename,
producing browse.exe instead of browse. find-browse's existsSync probe
only checked the bare path and returned null on Windows even when the
binary was correctly built. .gitignore similarly only excluded the
bare bin/gstack-global-discover path, leaving the .exe variant
tracked.

This commit:
- .gitignore: changes `bin/gstack-global-discover` →
  `bin/gstack-global-discover*` so the Windows .exe variant is ignored
- browse/src/find-browse.ts: adds isExecutable + findExecutable helpers
  that fall back to .exe/.cmd/.bat probing on Windows, mirroring the
  same helper already in make-pdf/src/browseClient.ts and pdftotext.ts

Contributed by @Mike-E-Log via #1554.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* ci(windows): add fresh-install E2E gate that runs bun run build on windows-latest

Adds .github/workflows/windows-setup-e2e.yml as the gate that catches
Bun shell-parser regressions in the build chain before they reach
users. Triggers on PRs touching package.json, scripts/build.sh,
scripts/write-version-files.sh, setup, browse cli/find-browse, or
gstack-paths.

What it verifies:
1. bun run build completes on Windows (the previously-broken path that
   #1538/#1537/#1530/#1457/#1561 reported)
2. All compiled binaries land on disk (browse.exe, find-browse.exe,
   design.exe, gstack-global-discover.exe)
3. find-browse resolves to the .exe variant on Windows (regression
   gate for #1554)
4. gstack-paths returns non-empty GSTACK_STATE_ROOT/PLAN_ROOT/TMP_ROOT
   on Windows (regression gate for #1570)

Complements the existing windows-free-tests.yml (curated unit subset);
this new workflow exercises the install path itself.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(codex): move diff scope into prompt instead of --base (Codex CLI 0.130+ argv conflict) (#1209)

Codex CLI ≥ 0.130.0 rejects passing a custom prompt and --base together
(mutually exclusive at argv level). Every /codex review, /review, and
/ship structured Codex review call ended with an argv error before the
model ran.

Fix: scope the diff in prompt text using
"Run git diff origin/<base>...HEAD 2>/dev/null || git diff <base>...HEAD"
instead of `--base <base>`. Preserves the filesystem boundary
instruction across all invocations and keeps Codex's review prompt
tuning.

Touches:
- codex/SKILL.md.tmpl + regenerated codex/SKILL.md
- scripts/resolvers/review.ts + regenerated review/SKILL.md, ship/SKILL.md
- test/gen-skill-docs.test.ts: new regression that fails if any of the
  five known files still contain the prompt+--base shape
- test/skill-validation.test.ts: corresponding negative + positive pin
  on the rendered SKILL.md files

Contributed by @jbetala7 via #1209. Closes #1479. Supersedes #1527
(@mvanhorn — same intent, different patch shape, CONFLICTING) and
#1449 (@Gujiassh — broader refactor, CONFLICTING). Credit retained
in CHANGELOG.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(review): diff from git merge-base, not git diff origin/<base> (#1492)

git diff origin/<base> shows everything since the common ancestor in
both directions — it includes commits that landed on origin/<base>
after this branch was created as deletions. That made /review and
/ship's pre-landing structured review report inflated diff totals and
flagged "removed" code that was actually still present in the working
tree.

Fix: compute DIFF_BASE via git merge-base origin/<base> HEAD and diff
the working tree against that point. Same coverage of uncommitted
edits, no phantom deletions from out-of-order base advancement.

Applies to /review's Step 1 (diff existence check), Step 3 (get the
diff), the build-on-intent scope-creep check, the structured review
DIFF_INS/DIFF_DEL stats, and the Claude adversarial subagent prompt.
Same change flows into ship/SKILL.md via the shared resolver.

Touches:
- review/SKILL.md.tmpl + regenerated review/SKILL.md, ship/SKILL.md
- scripts/resolvers/review.ts
- scripts/resolvers/review-army.ts

Contributed by @mvanhorn via #1492.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test(codex): pin filesystem-boundary preservation across all codex review surfaces (#1503, #1522)

#1503 reported that the bare codex review --base path stripped the
filesystem boundary instruction, letting Codex spend tokens reading
.claude/skills/ and agents/. #1522 proposed adding a skill-path
detector that switched to the custom-instructions route when the diff
touched skill files.

After C10 (#1209) restructured codex review to always carry the
boundary in the prompt (the prompt+--base argv conflict forced the
restructure), the skill-path detector becomes redundant — every
default call already preserves the boundary.

This commit pins the post-#1209 invariant with a test that fails the
build if any future refactor strips the boundary from codex/SKILL.md,
review/SKILL.md, or ship/SKILL.md. Closes #1503 by regression test.

#1522 (@genisis0x) is superseded by #1209 (the prompt rewrite covers
its safety concern); credit retained in CHANGELOG.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(skills): use command -v instead of which for codex detection (#1197)

`which` is not on PATH in every shell — some Windows shells, BusyBox-
only containers, and minimal CI images all fail when skills probe
codex availability via `which codex`. `command -v` is a POSIX builtin
and always available where the skill is running.

Touched:
- codex/SKILL.md.tmpl: CODEX_BIN=$(command -v codex || echo "")
- scripts/resolvers/review.ts and scripts/resolvers/design.ts:
  3 + 3 sites each rewritten to `command -v codex >/dev/null 2>&1`
- Regenerated all 10 affected SKILL.md files (codex, review, ship,
  design-consultation, design-review, office-hours, plan-ceo-review,
  plan-design-review, plan-devex-review, plan-eng-review)
- test/skill-validation.test.ts: updated pin + defensive regression
  test that fails if `which codex` returns to codex/SKILL.md
- test/skill-e2e-plan.test.ts: updated summary regex

Contributed by @mvanhorn via #1197.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(codex): surface non-zero exits so wrappers stop reading as silent stalls (#1467, #1327)

When codex exits non-zero (parse errors, arg-shape breaks, model API
errors that propagate as non-zero status), the calling agent
previously saw an empty output and burned 30-60 minutes misdiagnosing
as a silent model/API stall. The hang-detection block only caught
exit 124 (the timeout-wrapper signal).

Adds elif blocks in all four codex invocation sites (Review default,
Challenge, Consult new-session, Consult resume) that:
- Echo "[codex exit N] <stderr first line>" to stdout
- Indent the first 20 stderr lines for inline context
- Log codex_nonzero_exit telemetry tagged with the call site

Contributed by @genisis0x via #1467. Closes #1327.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(design): disclose OpenAI key source + warn on cwd .env match (#1278, closes #1248)

The design binary previously called process.env.OPENAI_API_KEY without
checking where the key came from. If a user ran $D inside someone
else's project that had OPENAI_API_KEY in its .env, the resulting
generation billed that project's account. Silent and irreversible.

Fix: resolveApiKeyInfo() returns both the key and its source. When the
env-var path matches an OPENAI_API_KEY entry in the current
directory's .env, .env.<NODE_ENV>, or .env.local file, we set a
warning. requireApiKey() prints "Using OpenAI key from <source>" plus
the warning before the run — never the key itself.

Adds 6 unit tests covering: config-vs-env precedence, env-only (no
match), env+cwd .env match, quoted/exported values, value-mismatch
(no false positive), and the no-leak invariant for requireApiKey
stderr output.

Contributed by @jbetala7 via #1278. Closes #1248.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(browse): guard full-page screenshots against Anthropic vision API >2000px brick (#1214)

Full-page screenshots of tall pages routinely exceeded 2000px on the
longest dimension, silently bricking the agent's session: the
resulting base64 reached the Anthropic vision API which rejected the
oversized image, leaving the agent burning turns on a useless blob
with no stderr trace from the browse side.

Adds browse/src/screenshot-size-guard.ts as a shared helper:
- guardScreenshotBuffer(buf) → downscales in-memory if max(w,h) > 2000
- guardScreenshotPath(path) → file-mode variant that rewrites in place
- Aspect ratio preserved via sharp's resize fit:inside
- Stderr diagnostic on any downscale so callers can see when it fired
- Lazy sharp import so non-screenshot paths pay no startup cost

Wires the guard into all three full-page callsites codex review
flagged:
- browse/src/snapshot.ts: annotated + heatmap fullPage captures
- browse/src/meta-commands.ts: screenshot command (path + base64
  fullPage modes) plus the responsive 3-viewport sweep
- browse/src/write-commands.ts: prettyscreenshot fullPage path

Covers seven unit cases (pass-through, downscale, aspect ratio,
exactly-2000px edge, file-mode rewrite) plus a static invariant test
that fails the build if any of the three callsites stops importing the
guard.

Closes #1214.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(security): add Node sidecar entry for L4 prompt-injection classifier (#1370)

The L4 TestSavant classifier in browse/src/security-classifier.ts
can't be imported into the compiled browse server (onnxruntime-node
dlopen fails from Bun's compile extract dir per CLAUDE.md). The agent
that used to host it (sidebar-agent.ts) was removed when the PTY
proved out — leaving the classifier file shipped but with zero
callers. Exactly the gap codex flagged in #1370.

Adds browse/src/security-sidecar-entry.ts: a Node script that runs the
classifier as a subprocess of the browse server. It reads NDJSON
requests from stdin and writes id-correlated NDJSON responses to
stdout, supporting:
  - op: "scan-page-content" — full L4 classifier scan
  - op: "ping" — liveness probe for the client's health check
  - op: "status" — classifier readiness (used by /pty-inject-scan to
    surface l4 { available: bool } in its response)

Plus browse/src/find-security-sidecar.ts: a resolver that locates
node + the bundled JS entry (browse/dist/security-sidecar.js, built in
a follow-up package.json change) or falls back to the dev TS entry.
Returns null cleanly when node isn't on PATH so the calling endpoint
can degrade per D7 (extension WARN + user confirm).

C17 of the security-stack wave. C18 adds the IPC client + lifecycle
management; C19 wires the endpoint; C20 routes the extension through it.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(security): sidecar IPC client with lifecycle + circuit breaker (#1370)

Adds browse/src/security-sidecar-client.ts to manage the Node L4
classifier subprocess from the compiled browse server:

- Lazy spawn on first scan; reuses the same process across requests
- Id-correlated request/response via NDJSON over stdio
- 5s default per-scan timeout; 64KB payload cap (short-circuits before
  spawn so oversized requests don't waste a process)
- 3-in-10-minutes respawn cap → trips circuit breaker; subsequent
  scans throw immediately so the /pty-inject-scan endpoint can surface
  l4 { available: false } to the extension and degrade to WARN+confirm
- process.on('exit') sends SIGTERM to the child for clean teardown
- isSidecarAvailable() lets the endpoint probe before scan calls so
  the response shape reflects degraded mode honestly

Unit tests cover the payload cap, the availability probe, and the
breaker-doesn't-crash invariant under repeated rejected calls.

C18 of the security-stack wave. C19 adds POST /pty-inject-scan; C20
routes the extension through it.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(security): add POST /pty-inject-scan endpoint for pre-PTY-inject scans (#1370)

The sidebar's gstackInjectToTerminal callers (toolbar Cleanup,
Inspector "Send to Code") were piping page-derived text directly into
the live claude PTY with ZERO classifier processing — the gap codex
flagged in #1370. The documented sidebar security stack had a hole
the size of every Cleanup-button click.

Adds POST /pty-inject-scan to browse/src/server.ts:
- Local-only binding (NOT in TUNNEL_PATHS — tunnel attempts get the
  general 404 path; never reaches the scan logic)
- Root-token auth via existing validateAuth() — 401 on unauth
- 64KB request cap → 413 + payload-too-large body
- 5s scan timeout via sidecar client
- URL-blocklist forced to BLOCK in PTY context (page-derived REPL
  input is higher-risk than ordinary tool output)
- L4 ML classifier via the sidecar when available; degrades to WARN
  per D7 when sidecar is unavailable
- Response goes through JSON.stringify(..., sanitizeReplacer) per
  v1.38.0.0 Unicode-egress hardening
- Imports only from security-sidecar-client.ts, never directly from
  security-classifier.ts (which would brick the compiled Bun binary)

Seven static-invariant tests pin the POST verb, auth gate, 64KB cap,
tunnel-listener exclusion, sanitizeReplacer wrapping, l4 availability
shape, and the no-direct-classifier-import rule.

C19 of the security-stack wave. C20 routes the extension through it;
C21 adds the invariant AST check.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(extension): route gstackInjectToTerminal through /pty-inject-scan (#1370)

Closes the documented-vs-shipped gap codex flagged in #1370. The
sidebar's two PTY-injection call sites (Inspector "Send to Code" and
toolbar Cleanup) now pre-scan via the new /pty-inject-scan endpoint
before writing to the live claude REPL.

Adds window.gstackScanForPTYInject(text, origin) to
extension/sidepanel-terminal.js:
- Async, returns { allow, verdict, reasons, l4 }
- POST to /pty-inject-scan with the existing root-token auth
- WARN+confirm on scan failure (network down, sidecar absent, etc.)
  rather than silent PASS — D7 honest-degradation

gstackInjectToTerminal stays synchronous, returns boolean. Per D6:
keeping the inject sync means existing `const ok = ...?.()` callers
don't break, and the invariant test in
test/extension-pty-inject-invariant.test.ts can statically pin that
every call goes through the scan first.

extension/sidepanel.js call sites updated:
- inspectorSendBtn click → await scan, BLOCK drops + WARN prompts via
  window.confirm, PASS injects silently
- runCleanup() → same flow. Static cleanup prompt always PASSes but
  still routes through scan to honor the invariant.

C20 of the security-stack wave. C21 adds the static invariant test.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test(security): invariant — extension PTY inject must be scan-gated (#1370)

Static-analysis invariant test that fails the build if any
extension/*.js path calls window.gstackInjectToTerminal without a
preceding window.gstackScanForPTYInject in the same enclosing
function. Closes the documented-vs-shipped gap codex demanded a
machine check on.

Rules:
- Rule 1: any file that calls inject must also reference scan
- Rule 2: in the enclosing function (function declaration, arrow,
  async (), event handler), a scan call must appear before the inject
  call by source position
- Exemption: sidepanel-terminal.js (the file that DEFINES the inject
  function) is exempt from Rule 2 since the definition is not a call

Plus two structural checks:
- sidepanel-terminal.js defines both the inject and scan functions
- inject stays SYNCHRONOUS (no `async` modifier) per D6 — async would
  silently break the `const ok = ...?.()` pattern at every caller

C21 of the security-stack wave. The sidecar architecture (#1370) is
complete: server-side L1-L3 + L4-via-sidecar (C17+C18+C19), extension
pre-scan wiring (C20), and now the regression gate (C21).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(browse): opt-in extended stealth mode with 6 detection-vector patches (#1112)

Rebases @garrytan's PR #1112 (Apr 2026, abandoned) onto the current
browse/src/stealth.ts contract. The existing minimal "codex narrowed"
stealth (webdriver-mask + AutomationControlled launch arg) stays the
default. PR #1112's six additional patches are added behind an opt-in
GSTACK_STEALTH=extended env flag.

Extended-mode patches (applied AFTER the default mask, in order):
  1. delete navigator.webdriver from prototype (not just the getter —
     detectors check `"webdriver" in navigator`)
  2. WebGL renderer spoof to Apple M1 Pro (SwiftShader was the #1
     software-GPU tell in containers)
  3. navigator.plugins returns a PluginArray-prototype-passing array
     with MimeType objects and namedItem()
  4. window.chrome populated with chrome.app, chrome.runtime,
     chrome.loadTimes(), chrome.csi() with realistic shapes
  5. navigator.mediaDevices backfilled when headless drops it
  6. CDP cdc_*-prefixed window globals cleared

Why opt-in: the default mode's contract is fingerprint CONSISTENCY,
which protects against detectors that flag spoofing mismatch. Extended
mode actively lies about the environment; sites that reflect on these
properties can break. Users who hit detection in default mode can flip
GSTACK_STEALTH=extended for SannySoft 100% pass-rate.

Twenty unit tests pin the env-flag semantics, all six patches' code
presence, and the applyStealth wiring order. Live SannySoft pass-rate
verification stays in the periodic-tier E2E suite.

Contributed by @garrytan via #1112 (rebased — original PR opened
before the codex-narrowed minimum landed; rebase preserves the
narrowed default while adding the SannySoft-passing path as opt-in).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test(fixtures): regenerate ship-SKILL.md golden baselines after C10-C13 + C16 templates

Updates the three ship-SKILL.md golden baselines (claude, codex,
factory hosts) to match the new shape produced by:
- C10 #1209 codex argv (prompt + diff scope, no --base)
- C11 #1492 merge-base diff (DIFF_BASE= preamble)
- C13 #1197 command -v for codex detection
- C12 + boundary preservation per regen-enforcing test

Per CLAUDE.md SKILL.md workflow: edit the .tmpl, run gen:skill-docs,
commit the regenerated outputs together. Goldens are part of the
regen contract — without this commit, test/host-config.test.ts'
golden-baseline checks fail with the diff codex review surfaced.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore(release): v1.41.0.0 — Daegu wave (24 bisect commits, 14 user-facing fixes)

Bumps VERSION 1.40.0.0 → 1.41.0.0. CHANGELOG entry follows the
release-summary format in CLAUDE.md: two-line headline, lead
paragraph, "The numbers that matter" table, "What this means for
builders" closer, then itemized Added/Changed/Fixed/For contributors
with inline credit to every PR author and original issue reporter.

Scale-aware bump per CLAUDE.md: 24 commits, ~6000 LOC net,
substantial new capability across security (PTY sidecar wiring),
install (Windows build chain), compat (gbrain 0.18-0.35, Codex CLI
0.130+), and quality (screenshot guard, design key disclosure,
extended stealth opt-in). MINOR is the right call.

Closes for users: #1567, #1559, #1569, #1346, #1418, #1538, #1537,
#1530, #1457, #1561, #1554, #1479, #1503, #1248, #1214, #1370, #1327,
#1193 pattern, #1152 pattern. Credit retained inline.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(find-browse): resolve source-checkout layout <git-root>/browse/dist/browse[.exe]

windows-setup-e2e.yml runs `bun browse/src/find-browse.ts` against a
freshly-built repo where binaries land at browse/dist/browse.exe (no
.claude/skills/gstack/ install layout). The previous markers chain
only matched .codex/.agents/.claude prefixed paths, so find-browse
exited "not found" even when the binary was present.

Adds a source-checkout fallback after the marker scan: if no
installed layout resolves but <git-root>/browse/dist/browse[.exe]
exists, return that. Three real callers hit this path:
- gstack repo dev workflow before `./setup` runs
- windows-setup-e2e.yml CI (the breakage that surfaced this)
- make-pdf consumers running from a sibling source checkout

Smoke-verified: a fresh git repo with browse/dist/browse on disk now
resolves through the source-checkout branch (was returning null
before this commit).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore(release): bump v1.41.0.0 → v1.42.0.0 to clear queue collision with #1574

The version-gate workflow flagged a collision: PR #1574
(garrytan/colombo-v3) already claims v1.41.0.0, and #1592
(fix/audit-critical-high-bugs) claims v1.41.1.0. Per CLAUDE.md's
workspace-aware ship rule, queue-advancing past a claimed version
within the same bump level is permitted — MINOR work landing on top
of a queued MINOR still reads as MINOR relative to main.

Util's suggested next slot is v1.42.0.0; taking it. CHANGELOG entry
header bumped + dated 2026-05-19; entry body unchanged (same wave
content, same credit list).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-20 07:35:01 -07:00

1168 lines
48 KiB
TypeScript

/**
* Meta commands — tabs, server control, screenshots, chain, diff, snapshot
*/
import type { BrowserManager } from './browser-manager';
import { handleSnapshot } from './snapshot';
import { getCleanText } from './read-commands';
import { READ_COMMANDS, WRITE_COMMANDS, META_COMMANDS, PAGE_CONTENT_COMMANDS, wrapUntrustedContent, canonicalizeCommand } from './commands';
import { handleDomainSkillCommand } from './domain-skill-commands';
import { handleSkillCommand } from './browser-skill-commands';
import { validateNavigationUrl } from './url-validation';
import { checkScope, type TokenInfo } from './token-registry';
import { validateOutputPath, validateReadPath, SAFE_DIRECTORIES, escapeRegExp } from './path-security';
import { guardScreenshotBuffer, guardScreenshotPath } from './screenshot-size-guard';
// Re-export for backward compatibility (tests import from meta-commands)
export { validateOutputPath, escapeRegExp } from './path-security';
import * as Diff from 'diff';
import * as fs from 'fs';
import * as path from 'path';
import { writeSecureFile, mkdirSecure } from './file-permissions';
import { TEMP_DIR } from './platform';
import { resolveConfig } from './config';
import type { Frame } from 'playwright';
/** Tokenize a pipe segment respecting double-quoted strings. */
function tokenizePipeSegment(segment: string): string[] {
const tokens: string[] = [];
let current = '';
let inQuote = false;
for (let i = 0; i < segment.length; i++) {
const ch = segment[i];
if (ch === '"') {
inQuote = !inQuote;
} else if (ch === ' ' && !inQuote) {
if (current) { tokens.push(current); current = ''; }
} else {
current += ch;
}
}
if (current) tokens.push(current);
return tokens;
}
// ─── PDF flag parsing (make-pdf contract) ─────────────────────────────
//
// The $B pdf command grew from a 2-line wrapper (format: 'A4') into a real
// PDF engine frontend. make-pdf/dist/pdf shells out to `browse pdf` with
// this flag set, so the contract here has to be stable.
//
// Mutex rules enforced:
// --format vs --width/--height
// --margins vs any --margin-*
// --page-numbers vs --footer-template (page-numbers writes the footer itself)
//
// Units for dimensions: "1in" | "72pt" | "25mm" | "2.54cm". Bare numbers
// are interpreted as pixels (Playwright's default), which is almost never
// what callers want — we warn but don't reject.
//
// Large payloads: header/footer HTML and custom CSS can exceed Windows'
// 8191-char CreateProcess cap via argv. Callers pass `--from-file <path>`
// to a JSON file holding the full options. make-pdf always uses this path.
interface ParsedPdfArgs {
output: string;
format?: string;
width?: string;
height?: string;
marginTop?: string;
marginRight?: string;
marginBottom?: string;
marginLeft?: string;
headerTemplate?: string;
footerTemplate?: string;
pageNumbers?: boolean;
tagged?: boolean;
outline?: boolean;
printBackground?: boolean;
preferCSSPageSize?: boolean;
toc?: boolean;
}
function parsePdfArgs(args: string[]): ParsedPdfArgs {
// --from-file short-circuits argv parsing entirely
for (let i = 0; i < args.length; i++) {
if (args[i] === '--from-file') {
const payloadPath = args[++i];
if (!payloadPath) throw new Error('pdf: --from-file requires a path');
return parsePdfFromFile(payloadPath);
}
}
const result: ParsedPdfArgs = {
output: `${TEMP_DIR}/browse-page.pdf`,
};
let margins: string | undefined;
const positional: string[] = [];
for (let i = 0; i < args.length; i++) {
const a = args[i];
if (a === '--format') { result.format = requireValue(args, ++i, 'format'); }
else if (a === '--page-size') { result.format = requireValue(args, ++i, 'page-size'); }
else if (a === '--width') { result.width = requireValue(args, ++i, 'width'); }
else if (a === '--height') { result.height = requireValue(args, ++i, 'height'); }
else if (a === '--margins') { margins = requireValue(args, ++i, 'margins'); }
else if (a === '--margin-top') { result.marginTop = requireValue(args, ++i, 'margin-top'); }
else if (a === '--margin-right') { result.marginRight = requireValue(args, ++i, 'margin-right'); }
else if (a === '--margin-bottom') { result.marginBottom = requireValue(args, ++i, 'margin-bottom'); }
else if (a === '--margin-left') { result.marginLeft = requireValue(args, ++i, 'margin-left'); }
else if (a === '--header-template') { result.headerTemplate = requireValue(args, ++i, 'header-template'); }
else if (a === '--footer-template') { result.footerTemplate = requireValue(args, ++i, 'footer-template'); }
else if (a === '--page-numbers') { result.pageNumbers = true; }
else if (a === '--tagged') { result.tagged = true; }
else if (a === '--outline') { result.outline = true; }
else if (a === '--print-background') { result.printBackground = true; }
else if (a === '--prefer-css-page-size') { result.preferCSSPageSize = true; }
else if (a === '--toc') { result.toc = true; }
else if (a.startsWith('--')) { throw new Error(`Unknown pdf flag: ${a}`); }
else { positional.push(a); }
}
if (positional.length > 0) result.output = positional[0];
if (margins !== undefined) {
if (result.marginTop || result.marginRight || result.marginBottom || result.marginLeft) {
throw new Error('pdf: --margins is mutex with --margin-top/--margin-right/--margin-bottom/--margin-left');
}
result.marginTop = result.marginRight = result.marginBottom = result.marginLeft = margins;
}
if (result.format && (result.width || result.height)) {
throw new Error('pdf: --format is mutex with --width/--height');
}
if (result.pageNumbers && result.footerTemplate) {
throw new Error('pdf: --page-numbers is mutex with --footer-template (page-numbers writes the footer itself)');
}
return result;
}
export function parsePdfFromFile(payloadPath: string): ParsedPdfArgs {
// Parity with load-html --from-file (browse/src/write-commands.ts) and
// the direct load-html <file> path: every caller-supplied file path
// must pass validateReadPath so the safe-dirs policy can't be skirted
// by routing reads through the --from-file shortcut.
try {
validateReadPath(path.resolve(payloadPath));
} catch {
throw new Error(
`pdf: --from-file ${payloadPath} must be under ${SAFE_DIRECTORIES.join(' or ')} (security policy). Copy the payload into the project tree or /tmp first.`
);
}
const raw = fs.readFileSync(payloadPath, 'utf8');
let json: any;
try {
json = JSON.parse(raw);
} catch (err) {
const msg = err instanceof Error ? err.message : String(err);
throw new Error(`pdf: --from-file ${payloadPath} is not valid JSON (${msg}).`);
}
if (json === null || typeof json !== 'object' || Array.isArray(json)) {
throw new Error(`pdf: --from-file ${payloadPath} must be a JSON object, got ${Array.isArray(json) ? 'array' : typeof json}.`);
}
const out: ParsedPdfArgs = {
output: json.output || `${TEMP_DIR}/browse-page.pdf`,
format: json.format,
width: json.width,
height: json.height,
marginTop: json.marginTop,
marginRight: json.marginRight,
marginBottom: json.marginBottom,
marginLeft: json.marginLeft,
headerTemplate: json.headerTemplate,
footerTemplate: json.footerTemplate,
pageNumbers: json.pageNumbers === true,
tagged: json.tagged === true,
outline: json.outline === true,
printBackground: json.printBackground === true,
preferCSSPageSize: json.preferCSSPageSize === true,
toc: json.toc === true,
};
return out;
}
function requireValue(args: string[], i: number, flag: string): string {
const v = args[i];
if (v === undefined || v.startsWith('--')) {
throw new Error(`pdf: --${flag} requires a value`);
}
return v;
}
function buildPdfOptions(parsed: ParsedPdfArgs): Record<string, unknown> {
const opts: Record<string, unknown> = {};
// Page size
if (parsed.format) {
opts.format = parsed.format.charAt(0).toUpperCase() + parsed.format.slice(1).toLowerCase();
} else if (parsed.width && parsed.height) {
opts.width = parsed.width;
opts.height = parsed.height;
} else {
opts.format = 'Letter';
}
// Margins
const margin: Record<string, string> = {};
if (parsed.marginTop) margin.top = parsed.marginTop;
if (parsed.marginRight) margin.right = parsed.marginRight;
if (parsed.marginBottom) margin.bottom = parsed.marginBottom;
if (parsed.marginLeft) margin.left = parsed.marginLeft;
if (Object.keys(margin).length > 0) opts.margin = margin;
// Header/footer
const displayHeaderFooter =
!!parsed.headerTemplate || !!parsed.footerTemplate || parsed.pageNumbers === true;
if (displayHeaderFooter) {
opts.displayHeaderFooter = true;
// Provide minimum empty templates when only one is set, otherwise Chromium
// emits its default ugly URL/date in the other slot.
if (parsed.headerTemplate !== undefined) opts.headerTemplate = parsed.headerTemplate;
else if (parsed.pageNumbers || parsed.footerTemplate) opts.headerTemplate = '<div></div>';
if (parsed.pageNumbers) {
opts.footerTemplate = [
'<div style="font-size:9pt; font-family:Helvetica,Arial,sans-serif; color:#666; ',
'width:100%; text-align:center;">',
'<span class="pageNumber"></span> of <span class="totalPages"></span>',
'</div>',
].join('');
} else if (parsed.footerTemplate !== undefined) {
opts.footerTemplate = parsed.footerTemplate;
} else {
opts.footerTemplate = '<div></div>';
}
}
if (parsed.tagged === true) opts.tagged = true;
if (parsed.outline === true) opts.outline = true;
if (parsed.printBackground === true) opts.printBackground = true;
if (parsed.preferCSSPageSize === true) opts.preferCSSPageSize = true;
return opts;
}
/** Options passed from handleCommandInternal for chain routing */
export interface MetaCommandOpts {
chainDepth?: number;
/** Callback to route subcommands through the full security pipeline (handleCommandInternal) */
executeCommand?: (body: { command: string; args?: string[]; tabId?: number }, tokenInfo?: TokenInfo | null) => Promise<{ status: number; result: string; json?: boolean }>;
/** The port the daemon is listening on (needed by `$B skill run` to point spawned scripts at the daemon). */
daemonPort?: number;
}
export async function handleMetaCommand(
command: string,
args: string[],
bm: BrowserManager,
shutdown: () => Promise<void> | void,
tokenInfo?: TokenInfo | null,
opts?: MetaCommandOpts,
): Promise<string> {
// Per-tab operations use the active session; global operations use bm directly
const session = bm.getActiveSession();
switch (command) {
// ─── Tabs ──────────────────────────────────────────
case 'tabs': {
const tabs = await bm.getTabListWithTitles();
return tabs.map(t =>
`${t.active ? '→ ' : ' '}[${t.id}] ${t.title || '(untitled)'}${t.url}`
).join('\n');
}
case 'tab': {
const id = parseInt(args[0], 10);
if (isNaN(id)) throw new Error('Usage: browse tab <id>');
bm.switchTab(id);
return `Switched to tab ${id}`;
}
case 'newtab': {
// --json returns structured output (machine-parseable). Other flag-like
// tokens are treated as the url. make-pdf always passes --json.
let url: string | undefined;
let jsonMode = false;
for (const a of args) {
if (a === '--json') { jsonMode = true; }
else if (!url) { url = a; }
}
const id = await bm.newTab(url);
if (jsonMode) {
return JSON.stringify({ tabId: id, url: url ?? null });
}
return `Opened tab ${id}${url ? `${url}` : ''}`;
}
case 'closetab': {
const id = args[0] ? parseInt(args[0], 10) : undefined;
await bm.closeTab(id);
return `Closed tab${id ? ` ${id}` : ''}`;
}
case 'tab-each': {
// Fan out a single command across every open tab. Returns a JSON
// object: { results: [{tabId, url, title, status, output}], total }.
// Restores the originally active tab when done so the user's view
// doesn't shift under them.
//
// Usage: $B tab-each <command> [args...]
// $B tab-each snapshot -i → snapshot every tab
// $B tab-each text → grab clean text from every tab
// $B tab-each goto https://x.y → load the same URL in every tab
if (args.length === 0) {
throw new Error(
'Usage: browse tab-each <command> [args...]\n' +
'Example: browse tab-each snapshot -i'
);
}
const innerRaw = args[0];
const innerName = canonicalizeCommand(innerRaw);
const innerArgs = args.slice(1);
// Scope check the inner command before fanning out, so a single
// permission failure aborts the whole batch instead of partially
// mutating tabs.
if (tokenInfo && tokenInfo.clientId !== 'root' && !checkScope(tokenInfo, innerName)) {
throw new Error(
`tab-each rejected: subcommand "${innerRaw}" not allowed by your token scope (${tokenInfo.scopes.join(', ')}).`
);
}
const tabs = await bm.getTabListWithTitles();
const originalActive = tabs.find(t => t.active)?.id ?? bm.getActiveTabId();
const executeCmd = opts?.executeCommand;
const results: Array<{
tabId: number;
url: string;
title: string;
status: number;
output: string;
}> = [];
try {
for (const tab of tabs) {
// Skip chrome:// internal pages — they aren't useful targets and
// many commands fail outright on them.
if (tab.url.startsWith('chrome://') || tab.url.startsWith('chrome-extension://')) {
results.push({
tabId: tab.id,
url: tab.url,
title: tab.title || '',
status: 0,
output: 'skipped: internal page',
});
continue;
}
// Switch to the tab. Don't pull focus away — we're a background
// operation; the user shouldn't see the OS window jump.
bm.switchTab(tab.id, { bringToFront: false });
let status = 0;
let output = '';
if (executeCmd) {
const r = await executeCmd(
{ command: innerName, args: innerArgs, tabId: tab.id },
tokenInfo,
);
status = r.status;
output = r.result;
if (status !== 200) {
try { output = JSON.parse(output).error || output; } catch (err: any) { if (!(err instanceof SyntaxError)) throw err; }
}
} else {
// Fallback path (CLI / test harness without a server context).
// We don't recurse through read/write/meta directly here because
// tab-each is only meaningful with the live server; surface a
// clear error.
status = 500;
output = 'tab-each requires the browse server (no executeCommand context)';
}
results.push({
tabId: tab.id,
url: tab.url,
title: tab.title || '',
status,
output,
});
}
} finally {
// Restore the original active tab so the user's view is unchanged.
try { bm.switchTab(originalActive, { bringToFront: false }); } catch {}
}
return JSON.stringify({
command: innerName,
args: innerArgs,
total: results.length,
results,
}, null, 2);
}
// ─── Server Control ────────────────────────────────
case 'status': {
const page = bm.getPage();
const tabs = bm.getTabCount();
const mode = bm.getConnectionMode();
return [
`Status: healthy`,
`Mode: ${mode}`,
`URL: ${page.url()}`,
`Tabs: ${tabs}`,
`PID: ${process.pid}`,
].join('\n');
}
case 'url': {
return bm.getCurrentUrl();
}
case 'stop': {
await shutdown();
return 'Server stopped';
}
case 'restart': {
// Signal that we want a restart — the CLI will detect exit and restart
console.log('[browse] Restart requested. Exiting for CLI to restart.');
await shutdown();
return 'Restarting...';
}
// ─── Visual ────────────────────────────────────────
case 'screenshot': {
// Parse priority: flags (--viewport, --clip, --base64) → selector (@ref, CSS) → output path
const page = bm.getPage();
let outputPath = `${TEMP_DIR}/browse-screenshot.png`;
let clipRect: { x: number; y: number; width: number; height: number } | undefined;
let targetSelector: string | undefined;
let viewportOnly = false;
let base64Mode = false;
const remaining: string[] = [];
let flagSelector: string | undefined;
for (let i = 0; i < args.length; i++) {
if (args[i] === '--viewport') {
viewportOnly = true;
} else if (args[i] === '--base64') {
base64Mode = true;
} else if (args[i] === '--selector') {
flagSelector = args[++i];
if (!flagSelector) throw new Error('Usage: screenshot --selector <css> [path]');
} else if (args[i] === '--clip') {
const coords = args[++i];
if (!coords) throw new Error('Usage: screenshot --clip x,y,w,h [path]');
const parts = coords.split(',').map(Number);
if (parts.length !== 4 || parts.some(isNaN))
throw new Error('Usage: screenshot --clip x,y,width,height — all must be numbers');
clipRect = { x: parts[0], y: parts[1], width: parts[2], height: parts[3] };
} else if (args[i].startsWith('--')) {
throw new Error(`Unknown screenshot flag: ${args[i]}`);
} else {
remaining.push(args[i]);
}
}
// Separate target (selector/@ref) from output path
for (const arg of remaining) {
// File paths containing / and ending with an image/pdf extension are never CSS selectors
const isFilePath = arg.includes('/') && /\.(png|jpe?g|webp|pdf)$/i.test(arg);
if (isFilePath) {
outputPath = arg;
} else if (arg.startsWith('@e') || arg.startsWith('@c') || arg.startsWith('.') || arg.startsWith('#') || arg.includes('[')) {
targetSelector = arg;
} else {
outputPath = arg;
}
}
// --selector flag takes precedence; conflict with positional selector.
if (flagSelector !== undefined) {
if (targetSelector !== undefined) {
throw new Error('--selector conflicts with positional selector — choose one');
}
targetSelector = flagSelector;
}
validateOutputPath(outputPath);
if (clipRect && targetSelector) {
throw new Error('Cannot use --clip with a selector/ref — choose one');
}
if (viewportOnly && clipRect) {
throw new Error('Cannot use --viewport with --clip — choose one');
}
// --base64 mode: capture to buffer instead of disk
if (base64Mode) {
let buffer: Buffer;
if (targetSelector) {
const resolved = await bm.resolveRef(targetSelector);
const locator = 'locator' in resolved ? resolved.locator : page.locator(resolved.selector);
buffer = await locator.screenshot({ timeout: 5000 });
} else if (clipRect) {
buffer = await page.screenshot({ clip: clipRect });
} else {
buffer = await page.screenshot({ fullPage: !viewportOnly });
// Guard the most common API-bricking case (fullPage). Element /
// clip captures usually stay within the cap; we still guard the
// path-mode below for fullPage writes.
({ buffer } = await guardScreenshotBuffer(buffer));
}
if (buffer.length > 10 * 1024 * 1024) {
throw new Error('Screenshot too large for --base64 (>10MB). Use disk path instead.');
}
return `data:image/png;base64,${buffer.toString('base64')}`;
}
if (targetSelector) {
const resolved = await bm.resolveRef(targetSelector);
const locator = 'locator' in resolved ? resolved.locator : page.locator(resolved.selector);
await locator.screenshot({ path: outputPath, timeout: 5000 });
return `Screenshot saved (element): ${outputPath}`;
}
if (clipRect) {
await page.screenshot({ path: outputPath, clip: clipRect });
return `Screenshot saved (clip ${clipRect.x},${clipRect.y},${clipRect.width},${clipRect.height}): ${outputPath}`;
}
await page.screenshot({ path: outputPath, fullPage: !viewportOnly });
if (!viewportOnly) await guardScreenshotPath(outputPath);
return `Screenshot saved${viewportOnly ? ' (viewport)' : ''}: ${outputPath}`;
}
case 'pdf': {
const page = bm.getPage();
const parsed = parsePdfArgs(args);
validateOutputPath(parsed.output);
// If --toc: wait up to 3s for Paged.js to signal by setting
// window.__pagedjsAfterFired = true. If the polyfill isn't injected
// (make-pdf v1 ships without Paged.js; TOC renders without page
// numbers), we fall through silently — callers that require strict
// TOC pagination should pass --require-paged-js too.
if (parsed.toc) {
const deadline = Date.now() + 3000;
let ready = false;
while (Date.now() < deadline) {
try {
ready = await page.evaluate('!!window.__pagedjsAfterFired');
} catch { /* tab may still be hydrating */ }
if (ready) break;
await new Promise(r => setTimeout(r, 150));
}
// Intentionally non-fatal. Paged.js is optional in v1.
}
const opts = buildPdfOptions(parsed);
opts.path = parsed.output;
await page.pdf(opts);
return `PDF saved: ${parsed.output}`;
}
case 'responsive': {
const page = bm.getPage();
const prefix = args[0] || `${TEMP_DIR}/browse-responsive`;
validateOutputPath(prefix);
const viewports = [
{ name: 'mobile', width: 375, height: 812 },
{ name: 'tablet', width: 768, height: 1024 },
{ name: 'desktop', width: 1280, height: 720 },
];
const originalViewport = page.viewportSize();
const results: string[] = [];
for (const vp of viewports) {
await page.setViewportSize({ width: vp.width, height: vp.height });
const screenshotPath = `${prefix}-${vp.name}.png`;
validateOutputPath(screenshotPath);
await page.screenshot({ path: screenshotPath, fullPage: true });
await guardScreenshotPath(screenshotPath);
results.push(`${vp.name} (${vp.width}x${vp.height}): ${screenshotPath}`);
}
// Restore original viewport
if (originalViewport) {
await page.setViewportSize(originalViewport);
}
return results.join('\n');
}
// ─── Chain ─────────────────────────────────────────
case 'chain': {
// Read JSON array from args[0] (if provided) or expect it was passed as body
const jsonStr = args[0];
if (!jsonStr) throw new Error(
'Usage: echo \'[["goto","url"],["text"]]\' | browse chain\n' +
' or: browse chain \'goto url | click @e5 | snapshot -ic\''
);
let rawCommands: string[][];
try {
rawCommands = JSON.parse(jsonStr);
if (!Array.isArray(rawCommands)) throw new Error('not array');
} catch (err: any) {
// Fallback: pipe-delimited format "goto url | click @e5 | snapshot -ic"
if (!(err instanceof SyntaxError) && err?.message !== 'not array') throw err;
rawCommands = jsonStr.split(' | ')
.filter(seg => seg.trim().length > 0)
.map(seg => tokenizePipeSegment(seg.trim()));
}
// Canonicalize aliases across the whole chain. Pair canonical name with the raw
// input so result labels + error messages reflect what the user typed, but every
// dispatch path (scope check, WRITE_COMMANDS.has, watch blocking, handler lookup)
// uses the canonical name. Otherwise `chain '[["setcontent","/tmp/x.html"]]'`
// bypasses prevalidation or runs under the wrong command set.
const commands = rawCommands.map(cmd => {
const [rawName, ...cmdArgs] = cmd;
const name = canonicalizeCommand(rawName);
return { rawName, name, args: cmdArgs };
});
// Pre-validate ALL subcommands against the token's scope before executing any.
// Uses canonical name so aliases don't bypass scope checks.
if (tokenInfo && tokenInfo.clientId !== 'root') {
for (const c of commands) {
if (!checkScope(tokenInfo, c.name)) {
throw new Error(
`Chain rejected: subcommand "${c.rawName}" not allowed by your token scope (${tokenInfo.scopes.join(', ')}). ` +
`All subcommands must be within scope.`
);
}
}
}
// Route each subcommand through handleCommandInternal for full security:
// scope, domain, tab ownership, content wrapping — all enforced per subcommand.
// Chain-specific options: skip rate check (chain = 1 request), skip activity
// events (chain emits 1 event), increment chain depth (recursion guard).
const executeCmd = opts?.executeCommand;
const results: string[] = [];
let lastWasWrite = false;
if (executeCmd) {
// Full security pipeline via handleCommandInternal.
// Pass rawName so the server's own canonicalization is a no-op (already canonical).
for (const c of commands) {
const cr = await executeCmd(
{ command: c.name, args: c.args },
tokenInfo,
);
const label = c.rawName === c.name ? c.name : `${c.rawName}${c.name}`;
if (cr.status === 200) {
results.push(`[${label}] ${cr.result}`);
} else {
// Parse error from JSON result
let errMsg = cr.result;
try { errMsg = JSON.parse(cr.result).error || cr.result; } catch (err: any) { if (!(err instanceof SyntaxError)) throw err; }
results.push(`[${label}] ERROR: ${errMsg}`);
}
lastWasWrite = WRITE_COMMANDS.has(c.name);
}
} else {
// Fallback: direct dispatch (CLI mode, no server context)
const { handleReadCommand } = await import('./read-commands');
const { handleWriteCommand } = await import('./write-commands');
for (const c of commands) {
const name = c.name;
const cmdArgs = c.args;
const label = c.rawName === name ? name : `${c.rawName}${name}`;
try {
let result: string;
if (WRITE_COMMANDS.has(name)) {
if (bm.isWatching()) {
result = 'BLOCKED: write commands disabled in watch mode';
} else {
result = await handleWriteCommand(name, cmdArgs, session, bm);
}
lastWasWrite = true;
} else if (READ_COMMANDS.has(name)) {
result = await handleReadCommand(name, cmdArgs, session);
if (PAGE_CONTENT_COMMANDS.has(name)) {
result = wrapUntrustedContent(result, bm.getCurrentUrl());
}
lastWasWrite = false;
} else if (META_COMMANDS.has(name)) {
result = await handleMetaCommand(name, cmdArgs, bm, shutdown, tokenInfo, opts);
lastWasWrite = false;
} else {
throw new Error(`Unknown command: ${c.rawName}`);
}
results.push(`[${label}] ${result}`);
} catch (err: any) {
results.push(`[${label}] ERROR: ${err.message}`);
}
}
}
// Wait for network to settle after write commands before returning
if (lastWasWrite) {
await bm.getPage().waitForLoadState('networkidle', { timeout: 2000 }).catch(() => {});
}
return results.join('\n\n');
}
// ─── Diff ──────────────────────────────────────────
case 'diff': {
const [url1, url2] = args;
if (!url1 || !url2) throw new Error('Usage: browse diff <url1> <url2>');
const page = bm.getPage();
const normalizedUrl1 = await validateNavigationUrl(url1);
await page.goto(normalizedUrl1, { waitUntil: 'domcontentloaded', timeout: 15000 });
const text1 = await getCleanText(page);
const normalizedUrl2 = await validateNavigationUrl(url2);
await page.goto(normalizedUrl2, { waitUntil: 'domcontentloaded', timeout: 15000 });
const text2 = await getCleanText(page);
const changes = Diff.diffLines(text1, text2);
const output: string[] = [`--- ${url1}`, `+++ ${url2}`, ''];
for (const part of changes) {
const prefix = part.added ? '+' : part.removed ? '-' : ' ';
const lines = part.value.split('\n').filter(l => l.length > 0);
for (const line of lines) {
output.push(`${prefix} ${line}`);
}
}
return wrapUntrustedContent(output.join('\n'), `diff: ${url1} vs ${url2}`);
}
// ─── Snapshot ─────────────────────────────────────
case 'snapshot': {
const isScoped = tokenInfo && tokenInfo.clientId !== 'root';
const snapshotResult = await handleSnapshot(args, session, {
splitForScoped: !!isScoped,
});
// Scoped tokens get split format (refs outside envelope); root gets basic wrapping
if (isScoped) {
return snapshotResult; // already has envelope from split format
}
return wrapUntrustedContent(snapshotResult, bm.getCurrentUrl());
}
// ─── Handoff ────────────────────────────────────
case 'handoff': {
const message = args.join(' ') || 'User takeover requested';
return await bm.handoff(message);
}
case 'resume': {
bm.resume();
// Re-snapshot to capture current page state after human interaction
const isScoped2 = tokenInfo && tokenInfo.clientId !== 'root';
const snapshot = await handleSnapshot(['-i'], session, { splitForScoped: !!isScoped2 });
if (isScoped2) {
return `RESUMED\n${snapshot}`;
}
return `RESUMED\n${wrapUntrustedContent(snapshot, bm.getCurrentUrl())}`;
}
// ─── Headed Mode ──────────────────────────────────────
case 'connect': {
// connect is handled as a pre-server command in cli.ts
// If we get here, server is already running — tell the user
if (bm.getConnectionMode() === 'headed') {
return 'Already in headed mode with extension.';
}
return 'The connect command must be run from the CLI (not sent to a running server). Run: $B connect';
}
case 'disconnect': {
if (bm.getConnectionMode() !== 'headed') {
return 'Not in headed mode — nothing to disconnect.';
}
// Signal that we want a restart in headless mode
console.log('[browse] Disconnecting headed browser. Restarting in headless mode.');
await shutdown();
return 'Disconnected. Server will restart in headless mode on next command.';
}
case 'focus': {
if (bm.getConnectionMode() !== 'headed') {
return 'focus requires headed mode. Run `$B connect` first.';
}
try {
const { execSync } = await import('child_process');
// Try common Chromium-based browser app names to bring to foreground
const appNames = ['Comet', 'Google Chrome', 'Arc', 'Brave Browser', 'Microsoft Edge'];
let activated = false;
for (const appName of appNames) {
try {
execSync(`osascript -e 'tell application "${appName}" to activate'`, { stdio: 'pipe', timeout: 3000 });
activated = true;
break;
} catch (err: any) {
// Try next browser — osascript fails if app not found or AppleScript errors
if (err?.status === undefined && !err?.message?.includes('Command failed')) throw err;
}
}
if (!activated) {
return 'Could not bring browser to foreground. macOS only.';
}
// If a ref was passed, scroll it into view
if (args.length > 0 && args[0].startsWith('@')) {
try {
const resolved = await bm.resolveRef(args[0]);
if ('locator' in resolved) {
await resolved.locator.scrollIntoViewIfNeeded({ timeout: 5000 });
return `Browser activated. Scrolled ${args[0]} into view.`;
}
} catch (err: any) {
// Ref not found or element gone — still activated the browser
if (!err?.message?.includes('not found') && !err?.message?.includes('closed') && !err?.message?.includes('Target') && !err?.message?.includes('timeout')) throw err;
}
}
return 'Browser window activated.';
} catch (err: any) {
return `focus failed: ${err.message}. macOS only.`;
}
}
// ─── Watch ──────────────────────────────────────────
case 'watch': {
if (args[0] === 'stop') {
if (!bm.isWatching()) return 'Not currently watching.';
const result = bm.stopWatch();
const durationSec = Math.round(result.duration / 1000);
const lastSnapshot = result.snapshots.length > 0
? wrapUntrustedContent(result.snapshots[result.snapshots.length - 1], bm.getCurrentUrl())
: '(none)';
return [
`WATCH STOPPED (${durationSec}s, ${result.snapshots.length} snapshots)`,
'',
'Last snapshot:',
lastSnapshot,
].join('\n');
}
if (bm.isWatching()) return 'Already watching. Run `$B watch stop` to stop.';
if (bm.getConnectionMode() !== 'headed') {
return 'watch requires headed mode. Run `$B connect` first.';
}
bm.startWatch();
return 'WATCHING — observing user browsing. Periodic snapshots every 5s.\nRun `$B watch stop` to stop and get summary.';
}
// ─── Inbox ──────────────────────────────────────────
case 'inbox': {
const { execSync } = await import('child_process');
let gitRoot: string;
try {
gitRoot = execSync('git rev-parse --show-toplevel', { encoding: 'utf-8', stdio: ['pipe', 'pipe', 'pipe'] }).trim();
} catch (err: any) {
// execSync throws with exit status on non-git directories
if (err?.status === undefined && !err?.message?.includes('Command failed')) throw err;
return 'Not in a git repository — cannot locate inbox.';
}
const inboxDir = path.join(gitRoot, '.context', 'sidebar-inbox');
if (!fs.existsSync(inboxDir)) return 'Inbox empty.';
const files = fs.readdirSync(inboxDir)
.filter(f => f.endsWith('.json') && !f.startsWith('.'))
.sort()
.reverse(); // newest first
if (files.length === 0) return 'Inbox empty.';
const messages: { timestamp: string; url: string; userMessage: string }[] = [];
for (const file of files) {
try {
const data = JSON.parse(fs.readFileSync(path.join(inboxDir, file), 'utf-8'));
messages.push({
timestamp: data.timestamp || '',
url: data.page?.url || 'unknown',
userMessage: data.userMessage || '',
});
} catch (err: any) {
// Skip malformed JSON or unreadable files
if (!(err instanceof SyntaxError) && err?.code !== 'ENOENT' && err?.code !== 'EACCES') throw err;
}
}
if (messages.length === 0) return 'Inbox empty.';
const lines: string[] = [];
lines.push(`SIDEBAR INBOX (${messages.length} message${messages.length === 1 ? '' : 's'})`);
lines.push('────────────────────────────────');
for (const msg of messages) {
const ts = msg.timestamp ? `[${msg.timestamp}]` : '[unknown]';
lines.push(`${ts} ${wrapUntrustedContent(msg.url, 'inbox-url')}`);
lines.push(` "${wrapUntrustedContent(msg.userMessage, 'inbox-message')}"`);
lines.push('');
}
lines.push('────────────────────────────────');
// Handle --clear flag
if (args.includes('--clear')) {
for (const file of files) {
try { fs.unlinkSync(path.join(inboxDir, file)); } catch (err: any) { if (err?.code !== 'ENOENT') throw err; }
}
lines.push(`Cleared ${files.length} message${files.length === 1 ? '' : 's'}.`);
}
return lines.join('\n');
}
// ─── State ────────────────────────────────────────
case 'state': {
const [action, name] = args;
if (!action || !name) throw new Error('Usage: state save|load <name>');
// Sanitize name: alphanumeric + hyphens + underscores only
if (!/^[a-zA-Z0-9_-]+$/.test(name)) {
throw new Error('State name must be alphanumeric (a-z, 0-9, _, -)');
}
const config = resolveConfig();
const stateDir = path.join(config.stateDir, 'browse-states');
mkdirSecure(stateDir);
const statePath = path.join(stateDir, `${name}.json`);
if (action === 'save') {
const state = await bm.saveState();
// V1: cookies + URLs only (not localStorage — breaks on load-before-navigate)
const saveData = {
version: 1,
savedAt: new Date().toISOString(),
cookies: state.cookies,
pages: state.pages.map(p => ({ url: p.url, isActive: p.isActive })),
};
writeSecureFile(statePath, JSON.stringify(saveData, null, 2));
return `State saved: ${statePath} (${state.cookies.length} cookies, ${state.pages.length} pages)\n⚠️ Cookies stored in plaintext. Delete when no longer needed.`;
}
if (action === 'load') {
if (!fs.existsSync(statePath)) throw new Error(`State not found: ${statePath}`);
const data = JSON.parse(fs.readFileSync(statePath, 'utf-8'));
if (!Array.isArray(data.cookies) || !Array.isArray(data.pages)) {
throw new Error('Invalid state file: expected cookies and pages arrays');
}
// Validate and filter cookies — reject malformed or internal-network cookies
const validatedCookies = data.cookies.filter((c: any) => {
if (typeof c !== 'object' || !c) return false;
if (typeof c.name !== 'string' || typeof c.value !== 'string') return false;
if (typeof c.domain !== 'string' || !c.domain) return false;
const d = c.domain.startsWith('.') ? c.domain.slice(1) : c.domain;
if (d === 'localhost' || d.endsWith('.internal') || d === '169.254.169.254') return false;
return true;
});
if (validatedCookies.length < data.cookies.length) {
console.warn(`[browse] Filtered ${data.cookies.length - validatedCookies.length} invalid cookies from state file`);
}
// Warn on state files older than 7 days
if (data.savedAt) {
const ageMs = Date.now() - new Date(data.savedAt).getTime();
const SEVEN_DAYS = 7 * 24 * 60 * 60 * 1000;
if (ageMs > SEVEN_DAYS) {
console.warn(`[browse] Warning: State file is ${Math.round(ageMs / 86400000)} days old. Consider re-saving.`);
}
}
// Close existing pages, then restore (replace, not merge)
bm.setFrame(null);
await bm.closeAllPages();
// Allowlist disk-loaded page fields — NEVER accept loadedHtml, loadedHtmlWaitUntil,
// or owner from disk. Those are in-memory-only invariants; allowing them would let
// a tampered state file smuggle HTML past load-html's safe-dirs + magic-byte + size
// checks, or forge tab ownership for cross-agent authorization bypass.
await bm.restoreState({
cookies: validatedCookies,
pages: data.pages.map((p: any) => ({
url: typeof p.url === 'string' ? p.url : '',
isActive: Boolean(p.isActive),
storage: null,
})),
});
return `State loaded: ${data.cookies.length} cookies, ${data.pages.length} pages`;
}
throw new Error('Usage: state save|load <name>');
}
// ─── Frame ───────────────────────────────────────
case 'frame': {
const target = args[0];
if (!target) throw new Error('Usage: frame <selector|@ref|--name name|--url pattern|main>');
if (target === 'main') {
bm.setFrame(null);
bm.clearRefs();
return 'Switched to main frame';
}
const page = bm.getPage();
let frame: Frame | null = null;
if (target === '--name') {
if (!args[1]) throw new Error('Usage: frame --name <name>');
frame = page.frame({ name: args[1] });
} else if (target === '--url') {
if (!args[1]) throw new Error('Usage: frame --url <pattern>');
frame = page.frame({ url: new RegExp(escapeRegExp(args[1])) });
} else {
// CSS selector or @ref for the iframe element
const resolved = await bm.resolveRef(target);
const locator = 'locator' in resolved ? resolved.locator : page.locator(resolved.selector);
const elementHandle = await locator.elementHandle({ timeout: 5000 });
frame = await elementHandle?.contentFrame() ?? null;
await elementHandle?.dispose();
}
if (!frame) throw new Error(`Frame not found: ${target}`);
bm.setFrame(frame);
bm.clearRefs();
return `Switched to frame: ${frame.url()}`;
}
// ─── UX Audit ─────────────────────────────────────
case 'ux-audit': {
const page = bm.getPage();
// Extract page structure for UX behavioral analysis
// Agent interprets the data and applies Krug's 6 usability tests
// Uses textContent (not innerText) to avoid layout computation on large DOMs
const data = await page.evaluate(() => {
const HEADING_CAP = 50;
const INTERACTIVE_CAP = 200;
const TEXT_BLOCK_CAP = 50;
// Site ID: logo or brand element
const logoEl = document.querySelector('[class*="logo"], [id*="logo"], header img, [aria-label*="home"], a[href="/"]');
const siteId = logoEl ? {
found: true,
text: (logoEl.textContent || '').trim().slice(0, 100),
tag: logoEl.tagName,
alt: (logoEl as HTMLImageElement).alt || null,
} : { found: false, text: null, tag: null, alt: null };
// Page name: main heading
const h1 = document.querySelector('h1');
const pageName = h1 ? {
found: true,
text: h1.textContent?.trim().slice(0, 200) || '',
} : { found: false, text: null };
// Navigation: primary nav elements
const navEls = document.querySelectorAll('nav, [role="navigation"]');
const navItems: Array<{ text: string; links: number }> = [];
navEls.forEach((nav, i) => {
if (i >= 5) return;
const links = nav.querySelectorAll('a');
navItems.push({
text: (nav.getAttribute('aria-label') || `nav-${i}`).slice(0, 50),
links: links.length,
});
});
// "You are here" indicator: current/active nav items
// Scoped to nav containers to avoid false positives from animation classes
const activeNavItems = document.querySelectorAll('nav [aria-current], nav .active, nav .current, [role="navigation"] [aria-current], [role="navigation"] .active, [role="navigation"] .current');
const youAreHere = Array.from(activeNavItems).slice(0, 5).map(el => ({
text: (el.textContent || '').trim().slice(0, 50),
tag: el.tagName,
}));
// Search: search box presence
const searchEl = document.querySelector('input[type="search"], [role="search"], input[name*="search"], input[placeholder*="search" i], input[aria-label*="search" i]');
const search = { found: !!searchEl };
// Breadcrumbs
const breadcrumbEl = document.querySelector('[aria-label*="breadcrumb" i], .breadcrumb, .breadcrumbs, [class*="breadcrumb"]');
const breadcrumbs = breadcrumbEl ? {
found: true,
items: Array.from(breadcrumbEl.querySelectorAll('a, span, li')).slice(0, 10).map(el => (el.textContent || '').trim().slice(0, 30)),
} : { found: false, items: [] };
// Headings: heading hierarchy
const headings = Array.from(document.querySelectorAll('h1,h2,h3,h4,h5,h6')).slice(0, HEADING_CAP).map(h => ({
tag: h.tagName,
text: (h.textContent || '').trim().slice(0, 80),
size: getComputedStyle(h).fontSize,
}));
// Interactive elements: buttons, links, inputs
const interactiveEls = Array.from(document.querySelectorAll('a, button, input, select, textarea, [role="button"], [tabindex]')).slice(0, INTERACTIVE_CAP);
const interactive = interactiveEls.map(el => {
const rect = el.getBoundingClientRect();
return {
tag: el.tagName,
text: (el.textContent || (el as HTMLInputElement).placeholder || '').trim().slice(0, 50),
type: (el as HTMLInputElement).type || null,
role: el.getAttribute('role'),
w: Math.round(rect.width),
h: Math.round(rect.height),
visible: rect.width > 0 && rect.height > 0,
};
}).filter(el => el.visible);
// Text blocks: paragraphs and large text areas
const textBlocks = Array.from(document.querySelectorAll('p, [class*="description"], [class*="intro"], [class*="welcome"], [class*="hero"] p, main p')).slice(0, TEXT_BLOCK_CAP).map(el => ({
text: (el.textContent || '').trim().slice(0, 200),
wordCount: (el.textContent || '').trim().split(/\s+/).filter(Boolean).length,
}));
// Total visible text word count (textContent avoids layout computation)
const bodyText = (document.body?.textContent || '').trim();
const totalWords = bodyText.split(/\s+/).filter(Boolean).length;
return {
url: window.location.href,
title: document.title,
siteId,
pageName,
navigation: navItems,
youAreHere,
search,
breadcrumbs,
headings,
interactive,
textBlocks,
totalWords,
};
});
return JSON.stringify(data, null, 2);
}
case 'domain-skill': {
return await handleDomainSkillCommand(args, bm);
}
case 'skill': {
const port = opts?.daemonPort;
if (port === undefined) {
throw new Error('skill command requires daemonPort in MetaCommandOpts (server bug)');
}
return await handleSkillCommand(args, { port });
}
case 'cdp': {
// Lazy import — cdp-bridge introduces module deps we don't want loaded
// for projects that never use the CDP escape hatch.
const { handleCdpCommand } = await import('./cdp-commands');
return await handleCdpCommand(args, bm);
}
default:
throw new Error(`Unknown meta command: ${command}`);
}
}