Files
gstack/browse/src/browser-manager.ts
T
Garry Tan 00f966b3ec v1.30.0.0 fix wave: 21 community PRs + Windows CI extension + codex flag-semantics smoke (#1391)
* fix(codex): use resume-compatible flags

* fix: V-001 security vulnerability

Automated security fix generated by Orbis Security AI

* docs: align prompt-injection thresholds to security.ts (v1.6.4.0 catch-up)

CLAUDE.md:290 and ARCHITECTURE.md:159 were missed when WARN was bumped
0.60 → 0.75 in d75402bb (v1.6.4.0, "cut Haiku classifier FP from 44% to
23%, gate now enforced", #1135). browse/src/security.ts:37 has WARN: 0.75
and BROWSER.md:743 was updated alongside that commit; CLAUDE.md and
ARCHITECTURE.md still read 0.60.

Also adds the SOLO_CONTENT_BLOCK: 0.92 entry to CLAUDE.md (already in
security.ts:50 and BROWSER.md:745, missing from CLAUDE.md's threshold
table).

No code change. No behavior change. Pure doc-vs-code alignment.

Verification:
  $ grep -n "WARN" browse/src/security.ts CLAUDE.md ARCHITECTURE.md BROWSER.md
  browse/src/security.ts:37:  WARN: 0.75,
  CLAUDE.md:290: - \`WARN: 0.75\` ...
  ARCHITECTURE.md:159: ...>= \`WARN\` (0.75)...
  BROWSER.md:743: - \`WARN: 0.75\` ...

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix: Korean/CJK IME input and rendering in Sidebar Terminal

Fixes #1272

This commit addresses three separate Korean/CJK bugs in the Sidebar Terminal:

**Bug 1 - IME Input**: Korean text typed via IME composition was not
reaching the PTY correctly. Added compositionstart/compositionend event
listeners to suppress partial jamo fragments and only send the final
composed string.

**Bug 2a - Font Rendering**: Added CJK monospace font fallbacks
("Noto Sans Mono CJK KR", "Malgun Gothic") to both the xterm.js
fontFamily config and the CSS --font-mono variable. This ensures
consistent cell-width calculations for Korean characters.

**Bug 2b - UTF-8 Boundary Detection**: Added buffering logic to prevent
multi-byte UTF-8 characters (Korean is 3 bytes) from being split across
WebSocket chunks. This follows the same pattern as PR #1007 which fixed
the sidebar-agent path, but extends it to the terminal-agent path.

Special thanks to @ldybob for the excellent root cause analysis and
proposed solutions in issue #1272.

Tested on WSL2 + Windows 11 with Korean IME.

* fix(ship): tighten Plan Completion gate (VAS-449 remediation)

VAS-446 shipped with a PLAN.md acceptance criterion (domain-hq has
/docs/dashboard.md) silently skipped. /ship's Plan Completion subagent
existed at ship time (added in v1.4.1.0) but the gate let the failure
through. Four structural fixes:

1. Path concreteness rule: items naming a concrete filesystem path MUST
   be classified DONE/NOT DONE via [ -f <path> ], never UNVERIFIABLE.
2. Validator detection: CONTENT-SHAPE items scan target repo's
   package.json for validate-* scripts and run them before falling back
   to UNVERIFIABLE.
3. Per-item UNVERIFIABLE confirmation: replaces blanket "I've checked
   each one" with per-item Y/N/D loop. The blanket-confirm path is the
   exact failure VAS-449 surfaced.
4. Subagent fail-closed: if Plan Completion subagent + inline fallback
   both fail, surface explicit AskUserQuestion instead of silent pass.
   Replaces the prior "Never block /ship on subagent failure" fail-open.

Locked in by test/ship-plan-completion-invariants.test.ts (5 assertions,
no LLM dependency, ~60ms).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(browse): bash.exe wrap for telemetry on Windows

reportAttemptTelemetry() in browse/src/security.ts calls spawn(bin, args)
where bin is the gstack-telemetry-log bash script. On Windows this fails
silently with ENOENT — CreateProcess can't dispatch on shebang lines.

Adopts v1.24.0.0's Bun.which + GSTACK_*_BIN override pattern (from
browse/src/claude-bin.ts:resolveClaudeCommand, introduced in #1252) for
resolving bash.exe. resolveBashBinary() honors GSTACK_BASH_BIN absolute-path
or PATH-resolvable override, falling back to Bun.which('bash') which finds
Git Bash on the standard Windows install.

buildTelemetrySpawnCommand() wraps the script invocation on win32 only;
POSIX path is bit-identical. Returns null when bash can't be resolved on
Windows so caller skips spawn — local attempts.jsonl audit trail keeps
working without surfacing a Windows-only failure.

8 new unit tests cover resolveBashBinary (POSIX bash, absolute override,
quote-stripping, BASH_BIN fallback, empty-PATH null) and buildTelemetrySpawnCommand
(POSIX pass-through, win32 bash wrap, win32 null on unresolvable, arg-array
immutability).

POSIX path is bit-identical — Bun.which('bash') on Linux/macOS returns the
same /bin/bash or /usr/bin/bash that the old hardcoded spawn relied on.

* fix(make-pdf): Bun.which-based binary resolution for browse + pdftotext on Windows

Extends v1.24.0.0's Bun.which + GSTACK_*_BIN override pattern (introduced in
browse/src/claude-bin.ts via #1252) to the two other binary resolvers in the
codebase: make-pdf/src/browseClient.ts:resolveBrowseBin and
make-pdf/src/pdftotext.ts:resolvePdftotext.

Same Windows quirks (fs.accessSync(X_OK) degrades to existence-check; `which`
isn't available outside Git Bash; bun --compile --outfile X emits X.exe), same
Bun.which-based fix shape, same env override convention.

Changes:
  - GSTACK_BROWSE_BIN / GSTACK_PDFTOTEXT_BIN as the v1.24-aligned overrides;
    BROWSE_BIN / PDFTOTEXT_BIN remain as back-compat aliases.
  - Bun.which() replaces execFileSync('which', ...) for PATH lookup. Handles
    Windows PATHEXT natively; no more `where`-vs-`which` branch.
  - findExecutable(base) helper exported from each module, probes .exe/.cmd/.bat
    after the bare-path miss on win32. Linux/macOS behavior is bit-identical
    (isExecutable short-circuits before the win32 branch ever runs).
  - macCandidates renamed posixCandidates (always was — /opt/homebrew, /usr/local,
    /usr/bin). No Windows candidates added; Poppler installs scatter across
    Scoop/Chocolatey/portable zips and guessing causes false positives.
  - Error messages get a Windows install hint (scoop install poppler / oschwartz10612)
    and `setx` example for GSTACK_*_BIN.
  - Pre-existing test 'honors BROWSE_BIN when it points at a real executable'
    was hardcoded /bin/sh — made cross-platform via a REAL_EXE constant
    (cmd.exe on win32, /bin/sh on POSIX). Was a Windows-CI blocker on its own.

Coordination: PR #1094 (@BkashJEE) covered browseClient.ts independently with a
narrower scope; this PR's pdftotext + cross-platform tests + GSTACK_*_BIN naming
are additive. Either order of merge works.

Test plan:
  - bun test make-pdf/test/browseClient.test.ts make-pdf/test/pdftotext.test.ts
    on win32 — 29 pass, 0 fail (12 new assertions: findExecutable POSIX/win32/null,
    resolveBrowseBin GSTACK_BROWSE_BIN + BROWSE_BIN + precedence + quote-strip,
    same shape for resolvePdftotext + Windows install hint in error message).
  - POSIX branch unchanged — fs.accessSync(X_OK) on Linux/macOS short-circuits
    before any win32 logic runs, matching the v1.24 claude-bin.ts pattern.

* fix(browse): NTFS ACL hardening for Windows state files via icacls

gstack's ~/.gstack/ state directory holds bearer tokens, canary tokens, agent
queue contents (with prompt history), session state, security-decision logs,
and saved cookie bundles — all written with { mode: 0o600 } / 0o700. On Windows,
those mode bits are a silent no-op: Node's fs module doesn't translate POSIX
modes to NTFS ACLs, and inherited ACLs leave every "restricted" file readable
by other principals on the machine (verified via icacls — six ACEs, the
intended user is the LAST of six).

Threat model is non-trivial on:
  - Self-hosted CI runners (different service account on the same Windows box
    can read developer tokens, canary tokens, prompt history)
  - Shared development machines (agencies, studios, lab environments)
  - Multi-tenant servers with shared home directories

Orthogonal to v1.24.0.0's binary-resolution work — complementary at the write
side. v1.24's bin/gstack-paths resolves ~/.gstack/ correctly across plugin /
global / local installs; this PR ensures files written into those resolved
paths actually get the POSIX 0o600 semantic translated to NTFS.

The fix:
  - New browse/src/file-permissions.ts (158 LOC, 5 public + 1 test-reset).
    restrictFilePermissions / restrictDirectoryPermissions wrap chmod (POSIX)
    or icacls /inheritance:r /grant:r <user>:(F) (Windows). writeSecureFile /
    appendSecureFile / mkdirSecure are drop-in wrappers for the common patterns.
  - 19 call sites converted across 9 source files: browser-manager.ts,
    browser-skill-write.ts, cli.ts, config.ts, meta-commands.ts,
    security-classifier.ts, security.ts (4 sites), server.ts (5 sites),
    terminal-agent.ts (8 sites), tunnel-denial-log.ts.
  - (OI)(CI) inheritance flags on directories mean files created via fs.write*
    *inside* an mkdirSecure-created dir inherit the owner-only ACL automatically
    — important for tunnel-denial-log.ts where appends use async fsp.appendFile.

Error handling: icacls failures (nonexistent path, missing icacls.exe, hardened
environments) log a one-shot warning to stderr and proceed. Once-per-process
gating prevents log spam if the condition persists. Filesystem stays
functional; the file just ends up with inherited ACLs.

Test plan:
  - bun test browse/test/file-permissions.test.ts — 13 pass, 0 fail (POSIX
    mode-bit assertions, Windows no-throw, mkdir idempotence, recursive
    creation, Buffer payloads, append-creates-then-reapplies-once semantics)
  - bun test browse/test/security.test.ts — 38 pass, 0 fail (existing security
    test suite plus the bash-binary resolution tests added in fix #1119; the
    converted writeFileSync/appendFileSync/mkdirSync sites in security.ts
    integrate cleanly)
  - Empirical icacls before/after on a real file — 6 ACEs → 1 ACE
  - bun build typecheck on all modified files — clean (server.ts has a
    pre-existing playwright-core/electron resolution issue unrelated to this PR)

POSIX behavior is bit-identical to old code — fs.chmodSync(path, 0o6XX) on the
helper's POSIX branch matches the inline { mode: 0o6XX } it replaces. Linux
and macOS see no behavior change.

Inviting pushback on three judgment calls (in PR description):
  1. icacls vs npm library
  2. ACL scope — just user, or user + SYSTEM?
  3. Graceful degradation — once-per-process warn, not silent, not hard-fail.

* fix(browse): declare lastConsoleFlushed to restore console-log persistence

flushBuffers() references a `lastConsoleFlushed` cursor at server.ts:337
and assigns it at :344, but the `let lastConsoleFlushed = 0;`
declaration is missing — only the network and dialog siblings are
declared at lines 327-328.

Result: every 1-second flushBuffers tick (line 376) throws
`ReferenceError: lastConsoleFlushed is not defined`, gets swallowed by
the catch at line 369 ("[browse] Buffer flush failed: ..."), and the
console branch's append never runs. browse-console.log is never
written in any production deployment since this regressed.

Discovered by stress-testing the daemon with 15 concurrent CLIs against
cold state — the race surfaced the buffer-flush error spam in one
spawned daemon's stderr. Verified by running the daemon against a real
file:// page with console.log events: in-memory `browse console`
returns the entries, but `.gstack/browse-console.log` is never created
on disk.

Regression introduced by 1a100a2a "fix: eliminate duplicate command
sets in chain, improve flush perf and type safety" — the flush refactor
switched from `Bun.write` to `fs.appendFileSync` and added the
`lastConsoleFlushed` cursor pattern alongside its network/dialog
siblings, but missed the matching `let` declaration. Tests don't
currently exercise flushBuffers, so the regression shipped silently.

Fix:
  - Declare `let lastConsoleFlushed = 0;` next to `lastNetworkFlushed`
    and `lastDialogFlushed` (browse/src/server.ts:327)
  - Add a source-level guard test
    (browse/test/server-flush-trackers.test.ts) that fails any future
    refactor that adds a fourth `last*Flushed` cursor without the
    matching declaration. Same pattern as terminal-agent.test.ts and
    dual-listener.test.ts — read source as text, assert invariant, no
    daemon required.

Test plan:
  - [x] New regression test fails on current main, passes with the fix
  - [x] `bun run build` clean
  - [x] Manual smoke: spawn daemon -> goto file:// page with
        console.log -> wait 4s -> .gstack/browse-console.log now
        exists with the expected entries (163 bytes vs zero before)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

* fix(browse): per-process state-file temp path to fix concurrent-write ENOENT

The daemon writes `.gstack/browse.json` via the standard atomic-rename
pattern: `writeFileSync(tmp, …) → renameSync(tmp, stateFile)`. Four
sites in server.ts use this pattern (initial daemon-startup state at
:2002, /tunnel/start handler at :1479, BROWSE_TUNNEL=1 inline tunnel
update at :2083, BROWSE_TUNNEL_LOCAL_ONLY=1 update at :2113), and all
four hard-code the same temp filename `${stateFile}.tmp`.

Under concurrent writers the shared filename races on the rename:

    t0  Writer A: writeFileSync(stateFile + '.tmp', payloadA)
    t1  Writer B: writeFileSync(stateFile + '.tmp', payloadB)   // overwrites A
    t2  Writer A: renameSync(stateFile + '.tmp', stateFile)    // moves B's payload
    t3  Writer B: renameSync(stateFile + '.tmp', stateFile)    // ENOENT — file gone

Reproduced empirically with 15 concurrent CLIs against a fresh `.gstack/`:

    [browse] Failed to start: ENOENT: no such file or directory,
    rename '…/.gstack/browse.json.tmp' -> '…/.gstack/browse.json'

Pre-fix success rate: **0 / 15** under cold-start race.
Post-fix success rate: **15 / 15**, zero ENOENT.

Fix:
  - New `tmpStatePath()` helper (server.ts:333) returns
    `${stateFile}.tmp.${pid}.${randomBytes(4).toString('hex')}`
  - All 4 call sites use `tmpStatePath()` instead of the shared literal
  - Atomic rename still gives last-writer-wins semantics on the final
    state.json content; only behavior change is that concurrent writers
    no longer kill each other on the rename step

Source-level guard test (browse/test/server-tmp-state-path.test.ts)
locks two invariants: (1) no remaining `stateFile + '.tmp'` literals,
(2) every state-write `writeFileSync` call uses `tmpStatePath()`. Same
read-source-as-text pattern as terminal-agent.test.ts and
dual-listener.test.ts — no daemon required, runs in tier-1 free.

Test plan:
  - [x] Targeted source-level guard test passes (3 / 0)
  - [x] `bun run build` clean
  - [x] Live regression: 15 concurrent CLIs against cold state →
        15 / 15 healthy, 0 ENOENT (vs 0 / 15 pre-fix)
  - [x] No `.tmp.*` orphans left behind after rename succeeds
  - [x] Related test cluster (server-auth, dual-listener, cdp-mutex,
        findport) — same pre-existing flakes as `main`, no new
        regressions introduced

🤖 Generated with [Claude Code](https://claude.com/claude-code)

* fix(browse): clear refs when iframe auto-detaches in getActiveFrameOrPage

Asymmetric cleanup between two equivalent staleness conditions:

  onMainFrameNavigated()  →  clearRefs() + activeFrame = null  ✓
  getActiveFrameOrPage()  →  activeFrame = null  (refs NOT cleared)  ✗

Both paths see the same staleness condition — refs were captured
against a frame that no longer exists. The main-frame path correctly
clears both pieces of state. The iframe-detach path nulls the frame
but leaves the refMap intact.

The lazy click-time check in `resolveRef` (tab-session.ts:97) partially
saves us — `entry.locator.count()` on a detached-frame locator throws
or returns 0, so the click errors out as "Ref X is stale". But the
user has no signal that frame context silently changed underfoot: the
next `snapshot` runs against `this.page` (main) while old iframe refs
still litter `refMap` with the same role+name keys. New refs collide
with stale ones, the resolver picks one at random, the user clicks
the wrong element.

TODOS.md line 816-820 documents "Detached frame auto-recovery" as a
shipped iframe-support feature in v0.12.1.0. This restores the
documented intent — the recovery should leave the session in a clean
state, not a half-cleared one.

Fix: 1 line — add `this.clearRefs()` next to `this.activeFrame = null`
inside the if-branch.

Test plan:
  - [x] New regression test: 4/4 pass
        - refs cleared when getActiveFrameOrPage detects detached iframe
        - refs preserved when active frame is still attached (no regression)
        - refs preserved when no frame set (page-level path untouched)
        - matches onMainFrameNavigated symmetry — both paths reach the
          same clean end state
  - [x] `bun run build` clean

🤖 Generated with [Claude Code](https://claude.com/claude-code)

* fix(codex): resolve python for JSON parser

* fix: add fail-fast probe for base branch in ship step 12

* fix(plan-devex-review): remove contradictory plan-mode handshake

* fix(design): honor Retry-After header in variants 429 handler

Closes #1244.

The 429 handler in `generateVariant` discarded the `Retry-After` response
header and fell straight through to a local exponential schedule (2s/4s/8s).
In image-generation batches, that burns retry attempts inside the provider's
cooldown window and the request never recovers.

Now we parse `Retry-After` per RFC 7231 — both delta-seconds (`Retry-After: 5`)
and HTTP-date (`Retry-After: Fri, 31 Dec 1999 23:59:59 GMT`). Honored waits
are capped at 60s to bound stalls from hostile or buggy headers. Delta-seconds
are validated as digits-only (rejects `2abc`). When `Retry-After` is honored
(including 0 / past-date "retry now"), the next iteration's leading exponential
sleep is skipped so we don't double-wait. Invalid or missing headers fall
through to the existing exponential schedule unchanged.

Behavior matrix:

| Header                          | Behavior                                  |
|---------------------------------|-------------------------------------------|
| Retry-After: 5                  | wait 5s, skip leading on next attempt     |
| Retry-After: 999999             | capped to 60s, skip leading               |
| Retry-After: 2abc               | invalid, fall through to exponential      |
| Retry-After: 0                  | wait 0, skip leading (retry immediately)  |
| Retry-After: <past HTTP-date>   | wait 0, skip leading                      |
| Retry-After: <future date>      | wait diff capped at 60s, skip leading     |
| no header                       | fall through to existing exponential      |

`generateVariant` now accepts an optional `fetchFn` parameter (defaults to
`globalThis.fetch`) so tests can inject a stub. Production call sites are
unchanged.

Tests cover the five behavior buckets above, asserting both the 1st-to-2nd
call timing gap and call counts. All five pass in ~8s.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(docs): correct per-skill symlink removal snippet in README uninstall

Closes #1130.

The manual-uninstall fallback in `## Uninstall` → `### Option 2` used
`find ~/.claude/skills -maxdepth 1 -type l`, which finds nothing on real
installs. Each `~/.claude/skills/<name>/` is a real directory, and only
`<name>/SKILL.md` inside it is a symlink into `gstack/`. The find never
matched, so the snippet silently removed nothing.

Replace with a directory walk that inspects each `<name>/SKILL.md`:

  find ~/.claude/skills -mindepth 1 -maxdepth 1 -type d ! -name gstack
  → check $dir/SKILL.md is a symlink → readlink it
  → if target is gstack/* or */gstack/*: rm -f the link, rmdir the dir
    (only if empty — preserves any user-added files)

Excludes the top-level `gstack/` dir from the walk; that's removed by
step 3 of the same uninstall block.

`bin/gstack-uninstall` (the script-mode path) already handles the layout
correctly via its own walk; only this manual fallback needed updating.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix: reject partial browse client env integers

* fix(gemini-adapter): detect new ~/.gemini/oauth_creds.json auth path

gemini-cli >=0.30 stores OAuth credentials at ~/.gemini/oauth_creds.json
instead of the legacy ~/.config/gemini/ directory. The benchmark adapter's
availability check now succeeds for users on recent gemini-cli releases
who have authenticated via interactive login.

Both paths are accepted so users on older versions still work.

* fix(browser): add --no-sandbox for root user on Linux/WSL2

Chromium's sandbox can't initialize when running as root on Linux,
causing an immediate exit. Extend the existing CI/CONTAINER check to
also cover this case, keeping the Windows-safe `typeof getuid` guard.

* security: pass cwd to git via execFileSync, not interpolation through /bin/sh

`bin/gstack-memory-ingest.ts:632-643` ran `execSync(\`git -C ${JSON.stringify(cwd)}
remote get-url origin 2>/dev/null\`, ...)`. JSON.stringify escapes `"` and `\`
but not `$` or backticks, so a `cwd` of `"$(touch /tmp/marker)"` survived JSON
quoting and detonated under /bin/sh's command-substitution-inside-double-quotes.

`cwd` originates from transcript JSONL records under
`~/.claude/projects/<encoded-cwd>/<uuid>.jsonl` and
`~/.codex/sessions/YYYY/MM/DD/rollout-*.jsonl`. The walker grabs the first
`.cwd` it sees per session. That's an untrusted surface in the gstack threat
model — the L1-L6 sidebar security stack exists exactly because agent
transcripts can carry attacker-influenced text. Two pivots above the local
same-uid bar: (a) prompt-injection appending `cwd="$(...)"` to the active
session log turns the next /sync-gbrain run into RCE under the user's uid;
(b) cross-machine transcript share (a colleague's `.claude/projects` snippet
untar'd into HOME, a documented gbrain dogfooding shape) → RCE on first sync.

Fix swaps the one execSync for `execFileSync("git", ["-C", cwd, "remote",
"get-url", "origin"], ...)`. No shell, argv passed directly to git. The same
module already uses execFileSync for `gbrainAvailable()` (line 762 pre-patch)
and `gbrainPutPage()` (line 816 pre-patch) — this single execSync was the
outlier.

Test: `gstack-memory-ingest security: untrusted cwd cannot trigger shell
substitution` plants a Claude-Code-shaped JSONL with cwd=`$(touch <marker>)`
and asserts the marker file is not created after `--incremental --quiet`.
Negative control: with the patch reverted, the test fails (marker created);
with the patch applied, it passes (18/18 in test/gstack-memory-ingest.test.ts).

* security: gate domain-skill auto-promote on classifier_score > 0

`browse/src/domain-skill-commands.ts:140` (handleSave) writes
`classifier_score: 0` with the comment "L4 deferred to load-time / sidebar-agent
fills this in on first prompt-injection load." But CLAUDE.md "Sidebar
architecture" documents that sidebar-agent.ts was ripped, and grep for
recordSkillUse + classifierFlagged callers across browse/src/ returns zero hits
outside the module under test.

Net effect: every quarantined skill that survives three benign uses without
flag (`recordSkillUse(... , classifierFlagged: false)` x3) auto-promotes to
`active` and lands in prompt context wrapped as UNTRUSTED on every subsequent
visit to that host. The L4 score that was supposed to gate the promotion was
never written — the production save path puts 0 on disk and nothing later
updates it.

Threat model: a domain-skill body authored by an agent under the influence of
a poisoned page (the new `gstackInjectToTerminal` PTY path runs no L1-L3
either) would lose its auto-promote barrier after three uses. The exploit
isn't single-step but the bar is exactly N=3 prompt-injection-shaped uses on
a hostile page, which is well within reach.

Fix adds a single condition to the auto-promote gate in `recordSkillUse`:

    if (state === 'quarantined' && useCount >= PROMOTE_THRESHOLD &&
        flagCount === 0 && current.classifier_score > 0) {
      state = 'active';
    }

`classifier_score` is set once at writeSkill and never updated. Production
saves it as 0 (handleSave), so the gate stays closed; existing tests that
explicitly pass `classifierScore: 0.1` still auto-promote (the auto-promote
path is preserved for the day L4 is rewired).

Manual promotion via `domain-skill promote-to-global` is unaffected (it goes
through `promoteToGlobal` which has its own state-machine guard at line 337+).

Test: new regression case `does NOT auto-promote when classifier_score is 0
(production handleSave shape)` plants a skill with classifierScore=0 (matches
domain-skill-commands.ts:140), runs three uses without flag, asserts the skill
stays quarantined and readSkill returns null. Negative control: revert the
patch, the test fails with `Received: "active"`. With the patch: 15/15 pass.

* fix(ship): port #1302 SKILL.md edits to .tmpl + resolver source

PR #1302 added Verification Mode + UNVERIFIABLE classification + per-item
confirmation gate to ship/SKILL.md, but only the generated SKILL.md was
edited — not the .tmpl source or scripts/resolvers/review.ts. The next
`bun run gen:skill-docs` run would have wiped the changes.

Port the same content into the resolver and .tmpl so regeneration produces
the intended output.

* ci(windows): extend free-tests lane to cover icacls + Bun.which resolvers from fix-wave PRs

Closes #1306/#1307/#1308 validation gap. The four newly-added test files
already have process.platform guards so they run safely on both POSIX and
Windows lanes — only platform-relevant assertions execute on each.

Tests added to the windows-latest lane:
- browse/test/file-permissions.test.ts (#1308 icacls + writeSecureFile)
- browse/test/security.test.ts (#1306 bash.exe wrap pure-function path)
- make-pdf/test/browseClient.test.ts (#1307 Bun.which browse resolver)
- make-pdf/test/pdftotext.test.ts (#1307 Bun.which pdftotext resolver)

* test(codex): live flag-semantics smoke for codex exec resume

Closes #1270's regex-only test gap. PR #1270 asserted that codex/SKILL.md's
`codex exec resume` invocation drops -C/-s and uses sandbox_mode config.
That regex catches the skill template regressing, but not codex CLI itself
flipping flag semantics again.

This test probes `codex exec resume --help` and asserts the surface gstack
relies on: -c/sandbox_mode is accepted, top-level -C is absent. Skips
silently when codex isn't on PATH, so dev machines without codex installed
never see it fail.

* chore: regen SKILL.md after fix wave

One regen commit at the end of the merge wave per the plan. plan-devex-review
loses the contradictory plan-mode handshake (#1333). review/SKILL.md picks up
the Verification Mode + UNVERIFIABLE classification additions that #1302
authored against ship/SKILL.md (same resolver shared between ship and review
modes).

* fix(server.ts): keep fs.writeFileSync for state-file writes

#1308's writeSecureFile wrapper added Windows icacls hardening for the
4 state-file write sites in server.ts, but #1310's regression test grep's
for fs.writeFileSync(tmpStatePath()) calls. The two changes are technically
compatible only if the test relaxes — keeping the test strict (the safer
choice for catching regressions on the cold-start race) means the 4 state-
file sites stay on fs.writeFileSync(..., { mode: 0o600 }).

POSIX 0o600 hardening is preserved on those 4 sites. Windows icacls
hardening still applies to all the other writeSecureFile call sites
#1308 added (auth.json, mkdirSecure, etc.).

Also refreshes golden baselines after #1302 / port + minor wording tweak
in scripts/resolvers/review.ts to keep gen-skill-docs.test.ts assertion
'Cite the specific file' satisfied.

* v1.30.0.0: fix wave — 21 community PRs + 2 closing fixes for Windows + codex CI gaps

Headline release. Browse stops dropping console logs, cold-start race
fixed, codex resume works without python3, Windows hardening (icacls +
Bun.which + bash.exe wrap), ship gate gets VAS-449 remediation, two
closing fixes that put icacls/Bun.which/codex flag semantics under CI.

* test(domain-skills): cover #1369 classifier_score=0 quarantine + score>0 promote path

The pre-existing T6 test seeded skills via writeSkill (which defaults
classifier_score to 0 until L4 is rewired) and then expected 3 uses to
auto-promote. PR #1369 added `current.classifier_score > 0` to the gate
specifically to block that path — a quarantined skill written under the
influence of a poisoned page would otherwise auto-promote after three
benign uses.

Updated test asserts both halves of the new contract:
- classifier_score=0 + 3 uses → stays quarantined (the security guarantee)
- classifier_score>0 + 3 more uses → promotes to active (unblock path)

Catches both regressions: the gate going away (would re-allow the bypass)
and the unblock path breaking (would silently quarantine all skills
forever once L4 is rewired).

---------

Co-authored-by: Jayesh Betala <jayesh.betala7@gmail.com>
Co-authored-by: orbisai0security <mediratta01.pally@gmail.com>
Co-authored-by: Bryce Alan <brycealan.eth@gmail.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-authored-by: Terry Carson YM <cym3118288@gmail.com>
Co-authored-by: Vasko Ckorovski <vckorovski@gmail.com>
Co-authored-by: Samuel Carson <samuel.carson@gmail.com>
Co-authored-by: Yashwant Kotipalli <yashwant7kotipalli@gmail.com>
Co-authored-by: Jasper Chen <jasperchen925@gmail.com>
Co-authored-by: Stefan Neamtu <stefan.neamtu@gmail.com>
Co-authored-by: 陈家名 <chenjiaming@kezaihui.com>
Co-authored-by: Abigail Atheryon <abi@atheryon.ai>
Co-authored-by: Furkan Köykıran <furkankoykiran@gmail.com>
Co-authored-by: gus <gustavoraularagon@gmail.com>
2026-05-09 08:06:47 -07:00

1470 lines
56 KiB
TypeScript

/**
* Browser lifecycle manager
*
* Chromium crash handling:
* browser.on('disconnected') → log error → process.exit(1)
* CLI detects dead server → auto-restarts on next command
* We do NOT try to self-heal — don't hide failure.
*
* Dialog handling:
* page.on('dialog') → auto-accept by default → store in dialog buffer
* Prevents browser lockup from alert/confirm/prompt
*
* Context recreation (useragent):
* recreateContext() saves cookies/storage/URLs, creates new context,
* restores state. Falls back to clean slate on any failure.
*/
import { chromium, type Browser, type BrowserContext, type BrowserContextOptions, type Page, type Locator, type Cookie } from 'playwright';
import { writeSecureFile, mkdirSecure } from './file-permissions';
import { addConsoleEntry, addNetworkEntry, addDialogEntry, networkBuffer, type DialogEntry } from './buffers';
import { validateNavigationUrl } from './url-validation';
import { TabSession, type RefEntry } from './tab-session';
export type { RefEntry };
// Re-export TabSession for consumers
export { TabSession };
export interface BrowserState {
cookies: Cookie[];
pages: Array<{
url: string;
isActive: boolean;
storage: { localStorage: Record<string, string>; sessionStorage: Record<string, string> } | null;
/**
* HTML content loaded via load-html (setContent), replayed after context recreation.
* In-memory only — never persisted to disk (HTML may contain secrets or customer data).
*/
loadedHtml?: string;
loadedHtmlWaitUntil?: 'load' | 'domcontentloaded' | 'networkidle';
/**
* Tab owner clientId for multi-agent isolation. Survives context recreation so
* scoped agents don't get locked out of their own tabs after viewport --scale.
* In-memory only.
*/
owner?: string;
}>;
}
export class BrowserManager {
private browser: Browser | null = null;
private context: BrowserContext | null = null;
// Proxy config applied to chromium.launch() when set (D8). Set by server.ts
// at startup based on BROWSE_PROXY_URL. For SOCKS5 with auth, server.ts
// points this at the local bridge (socks5://127.0.0.1:<bridgePort>); for
// HTTP/HTTPS or unauth SOCKS5, it's the upstream URL directly.
private proxyConfig: { server: string; username?: string; password?: string } | null = null;
private pages: Map<number, Page> = new Map();
private tabSessions: Map<number, TabSession> = new Map();
private activeTabId: number = 0;
private nextTabId: number = 1;
private extraHeaders: Record<string, string> = {};
private customUserAgent: string | null = null;
// ─── Viewport + deviceScaleFactor (context options) ──────────
// Tracked at the manager level so recreateContext() preserves them.
// deviceScaleFactor is a *context* option, not a page-level setter — changes
// require recreateContext(). Viewport width/height can change on-page, but we
// track the latest so context recreation restores it instead of hardcoding 1280x720.
private deviceScaleFactor: number = 1;
private currentViewport: { width: number; height: number } = { width: 1280, height: 720 };
/** Server port — set after server starts, used by cookie-import-browser command */
public serverPort: number = 0;
// ─── Tab Ownership (multi-agent isolation) ──────────────
// Maps tabId → clientId. Unowned tabs (not in this map) are root-only for writes.
private tabOwnership: Map<number, string> = new Map();
// ─── Dialog Handling (global, not per-tab) ──────────────────
private dialogAutoAccept: boolean = true;
private dialogPromptText: string | null = null;
// ─── Cookie Origin Tracking ────────────────────────────────
private cookieImportedDomains: Set<string> = new Set();
// ─── Handoff State ─────────────────────────────────────────
private isHeaded: boolean = false;
private consecutiveFailures: number = 0;
// ─── Watch Mode ─────────────────────────────────────────
private watching = false;
public watchInterval: ReturnType<typeof setInterval> | null = null;
private watchSnapshots: string[] = [];
private watchStartTime: number = 0;
// ─── Headed State ────────────────────────────────────────
private connectionMode: 'launched' | 'headed' = 'launched';
private intentionalDisconnect = false;
// Called when the headed browser disconnects without intentional teardown
// (user closed the window). Wired up by server.ts to run full cleanup
// (sidebar-agent, state file, profile locks) before exiting with code 2.
// Returns void or a Promise; rejections are caught and fall back to exit(2).
public onDisconnect: (() => void | Promise<void>) | null = null;
getConnectionMode(): 'launched' | 'headed' { return this.connectionMode; }
// ─── Watch Mode Methods ─────────────────────────────────
isWatching(): boolean { return this.watching; }
startWatch(): void {
this.watching = true;
this.watchSnapshots = [];
this.watchStartTime = Date.now();
}
stopWatch(): { snapshots: string[]; duration: number } {
this.watching = false;
if (this.watchInterval) {
clearInterval(this.watchInterval);
this.watchInterval = null;
}
const snapshots = this.watchSnapshots;
const duration = Date.now() - this.watchStartTime;
this.watchSnapshots = [];
this.watchStartTime = 0;
return { snapshots, duration };
}
addWatchSnapshot(snapshot: string): void {
this.watchSnapshots.push(snapshot);
}
/**
* Find the gstack Chrome extension directory.
* Checks: repo root /extension, global install, dev install.
*/
private findExtensionPath(): string | null {
const fs = require('fs');
const path = require('path');
const candidates = [
// Explicit override via env var (used by GStack Browser.app bundle)
process.env.BROWSE_EXTENSIONS_DIR || '',
// Relative to this source file (dev mode: browse/src/ -> ../../extension)
path.resolve(__dirname, '..', '..', 'extension'),
// Global gstack install
path.join(process.env.HOME || '', '.claude', 'skills', 'gstack', 'extension'),
// Git repo root (detected via BROWSE_STATE_FILE location)
(() => {
const stateFile = process.env.BROWSE_STATE_FILE || '';
if (stateFile) {
const repoRoot = path.resolve(path.dirname(stateFile), '..');
return path.join(repoRoot, '.claude', 'skills', 'gstack', 'extension');
}
return '';
})(),
].filter(Boolean);
for (const candidate of candidates) {
try {
if (fs.existsSync(path.join(candidate, 'manifest.json'))) {
return candidate;
}
} catch (err: any) {
if (err?.code !== 'ENOENT' && err?.code !== 'EACCES') throw err;
}
}
return null;
}
/**
* Set the proxy config applied to chromium.launch() in launch() and
* launchHeaded(). Called by server.ts at startup once the (optional) SOCKS5
* bridge is up.
*/
setProxyConfig(cfg: { server: string; username?: string; password?: string } | null): void {
this.proxyConfig = cfg;
}
/**
* Get the ref map for external consumers (e.g., /refs endpoint).
*/
getRefMap(): Array<{ ref: string; role: string; name: string }> {
try {
return this.getActiveSession().getRefEntries();
} catch {
return [];
}
}
async launch() {
// ─── Extension Support ────────────────────────────────────
// BROWSE_EXTENSIONS_DIR points to an unpacked Chrome extension directory.
// Extensions only work in headed mode, so we use an off-screen window.
const extensionsDir = process.env.BROWSE_EXTENSIONS_DIR;
const { STEALTH_LAUNCH_ARGS } = await import('./stealth');
const launchArgs: string[] = [...STEALTH_LAUNCH_ARGS];
let useHeadless = true;
// Docker/CI/root: Chromium sandbox requires unprivileged user namespaces which
// are typically disabled in containers and are never available for the root
// user on Linux. Detect all three cases and add --no-sandbox automatically.
const isRoot = typeof process.getuid === 'function' && process.getuid() === 0;
if (process.env.CI || process.env.CONTAINER || isRoot) {
launchArgs.push('--no-sandbox');
}
if (extensionsDir) {
launchArgs.push(
`--disable-extensions-except=${extensionsDir}`,
`--load-extension=${extensionsDir}`,
'--window-position=-9999,-9999',
'--window-size=1,1',
);
useHeadless = false; // extensions require headed mode; off-screen window simulates headless
console.log(`[browse] Extensions loaded from: ${extensionsDir}`);
}
this.browser = await chromium.launch({
headless: useHeadless,
// On Windows, Chromium's sandbox fails when the server is spawned through
// the Bun→Node process chain (GitHub #276). Disable it — local daemon
// browsing user-specified URLs has marginal sandbox benefit.
chromiumSandbox: process.platform !== 'win32',
...(launchArgs.length > 0 ? { args: launchArgs } : {}),
...(this.proxyConfig ? { proxy: this.proxyConfig } : {}),
});
// Chromium crash → exit with clear message
this.browser.on('disconnected', () => {
console.error('[browse] FATAL: Chromium process crashed or was killed. Server exiting.');
console.error('[browse] Console/network logs flushed to .gstack/browse-*.log');
process.exit(1);
});
const contextOptions: BrowserContextOptions = {
viewport: { width: this.currentViewport.width, height: this.currentViewport.height },
deviceScaleFactor: this.deviceScaleFactor,
};
if (this.customUserAgent) {
contextOptions.userAgent = this.customUserAgent;
}
this.context = await this.browser.newContext(contextOptions);
if (Object.keys(this.extraHeaders).length > 0) {
await this.context.setExtraHTTPHeaders(this.extraHeaders);
}
// D7: mask navigator.webdriver only. The other 3 wintermute patches
// (plugins, languages, chrome.runtime) are intentionally NOT applied —
// faking them to fixed values can flag more bot-like to modern
// fingerprinters, not less.
const { applyStealth } = await import('./stealth');
await applyStealth(this.context);
// Create first tab
await this.newTab();
}
// ─── Headed Mode ─────────────────────────────────────────────
/**
* Launch Playwright's bundled Chromium in headed mode with the gstack
* Chrome extension auto-loaded. Uses launchPersistentContext() which
* is required for extension loading (launch() + newContext() can't
* load extensions).
*
* The browser launches headed with a visible window — the user sees
* every action Claude takes in real time.
*/
async launchHeaded(authToken?: string): Promise<void> {
// Clear old state before repopulating
this.pages.clear();
this.tabSessions.clear();
this.nextTabId = 1;
// Find the gstack extension directory for auto-loading
const extensionPath = this.findExtensionPath();
const launchArgs = [
'--hide-crash-restore-bubble',
// Anti-bot-detection: remove the navigator.webdriver flag that Playwright sets.
// Sites like Google and NYTimes check this to block automation browsers.
'--disable-blink-features=AutomationControlled',
];
if (extensionPath) {
launchArgs.push(`--disable-extensions-except=${extensionPath}`);
launchArgs.push(`--load-extension=${extensionPath}`);
// Write auth token for extension bootstrap.
// Write to ~/.gstack/.auth.json (not the extension dir, which may be read-only
// in .app bundles and breaks codesigning).
if (authToken) {
const fs = require('fs');
const path = require('path');
const gstackDir = path.join(process.env.HOME || '/tmp', '.gstack');
mkdirSecure(gstackDir);
const authFile = path.join(gstackDir, '.auth.json');
try {
writeSecureFile(authFile, JSON.stringify({ token: authToken, port: this.serverPort || 34567 }));
} catch (err: any) {
console.warn(`[browse] Could not write .auth.json: ${err.message}`);
}
}
}
// Launch headed Chromium via Playwright's persistent context.
// Extensions REQUIRE launchPersistentContext (not launch + newContext).
// Real Chrome (executablePath/channel) silently blocks --load-extension,
// so we use Playwright's bundled Chromium which reliably loads extensions.
const fs = require('fs');
const path = require('path');
const userDataDir = path.join(process.env.HOME || '/tmp', '.gstack', 'chromium-profile');
fs.mkdirSync(userDataDir, { recursive: true });
// Support custom Chromium binary via GSTACK_CHROMIUM_PATH env var.
// Used by GStack Browser.app to point at the bundled Chromium.
const executablePath = process.env.GSTACK_CHROMIUM_PATH || undefined;
// Rebrand Chromium → GStack Browser in macOS menu bar / Dock / Cmd+Tab.
// Patch the Chromium .app's Info.plist so macOS shows our name.
// This works for both dev mode (system Playwright cache) and .app bundle.
const chromePath = executablePath || chromium.executablePath();
try {
// Walk up from binary to the .app's Info.plist
// e.g. .../Google Chrome for Testing.app/Contents/MacOS/Google Chrome for Testing
// → .../Google Chrome for Testing.app/Contents/Info.plist
const chromeContentsDir = path.resolve(path.dirname(chromePath), '..');
const chromePlist = path.join(chromeContentsDir, 'Info.plist');
if (fs.existsSync(chromePlist)) {
const plistContent = fs.readFileSync(chromePlist, 'utf-8');
if (plistContent.includes('Google Chrome for Testing')) {
const patched = plistContent
.replace(/Google Chrome for Testing/g, 'GStack Browser');
fs.writeFileSync(chromePlist, patched);
}
// Replace Chromium's Dock icon with ours (Chromium's process owns the Dock icon)
const iconCandidates = [
path.join(__dirname, '..', '..', 'scripts', 'app', 'icon.icns'), // repo dev mode
path.join(process.env.HOME || '', '.claude', 'skills', 'gstack', 'scripts', 'app', 'icon.icns'), // global install
];
const iconSrc = iconCandidates.find(p => fs.existsSync(p));
if (iconSrc) {
const chromeResources = path.join(chromeContentsDir, 'Resources');
// Read original icon name from plist
const iconMatch = plistContent.match(/<key>CFBundleIconFile<\/key>\s*<string>([^<]+)<\/string>/);
let origIcon = iconMatch ? iconMatch[1] : 'app';
if (!origIcon.endsWith('.icns')) origIcon += '.icns';
const destIcon = path.join(chromeResources, origIcon);
try {
fs.copyFileSync(iconSrc, destIcon);
} catch (err: any) {
if (err?.code !== 'ENOENT' && err?.code !== 'EACCES') throw err;
}
}
}
} catch (err: any) {
// Non-fatal: app name stays as Chrome for Testing (ENOENT/EACCES expected)
if (err?.code !== 'ENOENT' && err?.code !== 'EACCES') throw err;
}
// Build custom user agent: keep Chrome version for site compatibility,
// but replace "Chrome for Testing" branding with "GStackBrowser"
let customUA: string | undefined;
if (!this.customUserAgent) {
// Detect Chrome version from the Chromium binary
const chromePath = executablePath || chromium.executablePath();
try {
const versionProc = Bun.spawnSync([chromePath, '--version'], {
stdout: 'pipe', stderr: 'pipe', timeout: 5000,
});
const versionOutput = versionProc.stdout.toString().trim();
// Output like: "Google Chrome for Testing 145.0.6422.0" or "Chromium 145.0.6422.0"
const versionMatch = versionOutput.match(/(\d+\.\d+\.\d+\.\d+)/);
const chromeVersion = versionMatch ? versionMatch[1] : '131.0.0.0';
customUA = `Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/${chromeVersion} Safari/537.36 GStackBrowser`;
} catch {
// Fallback: generic modern Chrome UA
customUA = 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/131.0.0.0 Safari/537.36 GStackBrowser';
}
}
this.context = await chromium.launchPersistentContext(userDataDir, {
headless: false,
args: launchArgs,
viewport: null, // Use browser's default viewport (real window size)
userAgent: this.customUserAgent || customUA,
...(executablePath ? { executablePath } : {}),
...(this.proxyConfig ? { proxy: this.proxyConfig } : {}),
// Playwright adds flags that block extension loading
ignoreDefaultArgs: [
'--disable-extensions',
'--disable-component-extensions-with-background-pages',
],
});
this.browser = this.context.browser();
this.connectionMode = 'headed';
this.intentionalDisconnect = false;
// ─── Anti-bot-detection patches ───────────────────────────────
// D7 (codex correction): mask navigator.webdriver only. We do NOT fake
// plugins/languages — modern fingerprinters check consistency between
// those and userAgent/platform, and synthesizing fixed values can flag
// MORE bot-like, not less. Let Chromium's natural plugins and languages
// surface unmodified.
//
// What we DO clean up are automation-specific runtime artifacts that
// shouldn't exist in a real browser at all (Permissions API quirks,
// ChromeDriver-injected window globals). Those aren't fingerprint
// synthesis — they're removing leaked automation tells.
const { applyStealth } = await import('./stealth');
await applyStealth(this.context);
await this.context.addInitScript(() => {
// Remove CDP runtime artifacts that automation detectors look for
// cdc_ prefixed vars are injected by ChromeDriver/CDP
const cleanup = () => {
for (const key of Object.keys(window)) {
if (key.startsWith('cdc_') || key.startsWith('__webdriver')) {
try {
delete (window as any)[key];
} catch (e: any) {
if (!(e instanceof TypeError)) throw e;
}
}
}
};
cleanup();
// Re-clean after a tick in case they're injected late
setTimeout(cleanup, 0);
// Override Permissions API to return 'prompt' for notifications
// (automation browsers return 'denied' which is a fingerprint)
const originalQuery = window.navigator.permissions?.query;
if (originalQuery) {
(window.navigator.permissions as any).query = (params: any) => {
if (params.name === 'notifications') {
return Promise.resolve({ state: 'prompt', onchange: null } as PermissionStatus);
}
return originalQuery.call(window.navigator.permissions, params);
};
}
});
// Inject visual indicator — subtle top-edge amber gradient
// Extension's content script handles the floating pill
const indicatorScript = () => {
const injectIndicator = () => {
if (document.getElementById('gstack-ctrl')) return;
const topLine = document.createElement('div');
topLine.id = 'gstack-ctrl';
topLine.style.cssText = `
position: fixed; top: 0; left: 0; right: 0; height: 2px;
background: linear-gradient(90deg, #F59E0B, #FBBF24, #F59E0B);
background-size: 200% 100%;
animation: gstack-shimmer 3s linear infinite;
pointer-events: none; z-index: 2147483647;
opacity: 0.8;
`;
const style = document.createElement('style');
style.textContent = `
@keyframes gstack-shimmer {
0% { background-position: 200% 0; }
100% { background-position: -200% 0; }
}
@media (prefers-reduced-motion: reduce) {
#gstack-ctrl { animation: none !important; }
}
`;
document.documentElement.appendChild(style);
document.documentElement.appendChild(topLine);
};
if (document.readyState === 'loading') {
document.addEventListener('DOMContentLoaded', injectIndicator);
} else {
injectIndicator();
}
};
await this.context.addInitScript(indicatorScript);
// Track user-created tabs automatically (Cmd+T, link opens in new tab, etc.)
this.context.on('page', (page) => {
const id = this.nextTabId++;
this.pages.set(id, page);
this.tabSessions.set(id, new TabSession(page));
this.activeTabId = id;
this.wirePageEvents(page);
// Inject indicator on the new tab
page.evaluate(indicatorScript).catch(() => {});
console.log(`[browse] New tab detected (id=${id}, total=${this.pages.size})`);
});
// Persistent context opens a default page — adopt it instead of creating a new one
const existingPages = this.context.pages();
if (existingPages.length > 0) {
const page = existingPages[0];
const id = this.nextTabId++;
this.pages.set(id, page);
this.tabSessions.set(id, new TabSession(page));
this.activeTabId = id;
this.wirePageEvents(page);
// Inject indicator on restored page (addInitScript only fires on new navigations)
try {
await page.evaluate(indicatorScript);
} catch {}
} else {
await this.newTab();
}
// Browser disconnect handler — exit code 2 distinguishes from crashes (1).
// Calls onDisconnect() to trigger full shutdown (kill sidebar-agent, save
// session, clean profile locks + state file) before exit. Falls back to
// direct process.exit(2) if no callback is wired up, or if the callback
// throws/rejects — never leave the process running with a dead browser.
if (this.browser) {
this.browser.on('disconnected', () => {
if (this.intentionalDisconnect) return;
console.error('[browse] Real browser disconnected (user closed or crashed).');
console.error('[browse] Run `$B connect` to reconnect.');
if (!this.onDisconnect) {
process.exit(2);
return;
}
try {
const result = this.onDisconnect();
if (result && typeof (result as Promise<void>).catch === 'function') {
(result as Promise<void>).catch((err) => {
console.error('[browse] onDisconnect rejected:', err);
process.exit(2);
});
}
} catch (err) {
console.error('[browse] onDisconnect threw:', err);
process.exit(2);
}
});
}
// Headed mode defaults
this.dialogAutoAccept = false; // Don't dismiss user's real dialogs
this.isHeaded = true;
this.consecutiveFailures = 0;
}
async close() {
if (this.browser || (this.connectionMode === 'headed' && this.context)) {
if (this.connectionMode === 'headed') {
// Headed/persistent context mode: close the context (which closes the browser)
this.intentionalDisconnect = true;
if (this.browser) this.browser.removeAllListeners('disconnected');
await Promise.race([
this.context ? this.context.close() : Promise.resolve(),
new Promise(resolve => setTimeout(resolve, 5000)),
]).catch(() => {});
} else {
// Launched mode: close the browser we spawned
this.browser.removeAllListeners('disconnected');
await Promise.race([
this.browser.close(),
new Promise(resolve => setTimeout(resolve, 5000)),
]).catch(() => {});
}
this.browser = null;
}
}
/** Health check — verifies Chromium is connected AND responsive */
async isHealthy(): Promise<boolean> {
if (!this.browser || !this.browser.isConnected()) return false;
try {
const page = this.pages.get(this.activeTabId);
if (!page) return true; // connected but no pages — still healthy
await Promise.race([
page.evaluate('1'),
new Promise((_, reject) => setTimeout(() => reject(new Error('timeout')), 2000)),
]);
return true;
} catch {
return false;
}
}
// ─── Tab Management ────────────────────────────────────────
async newTab(url?: string, clientId?: string): Promise<number> {
if (!this.context) throw new Error('Browser not launched');
// Validate URL before allocating page to avoid zombie tabs on rejection.
// Use the normalized return value for navigation — it handles file://./x and
// file://<segment> cwd-relative forms that the standard URL parser doesn't.
let normalizedUrl: string | undefined;
if (url) {
normalizedUrl = await validateNavigationUrl(url);
}
const page = await this.context.newPage();
const id = this.nextTabId++;
this.pages.set(id, page);
this.tabSessions.set(id, new TabSession(page));
this.activeTabId = id;
// Record tab ownership for multi-agent isolation
if (clientId) {
this.tabOwnership.set(id, clientId);
}
// Wire up console/network/dialog capture
this.wirePageEvents(page);
if (normalizedUrl) {
await page.goto(normalizedUrl, { waitUntil: 'domcontentloaded', timeout: 15000 });
}
return id;
}
async closeTab(id?: number): Promise<void> {
const tabId = id ?? this.activeTabId;
const page = this.pages.get(tabId);
if (!page) throw new Error(`Tab ${tabId} not found`);
await page.close();
this.pages.delete(tabId);
this.tabSessions.delete(tabId);
this.tabOwnership.delete(tabId);
// Switch to another tab if we closed the active one
if (tabId === this.activeTabId) {
const remaining = [...this.pages.keys()];
if (remaining.length > 0) {
this.activeTabId = remaining[remaining.length - 1];
} else {
// No tabs left — create a new blank one
await this.newTab();
}
}
}
switchTab(id: number, opts?: { bringToFront?: boolean }): void {
if (!this.tabSessions.has(id)) throw new Error(`Tab ${id} not found`);
this.activeTabId = id;
// Only bring to front when explicitly requested (user-initiated tab switch).
// Internal tab pinning (BROWSE_TAB) should NOT steal focus.
if (opts?.bringToFront !== false) {
const page = this.pages.get(id);
if (page) page.bringToFront().catch(() => {});
}
}
/**
* Sync activeTabId to match the tab whose URL matches the Chrome extension's
* active tab. Called on every /sidebar-tabs poll so manual tab switches in
* the browser are detected within ~2s.
*/
syncActiveTabByUrl(activeUrl: string): void {
if (!activeUrl || this.pages.size <= 1) return;
// Try exact match first, then fuzzy match (origin+pathname, ignoring query/fragment)
let fuzzyId: number | null = null;
let activeOriginPath = '';
try {
const u = new URL(activeUrl);
activeOriginPath = u.origin + u.pathname;
} catch (err: any) {
if (!(err instanceof TypeError)) throw err;
}
for (const [id, page] of this.pages) {
try {
const pageUrl = page.url();
// Exact match — best case
if (pageUrl === activeUrl && id !== this.activeTabId) {
this.activeTabId = id;
return;
}
// Fuzzy match — origin+pathname (handles query param / fragment differences)
if (activeOriginPath && fuzzyId === null && id !== this.activeTabId) {
try {
const pu = new URL(pageUrl);
if (pu.origin + pu.pathname === activeOriginPath) {
fuzzyId = id;
}
} catch (err: any) {
if (!(err instanceof TypeError)) throw err;
}
}
} catch {}
}
// Fall back to fuzzy match
if (fuzzyId !== null) {
this.activeTabId = fuzzyId;
}
}
getActiveTabId(): number {
return this.activeTabId;
}
getTabCount(): number {
return this.pages.size;
}
// ─── Tab Ownership (multi-agent isolation) ──────────────
/** Get the owner of a tab, or null if unowned (root-only for writes). */
getTabOwner(tabId: number): string | null {
return this.tabOwnership.get(tabId) || null;
}
/**
* Check if a client can access a tab.
*
* Two policies, distinguished by `options.ownOnly`:
*
* - **own-only (pair-agent over tunnel):** the strict mode. Token must own
* the target tab for any access (reads or writes). Unowned user tabs
* and tabs owned by other clients are off-limits. Remote agents must
* `newtab` first to get a tab they can drive.
*
* - **shared (local skill spawns, default scoped tokens):** permissive on
* tab access. The token can read/write any tab — capability is gated
* elsewhere (scope checks at /command, rate limits, the dual-listener
* allowlist for tunnel-bound traffic). Tab ownership is not a security
* boundary for shared tokens; it only matters for pair-agent isolation.
* This matches the contract documented in `skill-token.ts:79`
* ("skill scripts may switch tabs as needed").
*
* Root is unconstrained.
*
* `isWrite` is preserved in the signature for callers that want to log or
* branch on it elsewhere, but the access decision itself only depends on
* `ownOnly` + ownership map state.
*/
checkTabAccess(tabId: number, clientId: string, options: { isWrite?: boolean; ownOnly?: boolean } = {}): boolean {
if (clientId === 'root') return true;
if (options.ownOnly) {
const owner = this.tabOwnership.get(tabId);
return owner === clientId;
}
return true;
}
/** Transfer tab ownership to a different client. */
transferTab(tabId: number, toClientId: string): void {
if (!this.pages.has(tabId)) throw new Error(`Tab ${tabId} not found`);
this.tabOwnership.set(tabId, toClientId);
}
async getTabListWithTitles(): Promise<Array<{ id: number; url: string; title: string; active: boolean }>> {
const tabs: Array<{ id: number; url: string; title: string; active: boolean }> = [];
for (const [id, page] of this.pages) {
tabs.push({
id,
url: page.url(),
title: await page.title().catch(() => ''),
active: id === this.activeTabId,
});
}
return tabs;
}
// ─── Session Access ────────────────────────────────────────
/** Get the TabSession for the active tab. */
getActiveSession(): TabSession {
const session = this.tabSessions.get(this.activeTabId);
if (!session) throw new Error('No active page. Use "browse goto <url>" first.');
return session;
}
/** Get a TabSession by tab ID. Used by /batch for parallel tab execution. */
getSession(tabId: number): TabSession {
const session = this.tabSessions.get(tabId);
if (!session) throw new Error(`Tab ${tabId} not found`);
return session;
}
/** Get the underlying Page for a tab id. Returns null if the tab doesn't exist.
* Used by the CDP bridge (cdp-bridge.ts) to mint per-tab CDPSessions. */
getPageForTab(tabId: number): Page | null {
return this.pages.get(tabId) ?? null;
}
// ─── Two-tier mutex (Codex T7) ─────────────────────────────
// Per-tab and global locks for the CDP bridge. tab-scoped methods take the
// per-tab mutex; browser-scoped methods take the global lock that blocks all
// tab mutexes. Hard timeout on acquire so silent deadlock can't happen.
// Every caller MUST use try { ... } finally { release() }.
private tabLocks: Map<number, Promise<void>> = new Map();
private globalCdpLockTail: Promise<void> = Promise.resolve();
/**
* Acquire the per-tab CDP lock with a timeout. Returns a release fn.
* Locks chain: each acquire waits on the prior tail's resolution.
* Browser-scoped global lock takes precedence: while the global lock is
* held, no tab lock can be acquired (and vice versa).
*/
async acquireTabLock(tabId: number, timeoutMs: number): Promise<() => void> {
const existing = this.tabLocks.get(tabId) ?? Promise.resolve();
// Wait for any held global lock first (cross-tier serialization).
const tail = Promise.all([existing, this.globalCdpLockTail]).then(() => undefined);
let release!: () => void;
const next = new Promise<void>((resolve) => { release = resolve; });
this.tabLocks.set(tabId, tail.then(() => next));
const timeoutPromise = new Promise<never>((_, reject) =>
setTimeout(() => reject(new Error(
`CDPMutexAcquireTimeout: tab ${tabId} lock not acquired within ${timeoutMs}ms.\n` +
'Cause: a prior CDP or browser-scoped operation has held the lock too long.\n' +
'Action: retry; if this repeats, the prior operation may be hung — file a bug.'
)), timeoutMs),
);
try {
await Promise.race([tail, timeoutPromise]);
} catch (e) {
// Acquisition failed; release the slot we reserved so we don't deadlock the queue.
release();
throw e;
}
return release;
}
/**
* Acquire the global CDP lock. Blocks until all tab locks are released, and
* blocks new tab-lock acquisitions until released.
*/
async acquireGlobalCdpLock(timeoutMs: number): Promise<() => void> {
const allTabTails = Array.from(this.tabLocks.values());
const priorGlobal = this.globalCdpLockTail;
const allPrior = Promise.all([priorGlobal, ...allTabTails]).then(() => undefined);
let release!: () => void;
const next = new Promise<void>((resolve) => { release = resolve; });
this.globalCdpLockTail = allPrior.then(() => next);
const timeoutPromise = new Promise<never>((_, reject) =>
setTimeout(() => reject(new Error(
`CDPMutexAcquireTimeout: global CDP lock not acquired within ${timeoutMs}ms.\n` +
'Cause: in-flight tab operations have not completed.\n' +
'Action: retry; if this repeats, file a bug — a tab op may be hung.'
)), timeoutMs),
);
try {
await Promise.race([allPrior, timeoutPromise]);
} catch (e) {
release();
throw e;
}
return release;
}
// ─── Page Access (delegates to active session) ─────────────
getPage(): Page {
return this.getActiveSession().page;
}
getCurrentUrl(): string {
try {
return this.getPage().url();
} catch {
return 'about:blank';
}
}
// ─── Ref Map (delegates to active session) ──────────────────
setRefMap(refs: Map<string, RefEntry>) {
this.getActiveSession().setRefMap(refs);
}
clearRefs() {
this.getActiveSession().clearRefs();
}
async resolveRef(selector: string): Promise<{ locator: Locator } | { selector: string }> {
return this.getActiveSession().resolveRef(selector);
}
getRefRole(selector: string): string | null {
return this.getActiveSession().getRefRole(selector);
}
getRefCount(): number {
return this.getActiveSession().getRefCount();
}
// ─── Snapshot Diffing (delegates to active session) ─────────
setLastSnapshot(text: string | null) {
this.getActiveSession().setLastSnapshot(text);
}
getLastSnapshot(): string | null {
return this.getActiveSession().getLastSnapshot();
}
// ─── Dialog Control ───────────────────────────────────────
setDialogAutoAccept(accept: boolean) {
this.dialogAutoAccept = accept;
}
getDialogAutoAccept(): boolean {
return this.dialogAutoAccept;
}
setDialogPromptText(text: string | null) {
this.dialogPromptText = text;
}
getDialogPromptText(): string | null {
return this.dialogPromptText;
}
// ─── Cookie Origin Tracking ────────────────────────────────
trackCookieImportDomains(domains: string[]): void {
for (const d of domains) this.cookieImportedDomains.add(d);
}
getCookieImportedDomains(): ReadonlySet<string> {
return this.cookieImportedDomains;
}
hasCookieImports(): boolean {
return this.cookieImportedDomains.size > 0;
}
// ─── Viewport ──────────────────────────────────────────────
async setViewport(width: number, height: number) {
this.currentViewport = { width, height };
await this.getPage().setViewportSize({ width, height });
}
// ─── Extra Headers ─────────────────────────────────────────
async setExtraHeader(name: string, value: string) {
this.extraHeaders[name] = value;
if (this.context) {
await this.context.setExtraHTTPHeaders(this.extraHeaders);
}
}
// ─── User Agent ────────────────────────────────────────────
setUserAgent(ua: string) {
this.customUserAgent = ua;
}
getUserAgent(): string | null {
return this.customUserAgent;
}
// ─── Lifecycle helpers ───────────────────────────────
/**
* Close all open pages and clear the pages map.
* Used by state load to replace the current session.
*/
async closeAllPages(): Promise<void> {
for (const page of this.pages.values()) {
await page.close().catch(() => {});
}
this.pages.clear();
this.tabSessions.clear();
}
// ─── Frame context (delegates to active session) ────────────
setFrame(frame: import('playwright').Frame | null): void {
this.getActiveSession().setFrame(frame);
}
getFrame(): import('playwright').Frame | null {
return this.getActiveSession().getFrame();
}
getActiveFrameOrPage(): import('playwright').Page | import('playwright').Frame {
return this.getActiveSession().getActiveFrameOrPage();
}
// ─── State Save/Restore (shared by recreateContext + handoff) ─
/**
* Capture browser state: cookies, localStorage, sessionStorage, URLs, active tab.
* Skips pages that fail storage reads (e.g., already closed).
*/
async saveState(): Promise<BrowserState> {
if (!this.context) throw new Error('Browser not launched');
const cookies = await this.context.cookies();
const pages: BrowserState['pages'] = [];
for (const [id, page] of this.pages) {
const url = page.url();
let storage = null;
try {
storage = await page.evaluate(() => ({
localStorage: { ...localStorage },
sessionStorage: { ...sessionStorage },
}));
} catch {}
// Capture load-html content so a later context recreation (viewport --scale)
// can replay it via setTabContent. Never persisted to disk.
const session = this.tabSessions.get(id);
const loaded = session?.getLoadedHtml();
// Preserve tab ownership through recreation so scoped agents aren't locked out.
const owner = this.tabOwnership.get(id);
pages.push({
url: url === 'about:blank' ? '' : url,
isActive: id === this.activeTabId,
storage,
loadedHtml: loaded?.html,
loadedHtmlWaitUntil: loaded?.waitUntil,
owner,
});
}
return { cookies, pages };
}
/**
* Restore browser state into the current context: cookies, pages, storage.
* Navigates to saved URLs, restores storage, wires page events.
* Failures on individual pages are swallowed — partial restore is better than none.
*/
async restoreState(state: BrowserState): Promise<void> {
if (!this.context) throw new Error('Browser not launched');
// Restore cookies
if (state.cookies.length > 0) {
await this.context.addCookies(state.cookies);
}
// Clear stale ownership — the old tab IDs are gone. We'll re-add per-tab
// owners below as each saved tab gets a fresh ID. Without this reset, old
// tabId → clientId entries would linger and match new tabs with the same
// sequential IDs, silently granting ownership to the wrong clients.
this.tabOwnership.clear();
// Re-create pages
let activeId: number | null = null;
for (const saved of state.pages) {
const page = await this.context.newPage();
const id = this.nextTabId++;
this.pages.set(id, page);
const newSession = new TabSession(page);
this.tabSessions.set(id, newSession);
this.wirePageEvents(page);
// Restore tab ownership for the new ID — preserves scoped-agent isolation
// across context recreation (viewport --scale, user-agent change, handoff).
if (saved.owner) {
this.tabOwnership.set(id, saved.owner);
}
if (saved.loadedHtml) {
// Replay load-html content via setTabContent — this rehydrates
// TabSession.loadedHtml so the next saveState sees it. page.setContent()
// alone would restore the DOM but lose the replay metadata.
try {
await newSession.setTabContent(saved.loadedHtml, { waitUntil: saved.loadedHtmlWaitUntil });
} catch (err: any) {
console.warn(`[browse] Failed to replay loadedHtml for tab ${id}: ${err.message}`);
}
} else if (saved.url) {
// Validate the saved URL before navigating — the state file is user-writable and
// a tampered URL could navigate to cloud metadata endpoints. Use the normalized
// return value so file:// forms get consistent treatment with live goto.
let normalizedUrl: string;
try {
normalizedUrl = await validateNavigationUrl(saved.url);
} catch (err: any) {
console.warn(`[browse] Skipping invalid URL in state file: ${saved.url}${err.message}`);
continue;
}
await page.goto(normalizedUrl, { waitUntil: 'domcontentloaded', timeout: 15000 }).catch(() => {});
}
if (saved.storage) {
try {
await page.evaluate((s: { localStorage: Record<string, string>; sessionStorage: Record<string, string> }) => {
if (s.localStorage) {
for (const [k, v] of Object.entries(s.localStorage)) {
localStorage.setItem(k, v);
}
}
if (s.sessionStorage) {
for (const [k, v] of Object.entries(s.sessionStorage)) {
sessionStorage.setItem(k, v);
}
}
}, saved.storage);
} catch {}
}
if (saved.isActive) activeId = id;
}
// If no pages were saved, create a blank one
if (this.pages.size === 0) {
await this.newTab();
} else {
this.activeTabId = activeId ?? [...this.pages.keys()][0];
}
// Clear refs — pages are new, locators are stale
this.clearRefs();
}
/**
* Recreate the browser context to apply user agent changes.
* Saves and restores cookies, localStorage, sessionStorage, and open pages.
* Falls back to a clean slate on any failure.
*/
async recreateContext(): Promise<string | null> {
if (this.connectionMode === 'headed') {
throw new Error('Cannot recreate context in headed mode. Use disconnect first.');
}
if (!this.browser || !this.context) {
throw new Error('Browser not launched');
}
try {
// 1. Save state
const state = await this.saveState();
// 2. Close old pages and context
for (const page of this.pages.values()) {
await page.close().catch(() => {});
}
this.pages.clear();
this.tabSessions.clear();
await this.context.close().catch(() => {});
// 3. Create new context with updated settings
const contextOptions: BrowserContextOptions = {
viewport: { width: this.currentViewport.width, height: this.currentViewport.height },
deviceScaleFactor: this.deviceScaleFactor,
};
if (this.customUserAgent) {
contextOptions.userAgent = this.customUserAgent;
}
this.context = await this.browser.newContext(contextOptions);
if (Object.keys(this.extraHeaders).length > 0) {
await this.context.setExtraHTTPHeaders(this.extraHeaders);
}
// 4. Restore state
await this.restoreState(state);
return null; // success
} catch (err: unknown) {
// Fallback: create a clean context + blank tab
try {
this.pages.clear();
this.tabSessions.clear();
if (this.context) await this.context.close().catch(() => {});
const contextOptions: BrowserContextOptions = {
viewport: { width: this.currentViewport.width, height: this.currentViewport.height },
deviceScaleFactor: this.deviceScaleFactor,
};
if (this.customUserAgent) {
contextOptions.userAgent = this.customUserAgent;
}
this.context = await this.browser!.newContext(contextOptions);
await this.newTab();
this.clearRefs();
} catch {
// If even the fallback fails, we're in trouble — but browser is still alive
}
return `Context recreation failed: ${err instanceof Error ? err.message : String(err)}. Browser reset to blank tab.`;
}
}
/**
* Change deviceScaleFactor + viewport size atomically.
*
* deviceScaleFactor is a context-level option, so Playwright requires a full context
* recreation. This method validates the input, stores the new values, calls
* recreateContext(), and rolls back the fields on failure so a bad call doesn't
* leave the manager in an inconsistent state.
*
* Returns null on success, or an error string if the new context couldn't be built
* (state may have been lost, per recreateContext's fallback behavior).
*/
async setDeviceScaleFactor(scale: number, width: number, height: number): Promise<string | null> {
if (!Number.isFinite(scale)) {
throw new Error(`viewport --scale: value must be a finite number, got ${scale}`);
}
if (scale < 1 || scale > 3) {
throw new Error(`viewport --scale: value must be between 1 and 3 (gstack policy cap), got ${scale}`);
}
if (this.connectionMode === 'headed') {
throw new Error('viewport --scale is not supported in headed mode — scale is controlled by the real browser window.');
}
const prevScale = this.deviceScaleFactor;
const prevViewport = { ...this.currentViewport };
this.deviceScaleFactor = scale;
this.currentViewport = { width, height };
const err = await this.recreateContext();
if (err !== null) {
// recreateContext's fallback path built a blank context using the NEW scale +
// viewport (the fields we just set). Rolling the fields back without a second
// recreate would leave the live context at new-scale while state says old-scale.
// Roll back fields FIRST, then force a second recreate against the old values
// so live state matches tracked state.
this.deviceScaleFactor = prevScale;
this.currentViewport = prevViewport;
const rollbackErr = await this.recreateContext();
if (rollbackErr !== null) {
// Second recreate also failed — we're in a clean blank slate via fallback, but
// with old scale. Return the original error so the caller sees the primary failure.
return `${err} (rollback also encountered: ${rollbackErr})`;
}
return err;
}
return null;
}
/** Read current deviceScaleFactor (for tests + debug). */
getDeviceScaleFactor(): number {
return this.deviceScaleFactor;
}
/** Read current tracked viewport (for tests + `viewport --scale` size fallback). */
getCurrentViewport(): { width: number; height: number } {
return { ...this.currentViewport };
}
// ─── Handoff: Headless → Headed ─────────────────────────────
/**
* Hand off browser control to the user by relaunching in headed mode.
*
* Flow (launch-first-close-second for safe rollback):
* 1. Save state from current headless browser
* 2. Launch NEW headed browser
* 3. Restore state into new browser
* 4. Close OLD headless browser
* If step 2 fails → return error, headless browser untouched
*/
async handoff(message: string): Promise<string> {
if (this.connectionMode === 'headed' || this.isHeaded) {
return `HANDOFF: Already in headed mode at ${this.getCurrentUrl()}`;
}
if (!this.browser || !this.context) {
throw new Error('Browser not launched');
}
// 1. Save state from current browser
const state = await this.saveState();
const currentUrl = this.getCurrentUrl();
// 2. Launch new headed browser with extension (same as launchHeaded)
// Uses launchPersistentContext so the extension auto-loads.
let newContext: BrowserContext;
try {
const fs = require('fs');
const path = require('path');
const extensionPath = this.findExtensionPath();
const launchArgs = ['--hide-crash-restore-bubble'];
if (extensionPath) {
launchArgs.push(`--disable-extensions-except=${extensionPath}`);
launchArgs.push(`--load-extension=${extensionPath}`);
// Auth token is served via /health endpoint now (no file write needed).
// Extension reads token from /health on connect.
console.log(`[browse] Handoff: loading extension from ${extensionPath}`);
} else {
console.log('[browse] Handoff: extension not found — headed mode without side panel');
}
const userDataDir = path.join(process.env.HOME || '/tmp', '.gstack', 'chromium-profile');
fs.mkdirSync(userDataDir, { recursive: true });
newContext = await chromium.launchPersistentContext(userDataDir, {
headless: false,
args: launchArgs,
viewport: null,
...(this.proxyConfig ? { proxy: this.proxyConfig } : {}),
ignoreDefaultArgs: [
'--disable-extensions',
'--disable-component-extensions-with-background-pages',
],
timeout: 15000,
});
} catch (err: unknown) {
const msg = err instanceof Error ? err.message : String(err);
return `ERROR: Cannot open headed browser — ${msg}. Headless browser still running.`;
}
// 3. Restore state into new headed browser
try {
// Swap to new browser/context before restoreState (it uses this.context)
const oldBrowser = this.browser;
this.context = newContext;
this.browser = newContext.browser();
this.pages.clear();
this.tabSessions.clear();
this.connectionMode = 'headed';
if (Object.keys(this.extraHeaders).length > 0) {
await newContext.setExtraHTTPHeaders(this.extraHeaders);
}
// Register crash handler on new browser
if (this.browser) {
this.browser.on('disconnected', () => {
if (this.intentionalDisconnect) return;
console.error('[browse] FATAL: Chromium process crashed or was killed. Server exiting.');
process.exit(1);
});
}
await this.restoreState(state);
this.isHeaded = true;
this.dialogAutoAccept = false; // User controls dialogs in headed mode
// 4. Close old headless browser (fire-and-forget)
oldBrowser.removeAllListeners('disconnected');
oldBrowser.close().catch(() => {});
return [
`HANDOFF: Browser opened at ${currentUrl}`,
`MESSAGE: ${message}`,
`STATUS: Waiting for user. Run 'resume' when done.`,
].join('\n');
} catch (err: unknown) {
// Restore failed — close the new context, keep old state
await newContext.close().catch(() => {});
const msg = err instanceof Error ? err.message : String(err);
return `ERROR: Handoff failed during state restore — ${msg}. Headless browser still running.`;
}
}
/**
* Resume AI control after user handoff.
* Clears stale refs and resets failure counter.
* The meta-command handler calls handleSnapshot() after this.
*/
resume(): void {
// Clear refs and frame on the active session
try {
const session = this.getActiveSession();
session.clearRefs();
session.setFrame(null);
} catch {}
this.resetFailures();
}
getIsHeaded(): boolean {
return this.isHeaded;
}
// ─── Auto-handoff Hint (consecutive failure tracking) ───────
incrementFailures(): void {
this.consecutiveFailures++;
}
resetFailures(): void {
this.consecutiveFailures = 0;
}
getFailureHint(): string | null {
if (this.consecutiveFailures >= 3 && !this.isHeaded) {
return `HINT: ${this.consecutiveFailures} consecutive failures. Consider using 'handoff' to let the user help.`;
}
return null;
}
// ─── Console/Network/Dialog/Ref Wiring ────────────────────
private wirePageEvents(page: Page) {
// Track tab close — remove from pages and sessions maps, switch to another tab
page.on('close', () => {
for (const [id, p] of this.pages) {
if (p === page) {
this.pages.delete(id);
this.tabSessions.delete(id);
console.log(`[browse] Tab closed (id=${id}, remaining=${this.pages.size})`);
// If the closed tab was active, switch to another
if (this.activeTabId === id) {
const remaining = [...this.pages.keys()];
this.activeTabId = remaining.length > 0 ? remaining[remaining.length - 1] : 0;
}
break;
}
}
});
// Clear ref map on navigation — refs point to stale elements after page change
// (lastSnapshot is NOT cleared — it's a text baseline for diffing)
page.on('framenavigated', (frame) => {
if (frame === page.mainFrame()) {
// Find the TabSession for this page and clear its per-tab state
for (const session of this.tabSessions.values()) {
if (session.page === page) {
session.onMainFrameNavigated();
break;
}
}
}
});
// ─── Dialog auto-handling (prevents browser lockup) ─────
page.on('dialog', async (dialog) => {
const entry: DialogEntry = {
timestamp: Date.now(),
type: dialog.type(),
message: dialog.message(),
defaultValue: dialog.defaultValue() || undefined,
action: this.dialogAutoAccept ? 'accepted' : 'dismissed',
response: this.dialogAutoAccept ? (this.dialogPromptText ?? undefined) : undefined,
};
addDialogEntry(entry);
try {
if (this.dialogAutoAccept) {
await dialog.accept(this.dialogPromptText ?? undefined);
} else {
await dialog.dismiss();
}
} catch {
// Dialog may have been dismissed by navigation
}
});
page.on('console', (msg) => {
addConsoleEntry({
timestamp: Date.now(),
level: msg.type(),
text: msg.text(),
});
});
page.on('request', (req) => {
addNetworkEntry({
timestamp: Date.now(),
method: req.method(),
url: req.url(),
});
});
page.on('response', (res) => {
// Find matching request entry and update it (backward scan)
const url = res.url();
const status = res.status();
for (let i = networkBuffer.length - 1; i >= 0; i--) {
const entry = networkBuffer.get(i);
if (entry && entry.url === url && !entry.status) {
networkBuffer.set(i, { ...entry, status, duration: Date.now() - entry.timestamp });
break;
}
}
});
// Capture response sizes via response finished
page.on('requestfinished', async (req) => {
try {
const res = await req.response();
if (res) {
const url = req.url();
const body = await res.body().catch(() => null);
const size = body ? body.length : 0;
for (let i = networkBuffer.length - 1; i >= 0; i--) {
const entry = networkBuffer.get(i);
if (entry && entry.url === url && !entry.size) {
networkBuffer.set(i, { ...entry, size });
break;
}
}
}
} catch {}
});
}
}