mirror of
https://github.com/garrytan/gstack.git
synced 2026-05-01 11:17:50 +02:00
ed1e4be2f6
* build: vendor xterm@5 for the Terminal sidebar tab
Adds xterm@5 + xterm-addon-fit as devDependencies and a `vendor:xterm`
build step that copies the assets into `extension/lib/` at build time.
The vendored files are .gitignored so the npm version stays the source
of truth. xterm@5 is eval-free, so no MV3 CSP changes needed.
No runtime callers yet — this just stages the assets.
* feat(server): add pty-session-cookie module for the Terminal tab
Mirrors `sse-session-cookie.ts` exactly. Mints short-lived 30-min HttpOnly
cookies for authenticating the Terminal-tab WebSocket upgrade against
the terminal-agent. Same TTL, same opportunistic-pruning shape, same
"scoped tokens never valid as root" invariant. Two registries instead of
one because the cookie names are different (`gstack_sse` vs `gstack_pty`)
and the token spaces must not overlap.
No callers yet — wired up in the next commit.
* feat(server): add terminal-agent.ts (PTY for the Terminal sidebar tab)
Translates phoenix gbrowser's Go PTY (cmd/gbd/terminal.go) into a Bun
non-compiled process. Lives separately from `sidebar-agent.ts` so a
WS-framing or PTY-cleanup bug can't take down the chat path (codex
outside-voice review caught the coupling risk).
Architecture:
- Bun.serve on 127.0.0.1:0 (never tunneled).
- POST /internal/grant accepts cookie tokens from the parent server over
loopback, authenticated with a per-boot internal token.
- GET /ws upgrades require BOTH (a) Origin: chrome-extension://<id> and
(b) the gstack_pty cookie minted by /pty-session. Either gate alone is
insufficient (CSWSH defense + auth defense).
- Lazy spawn: claude PTY is not started until the WS receives its first
data frame. Idle sidebar opens cost nothing.
- Bun PTY API: `terminal: { rows, cols, data(t, chunk) }` — verified at
impl time on Bun 1.3.10. proc.terminal.write() for input,
proc.terminal.resize() for resize, proc.kill() + 3s SIGKILL fallback
on close.
- process.on('uncaughtException'|'unhandledRejection') handlers so a
framing bug logs but doesn't kill the listener loop.
Test-only `BROWSE_TERMINAL_BINARY` env override lets the integration
tests spawn /bin/bash instead of requiring claude on every CI runner.
Not yet spawned by anything — wired in the next commit.
* feat(server): wire /pty-session route + spawn terminal-agent
Server-side glue connecting the Terminal sidebar tab to the new
terminal-agent process.
server.ts:
- New POST /pty-session route. Validates AUTH_TOKEN, mints a gstack_pty
HttpOnly cookie via pty-session-cookie.ts, posts the cookie value to
the agent's loopback /internal/grant. Returns the terminalPort + Set-Cookie
to the extension.
- /health response gains `terminalPort` (just the port number — never a
shell token). Tokens flow via the cookie path, never /health, because
/health already surfaces AUTH_TOKEN to localhost callers in headed mode
(that's a separate v1.1+ TODO).
- /pty-session and /terminal/* are deliberately NOT added to TUNNEL_PATHS,
so the dual-listener tunnel surface 404s by default-deny.
- Shutdown path now also pkills terminal-agent and unlinks its state files
(terminal-port + terminal-internal-token) so a reconnect doesn't try to
hit a dead port.
cli.ts:
- After spawning sidebar-agent.ts, also spawn terminal-agent.ts. Same
pattern: pkill old instances, Bun.spawn(['bun', 'run', script]) with
BROWSE_STATE_FILE + BROWSE_SERVER_PORT env. Non-fatal if the spawn
fails — chat still works without the terminal agent.
* feat(extension): Terminal as default sidebar tab
Adds a primary tab bar (Terminal | Chat) above the existing tab-content
panes. Terminal is the default-active tab; clicking Chat returns to the
existing claude -p one-shot flow which is preserved verbatim.
manifest.json: adds ws://127.0.0.1:*/ to host_permissions so MV3 doesn't
block the WebSocket upgrade.
sidepanel.html: new primary-tabs nav, new #tab-terminal pane with a
"Press any key to start Claude Code" bootstrap card, claude-not-found
install card, xterm mount point, and "session ended" restart UI. Loads
xterm.js + xterm-addon-fit + sidepanel-terminal.js. tab-chat is no
longer the .active default.
sidepanel.js: new activePrimaryPaneId() helper that reads which primary
tab is selected. Debug-close paths now route back to whichever primary
pane is active (was hardcoded to tab-chat). Primary-tab click handler
toggles .active classes and aria-selected. window.gstackServerPort and
window.gstackAuthToken exposed so sidepanel-terminal.js can build the
/pty-session POST and the WS URL.
sidepanel-terminal.js (new): xterm.js lifecycle. Lazy-spawn — first
keystroke fires POST /pty-session, then opens
ws://127.0.0.1:<terminalPort>/ws. Origin + cookie are set automatically
by the browser. Resize observer sends {type:"resize"} text frames.
ResizeObserver, tab-switch hooks, restart button, install-card retry.
On WS close shows "Session ended, click to restart" — no auto-reconnect
(codex outside-voice flagged that as session-burning).
sidepanel.css: primary-tabs bar + Terminal pane styling (full-height
xterm container, install card, ended state).
* test: terminal-agent + cookie module + sidebar default-tab regression
Three new test files:
terminal-agent.test.ts (16 tests): pty-session-cookie mint/validate/
revoke, Set-Cookie shape (HttpOnly + SameSite=Strict + Path=/, NO Secure
since 127.0.0.1 over HTTP), source-level guards that /pty-session and
/terminal/* are NOT in TUNNEL_PATHS, /health does NOT surface ptyToken
or gstack_pty, terminal-agent binds 127.0.0.1, /ws upgrade enforces
chrome-extension:// Origin AND gstack_pty cookie, lazy-spawn invariant
(spawnClaude is called from message handler, not upgrade), uncaughtException/
unhandledRejection handlers exist, SIGINT-then-SIGKILL cleanup.
terminal-agent-integration.test.ts (7 tests): spawns the agent as a real
subprocess in a tmp state dir. Verifies /internal/grant accepts/rejects
the loopback token, /ws gates (no Origin → 403, bad Origin → 403, no
cookie → 401), real WebSocket round-trip with /bin/bash via the
BROWSE_TERMINAL_BINARY override (write 'echo hello-pty-world\n', read it
back), and resize message acceptance.
sidebar-tabs.test.ts (13 tests): structural regression suite locking the
load-bearing invariants of the default-tab change — Terminal is .active,
Chat is not, xterm assets are loaded, debug-close path no longer hardcodes
tab-chat (uses activePrimaryPaneId), primary-tab click handler exists,
chat surface is not accidentally deleted, terminal JS does NOT auto-
reconnect on close, manifest declares ws:// + http:// localhost host
permissions, no unsafe-eval.
Plan called for Playwright + extension regression; the codebase doesn't
ship Playwright extension launcher infra, so we follow the existing
extension-test pattern (source-level structural assertions). Same
load-bearing intent — locks the invariants before they regress.
* docs: Terminal flow + threat model + v1.1 follow-ups
SIDEBAR_MESSAGE_FLOW.md: new "Terminal flow" section. Documents the WS
upgrade path (/pty-session cookie mint → /ws Origin + cookie gate →
lazy claude spawn), the dual-token model (AUTH_TOKEN for /pty-session,
gstack_pty cookie for /ws, INTERNAL_TOKEN for server↔agent loopback),
and the threat-model boundary — the Terminal tab bypasses the entire
prompt-injection security stack on purpose; user keystrokes are the
trust source. That trust assumption is load-bearing on three transport
guarantees: local-only listener, Origin gate, cookie auth. Drop any
one of those three and the tab becomes unsafe.
CLAUDE.md: extends the "Sidebar architecture" note to include
terminal-agent.ts in the read-this-first list. Adds a "Terminal tab is
its own process" note so a future contributor doesn't bolt PTY logic
onto sidebar-agent.ts.
TODOS.md: three new follow-ups under a new "Sidebar Terminal" section:
- v1.1: PTY session survives sidebar reload (Issue 1C deferred).
- v1.1+: audit /health AUTH_TOKEN distribution (codex finding #2 —
a pre-existing soft leak that cc-pty-import sidesteps but doesn't
fix).
- v1.1+: apply terminal-agent's process.on exception handlers to
sidebar-agent.ts (codex finding #4 — chat path has no fatal
handlers).
* feat(extension): Terminal-only sidebar — auth fix, UX polish, chat rip
The chat queue path is gone. The Chrome side panel is now just an
interactive claude PTY in xterm.js. Activity / Refs / Inspector still
exist behind the `debug` toggle in the footer.
Three threads of change, all from dogfood iteration on top of
cc-pty-import:
1. fix(server): cross-port WS auth via Sec-WebSocket-Protocol
- Browsers can't set Authorization on a WebSocket upgrade. We had
been minting an HttpOnly gstack_pty cookie via /pty-session, but
SameSite=Strict cookies don't survive the cross-port jump from
server.ts:34567 to the agent's random port from a chrome-extension
origin. The WS opened then immediately closed → "Session ended."
- /pty-session now also returns ptySessionToken in the JSON body.
- Extension calls `new WebSocket(url, [`gstack-pty.<token>`])`.
Browser sends Sec-WebSocket-Protocol on the upgrade.
- Agent reads the protocol header, validates against validTokens,
and MUST echo the protocol back (Chromium closes the connection
immediately if a server doesn't pick one of the offered protocols).
- Cookie path is kept as a fallback for non-browser callers (curl,
integration tests).
- New integration test exercises the full protocol-auth round-trip
via raw fetch+Upgrade so a future regression of this exact class
fails in CI.
2. fix(extension): UX polish on the Terminal pane
- Eager auto-connect when the sidebar opens — no "Press any key to
start" friction every reload.
- Always-visible ↻ Restart button in the terminal toolbar (not
gated on the ENDED state) so the user can force a fresh claude
mid-session.
- MutationObserver on #tab-terminal's class attribute drives a
fitAddon.fit() + term.refresh() when the pane becomes visible
again — xterm doesn't auto-redraw after display:none → display:flex.
3. feat(extension): rip the chat tab + sidebar-agent.ts
- Sidebar is Terminal-only. No more Terminal | Chat primary nav.
- sidebar-agent.ts deleted. /sidebar-command, /sidebar-chat,
/sidebar-agent/event, /sidebar-tabs* and friends all deleted.
- The pickSidebarModel router (sonnet vs opus) is gone — the live
PTY uses whatever model the user's `claude` CLI is configured with.
- Quick-actions (🧹 Cleanup / 📸 Screenshot / 🍪 Cookies) survive
in the Terminal toolbar. Cleanup now injects its prompt into the
live PTY via window.gstackInjectToTerminal — no more
/sidebar-command POST. The Inspector "Send to Code" action uses
the same injection path.
- clear-chat button removed from the footer.
- sidepanel.js shed ~900 lines of chat polling, optimistic UI,
stop-agent, etc.
Net diff: -3.4k lines across 16 files. CLAUDE.md, TODOS.md, and
docs/designs/SIDEBAR_MESSAGE_FLOW.md rewritten to match. The sidebar
regression test (browse/test/sidebar-tabs.test.ts) is rewritten as 27
structural assertions locking the new layout — Terminal sole pane,
no chat input, quick-actions in toolbar, eager-connect, MutationObserver
repaint, restart helper.
* feat: live tab awareness for the Terminal pane
claude in the PTY now has continuous tab-aware context. Three pieces:
1. Live state files. background.js listens to chrome.tabs.onActivated /
onCreated / onRemoved / onUpdated (throttled to URL/title/status==
complete so loading spinners don't spam) and pushes a snapshot. The
sidepanel relays it as a custom event; sidepanel-terminal.js sends
{type:"tabState"} text frames over the live PTY WebSocket.
terminal-agent.ts writes:
<stateDir>/tabs.json all open tabs (id, url, title, active,
pinned, audible, windowId)
<stateDir>/active-tab.json current active tab (skips chrome:// and
chrome-extension:// internal pages)
Atomic write via tmp + rename so claude never reads a half-written
document. A fresh snapshot is pushed on WS open so the files exist by
the time claude finishes booting.
2. New $B tab-each <command> [args...] meta-command. Fans out a single
command across every open tab, returns
{command, args, total, results: [{tabId, url, title, status, output}]}.
Skips chrome:// pages; restores the originally active tab in a finally
block (so a mid-batch error doesn't leave the user looking at a
different tab); uses bringToFront: false so the OS window doesn't
jump on every fanout. Scope-checks the inner command BEFORE the loop.
3. --append-system-prompt hint at spawn time. Claude is told about both
the state files and the $B tab-each command up front, so it doesn't
have to discover the surface by trial. Passed via the --append-system-
prompt CLI flag, NOT as a leading PTY write — the hint stays out of
the visible transcript.
Tests:
- browse/test/tab-each.test.ts (new) — registration + source-level
invariants (scope check before loop, finally-restore, bringToFront:false,
chrome:// skip) + behavior tests with a mock BrowserManager that verify
iteration order, JSON shape, error handling, and active-tab restore.
- browse/test/terminal-agent.test.ts — three new assertions for
tabState handler shape, atomic-write pattern, and the
--append-system-prompt wiring at spawn.
Verified live: opened 5 tabs, ran $B tab-each url against the live
server, got per-tab JSON results back, original active tab restored
without OS focus stealing.
* chore: drop sidebar-agent test refs after chat rip
Five test files / describe blocks targeted the deleted chat path:
- browse/test/security-e2e-fullstack.test.ts (full-stack chat-pipeline E2E
with mock claude — whole file gone)
- browse/test/security-review-fullstack.test.ts (review-flow E2E with real
classifier — whole file gone)
- browse/test/security-review-sidepanel-e2e.test.ts (Playwright E2E for
the security event banner that was ripped from sidepanel.html)
- browse/test/security-audit-r2.test.ts (5 describe blocks: agent queue
permissions, isValidQueueEntry stateFile traversal, loadSession session-ID
validation, switchChatTab DocumentFragment, pollChat reentrancy guard,
/sidebar-tabs URL sanitization, sidebar-agent SIGTERM→SIGKILL escalation,
AGENT_SRC top-level read converted to graceful fallback)
- browse/test/security-adversarial-fixes.test.ts (canary stream-chunk split
detection on detectCanaryLeak; one tool-output test on sidebar-agent)
- test/skill-validation.test.ts (sidebar agent #584 describe block)
These all assumed sidebar-agent.ts existed and tested chat-queue plumbing,
chat-tab DOM round-trip, chat-polling reentrancy, or per-message classifier
canary detection. With the live PTY there is no chat queue, no chat tab,
no LLM stream to canary-scan, and no per-message subprocess. The Terminal
pane's invariants are covered by the new browse/test/sidebar-tabs.test.ts
(27 structural assertions), browse/test/terminal-agent.test.ts, and
browse/test/terminal-agent-integration.test.ts.
bun test → exit 0, 0 failures.
* chore: bump version and changelog (v1.14.0.0)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(extension): xterm fills the full Terminal panel height
The Terminal pane only rendered into the top portion of the panel — most
of the panel below the prompt was an empty black gap. Three layered
issues, all about xterm.js measuring dimensions during a layout state
that wasn't ready yet:
1. order-of-operations in connect(): ensureXterm() ran BEFORE
setState(LIVE), so term.open() measured els.mount while it was still
display:none. xterm caches a 0-size viewport synchronously inside
open() and never auto-recovers when the container goes visible.
Flipped: setState(LIVE) → ensureXterm.
2. first fit() ran synchronously before the browser had applied the
.active class transition. Wrapped in requestAnimationFrame so layout
has settled before fit() reads clientHeight.
3. CSS flex-overflow trap: .terminal-mount has flex:1 inside the
flex-column #tab-terminal, but .tab-content's `overflow-y: auto` and
the lack of `min-height: 0` on .terminal-mount meant the item
couldn't shrink below content size. flex:1 then refused to expand
into available space and xterm rendered into whatever its initial
2x2 measurement happened to be.
Fixes:
- extension/sidepanel-terminal.js: reorder + RAF fit
- extension/sidepanel.css: .terminal-mount gets `flex: 1 1 0` +
`min-height: 0` + `position: relative`. #tab-terminal overrides
.tab-content's `overflow-y: auto` to `overflow: hidden` (xterm has
its own viewport scroll; the parent shouldn't compete) and explicitly
re-declares `display: flex; flex-direction: column` for #tab-terminal.active.
bun test browse/test/sidebar-tabs.test.ts → 27/27 pass.
Manually verified: side panel opens → Terminal fills full panel height,
xterm scrollback works, debug-tab toggle still repaints correctly.
---------
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
535 lines
24 KiB
TypeScript
535 lines
24 KiB
TypeScript
/**
|
|
* Security audit round-2 tests — static source checks + behavioral verification.
|
|
*
|
|
* These tests verify that security fixes are present at the source level and
|
|
* behave correctly at runtime. Source-level checks guard against regressions
|
|
* that could silently remove a fix without breaking compilation.
|
|
*/
|
|
|
|
import { describe, it, expect, beforeAll, afterAll } from 'bun:test';
|
|
import * as fs from 'fs';
|
|
import * as path from 'path';
|
|
import * as os from 'os';
|
|
|
|
// ─── Shared source reads (used across multiple test sections) ───────────────
|
|
const META_SRC = fs.readFileSync(path.join(import.meta.dir, '../src/meta-commands.ts'), 'utf-8');
|
|
const WRITE_SRC = fs.readFileSync(path.join(import.meta.dir, '../src/write-commands.ts'), 'utf-8');
|
|
const SERVER_SRC = fs.readFileSync(path.join(import.meta.dir, '../src/server.ts'), 'utf-8');
|
|
// sidebar-agent.ts was ripped (chat queue replaced by interactive PTY).
|
|
// AGENT_SRC kept as empty string so the legacy describe block below skips
|
|
// without crashing module load on a missing file.
|
|
const AGENT_SRC = (() => {
|
|
try { return fs.readFileSync(path.join(import.meta.dir, '../src/sidebar-agent.ts'), 'utf-8'); }
|
|
catch { return ''; }
|
|
})();
|
|
const SNAPSHOT_SRC = fs.readFileSync(path.join(import.meta.dir, '../src/snapshot.ts'), 'utf-8');
|
|
const PATH_SECURITY_SRC = fs.readFileSync(path.join(import.meta.dir, '../src/path-security.ts'), 'utf-8');
|
|
|
|
// ─── Helper ─────────────────────────────────────────────────────────────────
|
|
|
|
/**
|
|
* Extract the source text between two string markers.
|
|
*/
|
|
function sliceBetween(src: string, startMarker: string, endMarker: string): string {
|
|
const start = src.indexOf(startMarker);
|
|
if (start === -1) return '';
|
|
const end = src.indexOf(endMarker, start + startMarker.length);
|
|
if (end === -1) return src.slice(start);
|
|
return src.slice(start, end + endMarker.length);
|
|
}
|
|
|
|
/**
|
|
* Extract a function body by name — finds `function name(` or `export function name(`
|
|
* and returns the full balanced-brace block.
|
|
*/
|
|
function extractFunction(src: string, name: string): string {
|
|
const pattern = new RegExp(`(?:export\\s+)?function\\s+${name}\\s*\\(`);
|
|
const match = pattern.exec(src);
|
|
if (!match) return '';
|
|
let depth = 0;
|
|
let inBody = false;
|
|
const start = match.index;
|
|
for (let i = start; i < src.length; i++) {
|
|
if (src[i] === '{') { depth++; inBody = true; }
|
|
else if (src[i] === '}') { depth--; }
|
|
if (inBody && depth === 0) return src.slice(start, i + 1);
|
|
}
|
|
return src.slice(start);
|
|
}
|
|
|
|
// ─── Agent queue security ──────────────────────────────────────────────────
|
|
// Original block validated the chat queue's filesystem permissions and
|
|
// schema validator on sidebar-agent.ts. Both are gone (chat queue ripped
|
|
// in favor of the interactive Terminal PTY). The remaining 0o700 / 0o600
|
|
// invariants on extension queue paths are now covered by terminal-agent
|
|
// integration tests and the sidebar-tabs regression suite.
|
|
|
|
// ─── Shared source reads for CSS validator tests ────────────────────────────
|
|
const CDP_SRC = fs.readFileSync(path.join(import.meta.dir, '../src/cdp-inspector.ts'), 'utf-8');
|
|
const EXTENSION_SRC = fs.readFileSync(
|
|
path.join(import.meta.dir, '../../extension/inspector.js'),
|
|
'utf-8'
|
|
);
|
|
|
|
// ─── Task 2: Shared CSS value validator ─────────────────────────────────────
|
|
|
|
describe('Task 2: CSS value validator blocks dangerous patterns', () => {
|
|
describe('source-level checks', () => {
|
|
it('write-commands.ts style handler contains DANGEROUS_CSS url check', () => {
|
|
const styleBlock = sliceBetween(WRITE_SRC, "case 'style':", 'case \'cleanup\'');
|
|
expect(styleBlock).toMatch(/url\\s\*\\\(/);
|
|
});
|
|
|
|
it('write-commands.ts style handler blocks expression()', () => {
|
|
const styleBlock = sliceBetween(WRITE_SRC, "case 'style':", "case 'cleanup'");
|
|
expect(styleBlock).toMatch(/expression\\s\*\\\(/);
|
|
});
|
|
|
|
it('write-commands.ts style handler blocks @import', () => {
|
|
const styleBlock = sliceBetween(WRITE_SRC, "case 'style':", "case 'cleanup'");
|
|
expect(styleBlock).toContain('@import');
|
|
});
|
|
|
|
it('cdp-inspector.ts modifyStyle contains DANGEROUS_CSS url check', () => {
|
|
const fn = extractFunction(CDP_SRC, 'modifyStyle');
|
|
expect(fn).toBeTruthy();
|
|
expect(fn).toMatch(/url\\s\*\\\(/);
|
|
});
|
|
|
|
it('cdp-inspector.ts modifyStyle blocks @import', () => {
|
|
const fn = extractFunction(CDP_SRC, 'modifyStyle');
|
|
expect(fn).toContain('@import');
|
|
});
|
|
|
|
it('extension injectCSS validates id format', () => {
|
|
const fn = extractFunction(EXTENSION_SRC, 'injectCSS');
|
|
expect(fn).toBeTruthy();
|
|
// Should contain a regex test for valid id characters
|
|
expect(fn).toMatch(/\^?\[a-zA-Z0-9_-\]/);
|
|
});
|
|
|
|
it('extension injectCSS blocks dangerous CSS patterns', () => {
|
|
const fn = extractFunction(EXTENSION_SRC, 'injectCSS');
|
|
expect(fn).toMatch(/url\\s\*\\\(/);
|
|
});
|
|
|
|
it('extension toggleClass validates className format', () => {
|
|
const fn = extractFunction(EXTENSION_SRC, 'toggleClass');
|
|
expect(fn).toBeTruthy();
|
|
expect(fn).toMatch(/\^?\[a-zA-Z0-9_-\]/);
|
|
});
|
|
});
|
|
});
|
|
|
|
// ─── Task 1: Harden validateOutputPath to use realpathSync ──────────────────
|
|
|
|
describe('Task 1: validateOutputPath uses realpathSync', () => {
|
|
describe('source-level checks', () => {
|
|
it('path-security.ts validateOutputPath contains realpathSync', () => {
|
|
const fn = extractFunction(PATH_SECURITY_SRC, 'validateOutputPath');
|
|
expect(fn).toBeTruthy();
|
|
expect(fn).toContain('realpathSync');
|
|
});
|
|
|
|
it('path-security.ts SAFE_DIRECTORIES resolves with realpathSync', () => {
|
|
const safeBlock = sliceBetween(PATH_SECURITY_SRC, 'const SAFE_DIRECTORIES', ';');
|
|
expect(safeBlock).toContain('realpathSync');
|
|
});
|
|
|
|
it('meta-commands.ts re-exports validateOutputPath from path-security', () => {
|
|
expect(META_SRC).toContain("from './path-security'");
|
|
expect(META_SRC).toContain('validateOutputPath');
|
|
});
|
|
|
|
it('write-commands.ts imports validateOutputPath from path-security', () => {
|
|
expect(WRITE_SRC).toContain("from './path-security'");
|
|
expect(WRITE_SRC).toContain('validateOutputPath');
|
|
});
|
|
});
|
|
|
|
describe('behavioral checks', () => {
|
|
let tmpDir: string;
|
|
let symlinkPath: string;
|
|
|
|
beforeAll(() => {
|
|
tmpDir = fs.mkdtempSync(path.join(os.tmpdir(), 'gstack-sec-test-'));
|
|
symlinkPath = path.join(tmpDir, 'evil-link');
|
|
try {
|
|
fs.symlinkSync('/etc', symlinkPath);
|
|
} catch {
|
|
symlinkPath = '';
|
|
}
|
|
});
|
|
|
|
afterAll(() => {
|
|
try {
|
|
if (symlinkPath) fs.unlinkSync(symlinkPath);
|
|
fs.rmdirSync(tmpDir);
|
|
} catch {
|
|
// best-effort cleanup
|
|
}
|
|
});
|
|
|
|
it('meta-commands validateOutputPath rejects path through /etc symlink', async () => {
|
|
if (!symlinkPath) {
|
|
console.warn('Skipping: symlink creation failed');
|
|
return;
|
|
}
|
|
const mod = await import('../src/meta-commands.ts');
|
|
const attackPath = path.join(symlinkPath, 'passwd');
|
|
expect(() => mod.validateOutputPath(attackPath)).toThrow();
|
|
});
|
|
|
|
it('realpathSync on symlink-to-/etc resolves to /etc (out of safe dirs)', () => {
|
|
if (!symlinkPath) {
|
|
console.warn('Skipping: symlink creation failed');
|
|
return;
|
|
}
|
|
const resolvedLink = fs.realpathSync(symlinkPath);
|
|
// macOS: /etc -> /private/etc
|
|
expect(resolvedLink).toBe(fs.realpathSync('/etc'));
|
|
const TEMP_DIR_VAL = process.platform === 'win32' ? os.tmpdir() : '/tmp';
|
|
const safeDirs = [TEMP_DIR_VAL, process.cwd()].map(d => {
|
|
try { return fs.realpathSync(d); } catch { return d; }
|
|
});
|
|
const passwdReal = path.join(resolvedLink, 'passwd');
|
|
const isSafe = safeDirs.some(d => passwdReal === d || passwdReal.startsWith(d + path.sep));
|
|
expect(isSafe).toBe(false);
|
|
});
|
|
|
|
it('meta-commands validateOutputPath accepts legitimate tmpdir paths', async () => {
|
|
const mod = await import('../src/meta-commands.ts');
|
|
// Use /tmp (which resolves to /private/tmp on macOS) — matches SAFE_DIRECTORIES
|
|
const tmpBase = process.platform === 'darwin' ? '/tmp' : os.tmpdir();
|
|
const legitimatePath = path.join(tmpBase, 'gstack-screenshot.png');
|
|
expect(() => mod.validateOutputPath(legitimatePath)).not.toThrow();
|
|
});
|
|
|
|
it('meta-commands validateOutputPath accepts paths in cwd', async () => {
|
|
const mod = await import('../src/meta-commands.ts');
|
|
const cwdPath = path.join(process.cwd(), 'output.png');
|
|
expect(() => mod.validateOutputPath(cwdPath)).not.toThrow();
|
|
});
|
|
|
|
it('meta-commands validateOutputPath rejects paths outside safe dirs', async () => {
|
|
const mod = await import('../src/meta-commands.ts');
|
|
expect(() => mod.validateOutputPath('/home/user/secret.png')).toThrow(/Path must be within/);
|
|
expect(() => mod.validateOutputPath('/var/log/access.log')).toThrow(/Path must be within/);
|
|
});
|
|
});
|
|
});
|
|
|
|
// ─── Round-2 review findings: applyStyle CSS check ──────────────────────────
|
|
|
|
describe('Round-2 finding 1: extension applyStyle blocks dangerous CSS values', () => {
|
|
const INSPECTOR_SRC = fs.readFileSync(
|
|
path.join(import.meta.dir, '../../extension/inspector.js'),
|
|
'utf-8'
|
|
);
|
|
|
|
it('applyStyle function exists in inspector.js', () => {
|
|
const fn = extractFunction(INSPECTOR_SRC, 'applyStyle');
|
|
expect(fn).toBeTruthy();
|
|
});
|
|
|
|
it('applyStyle validates CSS value with url() block', () => {
|
|
const fn = extractFunction(INSPECTOR_SRC, 'applyStyle');
|
|
// Source contains literal regex /url\s*\(/ — match the source-level escape sequence
|
|
expect(fn).toMatch(/url\\s\*\\\(/);
|
|
});
|
|
|
|
it('applyStyle blocks expression()', () => {
|
|
const fn = extractFunction(INSPECTOR_SRC, 'applyStyle');
|
|
expect(fn).toMatch(/expression\\s\*\\\(/);
|
|
});
|
|
|
|
it('applyStyle blocks @import', () => {
|
|
const fn = extractFunction(INSPECTOR_SRC, 'applyStyle');
|
|
expect(fn).toContain('@import');
|
|
});
|
|
|
|
it('applyStyle blocks javascript: scheme', () => {
|
|
const fn = extractFunction(INSPECTOR_SRC, 'applyStyle');
|
|
expect(fn).toContain('javascript:');
|
|
});
|
|
|
|
it('applyStyle blocks data: scheme', () => {
|
|
const fn = extractFunction(INSPECTOR_SRC, 'applyStyle');
|
|
expect(fn).toContain('data:');
|
|
});
|
|
|
|
it('applyStyle value check appears before setProperty call', () => {
|
|
const fn = extractFunction(INSPECTOR_SRC, 'applyStyle');
|
|
// Check that the CSS value guard (url\s*\() appears before setProperty
|
|
const valueCheckIdx = fn.search(/url\\s\*\\\(/);
|
|
const setPropIdx = fn.indexOf('setProperty');
|
|
expect(valueCheckIdx).toBeGreaterThan(-1);
|
|
expect(setPropIdx).toBeGreaterThan(-1);
|
|
expect(valueCheckIdx).toBeLessThan(setPropIdx);
|
|
});
|
|
});
|
|
|
|
// ─── Round-2 finding 2: snapshot.ts annotated path uses realpathSync ────────
|
|
|
|
describe('Round-2 finding 2: snapshot.ts annotated path uses realpathSync', () => {
|
|
it('snapshot.ts annotated screenshot section contains realpathSync', () => {
|
|
// Slice the annotated screenshot block from the source
|
|
const annotateStart = SNAPSHOT_SRC.indexOf('opts.annotate');
|
|
expect(annotateStart).toBeGreaterThan(-1);
|
|
const annotateBlock = SNAPSHOT_SRC.slice(annotateStart, annotateStart + 2000);
|
|
expect(annotateBlock).toContain('realpathSync');
|
|
});
|
|
|
|
it('snapshot.ts annotated path validation resolves safe dirs with realpathSync', () => {
|
|
const annotateStart = SNAPSHOT_SRC.indexOf('opts.annotate');
|
|
const annotateBlock = SNAPSHOT_SRC.slice(annotateStart, annotateStart + 2000);
|
|
// safeDirs array must be built with .map() that calls realpathSync
|
|
// Pattern: [TEMP_DIR, process.cwd()].map(...realpathSync...)
|
|
expect(annotateBlock).toContain('[TEMP_DIR, process.cwd()].map');
|
|
expect(annotateBlock).toContain('realpathSync');
|
|
});
|
|
});
|
|
|
|
// ─── Round-2 finding 3: stateFile path traversal check ─────────────────────
|
|
// Tested isValidQueueEntry's stateFile validator on sidebar-agent.ts. Both
|
|
// the function and the file are gone (chat queue ripped). The terminal-agent
|
|
// PTY path no longer takes a queue entry — it accepts WebSocket frames
|
|
// gated on Origin + session token, no on-disk queue to traverse. Path
|
|
// traversal in browse-server's tab-state writer is covered by
|
|
// browse/test/terminal-agent.test.ts (handleTabState atomic-write tests).
|
|
|
|
// ─── Task 5: /health endpoint must not expose sensitive fields ───────────────
|
|
|
|
describe('/health endpoint security', () => {
|
|
it('must not expose currentMessage', () => {
|
|
const block = sliceBetween(SERVER_SRC, "url.pathname === '/health'", "url.pathname === '/refs'");
|
|
expect(block).not.toContain('currentMessage');
|
|
});
|
|
it('must not expose currentUrl', () => {
|
|
const block = sliceBetween(SERVER_SRC, "url.pathname === '/health'", "url.pathname === '/refs'");
|
|
expect(block).not.toContain('currentUrl');
|
|
});
|
|
});
|
|
|
|
// ─── Task 6: frame --url ReDoS fix ──────────────────────────────────────────
|
|
|
|
describe('frame --url ReDoS fix', () => {
|
|
it('frame --url section does not pass raw user input to new RegExp()', () => {
|
|
const block = sliceBetween(META_SRC, "target === '--url'", 'else {');
|
|
expect(block).not.toMatch(/new RegExp\(args\[/);
|
|
});
|
|
|
|
it('frame --url section uses escapeRegExp before constructing RegExp', () => {
|
|
const block = sliceBetween(META_SRC, "target === '--url'", 'else {');
|
|
expect(block).toContain('escapeRegExp');
|
|
});
|
|
|
|
it('escapeRegExp neutralizes catastrophic patterns (behavioral)', async () => {
|
|
const mod = await import('../src/meta-commands.ts');
|
|
const { escapeRegExp } = mod as any;
|
|
expect(typeof escapeRegExp).toBe('function');
|
|
const evil = '(a+)+$';
|
|
const escaped = escapeRegExp(evil);
|
|
const start = Date.now();
|
|
new RegExp(escaped).test('aaaaaaaaaaaaaaaaaaaaaaaaaaa!');
|
|
expect(Date.now() - start).toBeLessThan(100);
|
|
});
|
|
});
|
|
|
|
// ─── Task 7: watch-mode guard in chain command ───────────────────────────────
|
|
|
|
describe('chain command watch-mode guard', () => {
|
|
it('chain loop contains isWatching() guard before write dispatch', () => {
|
|
// Post-alias refactor: loop iterates over canonicalized `c of commands`.
|
|
const block = sliceBetween(META_SRC, 'for (const c of commands)', 'Wait for network to settle');
|
|
expect(block).toContain('isWatching');
|
|
});
|
|
|
|
it('chain loop BLOCKED message appears for write commands in watch mode', () => {
|
|
const block = sliceBetween(META_SRC, 'for (const c of commands)', 'Wait for network to settle');
|
|
expect(block).toContain('BLOCKED: write commands disabled in watch mode');
|
|
});
|
|
});
|
|
|
|
// ─── Task 8: Cookie domain validation ───────────────────────────────────────
|
|
|
|
describe('cookie-import domain validation', () => {
|
|
it('cookie-import handler validates cookie domain against page domain', () => {
|
|
const block = sliceBetween(WRITE_SRC, "case 'cookie-import':", "case 'cookie-import-browser':");
|
|
expect(block).toContain('cookieDomain');
|
|
expect(block).toContain('defaultDomain');
|
|
expect(block).toContain('does not match current page domain');
|
|
});
|
|
|
|
it('cookie-import-browser handler validates --domain against page hostname', () => {
|
|
const block = sliceBetween(WRITE_SRC, "case 'cookie-import-browser':", "case 'style':");
|
|
expect(block).toContain('normalizedDomain');
|
|
expect(block).toContain('pageHostname');
|
|
expect(block).toContain('does not match current page domain');
|
|
});
|
|
});
|
|
|
|
// loadSession session ID validation — loadSession lived inside the chat
|
|
// agent state block (sidebar-agent.ts session persistence). Chat queue
|
|
// is gone, so the function and its session-ID validator are gone. The
|
|
// terminal-agent's PTY session has no on-disk session ID — the WebSocket
|
|
// holds the session for its lifetime.
|
|
|
|
// ─── Task 10: Responsive screenshot path validation ──────────────────────────
|
|
|
|
describe('Task 10: responsive screenshot path validation', () => {
|
|
it('responsive loop contains validateOutputPath before page.screenshot()', () => {
|
|
// Extract the responsive case block
|
|
const block = sliceBetween(META_SRC, "case 'responsive':", 'Restore original viewport');
|
|
expect(block).toBeTruthy();
|
|
expect(block).toContain('validateOutputPath');
|
|
});
|
|
|
|
it('responsive loop calls validateOutputPath on the per-viewport path, not just the prefix', () => {
|
|
const block = sliceBetween(META_SRC, 'for (const vp of viewports)', 'Restore original viewport');
|
|
expect(block).toContain('validateOutputPath');
|
|
});
|
|
|
|
it('validateOutputPath appears before page.screenshot() in the loop', () => {
|
|
const block = sliceBetween(META_SRC, 'for (const vp of viewports)', 'Restore original viewport');
|
|
const validateIdx = block.indexOf('validateOutputPath');
|
|
const screenshotIdx = block.indexOf('page.screenshot');
|
|
expect(validateIdx).toBeGreaterThan(-1);
|
|
expect(screenshotIdx).toBeGreaterThan(-1);
|
|
expect(validateIdx).toBeLessThan(screenshotIdx);
|
|
});
|
|
|
|
it('results.push is present in the loop block (loop structure intact)', () => {
|
|
const block = sliceBetween(META_SRC, 'for (const vp of viewports)', 'Restore original viewport');
|
|
expect(block).toContain('results.push');
|
|
});
|
|
});
|
|
|
|
// ─── Task 11: State load — cookie + page URL validation ──────────────────────
|
|
|
|
const BROWSER_MANAGER_SRC = fs.readFileSync(path.join(import.meta.dir, '../src/browser-manager.ts'), 'utf-8');
|
|
|
|
describe('Task 11: state load cookie validation', () => {
|
|
it('state load block filters cookies by domain and type', () => {
|
|
const block = sliceBetween(META_SRC, "action === 'load'", "throw new Error('Usage: state save|load");
|
|
expect(block).toContain('cookie');
|
|
expect(block).toContain('domain');
|
|
expect(block).toContain('filter');
|
|
});
|
|
|
|
it('state load block checks for localhost and .internal in cookie domains', () => {
|
|
const block = sliceBetween(META_SRC, "action === 'load'", "throw new Error('Usage: state save|load");
|
|
expect(block).toContain('localhost');
|
|
expect(block).toContain('.internal');
|
|
});
|
|
|
|
it('state load block uses validatedCookies when calling restoreState', () => {
|
|
const block = sliceBetween(META_SRC, "action === 'load'", "throw new Error('Usage: state save|load");
|
|
expect(block).toContain('validatedCookies');
|
|
// Must pass validatedCookies to restoreState, not the raw data.cookies
|
|
const restoreIdx = block.indexOf('restoreState');
|
|
const restoreBlock = block.slice(restoreIdx, restoreIdx + 200);
|
|
expect(restoreBlock).toContain('validatedCookies');
|
|
});
|
|
|
|
it('browser-manager restoreState validates page URL before goto', () => {
|
|
// restoreState is a class method — use sliceBetween to extract the method body
|
|
const restoreFn = sliceBetween(BROWSER_MANAGER_SRC, 'async restoreState(', 'async recreateContext(');
|
|
expect(restoreFn).toBeTruthy();
|
|
expect(restoreFn).toContain('validateNavigationUrl');
|
|
});
|
|
|
|
it('browser-manager restoreState skips invalid URLs with a warning', () => {
|
|
const restoreFn = sliceBetween(BROWSER_MANAGER_SRC, 'async restoreState(', 'async recreateContext(');
|
|
expect(restoreFn).toContain('Skipping invalid URL');
|
|
expect(restoreFn).toContain('continue');
|
|
});
|
|
|
|
it('validateNavigationUrl call appears before page.goto in restoreState', () => {
|
|
const restoreFn = sliceBetween(BROWSER_MANAGER_SRC, 'async restoreState(', 'async recreateContext(');
|
|
const validateIdx = restoreFn.indexOf('validateNavigationUrl');
|
|
const gotoIdx = restoreFn.indexOf('page.goto');
|
|
expect(validateIdx).toBeGreaterThan(-1);
|
|
expect(gotoIdx).toBeGreaterThan(-1);
|
|
expect(validateIdx).toBeLessThan(gotoIdx);
|
|
});
|
|
});
|
|
|
|
// activeTabUrl sanitized before syncActiveTabByUrl — tested URL sanitization
|
|
// on the now-deleted /sidebar-tabs and /sidebar-command routes. The
|
|
// terminal-agent reads tab URLs from the live tabs.json file (atomic write
|
|
// from background.js), and chrome:// / chrome-extension:// pages are
|
|
// filtered server-side in handleTabState — see browse/test/terminal-agent.test.ts.
|
|
|
|
// ─── Task 13: Inbox output wrapped as untrusted ──────────────────────────────
|
|
|
|
describe('Task 13: inbox output wrapped as untrusted content', () => {
|
|
it('inbox handler wraps userMessage with wrapUntrustedContent', () => {
|
|
const block = sliceBetween(META_SRC, "case 'inbox':", "case 'state':");
|
|
expect(block).toContain('wrapUntrustedContent');
|
|
});
|
|
|
|
it('inbox handler applies wrapUntrustedContent to userMessage', () => {
|
|
const block = sliceBetween(META_SRC, "case 'inbox':", "case 'state':");
|
|
// Should wrap userMessage
|
|
expect(block).toMatch(/wrapUntrustedContent.*userMessage|userMessage.*wrapUntrustedContent/);
|
|
});
|
|
|
|
it('inbox handler applies wrapUntrustedContent to url', () => {
|
|
const block = sliceBetween(META_SRC, "case 'inbox':", "case 'state':");
|
|
// Should also wrap url
|
|
expect(block).toMatch(/wrapUntrustedContent.*msg\.url|msg\.url.*wrapUntrustedContent/);
|
|
});
|
|
|
|
it('wrapUntrustedContent calls appear in the message formatting loop', () => {
|
|
const block = sliceBetween(META_SRC, 'for (const msg of messages)', 'Handle --clear flag');
|
|
expect(block).toContain('wrapUntrustedContent');
|
|
});
|
|
});
|
|
|
|
// switchChatTab DocumentFragment + pollChat reentrancy guard tests targeted
|
|
// now-deleted chat-tab DOM logic and chat-polling reentrancy. Both are gone
|
|
// (Terminal pane is the sole sidebar surface; xterm.js owns its own DOM
|
|
// lifecycle, and the WebSocket has no reentrancy hazard).
|
|
|
|
// ─── Task 16: SIGKILL escalation ────────────────────────────────────────────
|
|
// Originally tested sidebar-agent's SIDEBAR_AGENT_TIMEOUT block. The chat
|
|
// queue and its watchdog are gone. terminal-agent.ts disposes claude with
|
|
// the same SIGINT-then-SIGKILL-after-3s pattern; that's covered by
|
|
// browse/test/terminal-agent.test.ts ("cleanup escalates SIGINT to SIGKILL
|
|
// after 3s on close").
|
|
|
|
// ─── Task 17: viewport and wait bounds clamping ──────────────────────────────
|
|
|
|
describe('Task 17: viewport dimensions and wait timeouts are clamped', () => {
|
|
it('viewport case clamps width and height with Math.min/Math.max', () => {
|
|
const block = sliceBetween(WRITE_SRC, "case 'viewport':", "case 'cookie':");
|
|
expect(block).toBeTruthy();
|
|
expect(block).toMatch(/Math\.min|Math\.max/);
|
|
});
|
|
|
|
it('viewport case uses rawW/rawH before clamping (not direct destructure)', () => {
|
|
const block = sliceBetween(WRITE_SRC, "case 'viewport':", "case 'cookie':");
|
|
expect(block).toContain('rawW');
|
|
expect(block).toContain('rawH');
|
|
});
|
|
|
|
it('wait case (networkidle branch) clamps timeout with MAX_WAIT_MS', () => {
|
|
const block = sliceBetween(WRITE_SRC, "case 'wait':", "case 'viewport':");
|
|
expect(block).toBeTruthy();
|
|
expect(block).toMatch(/MAX_WAIT_MS/);
|
|
});
|
|
|
|
it('wait case (element branch) also clamps timeout', () => {
|
|
const block = sliceBetween(WRITE_SRC, "case 'wait':", "case 'viewport':");
|
|
// Both the networkidle and element branches declare MAX_WAIT_MS
|
|
const maxWaitCount = (block.match(/MAX_WAIT_MS/g) || []).length;
|
|
expect(maxWaitCount).toBeGreaterThanOrEqual(2);
|
|
});
|
|
|
|
it('wait case uses MIN_WAIT_MS as a floor', () => {
|
|
const block = sliceBetween(WRITE_SRC, "case 'wait':", "case 'viewport':");
|
|
expect(block).toContain('MIN_WAIT_MS');
|
|
});
|
|
});
|