Files
gstack/browse/src/sidebar-agent.ts
T
Garry Tan dc0bae82d3 fix: sidebar agent uses real tab URL instead of stale Playwright URL (v0.12.6.0) (#544)
* fix: sidebar agent uses extension's activeTabUrl instead of stale Playwright URL

When the user navigates manually in headed Chrome, Playwright's page.url()
stays on the old page. The sidebar agent was using this stale URL in its
system prompt, causing it to navigate to the wrong page (e.g., Hacker News
instead of the user's current page).

The Chrome extension now captures the active tab URL via chrome.tabs.query()
and sends it as activeTabUrl in the /sidebar-command POST body. The server
prefers this over Playwright's URL. The URL is sanitized (http/https only,
control chars stripped, 2048 char limit) to prevent prompt injection.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: connect-chrome pre-flight cleanup + improved onboarding docs

Adds Step 0 pre-flight cleanup that kills stale browse servers and cleans
Chromium profile locks before connecting. Improves the onboarding flow with
clearer instructions for finding the extension, opening the Side Panel, and
troubleshooting connection issues. Fixes Mode check from cdp to headed.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* test: sidebar agent test suite (layers 1-2)

Layer 1 (unit): 18 tests for URL sanitization in sidebar-utils.ts — http/https
pass, chrome:// rejected, javascript: rejected, control chars stripped, truncation.

Layer 2 (integration): 13 tests for server HTTP endpoints — auth, sidebar-command
queue writes, activeTabUrl override/fallback, event relay to chat buffer, message
queuing, queue overflow (429), chat clear, agent kill.

Source changes for testability:
- Extract sanitizeExtensionUrl() to browse/src/sidebar-utils.ts
- Add BROWSE_HEADLESS_SKIP env var to skip browser launch in HTTP-only tests
- Add SIDEBAR_QUEUE_PATH env var to both server.ts and sidebar-agent.ts
- Add SIDEBAR_AGENT_TIMEOUT env var to sidebar-agent.ts
- Sync package.json version to match VERSION (0.12.2.0)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* test: sidebar agent round-trip tests with mock claude (layer 3)

Starts server + sidebar-agent together with a mock claude binary (shell script
outputting canned stream-json). Verifies the full queue-based message flow:

- Full round-trip: POST /sidebar-command → queue → agent → mock claude → events → chat
- Claude crash recovery: mock exits 1, agent_error appears, status returns to idle
- Sequential queue drain: two rapid messages both process in order

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* test: sidebar agent E2E tests with real Claude (layer 4)

Two E2E tests that exercise the full sidebar agent flow with real Claude:

- sidebar-navigate: POST /sidebar-command asking Claude to describe a fixture
  page, verify it responds with page content through the chat buffer
- sidebar-url-accuracy: POST with activeTabUrl differing from Playwright URL,
  verify the queue prompt uses the extension URL (the core bug fix)

Both registered as periodic tier (~$0.80 total, non-deterministic).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: sidebar E2E tests — sequential execution + eval collector fix

Both tests now pass:
- sidebar-url-accuracy: deterministic queue file check (no Claude needed)
- sidebar-navigate: real Claude responds through sidebar agent queue

Fixed: testIfSelected (sequential, not concurrent) to avoid queue file
conflicts. Added cost_usd field for eval collector compatibility.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: kill stale sidebar-agent processes before starting new one

Each /connect-chrome starts a new sidebar-agent subprocess with unref()
but never kills the previous one. Old agents accumulate as zombies with
stale auth tokens. When they pick up queue entries, their event relay
fails (401), so the server never receives agent_done and marks the agent
as "hung". The user sees the sidebar freeze.

Fix: pkill any existing sidebar-agent.ts processes before spawning.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* chore: bump version and changelog (v0.12.6.0)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* docs: add P1 TODO for sidebar Write tool + error visibility

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-26 22:07:03 -06:00

280 lines
9.8 KiB
TypeScript

/**
* Sidebar Agent — polls agent-queue from server, spawns claude -p for each
* message, streams live events back to the server via /sidebar-agent/event.
*
* This runs as a NON-COMPILED bun process because compiled bun binaries
* cannot posix_spawn external executables. The server writes to the queue
* file, this process reads it and spawns claude.
*
* Usage: BROWSE_BIN=/path/to/browse bun run browse/src/sidebar-agent.ts
*/
import { spawn } from 'child_process';
import * as fs from 'fs';
import * as path from 'path';
const QUEUE = process.env.SIDEBAR_QUEUE_PATH || path.join(process.env.HOME || '/tmp', '.gstack', 'sidebar-agent-queue.jsonl');
const SERVER_PORT = parseInt(process.env.BROWSE_SERVER_PORT || '34567', 10);
const SERVER_URL = `http://127.0.0.1:${SERVER_PORT}`;
const POLL_MS = 500; // Fast polling — server already did the user-facing response
const B = process.env.BROWSE_BIN || path.resolve(__dirname, '../../.claude/skills/gstack/browse/dist/browse');
let lastLine = 0;
let authToken: string | null = null;
let isProcessing = false;
// ─── File drop relay ──────────────────────────────────────────
function getGitRoot(): string | null {
try {
const { execSync } = require('child_process');
return execSync('git rev-parse --show-toplevel', { encoding: 'utf-8', stdio: ['pipe', 'pipe', 'pipe'] }).trim();
} catch {
return null;
}
}
function writeToInbox(message: string, pageUrl?: string, sessionId?: string): void {
const gitRoot = getGitRoot();
if (!gitRoot) {
console.error('[sidebar-agent] Cannot write to inbox — not in a git repo');
return;
}
const inboxDir = path.join(gitRoot, '.context', 'sidebar-inbox');
fs.mkdirSync(inboxDir, { recursive: true });
const now = new Date();
const timestamp = now.toISOString().replace(/:/g, '-');
const filename = `${timestamp}-observation.json`;
const tmpFile = path.join(inboxDir, `.${filename}.tmp`);
const finalFile = path.join(inboxDir, filename);
const inboxMessage = {
type: 'observation',
timestamp: now.toISOString(),
page: { url: pageUrl || 'unknown', title: '' },
userMessage: message,
sidebarSessionId: sessionId || 'unknown',
};
fs.writeFileSync(tmpFile, JSON.stringify(inboxMessage, null, 2));
fs.renameSync(tmpFile, finalFile);
console.log(`[sidebar-agent] Wrote inbox message: ${filename}`);
}
// ─── Auth ────────────────────────────────────────────────────────
async function refreshToken(): Promise<string | null> {
try {
const resp = await fetch(`${SERVER_URL}/health`, { signal: AbortSignal.timeout(3000) });
if (!resp.ok) return null;
const data = await resp.json() as any;
authToken = data.token || null;
return authToken;
} catch {
return null;
}
}
// ─── Event relay to server ──────────────────────────────────────
async function sendEvent(event: Record<string, any>): Promise<void> {
if (!authToken) await refreshToken();
if (!authToken) return;
try {
await fetch(`${SERVER_URL}/sidebar-agent/event`, {
method: 'POST',
headers: {
'Content-Type': 'application/json',
'Authorization': `Bearer ${authToken}`,
},
body: JSON.stringify(event),
});
} catch (err) {
console.error('[sidebar-agent] Failed to send event:', err);
}
}
// ─── Claude subprocess ──────────────────────────────────────────
function shorten(str: string): string {
return str
.replace(new RegExp(B.replace(/[.*+?^${}()|[\]\\]/g, '\\$&'), 'g'), '$B')
.replace(/\/Users\/[^/]+/g, '~')
.replace(/\/conductor\/workspaces\/[^/]+\/[^/]+/g, '')
.replace(/\.claude\/skills\/gstack\//g, '')
.replace(/browse\/dist\/browse/g, '$B');
}
function summarizeToolInput(tool: string, input: any): string {
if (!input) return '';
if (tool === 'Bash' && input.command) {
let cmd = shorten(input.command);
return cmd.length > 80 ? cmd.slice(0, 80) + '…' : cmd;
}
if (tool === 'Read' && input.file_path) return shorten(input.file_path);
if (tool === 'Edit' && input.file_path) return shorten(input.file_path);
if (tool === 'Write' && input.file_path) return shorten(input.file_path);
if (tool === 'Grep' && input.pattern) return `/${input.pattern}/`;
if (tool === 'Glob' && input.pattern) return input.pattern;
try { return shorten(JSON.stringify(input)).slice(0, 60); } catch { return ''; }
}
async function handleStreamEvent(event: any): Promise<void> {
if (event.type === 'system' && event.session_id) {
// Relay claude session ID for --resume support
await sendEvent({ type: 'system', claudeSessionId: event.session_id });
}
if (event.type === 'assistant' && event.message?.content) {
for (const block of event.message.content) {
if (block.type === 'tool_use') {
await sendEvent({ type: 'tool_use', tool: block.name, input: summarizeToolInput(block.name, block.input) });
} else if (block.type === 'text' && block.text) {
await sendEvent({ type: 'text', text: block.text });
}
}
}
if (event.type === 'content_block_start' && event.content_block?.type === 'tool_use') {
await sendEvent({ type: 'tool_use', tool: event.content_block.name, input: summarizeToolInput(event.content_block.name, event.content_block.input) });
}
if (event.type === 'content_block_delta' && event.delta?.type === 'text_delta' && event.delta.text) {
await sendEvent({ type: 'text_delta', text: event.delta.text });
}
if (event.type === 'result') {
await sendEvent({ type: 'result', text: event.result || '' });
}
}
async function askClaude(queueEntry: any): Promise<void> {
const { prompt, args, stateFile, cwd } = queueEntry;
isProcessing = true;
await sendEvent({ type: 'agent_start' });
return new Promise((resolve) => {
// Build args fresh — don't trust --resume from queue (session may be stale)
let claudeArgs = ['-p', prompt, '--output-format', 'stream-json', '--verbose',
'--allowedTools', 'Bash,Read,Glob,Grep'];
// Validate cwd exists — queue may reference a stale worktree
let effectiveCwd = cwd || process.cwd();
try { fs.accessSync(effectiveCwd); } catch { effectiveCwd = process.cwd(); }
const proc = spawn('claude', claudeArgs, {
stdio: ['pipe', 'pipe', 'pipe'],
cwd: effectiveCwd,
env: { ...process.env, BROWSE_STATE_FILE: stateFile || '' },
});
proc.stdin.end();
let buffer = '';
proc.stdout.on('data', (data: Buffer) => {
buffer += data.toString();
const lines = buffer.split('\n');
buffer = lines.pop() || '';
for (const line of lines) {
if (!line.trim()) continue;
try { handleStreamEvent(JSON.parse(line)); } catch {}
}
});
proc.stderr.on('data', () => {}); // Claude logs to stderr, ignore
proc.on('close', (code) => {
if (buffer.trim()) {
try { handleStreamEvent(JSON.parse(buffer)); } catch {}
}
sendEvent({ type: 'agent_done' }).then(() => {
isProcessing = false;
resolve();
});
});
proc.on('error', (err) => {
sendEvent({ type: 'agent_error', error: err.message }).then(() => {
isProcessing = false;
resolve();
});
});
// Timeout (default 300s / 5 min — multi-page tasks need time)
const timeoutMs = parseInt(process.env.SIDEBAR_AGENT_TIMEOUT || '300000', 10);
setTimeout(() => {
try { proc.kill(); } catch {}
sendEvent({ type: 'agent_error', error: `Timed out after ${timeoutMs / 1000}s` }).then(() => {
isProcessing = false;
resolve();
});
}, timeoutMs);
});
}
// ─── Poll loop ───────────────────────────────────────────────────
function countLines(): number {
try {
return fs.readFileSync(QUEUE, 'utf-8').split('\n').filter(Boolean).length;
} catch { return 0; }
}
function readLine(n: number): string | null {
try {
const lines = fs.readFileSync(QUEUE, 'utf-8').split('\n').filter(Boolean);
return lines[n - 1] || null;
} catch { return null; }
}
async function poll() {
if (isProcessing) return; // One at a time — server handles queuing
const current = countLines();
if (current <= lastLine) return;
while (lastLine < current && !isProcessing) {
lastLine++;
const line = readLine(lastLine);
if (!line) continue;
let entry: any;
try { entry = JSON.parse(line); } catch { continue; }
if (!entry.message && !entry.prompt) continue;
console.log(`[sidebar-agent] Processing: "${entry.message}"`);
// Write to inbox so workspace agent can pick it up
writeToInbox(entry.message || entry.prompt, entry.pageUrl, entry.sessionId);
try {
await askClaude(entry);
} catch (err) {
console.error(`[sidebar-agent] Error:`, err);
await sendEvent({ type: 'agent_error', error: String(err) });
}
}
}
// ─── Main ────────────────────────────────────────────────────────
async function main() {
const dir = path.dirname(QUEUE);
fs.mkdirSync(dir, { recursive: true });
if (!fs.existsSync(QUEUE)) fs.writeFileSync(QUEUE, '');
lastLine = countLines();
await refreshToken();
console.log(`[sidebar-agent] Started. Watching ${QUEUE} from line ${lastLine}`);
console.log(`[sidebar-agent] Server: ${SERVER_URL}`);
console.log(`[sidebar-agent] Browse binary: ${B}`);
setInterval(poll, POLL_MS);
}
main().catch(console.error);