Both tests now pass:
- sidebar-url-accuracy: deterministic queue file check (no Claude needed)
- sidebar-navigate: real Claude responds through sidebar agent queue
Fixed: testIfSelected (sequential, not concurrent) to avoid queue file
conflicts. Added cost_usd field for eval collector compatibility.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Two E2E tests that exercise the full sidebar agent flow with real Claude:
- sidebar-navigate: POST /sidebar-command asking Claude to describe a fixture
page, verify it responds with page content through the chat buffer
- sidebar-url-accuracy: POST with activeTabUrl differing from Playwright URL,
verify the queue prompt uses the extension URL (the core bug fix)
Both registered as periodic tier (~$0.80 total, non-deterministic).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>