The test that proves "changes on the website propagate to Claude Code."
Opens the comparison board in a real headless browser with __GSTACK_SERVER_URL
injected, simulates user clicks (Submit, Regenerate, More Like This), and
verifies that feedback.json / feedback-pending.json land on disk with the
correct structured data.
6 tests covering: submit → feedback.json, post-submit UI lockdown,
regenerate → feedback-pending.json, more-like-this → feedback-pending.json,
regenerate spinner display, and full regen → reload → submit round-trip.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The agent backgrounds $D serve (Claude Code can't block on a subprocess
and do other work simultaneously). With stdout-only feedback delivery,
the agent never sees regenerate/remix feedback.
Fix: write feedback-pending.json (regenerate/remix) and feedback.json
(submit) to disk next to the board HTML. Agent polls the filesystem
instead of reading stdout. Both channels (stdout + disk) are always
active so foreground mode still works.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
7 tests: empty dir, nonexistent dir, single session with approved variant,
multiple sessions sorted newest-first, corrupted approved.json handled
gracefully, session without approved.json, self-contained HTML (no
external dependencies).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
11 tests covering: HTML serving with injected server URL, /api/progress
state reporting, submit → done lifecycle, regenerate → regenerating state,
remix with remixSpec, malformed JSON rejection, /api/reload HTML swapping,
missing file validation, and full regenerate → reload → submit round-trip.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>