Files
gstack/test
Garry Tan f458f18f42 fix: broaden session-awareness E2E assertion to accept more LLM phrasings
The test checked for exact keywords like "RECOMMENDATION", "option a",
"which approach" but the model sometimes phrases options as "A)" or
references "Checkout" vs "Elements" directly without using the word
"recommend". Added: "option b", regex for "a)"/"b)", and the actual
decision terms (checkout, elements, hosted, embedded).

Failed 3/3 retries in CI because the assertion was too narrow for
non-deterministic LLM output.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-29 15:45:26 -07:00
..