Files
gstack/test
Garry Tan 91c0b31a78 test: drop strict "Choose" regex from AUQ format checks; judge covers presence
Periodic-tier eval surfaced that Opus 4.7 writes "Recommendation: A) SCOPE
EXPANSION because..." (option label, no "Choose" prefix), which the
generate-ask-user-format.ts spec actually mandates — `Recommendation: <choice>
because <reason>` where <choice> is the bare option label. The legacy regex
`/[Rr]ecommendation:[*\s]*Choose/` pinned down a per-skill template-example
phrasing that the canonical spec doesn't require, so it false-failed on
correctly-formatted captures.

judgeRecommendation.present (deterministic regex over the canonical shape)
plus has_because and reason_substance >= 4 cover the recommendation surface
end-to-end. Drop the redundant strict regex from all five wired call sites
(four plan-format cases + new office-hours Phase 4 test).

Verified by re-reading the captured AUQs from both failing periodic runs:
both contained substantive Recommendation lines that the spec accepts and
the judge correctly grades at substance >= 4.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 14:23:07 -07:00
..