mirror of
https://github.com/garrytan/gstack.git
synced 2026-06-17 23:30:09 +02:00
4c2bcf5c17
Adds a periodic-tier E2E that catches the May 2026 transcript bug shape the existing single-finding gate-tier floor test cannot detect: a model that fires one AskUserQuestion and then batches the remaining findings into a single "## Decisions to confirm" plan write + ExitPlanMode. Why a separate test from skill-e2e-plan-eng-finding-floor: the gate-tier floor (runPlanSkillFloorCheck) exits on the first AUQ render and returns success, so a once-then-batch model would pass it trivially. This test uses runPlanSkillCounting at periodic tier with N-AUQ tracking and asserts >= 3 distinct review-phase AUQs on a 4-finding seeded plan. - test/fixtures/forcing-finding-seeds.ts: FORCING_BATCHING_ENG fixture (4 distinct non-trivial findings spread across Architecture, Code Quality, Tests, Performance — mirrors the D1-D4 transcript shape) - test/skill-e2e-plan-eng-multi-finding-batching.test.ts: new test - test/helpers/touchfiles.ts: registered in BOTH E2E_TOUCHFILES and E2E_TIERS (touchfiles.test.ts asserts exact equality) Test will fail on baseline today because today's model uses the preamble fallback to batch findings; passes after the architectural fix lands in a follow-up commit. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>