Files
gstack/test
Garry Tan c9cead34e2 test: codex skill validation (12 stub tests) + E2E eval test
Stub tests (free tier): verify template content — three modes, gate verdict,
session continuity, cost tracking, cross-model comparison, binary discovery,
error handling, mktemp usage, and integrations into /review, /ship, /plan-eng-review.

E2E test (paid tier): runs /codex review on vulnerable fixture repo via
session-runner, verifies output contains findings and GATE verdict.
2026-03-18 21:21:02 -07:00
..