diff --git a/CHANGELOG.md b/CHANGELOG.md index 3f8cffae1..503433f11 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -1,5 +1,91 @@ # Changelog +## [1.57.10.0] - 2026-06-10 + +## **Codex review now runs by default everywhere it matters.** +## **One switch governs it, and it falls back to Claude when Codex is missing or unauthed.** + +Codex cross-model review used to be inconsistent. `/review` and `/ship` ran it +automatically, but plan reviews hid it behind a "Want an outside voice?" question +you had to say yes to every time, `/document-release` never ran it at all, and every +entry point only checked whether the `codex` binary existed, not whether it was +logged in. Now `codex_reviews` is one master switch (default `enabled`) that governs +Codex review across `/review`, `/ship`, `/plan-ceo-review`, `/plan-eng-review`, +`/plan-design-review`, `/plan-devex-review`, `/document-release`, and `/autoplan`. +The plan-review outside voice runs automatically. `/document-release` gets a new +Codex pass that checks your docs against what actually shipped. Every call site now +detects install AND auth separately, and degrades to a Claude subagent with a clear +one-line reason instead of silently skipping. Turn the whole thing off with one +command: `gstack-config set codex_reviews disabled`. + +### The numbers that matter + +Verified by the gate-tier E2E evals that exercise these exact paths +(`codex-offered-ceo-review`, `codex-offered-eng-review`, `document-release`, +`codex-review-findings`), all green this run. + +| Metric | Before | After | Δ | +|--------|--------|-------|---| +| Skills where Codex review runs by default | 2 | 8 | +6 | +| Prompts to get a plan-review outside voice | 1 (opt-in each time) | 0 (automatic) | -1 | +| Codex readiness detection | install only | install + auth | sharper | +| Master switches to disable it all | 0 (per-skill only) | 1 (`codex_reviews`) | +1 | +| `/document-release` Codex doc audit | none | doc-vs-diff pass | new | + +When Codex is installed but not logged in, you used to get nothing on the paths that +checked only `command -v codex`. Now you get a named reason ("Codex installed but not +authenticated, using Claude subagent") and the review still happens. A typo on the +switch (`gstack-config set codex_reviews disabledd`) is rejected and your existing +setting is preserved, so a fat-finger can never silently turn paid Codex calls on or +off. + +### What this means for you + +If you run gstack day to day, you stop deciding whether to get a second model's eyes +on every plan and every release. It is just there, on by default, the way the strong +reviewers already worked on diffs. If you do not have Codex set up, nothing breaks: +you get the Claude outside voice instead, with a one-line note telling you how to add +Codex for true cross-model coverage. If you want it gone, one command turns off all +eight surfaces at once. + +### Itemized changes + +#### Added +- **`codex_reviews` as the master switch** for Codex review across `/review`, `/ship`, + `/document-release`, all four plan reviews, and `/autoplan` (`bin/gstack-config`). + Default `enabled`. Invalid values on `set` are rejected with the existing value + preserved, so a typo cannot flip paid Codex calls. +- **`/document-release` Codex doc audit** (`generateCodexDocReview`): reviews the + docs you touched against the release diff for stale claims, undocumented new + surface, and over/under-sold CHANGELOG entries. Informational, with an explicit + apply-fixes decision point. Never auto-edits docs. +- **`codexPreflight()` shared helper** (`scripts/resolvers/constants.ts`): one + self-contained bash block that reads the switch, sources the probe, checks install + and auth, and emits a single canonical mode (`ready` / `not_installed` / + `not_authed` / `disabled`). + +#### Changed +- **Plan-review outside voice is default-on**, not opt-in. The "Want an outside + voice?" question is gone; it runs automatically and falls back to a Claude subagent + when Codex is unavailable. Incorporating its findings still requires your explicit + approval (cross-model tension is presented, never auto-applied). +- **Adversarial review detects auth, not just install** (`generateAdversarialStep`): + distinct "not installed" vs "not authenticated" guidance. The 200-line threshold + for the heavier structured `codex review` is unchanged. +- **`/autoplan` honors `codex_reviews=disabled`** in its Phase 0.5 preflight, so the + switch is truly global. + +#### Fixed +- Three `gstack-config` tests asserted `get`/`list` print empty for unset keys; the + tool falls back to the documented defaults table. Assertions now match real behavior. + +#### For contributors +- Size-budget guards widened for the default-on outside-voice prose, each with a + rationale comment (`test/helpers/carve-guards.ts`, `test/helpers/parity-harness.ts`). +- Static guards added: plan reviews must not carry the opt-in question and must render + the default-on voice; `/document-release` must carry the doc review; the codex host + strips all of it (`test/skill-validation.test.ts`). + ## [1.57.9.0] - 2026-06-09 ## **Your gstack checkout stays clean when gbrain is installed.** diff --git a/VERSION b/VERSION index 10521b5e0..e535f0937 100644 --- a/VERSION +++ b/VERSION @@ -1 +1 @@ -1.57.9.0 +1.57.10.0 diff --git a/package.json b/package.json index 53da1d736..86801282f 100644 --- a/package.json +++ b/package.json @@ -1,6 +1,6 @@ { "name": "gstack", - "version": "1.57.9.0", + "version": "1.57.10.0", "description": "Garry's Stack — Claude Code skills + fast headless browser. One repo, one install, entire AI engineering workflow.", "license": "MIT", "type": "module",