feat: always-full eng review + ship review gate persistence (v0.5.4) (#135)

Remove SMALL/BIG CHANGE menu from /plan-eng-review — every plan gets the full interactive review. Scope reduction is now proactive (only when complexity check triggers) rather than a menu item. Add review gate override persistence to /ship — when the user says "ship anyway" or "not relevant", that decision is saved to the branch's reviews.jsonl so subsequent /ship runs don't re-ask. Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-05-01 19:25:10 +02:00 · 2026-03-17 12:41:44 -05:00
parent 5e9f0e78f2
commit b65a464d37
7 changed files with 68 additions and 29 deletions
@@ -1,5 +1,16 @@
 # Changelog

+## 0.5.4 — 2026-03-17
+
+- **Engineering review is always the full review now.** `/plan-eng-review` no longer asks you to choose between "big change" and "small change" modes. Every plan gets the full interactive walkthrough (architecture, code quality, tests, performance). Scope reduction is only suggested when the complexity check actually triggers — not as a standing menu option.
+- **Ship stops asking about reviews once you've answered.** When `/ship` asks about missing reviews and you say "ship anyway" or "not relevant," that decision is saved for the branch. No more getting re-asked every time you re-run `/ship` after a pre-landing fix.
+
+### For contributors
+
+- Removed SMALL_CHANGE / BIG_CHANGE / SCOPE_REDUCTION menu from `plan-eng-review/SKILL.md.tmpl`. Scope reduction is now proactive (triggered by complexity check) rather than a menu item.
+- Added review gate override persistence to `ship/SKILL.md.tmpl` — writes `ship-review-override` entries to `$BRANCH-reviews.jsonl` so subsequent `/ship` runs skip the gate.
+- Updated 2 E2E test prompts to match new flow.
+
 ## 0.5.3 — 2026-03-17

 - **You're always in control — even when dreaming big.** `/plan-ceo-review` now presents every scope expansion as an individual decision you opt into. EXPANSION mode recommends enthusiastically, but you say yes or no to each idea. No more "the agent went wild and added 5 features I didn't ask for."
@@ -1 +1 @@
-0.5.3
+0.5.4
@@ -110,12 +110,11 @@ Before reviewing anything, answer these questions:
 3. **Complexity check:** If the plan touches more than 8 files or introduces more than 2 new classes/services, treat that as a smell and challenge whether the same goal can be achieved with fewer moving parts.
 4. **TODOS cross-reference:** Read `TODOS.md` if it exists. Are any deferred items blocking this plan? Can any deferred items be bundled into this PR without expanding scope? Does this plan create new work that should be captured as a TODO?

-Then ask if I want one of three options:
-1. **SCOPE REDUCTION:** The plan is overbuilt. Propose a minimal version that achieves the core goal, then review that.
-2. **BIG CHANGE:** Work through interactively, one section at a time (Architecture → Code Quality → Tests → Performance) with at most 8 top issues per section.
-3. **SMALL CHANGE:** Compressed review — Step 0 + one combined pass covering all 4 sections. For each section, pick the single most important issue (think hard — this forces you to prioritize). Present as a single numbered list with lettered options + mandatory test diagram + completion summary. One AskUserQuestion round at the end. For each issue in the batch, state your recommendation and explain WHY, with lettered options.
+If the complexity check triggers (8+ files or 2+ new classes/services), proactively recommend scope reduction via AskUserQuestion — explain what's overbuilt, propose a minimal version that achieves the core goal, and ask whether to reduce or proceed as-is. If the complexity check does not trigger, present your Step 0 findings and proceed directly to Section 1.

-**Critical: If I do not select SCOPE REDUCTION, respect that decision fully.** Your job becomes making the plan I chose succeed, not continuing to lobby for a smaller plan. Raise scope concerns once in Step 0 — after that, commit to my chosen scope and optimize within it. Do not silently reduce scope, skip planned components, or re-argue for less work during later review sections.
+Always work through the full interactive review: one section at a time (Architecture → Code Quality → Tests → Performance) with at most 8 top issues per section.
+
+**Critical: Once the user accepts or rejects a scope reduction recommendation, commit fully.** Do not re-argue for smaller scope during later review sections. Do not silently reduce scope or skip planned components.

 ## Review Sections (after scope is agreed)

@@ -201,7 +200,6 @@ Follow the AskUserQuestion format from the Preamble above. Additional rules for
 * **Map the reasoning to my engineering preferences above.** One sentence connecting your recommendation to a specific preference (DRY, explicit > clever, minimal diff, etc.).
 * Label with issue NUMBER + option LETTER (e.g., "3A", "3B").
 * **Escape hatch:** If a section has no issues, say so and move on. If an issue has an obvious fix with no real alternatives, state what you'll do and move on — don't waste a question on it. Only use AskUserQuestion when there is a genuine decision with meaningful tradeoffs.
-* **Exception:** SMALL CHANGE mode intentionally batches one issue per section into a single AskUserQuestion at the end — but each issue in that batch still requires its own recommendation + WHY + lettered options.

 ## Required outputs

@@ -239,7 +237,7 @@ If any failure mode has no test AND no error handling AND would be silent, flag

 ### Completion summary
 At the end of the review, fill in and display this summary so the user can see all findings at a glance:
- Step 0: Scope Challenge (user chose: ___)
+- Step 0: Scope Challenge — ___ (scope accepted as-is / scope reduced per recommendation)
 - Architecture Review: ___ issues found
 - Code Quality Review: ___ issues found
 - Test Review: diagram produced, ___ gaps identified
@@ -273,7 +271,7 @@ Substitute values from the Completion Summary:
 - **STATUS**: "clean" if 0 unresolved decisions AND 0 critical gaps; otherwise "issues_open"
 - **unresolved**: number from "Unresolved decisions" count
 - **critical_gaps**: number from "Failure modes: ___ critical gaps flagged"
- **MODE**: SCOPE_REDUCTION / BIG_CHANGE / SMALL_CHANGE
+- **MODE**: FULL_REVIEW / SCOPE_REDUCED

 ## Review Readiness Dashboard

@@ -45,12 +45,11 @@ Before reviewing anything, answer these questions:
 3. **Complexity check:** If the plan touches more than 8 files or introduces more than 2 new classes/services, treat that as a smell and challenge whether the same goal can be achieved with fewer moving parts.
 4. **TODOS cross-reference:** Read `TODOS.md` if it exists. Are any deferred items blocking this plan? Can any deferred items be bundled into this PR without expanding scope? Does this plan create new work that should be captured as a TODO?

-Then ask if I want one of three options:
-1. **SCOPE REDUCTION:** The plan is overbuilt. Propose a minimal version that achieves the core goal, then review that.
-2. **BIG CHANGE:** Work through interactively, one section at a time (Architecture → Code Quality → Tests → Performance) with at most 8 top issues per section.
-3. **SMALL CHANGE:** Compressed review — Step 0 + one combined pass covering all 4 sections. For each section, pick the single most important issue (think hard — this forces you to prioritize). Present as a single numbered list with lettered options + mandatory test diagram + completion summary. One AskUserQuestion round at the end. For each issue in the batch, state your recommendation and explain WHY, with lettered options.
+If the complexity check triggers (8+ files or 2+ new classes/services), proactively recommend scope reduction via AskUserQuestion — explain what's overbuilt, propose a minimal version that achieves the core goal, and ask whether to reduce or proceed as-is. If the complexity check does not trigger, present your Step 0 findings and proceed directly to Section 1.

-**Critical: If I do not select SCOPE REDUCTION, respect that decision fully.** Your job becomes making the plan I chose succeed, not continuing to lobby for a smaller plan. Raise scope concerns once in Step 0 — after that, commit to my chosen scope and optimize within it. Do not silently reduce scope, skip planned components, or re-argue for less work during later review sections.
+Always work through the full interactive review: one section at a time (Architecture → Code Quality → Tests → Performance) with at most 8 top issues per section.
+
+**Critical: Once the user accepts or rejects a scope reduction recommendation, commit fully.** Do not re-argue for smaller scope during later review sections. Do not silently reduce scope or skip planned components.

 ## Review Sections (after scope is agreed)

@@ -136,7 +135,6 @@ Follow the AskUserQuestion format from the Preamble above. Additional rules for
 * **Map the reasoning to my engineering preferences above.** One sentence connecting your recommendation to a specific preference (DRY, explicit > clever, minimal diff, etc.).
 * Label with issue NUMBER + option LETTER (e.g., "3A", "3B").
 * **Escape hatch:** If a section has no issues, say so and move on. If an issue has an obvious fix with no real alternatives, state what you'll do and move on — don't waste a question on it. Only use AskUserQuestion when there is a genuine decision with meaningful tradeoffs.
-* **Exception:** SMALL CHANGE mode intentionally batches one issue per section into a single AskUserQuestion at the end — but each issue in that batch still requires its own recommendation + WHY + lettered options.

 ## Required outputs

@@ -174,7 +172,7 @@ If any failure mode has no test AND no error handling AND would be silent, flag

 ### Completion summary
 At the end of the review, fill in and display this summary so the user can see all findings at a glance:
- Step 0: Scope Challenge (user chose: ___)
+- Step 0: Scope Challenge — ___ (scope accepted as-is / scope reduced per recommendation)
 - Architecture Review: ___ issues found
 - Code Quality Review: ___ issues found
 - Test Review: diagram produced, ___ gaps identified
@@ -208,7 +206,7 @@ Substitute values from the Completion Summary:
 - **STATUS**: "clean" if 0 unresolved decisions AND 0 critical gaps; otherwise "issues_open"
 - **unresolved**: number from "Unresolved decisions" count
 - **critical_gaps**: number from "Failure modes: ___ critical gaps flagged"
- **MODE**: SCOPE_REDUCTION / BIG_CHANGE / SMALL_CHANGE
+- **MODE**: FULL_REVIEW / SCOPE_REDUCED

 {{REVIEW_DASHBOARD}}

@@ -172,11 +172,27 @@ Parse the output. Find the most recent entry for each skill (plan-ceo-review, pl
 - CEO and Design reviews are shown for context but never block shipping
 - If \`skip_eng_review\` config is \`true\`, Eng Review shows "SKIPPED (global)" and verdict is CLEARED

-If the verdict is NOT "CLEARED", use AskUserQuestion:
- Show that Eng Review is missing or has open issues
- RECOMMENDATION: Choose B (run eng review first) unless the change is obviously trivial (<20 lines, typo fix, config-only)
- Options: A) Ship anyway  B) Abort — run /plan-eng-review first  C) Change is too small to need eng review
- If CEO/Design reviews are missing, mention them as informational ("CEO Review not run — recommended for product changes") but do NOT block or recommend aborting for them
+If the Eng Review is NOT "CLEAR":
+
+1. **Check for a prior override on this branch:**
+   ```bash
+   eval $(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null)
+   grep '"skill":"ship-review-override"' ~/.gstack/projects/$SLUG/$BRANCH-reviews.jsonl 2>/dev/null || echo "NO_OVERRIDE"
+   ```
+   If an override exists, display the dashboard and note "Review gate previously accepted — continuing." Do NOT ask again.
+
+2. **If no override exists,** use AskUserQuestion:
+   - Show that Eng Review is missing or has open issues
+   - RECOMMENDATION: Choose C if the change is obviously trivial (< 20 lines, typo fix, config-only); Choose B for larger changes
+   - Options: A) Ship anyway  B) Abort — run /plan-eng-review first  C) Change is too small to need eng review
+   - If CEO/Design reviews are missing, mention them as informational ("CEO Review not run — recommended for product changes") but do NOT block or recommend aborting for them
+
+3. **If the user chooses A or C,** persist the decision so future `/ship` runs on this branch skip the gate:
+   ```bash
+   eval $(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null)
+   echo '{"skill":"ship-review-override","timestamp":"'"$(date -u +%Y-%m-%dT%H:%M:%SZ)"'","decision":"USER_CHOICE"}' >> ~/.gstack/projects/$SLUG/$BRANCH-reviews.jsonl
+   ```
+   Substitute USER_CHOICE with "ship_anyway" or "not_relevant".

 ---

@@ -54,11 +54,27 @@ You are running the `/ship` workflow. This is a **non-interactive, fully automat

 {{REVIEW_DASHBOARD}}

-If the verdict is NOT "CLEARED", use AskUserQuestion:
- Show that Eng Review is missing or has open issues
- RECOMMENDATION: Choose B (run eng review first) unless the change is obviously trivial (<20 lines, typo fix, config-only)
- Options: A) Ship anyway  B) Abort — run /plan-eng-review first  C) Change is too small to need eng review
- If CEO/Design reviews are missing, mention them as informational ("CEO Review not run — recommended for product changes") but do NOT block or recommend aborting for them
+If the Eng Review is NOT "CLEAR":
+
+1. **Check for a prior override on this branch:**
+   ```bash
+   eval $(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null)
+   grep '"skill":"ship-review-override"' ~/.gstack/projects/$SLUG/$BRANCH-reviews.jsonl 2>/dev/null || echo "NO_OVERRIDE"
+   ```
+   If an override exists, display the dashboard and note "Review gate previously accepted — continuing." Do NOT ask again.
+
+2. **If no override exists,** use AskUserQuestion:
+   - Show that Eng Review is missing or has open issues
+   - RECOMMENDATION: Choose C if the change is obviously trivial (< 20 lines, typo fix, config-only); Choose B for larger changes
+   - Options: A) Ship anyway  B) Abort — run /plan-eng-review first  C) Change is too small to need eng review
+   - If CEO/Design reviews are missing, mention them as informational ("CEO Review not run — recommended for product changes") but do NOT block or recommend aborting for them
+
+3. **If the user chooses A or C,** persist the decision so future `/ship` runs on this branch skip the gate:
+   ```bash
+   eval $(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null)
+   echo '{"skill":"ship-review-override","timestamp":"'"$(date -u +%Y-%m-%dT%H:%M:%SZ)"'","decision":"USER_CHOICE"}' >> ~/.gstack/projects/$SLUG/$BRANCH-reviews.jsonl
+   ```
+   Substitute USER_CHOICE with "ship_anyway" or "not_relevant".

 ---

@@ -1003,7 +1003,7 @@ Replace session-cookie auth with JWT tokens. Currently using express-session + R

 Read plan.md — that's the plan to review. This is a standalone plan document, not a codebase — skip any codebase exploration steps.

-Choose SMALL CHANGE mode. Skip any AskUserQuestion calls — this is non-interactive.
+Proceed directly to the full review. Skip any AskUserQuestion calls — this is non-interactive.
 Write your complete review directly to ${planDir}/review-output.md

 Focus on architecture, code quality, tests, and performance sections.`,
@@ -1404,7 +1404,7 @@ export function main() { return Dashboard(); }

 Read plan.md — that's the plan to review. This is a standalone plan with source code in app.ts and dashboard.ts.

-Choose SMALL CHANGE mode. Skip any AskUserQuestion calls — this is non-interactive.
+Proceed directly to the full review. Skip any AskUserQuestion calls — this is non-interactive.

 IMPORTANT: After your review, you MUST write the test-plan artifact as described in the "Test Plan Artifact" section of SKILL.md. The remote-slug shim is at ${planDir}/browse/bin/remote-slug.
@@ -1 +1 @@
 .5.3
 .5.4