v1.57.7.0 feat: GSTACK REVIEW REPORT always declares unresolved decisions (#1916)

* fix(plan-devex-review): add missing gstack-review-log step

plan-devex-review carried the EXIT PLAN MODE GATE but never wrote a
review-log entry, so the gate's 'review log was called' check was
structurally unsatisfiable and the Review Readiness Dashboard / GSTACK
REVIEW REPORT had no plan-devex-review data to read. Add a Review Log
section before the dashboard read, logging the devex fields the report
parser already expects (status, scores, product_type, tthw, persona,
competitive_tier, unresolved, commit).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* feat(review): make unresolved-decisions status mandatory in GSTACK REVIEW REPORT

The report's UNRESOLVED line was optional ('omit if empty') and the EXIT
PLAN MODE GATE only checked it 'if applicable', so a plan could ship with
no statement about open decisions at all — a missed ambiguity read
identically to a clean plan. Now every report ends with a mandatory
unresolved-decisions status as its final line: either the exact unbolded
sentinel 'NO UNRESOLVED DECISIONS', or a '**UNRESOLVED DECISIONS:**' block
of bullets. The gate blocks ExitPlanMode unless that final line is present.

generatePlanFileReviewReport: current-review items are listed from context;
prior reviews contribute an aggregate count computed as latest-fresh-row-
per-skill minus the current run (no double-count, dashboard 7-day window).
generateExitPlanModeGate: check #3 is now blocking with no 'if applicable'
escape; bolded sentinel does not satisfy it.

Tests: static guard in gen-skill-docs.test.ts asserts the mandatory status
across all six report consumers and the gate across gate-bearing skills;
skill-e2e-plan.test.ts asserts the written report's final line is the
status (and fixes a stale 'four review rows' -> five-row prompt).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* refactor(review): compress unresolved-status prose to fit parity budget

After merging origin/main (v1.57.3.0), plan-devex-review exceeded the 1.05x
parity ratio vs the v1.53.0.0 baseline. Rather than rebase the baseline,
compressed the new prose to stay under the cap honestly: the report's
unresolved-status block (~32 -> ~9 lines) and the EXIT PLAN MODE GATE's
final-line check (~7 -> ~5 lines), plus the plan-devex-review review-log
step. All load-bearing rules and the exact gate-checkable tokens are
preserved; the static guards in gen-skill-docs.test.ts still pass.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* test: regenerate stale ship golden fixtures (#1909 follow-up)

#1909 (v1.57.3.0) added the always-loaded PR-title-version rule to ship's
template and committed the regenerated ship/SKILL.md, but did not refresh the
three ship golden fixtures, leaving the golden-file regression test red on
main. Regenerate them from current output. The diff is purely #1909 content:
the PR-title invariant line plus a previously-unresolved ${ctx.paths.binDir}
placeholder that current generation correctly resolves. No feature content
from this branch leaks into ship (ship does not consume the review report
resolvers).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* fix(plan-devex-review): restore TIMESTAMP fill instruction in review-log

Adversarial review caught that compressing the devex review-log block dropped
the TIMESTAMP substitution guidance the three sibling plan-review skills carry.
A literal "timestamp":"TIMESTAMP" parses as JSON but is an unparseable date,
so the Review Readiness Dashboard's 7-day freshness window silently drops the
plan-devex-review row (and the report's prior-review aggregation loses it).
Restore the one-line instruction. Also: the plan-review-report E2E now derives
its last-line check from the report slice, not the whole file, so a mis-placed
report surfaces the real trailing content in the failure message.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* test(parity): rebase parity baseline v1.53.0.0 -> v1.57.7.0

The v1.53 anchor is four minor versions stale. v1.54-v1.57 (ship/plan carving,
carve-guards, AUQ prose fallback, the cross-session decision-log preamble) plus
this branch's mandatory unresolved-decisions status line pushed the three
plan-review skills past the 5% ratchet even after exhaustive compression. The
new baseline captures current UNION sizes (skeleton + sections/*.md, matching
what parity-harness measures) so the per-skill 1.05 ratio keeps catching future
bloat. The frozen v1.44.1 integrity anchor and the v1.47 size-budget baseline
are untouched.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* chore: bump version and changelog (v1.57.7.0)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
Garry Tan
2026-06-08 21:17:18 -07:00
committed by GitHub
parent 9cc41b7163
commit 1626d4857b
19 changed files with 945 additions and 51 deletions
+19 -2
View File
@@ -692,7 +692,7 @@ Read plan.md — that's the plan to review. This is a standalone plan document,
Proceed directly to the full review. Skip any AskUserQuestion calls — this is non-interactive.
Skip the preamble bash block, lake intro, telemetry, and contributor mode sections.
CRITICAL REQUIREMENT: plan.md IS the plan file for this review session. After completing your review, you MUST write a "## GSTACK REVIEW REPORT" section to the END of plan.md, exactly as described in the "Plan File Review Report" section of SKILL.md. If gstack-review-read is not available or returns NO_REVIEWS, write the placeholder table with all four review rows (CEO, Codex, Eng, Design). Use the Edit tool to append to plan.md — do NOT overwrite the existing plan content.
CRITICAL REQUIREMENT: plan.md IS the plan file for this review session. After completing your review, you MUST write a "## GSTACK REVIEW REPORT" section to the END of plan.md, exactly as described in the "Plan File Review Report" section of SKILL.md. If gstack-review-read is not available or returns NO_REVIEWS, write the placeholder table with all five review rows (CEO, Codex, Eng, Design, DX). The report MUST end with the mandatory unresolved-decisions status as its final line — the exact unbolded line NO UNRESOLVED DECISIONS when nothing is open, or a "**UNRESOLVED DECISIONS:**" block of bullets when items remain. Nothing may follow it. Use the Edit tool to append to plan.md — do NOT overwrite the existing plan content.
This review report at the bottom of the plan is the MOST IMPORTANT deliverable of this test.`,
workingDirectory: planDir,
@@ -741,7 +741,24 @@ This review report at the bottom of the plan is the MOST IMPORTANT deliverable o
expect(afterReport).toContain('Eng Review');
expect(afterReport).toContain('Design Review');
console.log('Plan review report found at bottom of plan.md');
// Mandatory unresolved-decisions status (plan-flag-unresolved-issues): the report's
// final non-whitespace line must be the unresolved status — the exact sentinel or a
// bullet of an UNRESOLVED DECISIONS block, with nothing (CODEX/CROSS-MODEL/VERDICT/
// prose) after it.
expect(afterReport).toContain('UNRESOLVED DECISIONS');
// Compute from afterReport (the report section to EOF), not the whole file, so a
// mid-file report surfaces the real trailing content in the failure message.
const nonEmpty = afterReport.split('\n').map(l => l.trim()).filter(l => l !== '');
const lastLine = nonEmpty[nonEmpty.length - 1];
const isSentinel = lastLine === 'NO UNRESOLVED DECISIONS';
const isUnresolvedBullet =
/^[-*]\s+/.test(lastLine) && !/VERDICT/i.test(lastLine) && afterReport.includes('UNRESOLVED DECISIONS:');
expect(
isSentinel || isUnresolvedBullet,
`report must end with the unresolved-decisions status; last line was: ${lastLine}`,
).toBe(true);
console.log('Plan review report found at bottom of plan.md (ends with unresolved status)');
}, 420_000);
});