v1.57.7.0 feat: GSTACK REVIEW REPORT always declares unresolved decisions (#1916)

* fix(plan-devex-review): add missing gstack-review-log step

plan-devex-review carried the EXIT PLAN MODE GATE but never wrote a
review-log entry, so the gate's 'review log was called' check was
structurally unsatisfiable and the Review Readiness Dashboard / GSTACK
REVIEW REPORT had no plan-devex-review data to read. Add a Review Log
section before the dashboard read, logging the devex fields the report
parser already expects (status, scores, product_type, tthw, persona,
competitive_tier, unresolved, commit).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* feat(review): make unresolved-decisions status mandatory in GSTACK REVIEW REPORT

The report's UNRESOLVED line was optional ('omit if empty') and the EXIT
PLAN MODE GATE only checked it 'if applicable', so a plan could ship with
no statement about open decisions at all — a missed ambiguity read
identically to a clean plan. Now every report ends with a mandatory
unresolved-decisions status as its final line: either the exact unbolded
sentinel 'NO UNRESOLVED DECISIONS', or a '**UNRESOLVED DECISIONS:**' block
of bullets. The gate blocks ExitPlanMode unless that final line is present.

generatePlanFileReviewReport: current-review items are listed from context;
prior reviews contribute an aggregate count computed as latest-fresh-row-
per-skill minus the current run (no double-count, dashboard 7-day window).
generateExitPlanModeGate: check #3 is now blocking with no 'if applicable'
escape; bolded sentinel does not satisfy it.

Tests: static guard in gen-skill-docs.test.ts asserts the mandatory status
across all six report consumers and the gate across gate-bearing skills;
skill-e2e-plan.test.ts asserts the written report's final line is the
status (and fixes a stale 'four review rows' -> five-row prompt).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* refactor(review): compress unresolved-status prose to fit parity budget

After merging origin/main (v1.57.3.0), plan-devex-review exceeded the 1.05x
parity ratio vs the v1.53.0.0 baseline. Rather than rebase the baseline,
compressed the new prose to stay under the cap honestly: the report's
unresolved-status block (~32 -> ~9 lines) and the EXIT PLAN MODE GATE's
final-line check (~7 -> ~5 lines), plus the plan-devex-review review-log
step. All load-bearing rules and the exact gate-checkable tokens are
preserved; the static guards in gen-skill-docs.test.ts still pass.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* test: regenerate stale ship golden fixtures (#1909 follow-up)

#1909 (v1.57.3.0) added the always-loaded PR-title-version rule to ship's
template and committed the regenerated ship/SKILL.md, but did not refresh the
three ship golden fixtures, leaving the golden-file regression test red on
main. Regenerate them from current output. The diff is purely #1909 content:
the PR-title invariant line plus a previously-unresolved ${ctx.paths.binDir}
placeholder that current generation correctly resolves. No feature content
from this branch leaks into ship (ship does not consume the review report
resolvers).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* fix(plan-devex-review): restore TIMESTAMP fill instruction in review-log

Adversarial review caught that compressing the devex review-log block dropped
the TIMESTAMP substitution guidance the three sibling plan-review skills carry.
A literal "timestamp":"TIMESTAMP" parses as JSON but is an unparseable date,
so the Review Readiness Dashboard's 7-day freshness window silently drops the
plan-devex-review row (and the report's prior-review aggregation loses it).
Restore the one-line instruction. Also: the plan-review-report E2E now derives
its last-line check from the report slice, not the whole file, so a mis-placed
report surfaces the real trailing content in the failure message.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* test(parity): rebase parity baseline v1.53.0.0 -> v1.57.7.0

The v1.53 anchor is four minor versions stale. v1.54-v1.57 (ship/plan carving,
carve-guards, AUQ prose fallback, the cross-session decision-log preamble) plus
this branch's mandatory unresolved-decisions status line pushed the three
plan-review skills past the 5% ratchet even after exhaustive compression. The
new baseline captures current UNION sizes (skeleton + sections/*.md, matching
what parity-harness measures) so the per-skill 1.05 ratio keeps catching future
bloat. The frozen v1.44.1 integrity anchor and the v1.47 size-budget baseline
are untouched.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* chore: bump version and changelog (v1.57.7.0)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
Garry Tan
2026-06-08 21:17:18 -07:00
committed by GitHub
parent 9cc41b7163
commit 1626d4857b
19 changed files with 945 additions and 51 deletions
+59
View File
@@ -1,5 +1,64 @@
# Changelog
## [1.57.7.0] - 2026-06-08
## **Every plan review now ends by telling you, in one line, whether anything is still unresolved.**
## **The GSTACK REVIEW REPORT closes with the open decisions, or "NO UNRESOLVED DECISIONS" in plain sight, before you approve.**
When a plan-review skill (/plan-ceo-review, /plan-eng-review, /plan-design-review,
/plan-devex-review, and /codex) finishes and hands you the plan to approve, its report
now ends with a mandatory unresolved-decisions verdict. If decisions are still open, it
lists each one and what breaks if you ship it deferred. If nothing is open, it prints the
exact line NO UNRESOLVED DECISIONS. A token-reduction pass had made this line optional, so
a clean plan and a plan hiding an open question rendered the same. Now the line is never
omitted, it is always the last thing you read before the approval prompt, and the approval
gate refuses to let the plan through without it.
### What changed, before and after
| At plan-approval time | Before | After |
|---|---|---|
| Clean plan | usually no unresolved line | `NO UNRESOLVED DECISIONS` as the final line |
| Plan with open decisions | unresolved line optional, often dropped | `**UNRESOLVED DECISIONS:**` + one bullet per open item |
| Approval gate (ExitPlanMode) | checked the line "if applicable" | blocks unless the unresolved status is the final line |
| /plan-devex-review review log | never written, gate uncheckable | written, so the dashboard and report see its data |
The unresolved count across reviews is computed without double-counting the review that
just ran, using the same 7-day freshness window as the Review Readiness Dashboard.
### What this means for you
Every approve-plan moment now carries an explicit verdict on open questions, so a missed
ambiguity cannot slip through looking like a clean plan. If you run the plan-review skills
or /autoplan, you will see the unresolved status as the closing line of every report.
Nothing to configure. Upgrade and your next plan review shows it.
### Itemized changes
#### Added
- **Mandatory unresolved-decisions status in the GSTACK REVIEW REPORT.** Generated into
all six report consumers (/plan-ceo-review, /plan-eng-review, /plan-design-review,
/plan-devex-review, /codex, /devex-review) from `scripts/resolvers/review.ts`. The report
always ends with either the exact unbolded sentinel `NO UNRESOLVED DECISIONS` or a
`**UNRESOLVED DECISIONS:**` bullet block listing each open item; never omitted, always
the final line.
- **Blocking approval gate.** The EXIT PLAN MODE GATE now refuses ExitPlanMode unless the
report's final non-whitespace line is the unresolved status (no "if applicable" escape).
- Static and E2E tests pinning the mandatory status across every report consumer and
gate-bearing skill, so a future compression pass cannot silently drop it again.
#### Fixed
- **/plan-devex-review never logged a review entry.** It carried the approval gate but
never called `gstack-review-log`, so the gate's "review log was called" check was
structurally unsatisfiable and its data was invisible to the Review Readiness Dashboard
and the report. It now logs with the correct timestamp and DX fields.
#### For contributors
- Rebased the parity-suite size baseline v1.53.0.0 to v1.57.7.0 (captures current union
sizes; keeps the per-skill 1.05 ratio so future bloat is still caught). Regenerated the
three ship golden fixtures left stale by #1909. The frozen v1.44.1 integrity anchor and
the v1.47 size-budget baseline are untouched.
## [1.57.6.0] - 2026-06-07
## **Eight community-filed bugs fixed in one wave, four of them security guards that were quietly failing open.**
+1 -1
View File
@@ -1 +1 @@
1.57.6.0
1.57.7.0
+21 -6
View File
@@ -1112,14 +1112,24 @@ Produce this markdown table:
| DX Review | \`/plan-devex-review\` | Developer experience gaps | {runs} | {status} | {findings} |
\`\`\`
Below the table, add these lines (omit any that are empty/not applicable):
Below the table, add these lines. **CODEX** and **CROSS-MODEL** are optional (omit when
empty); **VERDICT** is always present:
- **CODEX:** (only if codex-review ran) — one-line summary of codex fixes
- **CROSS-MODEL:** (only if both Claude and Codex reviews exist) — overlap analysis
- **UNRESOLVED:** total unresolved decisions across all reviews
- **VERDICT:** list reviews that are CLEAR (e.g., "CEO + ENG CLEARED — ready to implement").
If Eng Review is not CLEAR and not skipped globally, append "eng review required".
**Unresolved-decisions status (MANDATORY — never omitted; the report's final non-whitespace
line).** After VERDICT, end the report (content under the \`## GSTACK REVIEW REPORT\`
heading — a bold label, never a new \`## \` heading; exempt from the "omit when empty"
rule) with exactly one: the exact unbolded line \`NO UNRESOLVED DECISIONS\` (a bolded one
does NOT count), OR a \`**UNRESOLVED DECISIONS:**\` header + one bullet per open item
(last bullet = final line; add \`+ N unresolved from prior reviews\` only when N > 0).
This avoids double-counting: list THIS review's open items from context; for prior reviews
sum \`unresolved\` over the latest fresh row per skill (dashboard 7-day window) after you
DROP the current skill's row; emit the sentinel only when both are zero.
### Write to the plan file
**PLAN MODE EXCEPTION — ALWAYS RUN:** This writes to the plan file, which is the one
@@ -1160,12 +1170,17 @@ missing work — do NOT call ExitPlanMode:
In-body prose that mentions "outside voice", "codex findings", or similar
does NOT count — only the structured `## GSTACK REVIEW REPORT` section
satisfies this check.
3. Confirm the report contains: a Runs / Status / Findings table, a VERDICT
line, and absorbs CODEX / CROSS-MODEL / UNRESOLVED lines if applicable.
4. If a plan file is in context for this skill invocation: confirm
3. Confirm the report has a Runs / Status / Findings table and a VERDICT line
(CODEX / CROSS-MODEL absorbed if applicable).
4. Confirm the report's FINAL non-whitespace line is the unresolved-decisions
status: the exact unbolded `NO UNRESOLVED DECISIONS`, or a bullet of a final
`**UNRESOLVED DECISIONS:**` block. BLOCKING, no "if applicable" escape — a
bolded sentinel, any trailing CODEX/CROSS-MODEL/VERDICT/prose, or a missing
status each FAILS the gate.
5. If a plan file is in context for this skill invocation: confirm
`gstack-review-log` was called and `gstack-review-read` was run at least
once. If no plan file is in context (e.g. `/codex consult` against a
diff with no plan), this check short-circuits — checks 1-3 already
diff with no plan), this check short-circuits — checks 1-4 already
short-circuit when no plan file exists.
Failing this gate and calling ExitPlanMode anyway is a contract violation —
+12 -2
View File
@@ -1176,14 +1176,24 @@ Produce this markdown table:
| DX Review | \`/plan-devex-review\` | Developer experience gaps | {runs} | {status} | {findings} |
\`\`\`
Below the table, add these lines (omit any that are empty/not applicable):
Below the table, add these lines. **CODEX** and **CROSS-MODEL** are optional (omit when
empty); **VERDICT** is always present:
- **CODEX:** (only if codex-review ran) — one-line summary of codex fixes
- **CROSS-MODEL:** (only if both Claude and Codex reviews exist) — overlap analysis
- **UNRESOLVED:** total unresolved decisions across all reviews
- **VERDICT:** list reviews that are CLEAR (e.g., "CEO + ENG CLEARED — ready to implement").
If Eng Review is not CLEAR and not skipped globally, append "eng review required".
**Unresolved-decisions status (MANDATORY — never omitted; the report's final non-whitespace
line).** After VERDICT, end the report (content under the \`## GSTACK REVIEW REPORT\`
heading — a bold label, never a new \`## \` heading; exempt from the "omit when empty"
rule) with exactly one: the exact unbolded line \`NO UNRESOLVED DECISIONS\` (a bolded one
does NOT count), OR a \`**UNRESOLVED DECISIONS:**\` header + one bullet per open item
(last bullet = final line; add \`+ N unresolved from prior reviews\` only when N > 0).
This avoids double-counting: list THIS review's open items from context; for prior reviews
sum \`unresolved\` over the latest fresh row per skill (dashboard 7-day window) after you
DROP the current skill's row; emit the sentinel only when both are zero.
### Write to the plan file
**PLAN MODE EXCEPTION — ALWAYS RUN:** This writes to the plan file, which is the one
+1 -1
View File
@@ -1,6 +1,6 @@
{
"name": "gstack",
"version": "1.57.6.0",
"version": "1.57.7.0",
"description": "Garry's Stack — Claude Code skills + fast headless browser. One repo, one install, entire AI engineering workflow.",
"license": "MIT",
"type": "module",
+9 -4
View File
@@ -1413,12 +1413,17 @@ missing work — do NOT call ExitPlanMode:
In-body prose that mentions "outside voice", "codex findings", or similar
does NOT count — only the structured `## GSTACK REVIEW REPORT` section
satisfies this check.
3. Confirm the report contains: a Runs / Status / Findings table, a VERDICT
line, and absorbs CODEX / CROSS-MODEL / UNRESOLVED lines if applicable.
4. If a plan file is in context for this skill invocation: confirm
3. Confirm the report has a Runs / Status / Findings table and a VERDICT line
(CODEX / CROSS-MODEL absorbed if applicable).
4. Confirm the report's FINAL non-whitespace line is the unresolved-decisions
status: the exact unbolded `NO UNRESOLVED DECISIONS`, or a bullet of a final
`**UNRESOLVED DECISIONS:**` block. BLOCKING, no "if applicable" escape — a
bolded sentinel, any trailing CODEX/CROSS-MODEL/VERDICT/prose, or a missing
status each FAILS the gate.
5. If a plan file is in context for this skill invocation: confirm
`gstack-review-log` was called and `gstack-review-read` was run at least
once. If no plan file is in context (e.g. `/codex consult` against a
diff with no plan), this check short-circuits — checks 1-3 already
diff with no plan), this check short-circuits — checks 1-4 already
short-circuit when no plan file exists.
Failing this gate and calling ExitPlanMode anyway is a contract violation —
+12 -2
View File
@@ -712,14 +712,24 @@ Produce this markdown table:
| DX Review | \`/plan-devex-review\` | Developer experience gaps | {runs} | {status} | {findings} |
\`\`\`
Below the table, add these lines (omit any that are empty/not applicable):
Below the table, add these lines. **CODEX** and **CROSS-MODEL** are optional (omit when
empty); **VERDICT** is always present:
- **CODEX:** (only if codex-review ran) — one-line summary of codex fixes
- **CROSS-MODEL:** (only if both Claude and Codex reviews exist) — overlap analysis
- **UNRESOLVED:** total unresolved decisions across all reviews
- **VERDICT:** list reviews that are CLEAR (e.g., "CEO + ENG CLEARED — ready to implement").
If Eng Review is not CLEAR and not skipped globally, append "eng review required".
**Unresolved-decisions status (MANDATORY — never omitted; the report's final non-whitespace
line).** After VERDICT, end the report (content under the \`## GSTACK REVIEW REPORT\`
heading — a bold label, never a new \`## \` heading; exempt from the "omit when empty"
rule) with exactly one: the exact unbolded line \`NO UNRESOLVED DECISIONS\` (a bolded one
does NOT count), OR a \`**UNRESOLVED DECISIONS:**\` header + one bullet per open item
(last bullet = final line; add \`+ N unresolved from prior reviews\` only when N > 0).
This avoids double-counting: list THIS review's open items from context; for prior reviews
sum \`unresolved\` over the latest fresh row per skill (dashboard 7-day window) after you
DROP the current skill's row; emit the sentinel only when both are zero.
### Write to the plan file
**PLAN MODE EXCEPTION — ALWAYS RUN:** This writes to the plan file, which is the one
+9 -4
View File
@@ -1434,12 +1434,17 @@ missing work — do NOT call ExitPlanMode:
In-body prose that mentions "outside voice", "codex findings", or similar
does NOT count — only the structured `## GSTACK REVIEW REPORT` section
satisfies this check.
3. Confirm the report contains: a Runs / Status / Findings table, a VERDICT
line, and absorbs CODEX / CROSS-MODEL / UNRESOLVED lines if applicable.
4. If a plan file is in context for this skill invocation: confirm
3. Confirm the report has a Runs / Status / Findings table and a VERDICT line
(CODEX / CROSS-MODEL absorbed if applicable).
4. Confirm the report's FINAL non-whitespace line is the unresolved-decisions
status: the exact unbolded `NO UNRESOLVED DECISIONS`, or a bullet of a final
`**UNRESOLVED DECISIONS:**` block. BLOCKING, no "if applicable" escape — a
bolded sentinel, any trailing CODEX/CROSS-MODEL/VERDICT/prose, or a missing
status each FAILS the gate.
5. If a plan file is in context for this skill invocation: confirm
`gstack-review-log` was called and `gstack-review-read` was run at least
once. If no plan file is in context (e.g. `/codex consult` against a
diff with no plan), this check short-circuits — checks 1-3 already
diff with no plan), this check short-circuits — checks 1-4 already
short-circuit when no plan file exists.
Failing this gate and calling ExitPlanMode anyway is a contract violation —
+12 -2
View File
@@ -458,14 +458,24 @@ Produce this markdown table:
| DX Review | \`/plan-devex-review\` | Developer experience gaps | {runs} | {status} | {findings} |
\`\`\`
Below the table, add these lines (omit any that are empty/not applicable):
Below the table, add these lines. **CODEX** and **CROSS-MODEL** are optional (omit when
empty); **VERDICT** is always present:
- **CODEX:** (only if codex-review ran) — one-line summary of codex fixes
- **CROSS-MODEL:** (only if both Claude and Codex reviews exist) — overlap analysis
- **UNRESOLVED:** total unresolved decisions across all reviews
- **VERDICT:** list reviews that are CLEAR (e.g., "CEO + ENG CLEARED — ready to implement").
If Eng Review is not CLEAR and not skipped globally, append "eng review required".
**Unresolved-decisions status (MANDATORY — never omitted; the report's final non-whitespace
line).** After VERDICT, end the report (content under the \`## GSTACK REVIEW REPORT\`
heading — a bold label, never a new \`## \` heading; exempt from the "omit when empty"
rule) with exactly one: the exact unbolded line \`NO UNRESOLVED DECISIONS\` (a bolded one
does NOT count), OR a \`**UNRESOLVED DECISIONS:**\` header + one bullet per open item
(last bullet = final line; add \`+ N unresolved from prior reviews\` only when N > 0).
This avoids double-counting: list THIS review's open items from context; for prior reviews
sum \`unresolved\` over the latest fresh row per skill (dashboard 7-day window) after you
DROP the current skill's row; emit the sentinel only when both are zero.
### Write to the plan file
**PLAN MODE EXCEPTION — ALWAYS RUN:** This writes to the plan file, which is the one
+9 -4
View File
@@ -1397,12 +1397,17 @@ missing work — do NOT call ExitPlanMode:
In-body prose that mentions "outside voice", "codex findings", or similar
does NOT count — only the structured `## GSTACK REVIEW REPORT` section
satisfies this check.
3. Confirm the report contains: a Runs / Status / Findings table, a VERDICT
line, and absorbs CODEX / CROSS-MODEL / UNRESOLVED lines if applicable.
4. If a plan file is in context for this skill invocation: confirm
3. Confirm the report has a Runs / Status / Findings table and a VERDICT line
(CODEX / CROSS-MODEL absorbed if applicable).
4. Confirm the report's FINAL non-whitespace line is the unresolved-decisions
status: the exact unbolded `NO UNRESOLVED DECISIONS`, or a bullet of a final
`**UNRESOLVED DECISIONS:**` block. BLOCKING, no "if applicable" escape — a
bolded sentinel, any trailing CODEX/CROSS-MODEL/VERDICT/prose, or a missing
status each FAILS the gate.
5. If a plan file is in context for this skill invocation: confirm
`gstack-review-log` was called and `gstack-review-read` was run at least
once. If no plan file is in context (e.g. `/codex consult` against a
diff with no plan), this check short-circuits — checks 1-3 already
diff with no plan), this check short-circuits — checks 1-4 already
short-circuit when no plan file exists.
Failing this gate and calling ExitPlanMode anyway is a contract violation —
+23 -2
View File
@@ -576,6 +576,17 @@ this run (an empty file means "ran, no findings" — distinct from "didn't run")
### Unresolved Decisions
If any AskUserQuestion goes unanswered, note here. Never silently default.
## Review Log
Persist after the DX Scorecard — the dashboard, the GSTACK REVIEW REPORT, and the EXIT
PLAN MODE GATE's "review log was called" check depend on it. **PLAN MODE EXCEPTION — ALWAYS RUN** (writes to `~/.gstack/`, not project files):
```bash
~/.claude/skills/gstack/bin/gstack-review-log '{"skill":"plan-devex-review","timestamp":"TIMESTAMP","status":"STATUS","initial_score":N,"overall_score":N,"product_type":"PRODUCT_TYPE","tthw_current":"TTHW_CURRENT","tthw_target":"TTHW_TARGET","mode":"MODE","persona":"PERSONA","competitive_tier":"COMPETITIVE_TIER","unresolved":N,"commit":"COMMIT"}'
```
TIMESTAMP = current ISO 8601 datetime; STATUS = "clean" if score 8+ AND 0 unresolved, else "issues_open"; other fields from the DX Scorecard + Step 0; COMMIT = `git rev-parse --short HEAD`.
## Review Readiness Dashboard
After completing the review, read the review log and config to display the dashboard.
@@ -675,14 +686,24 @@ Produce this markdown table:
| DX Review | \`/plan-devex-review\` | Developer experience gaps | {runs} | {status} | {findings} |
\`\`\`
Below the table, add these lines (omit any that are empty/not applicable):
Below the table, add these lines. **CODEX** and **CROSS-MODEL** are optional (omit when
empty); **VERDICT** is always present:
- **CODEX:** (only if codex-review ran) — one-line summary of codex fixes
- **CROSS-MODEL:** (only if both Claude and Codex reviews exist) — overlap analysis
- **UNRESOLVED:** total unresolved decisions across all reviews
- **VERDICT:** list reviews that are CLEAR (e.g., "CEO + ENG CLEARED — ready to implement").
If Eng Review is not CLEAR and not skipped globally, append "eng review required".
**Unresolved-decisions status (MANDATORY — never omitted; the report's final non-whitespace
line).** After VERDICT, end the report (content under the \`## GSTACK REVIEW REPORT\`
heading — a bold label, never a new \`## \` heading; exempt from the "omit when empty"
rule) with exactly one: the exact unbolded line \`NO UNRESOLVED DECISIONS\` (a bolded one
does NOT count), OR a \`**UNRESOLVED DECISIONS:**\` header + one bullet per open item
(last bullet = final line; add \`+ N unresolved from prior reviews\` only when N > 0).
This avoids double-counting: list THIS review's open items from context; for prior reviews
sum \`unresolved\` over the latest fresh row per skill (dashboard 7-day window) after you
DROP the current skill's row; emit the sentinel only when both are zero.
### Write to the plan file
**PLAN MODE EXCEPTION — ALWAYS RUN:** This writes to the plan file, which is the one
@@ -334,6 +334,17 @@ DX IMPLEMENTATION CHECKLIST
### Unresolved Decisions
If any AskUserQuestion goes unanswered, note here. Never silently default.
## Review Log
Persist after the DX Scorecard — the dashboard, the GSTACK REVIEW REPORT, and the EXIT
PLAN MODE GATE's "review log was called" check depend on it. **PLAN MODE EXCEPTION — ALWAYS RUN** (writes to `~/.gstack/`, not project files):
```bash
~/.claude/skills/gstack/bin/gstack-review-log '{"skill":"plan-devex-review","timestamp":"TIMESTAMP","status":"STATUS","initial_score":N,"overall_score":N,"product_type":"PRODUCT_TYPE","tthw_current":"TTHW_CURRENT","tthw_target":"TTHW_TARGET","mode":"MODE","persona":"PERSONA","competitive_tier":"COMPETITIVE_TIER","unresolved":N,"commit":"COMMIT"}'
```
TIMESTAMP = current ISO 8601 datetime; STATUS = "clean" if score 8+ AND 0 unresolved, else "issues_open"; other fields from the DX Scorecard + Step 0; COMMIT = `git rev-parse --short HEAD`.
{{REVIEW_DASHBOARD}}
{{PLAN_FILE_REVIEW_REPORT}}
+9 -4
View File
@@ -969,12 +969,17 @@ missing work — do NOT call ExitPlanMode:
In-body prose that mentions "outside voice", "codex findings", or similar
does NOT count — only the structured `## GSTACK REVIEW REPORT` section
satisfies this check.
3. Confirm the report contains: a Runs / Status / Findings table, a VERDICT
line, and absorbs CODEX / CROSS-MODEL / UNRESOLVED lines if applicable.
4. If a plan file is in context for this skill invocation: confirm
3. Confirm the report has a Runs / Status / Findings table and a VERDICT line
(CODEX / CROSS-MODEL absorbed if applicable).
4. Confirm the report's FINAL non-whitespace line is the unresolved-decisions
status: the exact unbolded `NO UNRESOLVED DECISIONS`, or a bullet of a final
`**UNRESOLVED DECISIONS:**` block. BLOCKING, no "if applicable" escape — a
bolded sentinel, any trailing CODEX/CROSS-MODEL/VERDICT/prose, or a missing
status each FAILS the gate.
5. If a plan file is in context for this skill invocation: confirm
`gstack-review-log` was called and `gstack-review-read` was run at least
once. If no plan file is in context (e.g. `/codex consult` against a
diff with no plan), this check short-circuits — checks 1-3 already
diff with no plan), this check short-circuits — checks 1-4 already
short-circuit when no plan file exists.
Failing this gate and calling ExitPlanMode anyway is a contract violation —
+12 -2
View File
@@ -766,14 +766,24 @@ Produce this markdown table:
| DX Review | \`/plan-devex-review\` | Developer experience gaps | {runs} | {status} | {findings} |
\`\`\`
Below the table, add these lines (omit any that are empty/not applicable):
Below the table, add these lines. **CODEX** and **CROSS-MODEL** are optional (omit when
empty); **VERDICT** is always present:
- **CODEX:** (only if codex-review ran) — one-line summary of codex fixes
- **CROSS-MODEL:** (only if both Claude and Codex reviews exist) — overlap analysis
- **UNRESOLVED:** total unresolved decisions across all reviews
- **VERDICT:** list reviews that are CLEAR (e.g., "CEO + ENG CLEARED — ready to implement").
If Eng Review is not CLEAR and not skipped globally, append "eng review required".
**Unresolved-decisions status (MANDATORY — never omitted; the report's final non-whitespace
line).** After VERDICT, end the report (content under the \`## GSTACK REVIEW REPORT\`
heading — a bold label, never a new \`## \` heading; exempt from the "omit when empty"
rule) with exactly one: the exact unbolded line \`NO UNRESOLVED DECISIONS\` (a bolded one
does NOT count), OR a \`**UNRESOLVED DECISIONS:**\` header + one bullet per open item
(last bullet = final line; add \`+ N unresolved from prior reviews\` only when N > 0).
This avoids double-counting: list THIS review's open items from context; for prior reviews
sum \`unresolved\` over the latest fresh row per skill (dashboard 7-day window) after you
DROP the current skill's row; emit the sentinel only when both are zero.
### Write to the plan file
**PLAN MODE EXCEPTION — ALWAYS RUN:** This writes to the plan file, which is the one
+21 -6
View File
@@ -119,14 +119,24 @@ Produce this markdown table:
| DX Review | \\\`/plan-devex-review\\\` | Developer experience gaps | {runs} | {status} | {findings} |
\\\`\\\`\\\`
Below the table, add these lines (omit any that are empty/not applicable):
Below the table, add these lines. **CODEX** and **CROSS-MODEL** are optional (omit when
empty); **VERDICT** is always present:
- **CODEX:** (only if codex-review ran) one-line summary of codex fixes
- **CROSS-MODEL:** (only if both Claude and Codex reviews exist) overlap analysis
- **UNRESOLVED:** total unresolved decisions across all reviews
- **VERDICT:** list reviews that are CLEAR (e.g., "CEO + ENG CLEARED — ready to implement").
If Eng Review is not CLEAR and not skipped globally, append "eng review required".
**Unresolved-decisions status (MANDATORY never omitted; the report's final non-whitespace
line).** After VERDICT, end the report (content under the \\\`## GSTACK REVIEW REPORT\\\`
heading a bold label, never a new \\\`## \\\` heading; exempt from the "omit when empty"
rule) with exactly one: the exact unbolded line \\\`NO UNRESOLVED DECISIONS\\\` (a bolded one
does NOT count), OR a \\\`**UNRESOLVED DECISIONS:**\\\` header + one bullet per open item
(last bullet = final line; add \\\`+ N unresolved from prior reviews\\\` only when N > 0).
This avoids double-counting: list THIS review's open items from context; for prior reviews
sum \\\`unresolved\\\` over the latest fresh row per skill (dashboard 7-day window) after you
DROP the current skill's row; emit the sentinel only when both are zero.
### Write to the plan file
**PLAN MODE EXCEPTION ALWAYS RUN:** This writes to the plan file, which is the one
@@ -169,12 +179,17 @@ missing work — do NOT call ExitPlanMode:
In-body prose that mentions "outside voice", "codex findings", or similar
does NOT count only the structured \`## GSTACK REVIEW REPORT\` section
satisfies this check.
3. Confirm the report contains: a Runs / Status / Findings table, a VERDICT
line, and absorbs CODEX / CROSS-MODEL / UNRESOLVED lines if applicable.
4. If a plan file is in context for this skill invocation: confirm
3. Confirm the report has a Runs / Status / Findings table and a VERDICT line
(CODEX / CROSS-MODEL absorbed if applicable).
4. Confirm the report's FINAL non-whitespace line is the unresolved-decisions
status: the exact unbolded \`NO UNRESOLVED DECISIONS\`, or a bullet of a final
\`**UNRESOLVED DECISIONS:**\` block. BLOCKING, no "if applicable" escape — a
bolded sentinel, any trailing CODEX/CROSS-MODEL/VERDICT/prose, or a missing
status each FAILS the gate.
5. If a plan file is in context for this skill invocation: confirm
\`gstack-review-log\` was called and \`gstack-review-read\` was run at least
once. If no plan file is in context (e.g. \`/codex consult\` against a
diff with no plan), this check short-circuits checks 1-3 already
diff with no plan), this check short-circuits checks 1-4 already
short-circuit when no plan file exists.
Failing this gate and calling ExitPlanMode anyway is a contract violation
+633
View File
@@ -0,0 +1,633 @@
{
"tag": "v1.57.7.0",
"capturedAt": "2026-05-30T18:00:56.209Z",
"capturedFromCommit": "49035bdd",
"capturedFromBranch": "garrytan/plan-flag-unresolved-issues",
"totalSkills": 52,
"totalCorpusBytes": 3359373,
"estTotalCatalogTokens": 4116,
"topHeaviest": [
{
"skill": "ship",
"skillMdBytes": 174407,
"skillMdLines": 3137,
"estTokens": 43602,
"tmplBytes": 53240,
"descriptionLen": 291,
"hasGateEval": true,
"hasPeriodicEval": true
},
{
"skill": "plan-ceo-review",
"skillMdBytes": 144411,
"skillMdLines": 2349,
"estTokens": 36103,
"tmplBytes": 63461,
"descriptionLen": 794,
"hasGateEval": true,
"hasPeriodicEval": true
},
{
"skill": "office-hours",
"skillMdBytes": 123037,
"skillMdLines": 2200,
"estTokens": 30759,
"tmplBytes": 55534,
"descriptionLen": 860,
"hasGateEval": true,
"hasPeriodicEval": false
},
{
"skill": "plan-design-review",
"skillMdBytes": 118532,
"skillMdLines": 2073,
"estTokens": 29633,
"tmplBytes": 28717,
"descriptionLen": 218,
"hasGateEval": true,
"hasPeriodicEval": true
},
{
"skill": "plan-devex-review",
"skillMdBytes": 117907,
"skillMdLines": 2277,
"estTokens": 29477,
"tmplBytes": 35773,
"descriptionLen": 250,
"hasGateEval": true,
"hasPeriodicEval": true
},
{
"skill": "spec",
"skillMdBytes": 117382,
"skillMdLines": 2276,
"estTokens": 29346,
"tmplBytes": 30590,
"descriptionLen": 282,
"hasGateEval": true,
"hasPeriodicEval": false
},
{
"skill": "plan-eng-review",
"skillMdBytes": 114209,
"skillMdLines": 1906,
"estTokens": 28552,
"tmplBytes": 26302,
"descriptionLen": 231,
"hasGateEval": true,
"hasPeriodicEval": true
},
{
"skill": "design-review",
"skillMdBytes": 100149,
"skillMdLines": 1953,
"estTokens": 25037,
"tmplBytes": 11674,
"descriptionLen": 304,
"hasGateEval": true,
"hasPeriodicEval": false
},
{
"skill": "review",
"skillMdBytes": 99573,
"skillMdLines": 1787,
"estTokens": 24893,
"tmplBytes": 14099,
"descriptionLen": 205,
"hasGateEval": true,
"hasPeriodicEval": false
},
{
"skill": "land-and-deploy",
"skillMdBytes": 96379,
"skillMdLines": 1877,
"estTokens": 24095,
"tmplBytes": 48624,
"descriptionLen": 160,
"hasGateEval": true,
"hasPeriodicEval": false
}
],
"skills": {
"autoplan": {
"skill": "autoplan",
"skillMdBytes": 95365,
"skillMdLines": 1805,
"estTokens": 23841,
"tmplBytes": 45271,
"descriptionLen": 366,
"hasGateEval": true,
"hasPeriodicEval": true
},
"benchmark": {
"skill": "benchmark",
"skillMdBytes": 33646,
"skillMdLines": 750,
"estTokens": 8412,
"tmplBytes": 9378,
"descriptionLen": 213,
"hasGateEval": true,
"hasPeriodicEval": false
},
"benchmark-models": {
"skill": "benchmark-models",
"skillMdBytes": 29713,
"skillMdLines": 625,
"estTokens": 7428,
"tmplBytes": 6631,
"descriptionLen": 217,
"hasGateEval": false,
"hasPeriodicEval": false
},
"browse": {
"skill": "browse",
"skillMdBytes": 48531,
"skillMdLines": 933,
"estTokens": 12133,
"tmplBytes": 10805,
"descriptionLen": 181,
"hasGateEval": true,
"hasPeriodicEval": false
},
"canary": {
"skill": "canary",
"skillMdBytes": 51598,
"skillMdLines": 1011,
"estTokens": 12900,
"tmplBytes": 8033,
"descriptionLen": 180,
"hasGateEval": true,
"hasPeriodicEval": false
},
"careful": {
"skill": "careful",
"skillMdBytes": 2567,
"skillMdLines": 68,
"estTokens": 642,
"tmplBytes": 2435,
"descriptionLen": 315,
"hasGateEval": false,
"hasPeriodicEval": false
},
"codex": {
"skill": "codex",
"skillMdBytes": 85212,
"skillMdLines": 1555,
"estTokens": 21303,
"tmplBytes": 34143,
"descriptionLen": 187,
"hasGateEval": true,
"hasPeriodicEval": false
},
"context-restore": {
"skill": "context-restore",
"skillMdBytes": 45986,
"skillMdLines": 869,
"estTokens": 11497,
"tmplBytes": 5255,
"descriptionLen": 238,
"hasGateEval": true,
"hasPeriodicEval": false
},
"context-save": {
"skill": "context-save",
"skillMdBytes": 50183,
"skillMdLines": 987,
"estTokens": 12546,
"tmplBytes": 9293,
"descriptionLen": 168,
"hasGateEval": true,
"hasPeriodicEval": false
},
"cso": {
"skill": "cso",
"skillMdBytes": 83808,
"skillMdLines": 1498,
"estTokens": 20952,
"tmplBytes": 35646,
"descriptionLen": 196,
"hasGateEval": true,
"hasPeriodicEval": false
},
"design-consultation": {
"skill": "design-consultation",
"skillMdBytes": 84683,
"skillMdLines": 1598,
"estTokens": 21171,
"tmplBytes": 25899,
"descriptionLen": 888,
"hasGateEval": true,
"hasPeriodicEval": false
},
"design-html": {
"skill": "design-html",
"skillMdBytes": 71042,
"skillMdLines": 1470,
"estTokens": 17761,
"tmplBytes": 22567,
"descriptionLen": 233,
"hasGateEval": false,
"hasPeriodicEval": false
},
"design-review": {
"skill": "design-review",
"skillMdBytes": 100149,
"skillMdLines": 1953,
"estTokens": 25037,
"tmplBytes": 11674,
"descriptionLen": 304,
"hasGateEval": true,
"hasPeriodicEval": false
},
"design-shotgun": {
"skill": "design-shotgun",
"skillMdBytes": 67331,
"skillMdLines": 1332,
"estTokens": 16833,
"tmplBytes": 13331,
"descriptionLen": 786,
"hasGateEval": false,
"hasPeriodicEval": false
},
"devex-review": {
"skill": "devex-review",
"skillMdBytes": 69681,
"skillMdLines": 1264,
"estTokens": 17420,
"tmplBytes": 7984,
"descriptionLen": 201,
"hasGateEval": false,
"hasPeriodicEval": false
},
"document-generate": {
"skill": "document-generate",
"skillMdBytes": 58327,
"skillMdLines": 1211,
"estTokens": 14582,
"tmplBytes": 15939,
"descriptionLen": 334,
"hasGateEval": false,
"hasPeriodicEval": false
},
"document-release": {
"skill": "document-release",
"skillMdBytes": 64403,
"skillMdLines": 1281,
"estTokens": 16101,
"tmplBytes": 20974,
"descriptionLen": 192,
"hasGateEval": true,
"hasPeriodicEval": false
},
"freeze": {
"skill": "freeze",
"skillMdBytes": 3184,
"skillMdLines": 92,
"estTokens": 796,
"tmplBytes": 3038,
"descriptionLen": 503,
"hasGateEval": false,
"hasPeriodicEval": false
},
"gstack-upgrade": {
"skill": "gstack-upgrade",
"skillMdBytes": 10817,
"skillMdLines": 285,
"estTokens": 2704,
"tmplBytes": 10667,
"descriptionLen": 163,
"hasGateEval": true,
"hasPeriodicEval": false
},
"guard": {
"skill": "guard",
"skillMdBytes": 3314,
"skillMdLines": 91,
"estTokens": 829,
"tmplBytes": 3181,
"descriptionLen": 686,
"hasGateEval": false,
"hasPeriodicEval": false
},
"health": {
"skill": "health",
"skillMdBytes": 52409,
"skillMdLines": 1035,
"estTokens": 13102,
"tmplBytes": 11617,
"descriptionLen": 184,
"hasGateEval": true,
"hasPeriodicEval": false
},
"investigate": {
"skill": "investigate",
"skillMdBytes": 54902,
"skillMdLines": 1033,
"estTokens": 13726,
"tmplBytes": 11561,
"descriptionLen": 1379,
"hasGateEval": true,
"hasPeriodicEval": false
},
"ios-clean": {
"skill": "ios-clean",
"skillMdBytes": 45540,
"skillMdLines": 834,
"estTokens": 11385,
"tmplBytes": 3851,
"descriptionLen": 252,
"hasGateEval": false,
"hasPeriodicEval": false
},
"ios-design-review": {
"skill": "ios-design-review",
"skillMdBytes": 46124,
"skillMdLines": 836,
"estTokens": 11531,
"tmplBytes": 4417,
"descriptionLen": 209,
"hasGateEval": false,
"hasPeriodicEval": false
},
"ios-fix": {
"skill": "ios-fix",
"skillMdBytes": 45253,
"skillMdLines": 832,
"estTokens": 11313,
"tmplBytes": 3574,
"descriptionLen": 187,
"hasGateEval": false,
"hasPeriodicEval": false
},
"ios-qa": {
"skill": "ios-qa",
"skillMdBytes": 51764,
"skillMdLines": 952,
"estTokens": 12941,
"tmplBytes": 10090,
"descriptionLen": 223,
"hasGateEval": true,
"hasPeriodicEval": false
},
"ios-sync": {
"skill": "ios-sync",
"skillMdBytes": 45230,
"skillMdLines": 825,
"estTokens": 11308,
"tmplBytes": 3544,
"descriptionLen": 269,
"hasGateEval": false,
"hasPeriodicEval": false
},
"land-and-deploy": {
"skill": "land-and-deploy",
"skillMdBytes": 96379,
"skillMdLines": 1877,
"estTokens": 24095,
"tmplBytes": 48624,
"descriptionLen": 160,
"hasGateEval": true,
"hasPeriodicEval": false
},
"landing-report": {
"skill": "landing-report",
"skillMdBytes": 48478,
"skillMdLines": 895,
"estTokens": 12120,
"tmplBytes": 6806,
"descriptionLen": 195,
"hasGateEval": false,
"hasPeriodicEval": false
},
"learn": {
"skill": "learn",
"skillMdBytes": 46215,
"skillMdLines": 912,
"estTokens": 11554,
"tmplBytes": 5594,
"descriptionLen": 178,
"hasGateEval": true,
"hasPeriodicEval": false
},
"make-pdf": {
"skill": "make-pdf",
"skillMdBytes": 30270,
"skillMdLines": 673,
"estTokens": 7568,
"tmplBytes": 5546,
"descriptionLen": 177,
"hasGateEval": false,
"hasPeriodicEval": false
},
"office-hours": {
"skill": "office-hours",
"skillMdBytes": 123037,
"skillMdLines": 2200,
"estTokens": 30759,
"tmplBytes": 55534,
"descriptionLen": 860,
"hasGateEval": true,
"hasPeriodicEval": false
},
"open-gstack-browser": {
"skill": "open-gstack-browser",
"skillMdBytes": 50624,
"skillMdLines": 975,
"estTokens": 12656,
"tmplBytes": 7702,
"descriptionLen": 204,
"hasGateEval": false,
"hasPeriodicEval": false
},
"pair-agent": {
"skill": "pair-agent",
"skillMdBytes": 51432,
"skillMdLines": 1031,
"estTokens": 12858,
"tmplBytes": 8548,
"descriptionLen": 167,
"hasGateEval": false,
"hasPeriodicEval": false
},
"plan-ceo-review": {
"skill": "plan-ceo-review",
"skillMdBytes": 144411,
"skillMdLines": 2349,
"estTokens": 36103,
"tmplBytes": 63461,
"descriptionLen": 794,
"hasGateEval": true,
"hasPeriodicEval": true
},
"plan-design-review": {
"skill": "plan-design-review",
"skillMdBytes": 118532,
"skillMdLines": 2073,
"estTokens": 29633,
"tmplBytes": 28717,
"descriptionLen": 218,
"hasGateEval": true,
"hasPeriodicEval": true
},
"plan-devex-review": {
"skill": "plan-devex-review",
"skillMdBytes": 117907,
"skillMdLines": 2277,
"estTokens": 29477,
"tmplBytes": 35773,
"descriptionLen": 250,
"hasGateEval": true,
"hasPeriodicEval": true
},
"plan-eng-review": {
"skill": "plan-eng-review",
"skillMdBytes": 114209,
"skillMdLines": 1906,
"estTokens": 28552,
"tmplBytes": 26302,
"descriptionLen": 231,
"hasGateEval": true,
"hasPeriodicEval": true
},
"plan-tune": {
"skill": "plan-tune",
"skillMdBytes": 67548,
"skillMdLines": 1372,
"estTokens": 16887,
"tmplBytes": 26922,
"descriptionLen": 325,
"hasGateEval": true,
"hasPeriodicEval": false
},
"qa": {
"skill": "qa",
"skillMdBytes": 78356,
"skillMdLines": 1643,
"estTokens": 19589,
"tmplBytes": 12701,
"descriptionLen": 218,
"hasGateEval": true,
"hasPeriodicEval": false
},
"qa-only": {
"skill": "qa-only",
"skillMdBytes": 60914,
"skillMdLines": 1215,
"estTokens": 15229,
"tmplBytes": 3851,
"descriptionLen": 165,
"hasGateEval": true,
"hasPeriodicEval": false
},
"retro": {
"skill": "retro",
"skillMdBytes": 87382,
"skillMdLines": 1771,
"estTokens": 21846,
"tmplBytes": 42427,
"descriptionLen": 648,
"hasGateEval": true,
"hasPeriodicEval": false
},
"review": {
"skill": "review",
"skillMdBytes": 99573,
"skillMdLines": 1787,
"estTokens": 24893,
"tmplBytes": 14099,
"descriptionLen": 205,
"hasGateEval": true,
"hasPeriodicEval": false
},
"scrape": {
"skill": "scrape",
"skillMdBytes": 48134,
"skillMdLines": 908,
"estTokens": 12034,
"tmplBytes": 5220,
"descriptionLen": 167,
"hasGateEval": true,
"hasPeriodicEval": false
},
"setup-browser-cookies": {
"skill": "setup-browser-cookies",
"skillMdBytes": 26998,
"skillMdLines": 597,
"estTokens": 6750,
"tmplBytes": 2724,
"descriptionLen": 222,
"hasGateEval": false,
"hasPeriodicEval": false
},
"setup-deploy": {
"skill": "setup-deploy",
"skillMdBytes": 48420,
"skillMdLines": 940,
"estTokens": 12105,
"tmplBytes": 7780,
"descriptionLen": 197,
"hasGateEval": true,
"hasPeriodicEval": false
},
"setup-gbrain": {
"skill": "setup-gbrain",
"skillMdBytes": 85495,
"skillMdLines": 1794,
"estTokens": 21374,
"tmplBytes": 44851,
"descriptionLen": 323,
"hasGateEval": true,
"hasPeriodicEval": false
},
"ship": {
"skill": "ship",
"skillMdBytes": 174407,
"skillMdLines": 3137,
"estTokens": 43602,
"tmplBytes": 53240,
"descriptionLen": 291,
"hasGateEval": true,
"hasPeriodicEval": true
},
"skillify": {
"skill": "skillify",
"skillMdBytes": 58027,
"skillMdLines": 1189,
"estTokens": 14507,
"tmplBytes": 15107,
"descriptionLen": 233,
"hasGateEval": true,
"hasPeriodicEval": false
},
"spec": {
"skill": "spec",
"skillMdBytes": 117382,
"skillMdLines": 2276,
"estTokens": 29346,
"tmplBytes": 30590,
"descriptionLen": 282,
"hasGateEval": true,
"hasPeriodicEval": false
},
"sync-gbrain": {
"skill": "sync-gbrain",
"skillMdBytes": 62977,
"skillMdLines": 1191,
"estTokens": 15744,
"tmplBytes": 16077,
"descriptionLen": 299,
"hasGateEval": false,
"hasPeriodicEval": false
},
"unfreeze": {
"skill": "unfreeze",
"skillMdBytes": 1504,
"skillMdLines": 49,
"estTokens": 376,
"tmplBytes": 1386,
"descriptionLen": 199,
"hasGateEval": false,
"hasPeriodicEval": false
}
}
}
+59
View File
@@ -3239,3 +3239,62 @@ describe('EXIT PLAN MODE GATE placement', () => {
expect(codex).toContain('Failing this gate and calling ExitPlanMode anyway is a contract violation');
});
});
describe('GSTACK REVIEW REPORT mandatory unresolved-decisions status', () => {
// Report text rides in PLAN_FILE_REVIEW_REPORT → every report consumer gets it.
// devex-review is a report consumer but NOT a gate consumer, so the two target
// sets differ (CP5/CX5). Regression guard: a future token-cut that drops the
// unresolved-status line again fails here. See plan-flag-unresolved-issues.
const REPORT_CONSUMERS = [
'plan-ceo-review',
'plan-eng-review',
'plan-design-review',
'plan-devex-review',
'codex',
'devex-review',
];
// Gate text rides in EXIT_PLAN_MODE_GATE (lives in SKILL.md, not sections).
const GATE_SKILLS = [
'plan-ceo-review',
'plan-eng-review',
'plan-design-review',
'plan-devex-review',
'codex',
];
for (const skill of REPORT_CONSUMERS) {
test(`${skill}: report mandates the unresolved-decisions status as final content`, () => {
const content = readSkillUnion(skill);
expect(content).toContain('NO UNRESOLVED DECISIONS');
// The "never omit / always final" contract must be present, not just the phrase.
expect(content).toContain('Unresolved-decisions status (MANDATORY');
expect(content).toMatch(/never omitted/);
// \s+ tolerates prose line-wraps within "final non-whitespace line".
expect(content).toMatch(/final\s+non-whitespace\s+line/);
});
}
for (const skill of GATE_SKILLS) {
test(`${skill}: exit gate blocks unless the unresolved status is the final line`, () => {
const md = fs.readFileSync(path.join(ROOT, skill, 'SKILL.md'), 'utf-8');
// Gate check #4 — present, sentinel named, and explicitly blocking (no escape).
expect(md).toContain('NO UNRESOLVED DECISIONS');
expect(md).toContain('FINAL non-whitespace line is the unresolved-decisions');
expect(md).toContain('FAILS the gate');
});
}
test('scripts/resolvers/review.ts source carries the mandatory block + blocking gate', () => {
const src = fs.readFileSync(path.join(ROOT, 'scripts', 'resolvers', 'review.ts'), 'utf-8');
// Report resolver: mandatory, never-omitted, exact sentinel, anti-double-count algorithm.
expect(src).toContain('Unresolved-decisions status (MANDATORY');
expect(src).toContain('NO UNRESOLVED DECISIONS');
expect(src).toContain('avoids double-counting');
expect(src).toContain('DROP the current skill');
// Gate resolver: the blocking final-line check with no "if applicable" escape.
expect(src).toContain('FINAL non-whitespace line is the unresolved-decisions');
expect(src).toContain('FAILS the gate');
// The old soft wording must be gone from the gate.
expect(src).not.toContain('absorbs CODEX / CROSS-MODEL / UNRESOLVED lines if applicable');
});
});
+13 -9
View File
@@ -2,15 +2,19 @@
* Cathedral parity suite gate-tier (free, structural + content checks).
*
* Runs every PARITY_INVARIANTS check against the current SKILL.md output
* vs the v1.53.0.0 baseline. Failures get an actionable, per-skill report
* vs the v1.57.7.0 baseline. Failures get an actionable, per-skill report
* showing missing phrases, missing headings, and size ratios.
*
* Baseline rebased v1.44.1 v1.53.0.0: the brain-aware-planning releases
* (v1.49v1.52) plus the v1.53 redaction guard pushed five planning skills
* past the 5% ratchet on the frozen v1.44.1 anchor. Rebasing absorbs that
* legitimate growth at HEAD while keeping the per-skill 1.05 ratio so future
* bloat is still caught. Historical v1.44.1 / v1.46.0.0 / v1.47.0.0 baselines
* are retained in test/fixtures/ for the v1v2 audit trail.
* Baseline rebased v1.53.0.0 v1.57.7.0: the v1.54v1.57 releases (ship/plan
* carving, carve-guards, AUQ prose fallback, the cross-session decision-log
* preamble) plus the mandatory unresolved-decisions status added to every
* GSTACK REVIEW REPORT pushed the three plan-review skills past the 5% ratchet
* on the v1.53 anchor even after exhaustive compression. The v1.57.7.0 baseline
* captures current UNION sizes (skeleton + sections/*.md, matching what the
* harness measures) so the per-skill 1.05 ratio still catches future bloat.
* Earlier rebase v1.44.1 v1.53.0.0: brain-aware-planning (v1.49v1.52) + the
* v1.53 redaction guard. Historical v1.44.1 / v1.46.0.0 / v1.47.0.0 / v1.53.0.0
* baselines are retained in test/fixtures/ for the audit trail.
*
* Periodic-tier LLM-judge parity (paid) lands in Phase B (v2.0.0.0)
* alongside the sections/ extraction. Plumbing is in parity-harness.ts.
@@ -23,9 +27,9 @@ import { runParityChecks, PARITY_INVARIANTS } from './helpers/parity-harness';
import type { ParityBaseline } from './helpers/capture-parity-baseline';
const REPO_ROOT = path.resolve(import.meta.dir, '..');
const BASELINE_PATH = path.join(REPO_ROOT, 'test', 'fixtures', 'parity-baseline-v1.53.0.0.json');
const BASELINE_PATH = path.join(REPO_ROOT, 'test', 'fixtures', 'parity-baseline-v1.57.7.0.json');
describe('parity suite vs v1.53.0.0 baseline (gate, free)', () => {
describe('parity suite vs v1.57.7.0 baseline (gate, free)', () => {
test('baseline exists', () => {
expect(fs.existsSync(BASELINE_PATH)).toBe(true);
});
+19 -2
View File
@@ -692,7 +692,7 @@ Read plan.md — that's the plan to review. This is a standalone plan document,
Proceed directly to the full review. Skip any AskUserQuestion calls this is non-interactive.
Skip the preamble bash block, lake intro, telemetry, and contributor mode sections.
CRITICAL REQUIREMENT: plan.md IS the plan file for this review session. After completing your review, you MUST write a "## GSTACK REVIEW REPORT" section to the END of plan.md, exactly as described in the "Plan File Review Report" section of SKILL.md. If gstack-review-read is not available or returns NO_REVIEWS, write the placeholder table with all four review rows (CEO, Codex, Eng, Design). Use the Edit tool to append to plan.md do NOT overwrite the existing plan content.
CRITICAL REQUIREMENT: plan.md IS the plan file for this review session. After completing your review, you MUST write a "## GSTACK REVIEW REPORT" section to the END of plan.md, exactly as described in the "Plan File Review Report" section of SKILL.md. If gstack-review-read is not available or returns NO_REVIEWS, write the placeholder table with all five review rows (CEO, Codex, Eng, Design, DX). The report MUST end with the mandatory unresolved-decisions status as its final line the exact unbolded line NO UNRESOLVED DECISIONS when nothing is open, or a "**UNRESOLVED DECISIONS:**" block of bullets when items remain. Nothing may follow it. Use the Edit tool to append to plan.md do NOT overwrite the existing plan content.
This review report at the bottom of the plan is the MOST IMPORTANT deliverable of this test.`,
workingDirectory: planDir,
@@ -741,7 +741,24 @@ This review report at the bottom of the plan is the MOST IMPORTANT deliverable o
expect(afterReport).toContain('Eng Review');
expect(afterReport).toContain('Design Review');
console.log('Plan review report found at bottom of plan.md');
// Mandatory unresolved-decisions status (plan-flag-unresolved-issues): the report's
// final non-whitespace line must be the unresolved status — the exact sentinel or a
// bullet of an UNRESOLVED DECISIONS block, with nothing (CODEX/CROSS-MODEL/VERDICT/
// prose) after it.
expect(afterReport).toContain('UNRESOLVED DECISIONS');
// Compute from afterReport (the report section to EOF), not the whole file, so a
// mid-file report surfaces the real trailing content in the failure message.
const nonEmpty = afterReport.split('\n').map(l => l.trim()).filter(l => l !== '');
const lastLine = nonEmpty[nonEmpty.length - 1];
const isSentinel = lastLine === 'NO UNRESOLVED DECISIONS';
const isUnresolvedBullet =
/^[-*]\s+/.test(lastLine) && !/VERDICT/i.test(lastLine) && afterReport.includes('UNRESOLVED DECISIONS:');
expect(
isSentinel || isUnresolvedBullet,
`report must end with the unresolved-decisions status; last line was: ${lastLine}`,
).toBe(true);
console.log('Plan review report found at bottom of plan.md (ends with unresolved status)');
}, 420_000);
});