Merge remote-tracking branch 'origin/main' into garrytan/codex-review-default-on

This commit is contained in:
Garry Tan
2026-06-09 11:21:32 -07:00
19 changed files with 945 additions and 51 deletions
+59
View File
@@ -1,5 +1,64 @@
# Changelog
## [1.57.7.0] - 2026-06-08
## **Every plan review now ends by telling you, in one line, whether anything is still unresolved.**
## **The GSTACK REVIEW REPORT closes with the open decisions, or "NO UNRESOLVED DECISIONS" in plain sight, before you approve.**
When a plan-review skill (/plan-ceo-review, /plan-eng-review, /plan-design-review,
/plan-devex-review, and /codex) finishes and hands you the plan to approve, its report
now ends with a mandatory unresolved-decisions verdict. If decisions are still open, it
lists each one and what breaks if you ship it deferred. If nothing is open, it prints the
exact line NO UNRESOLVED DECISIONS. A token-reduction pass had made this line optional, so
a clean plan and a plan hiding an open question rendered the same. Now the line is never
omitted, it is always the last thing you read before the approval prompt, and the approval
gate refuses to let the plan through without it.
### What changed, before and after
| At plan-approval time | Before | After |
|---|---|---|
| Clean plan | usually no unresolved line | `NO UNRESOLVED DECISIONS` as the final line |
| Plan with open decisions | unresolved line optional, often dropped | `**UNRESOLVED DECISIONS:**` + one bullet per open item |
| Approval gate (ExitPlanMode) | checked the line "if applicable" | blocks unless the unresolved status is the final line |
| /plan-devex-review review log | never written, gate uncheckable | written, so the dashboard and report see its data |
The unresolved count across reviews is computed without double-counting the review that
just ran, using the same 7-day freshness window as the Review Readiness Dashboard.
### What this means for you
Every approve-plan moment now carries an explicit verdict on open questions, so a missed
ambiguity cannot slip through looking like a clean plan. If you run the plan-review skills
or /autoplan, you will see the unresolved status as the closing line of every report.
Nothing to configure. Upgrade and your next plan review shows it.
### Itemized changes
#### Added
- **Mandatory unresolved-decisions status in the GSTACK REVIEW REPORT.** Generated into
all six report consumers (/plan-ceo-review, /plan-eng-review, /plan-design-review,
/plan-devex-review, /codex, /devex-review) from `scripts/resolvers/review.ts`. The report
always ends with either the exact unbolded sentinel `NO UNRESOLVED DECISIONS` or a
`**UNRESOLVED DECISIONS:**` bullet block listing each open item; never omitted, always
the final line.
- **Blocking approval gate.** The EXIT PLAN MODE GATE now refuses ExitPlanMode unless the
report's final non-whitespace line is the unresolved status (no "if applicable" escape).
- Static and E2E tests pinning the mandatory status across every report consumer and
gate-bearing skill, so a future compression pass cannot silently drop it again.
#### Fixed
- **/plan-devex-review never logged a review entry.** It carried the approval gate but
never called `gstack-review-log`, so the gate's "review log was called" check was
structurally unsatisfiable and its data was invisible to the Review Readiness Dashboard
and the report. It now logs with the correct timestamp and DX fields.
#### For contributors
- Rebased the parity-suite size baseline v1.53.0.0 to v1.57.7.0 (captures current union
sizes; keeps the per-skill 1.05 ratio so future bloat is still caught). Regenerated the
three ship golden fixtures left stale by #1909. The frozen v1.44.1 integrity anchor and
the v1.47 size-budget baseline are untouched.
## [1.57.6.0] - 2026-06-07
## **Eight community-filed bugs fixed in one wave, four of them security guards that were quietly failing open.**
+1 -1
View File
@@ -1 +1 @@
1.57.6.0
1.57.7.0
+21 -6
View File
@@ -1112,14 +1112,24 @@ Produce this markdown table:
| DX Review | \`/plan-devex-review\` | Developer experience gaps | {runs} | {status} | {findings} |
\`\`\`
Below the table, add these lines (omit any that are empty/not applicable):
Below the table, add these lines. **CODEX** and **CROSS-MODEL** are optional (omit when
empty); **VERDICT** is always present:
- **CODEX:** (only if codex-review ran) — one-line summary of codex fixes
- **CROSS-MODEL:** (only if both Claude and Codex reviews exist) — overlap analysis
- **UNRESOLVED:** total unresolved decisions across all reviews
- **VERDICT:** list reviews that are CLEAR (e.g., "CEO + ENG CLEARED — ready to implement").
If Eng Review is not CLEAR and not skipped globally, append "eng review required".
**Unresolved-decisions status (MANDATORY — never omitted; the report's final non-whitespace
line).** After VERDICT, end the report (content under the \`## GSTACK REVIEW REPORT\`
heading — a bold label, never a new \`## \` heading; exempt from the "omit when empty"
rule) with exactly one: the exact unbolded line \`NO UNRESOLVED DECISIONS\` (a bolded one
does NOT count), OR a \`**UNRESOLVED DECISIONS:**\` header + one bullet per open item
(last bullet = final line; add \`+ N unresolved from prior reviews\` only when N > 0).
This avoids double-counting: list THIS review's open items from context; for prior reviews
sum \`unresolved\` over the latest fresh row per skill (dashboard 7-day window) after you
DROP the current skill's row; emit the sentinel only when both are zero.
### Write to the plan file
**PLAN MODE EXCEPTION — ALWAYS RUN:** This writes to the plan file, which is the one
@@ -1160,12 +1170,17 @@ missing work — do NOT call ExitPlanMode:
In-body prose that mentions "outside voice", "codex findings", or similar
does NOT count — only the structured `## GSTACK REVIEW REPORT` section
satisfies this check.
3. Confirm the report contains: a Runs / Status / Findings table, a VERDICT
line, and absorbs CODEX / CROSS-MODEL / UNRESOLVED lines if applicable.
4. If a plan file is in context for this skill invocation: confirm
3. Confirm the report has a Runs / Status / Findings table and a VERDICT line
(CODEX / CROSS-MODEL absorbed if applicable).
4. Confirm the report's FINAL non-whitespace line is the unresolved-decisions
status: the exact unbolded `NO UNRESOLVED DECISIONS`, or a bullet of a final
`**UNRESOLVED DECISIONS:**` block. BLOCKING, no "if applicable" escape — a
bolded sentinel, any trailing CODEX/CROSS-MODEL/VERDICT/prose, or a missing
status each FAILS the gate.
5. If a plan file is in context for this skill invocation: confirm
`gstack-review-log` was called and `gstack-review-read` was run at least
once. If no plan file is in context (e.g. `/codex consult` against a
diff with no plan), this check short-circuits — checks 1-3 already
diff with no plan), this check short-circuits — checks 1-4 already
short-circuit when no plan file exists.
Failing this gate and calling ExitPlanMode anyway is a contract violation —
+12 -2
View File
@@ -1176,14 +1176,24 @@ Produce this markdown table:
| DX Review | \`/plan-devex-review\` | Developer experience gaps | {runs} | {status} | {findings} |
\`\`\`
Below the table, add these lines (omit any that are empty/not applicable):
Below the table, add these lines. **CODEX** and **CROSS-MODEL** are optional (omit when
empty); **VERDICT** is always present:
- **CODEX:** (only if codex-review ran) — one-line summary of codex fixes
- **CROSS-MODEL:** (only if both Claude and Codex reviews exist) — overlap analysis
- **UNRESOLVED:** total unresolved decisions across all reviews
- **VERDICT:** list reviews that are CLEAR (e.g., "CEO + ENG CLEARED — ready to implement").
If Eng Review is not CLEAR and not skipped globally, append "eng review required".
**Unresolved-decisions status (MANDATORY — never omitted; the report's final non-whitespace
line).** After VERDICT, end the report (content under the \`## GSTACK REVIEW REPORT\`
heading — a bold label, never a new \`## \` heading; exempt from the "omit when empty"
rule) with exactly one: the exact unbolded line \`NO UNRESOLVED DECISIONS\` (a bolded one
does NOT count), OR a \`**UNRESOLVED DECISIONS:**\` header + one bullet per open item
(last bullet = final line; add \`+ N unresolved from prior reviews\` only when N > 0).
This avoids double-counting: list THIS review's open items from context; for prior reviews
sum \`unresolved\` over the latest fresh row per skill (dashboard 7-day window) after you
DROP the current skill's row; emit the sentinel only when both are zero.
### Write to the plan file
**PLAN MODE EXCEPTION — ALWAYS RUN:** This writes to the plan file, which is the one
+1 -1
View File
@@ -1,6 +1,6 @@
{
"name": "gstack",
"version": "1.57.6.0",
"version": "1.57.7.0",
"description": "Garry's Stack — Claude Code skills + fast headless browser. One repo, one install, entire AI engineering workflow.",
"license": "MIT",
"type": "module",
+9 -4
View File
@@ -1413,12 +1413,17 @@ missing work — do NOT call ExitPlanMode:
In-body prose that mentions "outside voice", "codex findings", or similar
does NOT count — only the structured `## GSTACK REVIEW REPORT` section
satisfies this check.
3. Confirm the report contains: a Runs / Status / Findings table, a VERDICT
line, and absorbs CODEX / CROSS-MODEL / UNRESOLVED lines if applicable.
4. If a plan file is in context for this skill invocation: confirm
3. Confirm the report has a Runs / Status / Findings table and a VERDICT line
(CODEX / CROSS-MODEL absorbed if applicable).
4. Confirm the report's FINAL non-whitespace line is the unresolved-decisions
status: the exact unbolded `NO UNRESOLVED DECISIONS`, or a bullet of a final
`**UNRESOLVED DECISIONS:**` block. BLOCKING, no "if applicable" escape — a
bolded sentinel, any trailing CODEX/CROSS-MODEL/VERDICT/prose, or a missing
status each FAILS the gate.
5. If a plan file is in context for this skill invocation: confirm
`gstack-review-log` was called and `gstack-review-read` was run at least
once. If no plan file is in context (e.g. `/codex consult` against a
diff with no plan), this check short-circuits — checks 1-3 already
diff with no plan), this check short-circuits — checks 1-4 already
short-circuit when no plan file exists.
Failing this gate and calling ExitPlanMode anyway is a contract violation —
+12 -2
View File
@@ -722,14 +722,24 @@ Produce this markdown table:
| DX Review | \`/plan-devex-review\` | Developer experience gaps | {runs} | {status} | {findings} |
\`\`\`
Below the table, add these lines (omit any that are empty/not applicable):
Below the table, add these lines. **CODEX** and **CROSS-MODEL** are optional (omit when
empty); **VERDICT** is always present:
- **CODEX:** (only if codex-review ran) — one-line summary of codex fixes
- **CROSS-MODEL:** (only if both Claude and Codex reviews exist) — overlap analysis
- **UNRESOLVED:** total unresolved decisions across all reviews
- **VERDICT:** list reviews that are CLEAR (e.g., "CEO + ENG CLEARED — ready to implement").
If Eng Review is not CLEAR and not skipped globally, append "eng review required".
**Unresolved-decisions status (MANDATORY — never omitted; the report's final non-whitespace
line).** After VERDICT, end the report (content under the \`## GSTACK REVIEW REPORT\`
heading — a bold label, never a new \`## \` heading; exempt from the "omit when empty"
rule) with exactly one: the exact unbolded line \`NO UNRESOLVED DECISIONS\` (a bolded one
does NOT count), OR a \`**UNRESOLVED DECISIONS:**\` header + one bullet per open item
(last bullet = final line; add \`+ N unresolved from prior reviews\` only when N > 0).
This avoids double-counting: list THIS review's open items from context; for prior reviews
sum \`unresolved\` over the latest fresh row per skill (dashboard 7-day window) after you
DROP the current skill's row; emit the sentinel only when both are zero.
### Write to the plan file
**PLAN MODE EXCEPTION — ALWAYS RUN:** This writes to the plan file, which is the one
+9 -4
View File
@@ -1434,12 +1434,17 @@ missing work — do NOT call ExitPlanMode:
In-body prose that mentions "outside voice", "codex findings", or similar
does NOT count — only the structured `## GSTACK REVIEW REPORT` section
satisfies this check.
3. Confirm the report contains: a Runs / Status / Findings table, a VERDICT
line, and absorbs CODEX / CROSS-MODEL / UNRESOLVED lines if applicable.
4. If a plan file is in context for this skill invocation: confirm
3. Confirm the report has a Runs / Status / Findings table and a VERDICT line
(CODEX / CROSS-MODEL absorbed if applicable).
4. Confirm the report's FINAL non-whitespace line is the unresolved-decisions
status: the exact unbolded `NO UNRESOLVED DECISIONS`, or a bullet of a final
`**UNRESOLVED DECISIONS:**` block. BLOCKING, no "if applicable" escape — a
bolded sentinel, any trailing CODEX/CROSS-MODEL/VERDICT/prose, or a missing
status each FAILS the gate.
5. If a plan file is in context for this skill invocation: confirm
`gstack-review-log` was called and `gstack-review-read` was run at least
once. If no plan file is in context (e.g. `/codex consult` against a
diff with no plan), this check short-circuits — checks 1-3 already
diff with no plan), this check short-circuits — checks 1-4 already
short-circuit when no plan file exists.
Failing this gate and calling ExitPlanMode anyway is a contract violation —
+12 -2
View File
@@ -458,14 +458,24 @@ Produce this markdown table:
| DX Review | \`/plan-devex-review\` | Developer experience gaps | {runs} | {status} | {findings} |
\`\`\`
Below the table, add these lines (omit any that are empty/not applicable):
Below the table, add these lines. **CODEX** and **CROSS-MODEL** are optional (omit when
empty); **VERDICT** is always present:
- **CODEX:** (only if codex-review ran) — one-line summary of codex fixes
- **CROSS-MODEL:** (only if both Claude and Codex reviews exist) — overlap analysis
- **UNRESOLVED:** total unresolved decisions across all reviews
- **VERDICT:** list reviews that are CLEAR (e.g., "CEO + ENG CLEARED — ready to implement").
If Eng Review is not CLEAR and not skipped globally, append "eng review required".
**Unresolved-decisions status (MANDATORY — never omitted; the report's final non-whitespace
line).** After VERDICT, end the report (content under the \`## GSTACK REVIEW REPORT\`
heading — a bold label, never a new \`## \` heading; exempt from the "omit when empty"
rule) with exactly one: the exact unbolded line \`NO UNRESOLVED DECISIONS\` (a bolded one
does NOT count), OR a \`**UNRESOLVED DECISIONS:**\` header + one bullet per open item
(last bullet = final line; add \`+ N unresolved from prior reviews\` only when N > 0).
This avoids double-counting: list THIS review's open items from context; for prior reviews
sum \`unresolved\` over the latest fresh row per skill (dashboard 7-day window) after you
DROP the current skill's row; emit the sentinel only when both are zero.
### Write to the plan file
**PLAN MODE EXCEPTION — ALWAYS RUN:** This writes to the plan file, which is the one
+9 -4
View File
@@ -1397,12 +1397,17 @@ missing work — do NOT call ExitPlanMode:
In-body prose that mentions "outside voice", "codex findings", or similar
does NOT count — only the structured `## GSTACK REVIEW REPORT` section
satisfies this check.
3. Confirm the report contains: a Runs / Status / Findings table, a VERDICT
line, and absorbs CODEX / CROSS-MODEL / UNRESOLVED lines if applicable.
4. If a plan file is in context for this skill invocation: confirm
3. Confirm the report has a Runs / Status / Findings table and a VERDICT line
(CODEX / CROSS-MODEL absorbed if applicable).
4. Confirm the report's FINAL non-whitespace line is the unresolved-decisions
status: the exact unbolded `NO UNRESOLVED DECISIONS`, or a bullet of a final
`**UNRESOLVED DECISIONS:**` block. BLOCKING, no "if applicable" escape — a
bolded sentinel, any trailing CODEX/CROSS-MODEL/VERDICT/prose, or a missing
status each FAILS the gate.
5. If a plan file is in context for this skill invocation: confirm
`gstack-review-log` was called and `gstack-review-read` was run at least
once. If no plan file is in context (e.g. `/codex consult` against a
diff with no plan), this check short-circuits — checks 1-3 already
diff with no plan), this check short-circuits — checks 1-4 already
short-circuit when no plan file exists.
Failing this gate and calling ExitPlanMode anyway is a contract violation —
+23 -2
View File
@@ -586,6 +586,17 @@ this run (an empty file means "ran, no findings" — distinct from "didn't run")
### Unresolved Decisions
If any AskUserQuestion goes unanswered, note here. Never silently default.
## Review Log
Persist after the DX Scorecard — the dashboard, the GSTACK REVIEW REPORT, and the EXIT
PLAN MODE GATE's "review log was called" check depend on it. **PLAN MODE EXCEPTION — ALWAYS RUN** (writes to `~/.gstack/`, not project files):
```bash
~/.claude/skills/gstack/bin/gstack-review-log '{"skill":"plan-devex-review","timestamp":"TIMESTAMP","status":"STATUS","initial_score":N,"overall_score":N,"product_type":"PRODUCT_TYPE","tthw_current":"TTHW_CURRENT","tthw_target":"TTHW_TARGET","mode":"MODE","persona":"PERSONA","competitive_tier":"COMPETITIVE_TIER","unresolved":N,"commit":"COMMIT"}'
```
TIMESTAMP = current ISO 8601 datetime; STATUS = "clean" if score 8+ AND 0 unresolved, else "issues_open"; other fields from the DX Scorecard + Step 0; COMMIT = `git rev-parse --short HEAD`.
## Review Readiness Dashboard
After completing the review, read the review log and config to display the dashboard.
@@ -685,14 +696,24 @@ Produce this markdown table:
| DX Review | \`/plan-devex-review\` | Developer experience gaps | {runs} | {status} | {findings} |
\`\`\`
Below the table, add these lines (omit any that are empty/not applicable):
Below the table, add these lines. **CODEX** and **CROSS-MODEL** are optional (omit when
empty); **VERDICT** is always present:
- **CODEX:** (only if codex-review ran) — one-line summary of codex fixes
- **CROSS-MODEL:** (only if both Claude and Codex reviews exist) — overlap analysis
- **UNRESOLVED:** total unresolved decisions across all reviews
- **VERDICT:** list reviews that are CLEAR (e.g., "CEO + ENG CLEARED — ready to implement").
If Eng Review is not CLEAR and not skipped globally, append "eng review required".
**Unresolved-decisions status (MANDATORY — never omitted; the report's final non-whitespace
line).** After VERDICT, end the report (content under the \`## GSTACK REVIEW REPORT\`
heading — a bold label, never a new \`## \` heading; exempt from the "omit when empty"
rule) with exactly one: the exact unbolded line \`NO UNRESOLVED DECISIONS\` (a bolded one
does NOT count), OR a \`**UNRESOLVED DECISIONS:**\` header + one bullet per open item
(last bullet = final line; add \`+ N unresolved from prior reviews\` only when N > 0).
This avoids double-counting: list THIS review's open items from context; for prior reviews
sum \`unresolved\` over the latest fresh row per skill (dashboard 7-day window) after you
DROP the current skill's row; emit the sentinel only when both are zero.
### Write to the plan file
**PLAN MODE EXCEPTION — ALWAYS RUN:** This writes to the plan file, which is the one
@@ -334,6 +334,17 @@ DX IMPLEMENTATION CHECKLIST
### Unresolved Decisions
If any AskUserQuestion goes unanswered, note here. Never silently default.
## Review Log
Persist after the DX Scorecard — the dashboard, the GSTACK REVIEW REPORT, and the EXIT
PLAN MODE GATE's "review log was called" check depend on it. **PLAN MODE EXCEPTION — ALWAYS RUN** (writes to `~/.gstack/`, not project files):
```bash
~/.claude/skills/gstack/bin/gstack-review-log '{"skill":"plan-devex-review","timestamp":"TIMESTAMP","status":"STATUS","initial_score":N,"overall_score":N,"product_type":"PRODUCT_TYPE","tthw_current":"TTHW_CURRENT","tthw_target":"TTHW_TARGET","mode":"MODE","persona":"PERSONA","competitive_tier":"COMPETITIVE_TIER","unresolved":N,"commit":"COMMIT"}'
```
TIMESTAMP = current ISO 8601 datetime; STATUS = "clean" if score 8+ AND 0 unresolved, else "issues_open"; other fields from the DX Scorecard + Step 0; COMMIT = `git rev-parse --short HEAD`.
{{REVIEW_DASHBOARD}}
{{PLAN_FILE_REVIEW_REPORT}}
+9 -4
View File
@@ -969,12 +969,17 @@ missing work — do NOT call ExitPlanMode:
In-body prose that mentions "outside voice", "codex findings", or similar
does NOT count — only the structured `## GSTACK REVIEW REPORT` section
satisfies this check.
3. Confirm the report contains: a Runs / Status / Findings table, a VERDICT
line, and absorbs CODEX / CROSS-MODEL / UNRESOLVED lines if applicable.
4. If a plan file is in context for this skill invocation: confirm
3. Confirm the report has a Runs / Status / Findings table and a VERDICT line
(CODEX / CROSS-MODEL absorbed if applicable).
4. Confirm the report's FINAL non-whitespace line is the unresolved-decisions
status: the exact unbolded `NO UNRESOLVED DECISIONS`, or a bullet of a final
`**UNRESOLVED DECISIONS:**` block. BLOCKING, no "if applicable" escape — a
bolded sentinel, any trailing CODEX/CROSS-MODEL/VERDICT/prose, or a missing
status each FAILS the gate.
5. If a plan file is in context for this skill invocation: confirm
`gstack-review-log` was called and `gstack-review-read` was run at least
once. If no plan file is in context (e.g. `/codex consult` against a
diff with no plan), this check short-circuits — checks 1-3 already
diff with no plan), this check short-circuits — checks 1-4 already
short-circuit when no plan file exists.
Failing this gate and calling ExitPlanMode anyway is a contract violation —
+12 -2
View File
@@ -776,14 +776,24 @@ Produce this markdown table:
| DX Review | \`/plan-devex-review\` | Developer experience gaps | {runs} | {status} | {findings} |
\`\`\`
Below the table, add these lines (omit any that are empty/not applicable):
Below the table, add these lines. **CODEX** and **CROSS-MODEL** are optional (omit when
empty); **VERDICT** is always present:
- **CODEX:** (only if codex-review ran) — one-line summary of codex fixes
- **CROSS-MODEL:** (only if both Claude and Codex reviews exist) — overlap analysis
- **UNRESOLVED:** total unresolved decisions across all reviews
- **VERDICT:** list reviews that are CLEAR (e.g., "CEO + ENG CLEARED — ready to implement").
If Eng Review is not CLEAR and not skipped globally, append "eng review required".
**Unresolved-decisions status (MANDATORY — never omitted; the report's final non-whitespace
line).** After VERDICT, end the report (content under the \`## GSTACK REVIEW REPORT\`
heading — a bold label, never a new \`## \` heading; exempt from the "omit when empty"
rule) with exactly one: the exact unbolded line \`NO UNRESOLVED DECISIONS\` (a bolded one
does NOT count), OR a \`**UNRESOLVED DECISIONS:**\` header + one bullet per open item
(last bullet = final line; add \`+ N unresolved from prior reviews\` only when N > 0).
This avoids double-counting: list THIS review's open items from context; for prior reviews
sum \`unresolved\` over the latest fresh row per skill (dashboard 7-day window) after you
DROP the current skill's row; emit the sentinel only when both are zero.
### Write to the plan file
**PLAN MODE EXCEPTION — ALWAYS RUN:** This writes to the plan file, which is the one
+21 -6
View File
@@ -120,14 +120,24 @@ Produce this markdown table:
| DX Review | \\\`/plan-devex-review\\\` | Developer experience gaps | {runs} | {status} | {findings} |
\\\`\\\`\\\`
Below the table, add these lines (omit any that are empty/not applicable):
Below the table, add these lines. **CODEX** and **CROSS-MODEL** are optional (omit when
empty); **VERDICT** is always present:
- **CODEX:** (only if codex-review ran) one-line summary of codex fixes
- **CROSS-MODEL:** (only if both Claude and Codex reviews exist) overlap analysis
- **UNRESOLVED:** total unresolved decisions across all reviews
- **VERDICT:** list reviews that are CLEAR (e.g., "CEO + ENG CLEARED — ready to implement").
If Eng Review is not CLEAR and not skipped globally, append "eng review required".
**Unresolved-decisions status (MANDATORY never omitted; the report's final non-whitespace
line).** After VERDICT, end the report (content under the \\\`## GSTACK REVIEW REPORT\\\`
heading a bold label, never a new \\\`## \\\` heading; exempt from the "omit when empty"
rule) with exactly one: the exact unbolded line \\\`NO UNRESOLVED DECISIONS\\\` (a bolded one
does NOT count), OR a \\\`**UNRESOLVED DECISIONS:**\\\` header + one bullet per open item
(last bullet = final line; add \\\`+ N unresolved from prior reviews\\\` only when N > 0).
This avoids double-counting: list THIS review's open items from context; for prior reviews
sum \\\`unresolved\\\` over the latest fresh row per skill (dashboard 7-day window) after you
DROP the current skill's row; emit the sentinel only when both are zero.
### Write to the plan file
**PLAN MODE EXCEPTION ALWAYS RUN:** This writes to the plan file, which is the one
@@ -170,12 +180,17 @@ missing work — do NOT call ExitPlanMode:
In-body prose that mentions "outside voice", "codex findings", or similar
does NOT count only the structured \`## GSTACK REVIEW REPORT\` section
satisfies this check.
3. Confirm the report contains: a Runs / Status / Findings table, a VERDICT
line, and absorbs CODEX / CROSS-MODEL / UNRESOLVED lines if applicable.
4. If a plan file is in context for this skill invocation: confirm
3. Confirm the report has a Runs / Status / Findings table and a VERDICT line
(CODEX / CROSS-MODEL absorbed if applicable).
4. Confirm the report's FINAL non-whitespace line is the unresolved-decisions
status: the exact unbolded \`NO UNRESOLVED DECISIONS\`, or a bullet of a final
\`**UNRESOLVED DECISIONS:**\` block. BLOCKING, no "if applicable" escape — a
bolded sentinel, any trailing CODEX/CROSS-MODEL/VERDICT/prose, or a missing
status each FAILS the gate.
5. If a plan file is in context for this skill invocation: confirm
\`gstack-review-log\` was called and \`gstack-review-read\` was run at least
once. If no plan file is in context (e.g. \`/codex consult\` against a
diff with no plan), this check short-circuits checks 1-3 already
diff with no plan), this check short-circuits checks 1-4 already
short-circuit when no plan file exists.
Failing this gate and calling ExitPlanMode anyway is a contract violation
+633
View File
@@ -0,0 +1,633 @@
{
"tag": "v1.57.7.0",
"capturedAt": "2026-05-30T18:00:56.209Z",
"capturedFromCommit": "49035bdd",
"capturedFromBranch": "garrytan/plan-flag-unresolved-issues",
"totalSkills": 52,
"totalCorpusBytes": 3359373,
"estTotalCatalogTokens": 4116,
"topHeaviest": [
{
"skill": "ship",
"skillMdBytes": 174407,
"skillMdLines": 3137,
"estTokens": 43602,
"tmplBytes": 53240,
"descriptionLen": 291,
"hasGateEval": true,
"hasPeriodicEval": true
},
{
"skill": "plan-ceo-review",
"skillMdBytes": 144411,
"skillMdLines": 2349,
"estTokens": 36103,
"tmplBytes": 63461,
"descriptionLen": 794,
"hasGateEval": true,
"hasPeriodicEval": true
},
{
"skill": "office-hours",
"skillMdBytes": 123037,
"skillMdLines": 2200,
"estTokens": 30759,
"tmplBytes": 55534,
"descriptionLen": 860,
"hasGateEval": true,
"hasPeriodicEval": false
},
{
"skill": "plan-design-review",
"skillMdBytes": 118532,
"skillMdLines": 2073,
"estTokens": 29633,
"tmplBytes": 28717,
"descriptionLen": 218,
"hasGateEval": true,
"hasPeriodicEval": true
},
{
"skill": "plan-devex-review",
"skillMdBytes": 117907,
"skillMdLines": 2277,
"estTokens": 29477,
"tmplBytes": 35773,
"descriptionLen": 250,
"hasGateEval": true,
"hasPeriodicEval": true
},
{
"skill": "spec",
"skillMdBytes": 117382,
"skillMdLines": 2276,
"estTokens": 29346,
"tmplBytes": 30590,
"descriptionLen": 282,
"hasGateEval": true,
"hasPeriodicEval": false
},
{
"skill": "plan-eng-review",
"skillMdBytes": 114209,
"skillMdLines": 1906,
"estTokens": 28552,
"tmplBytes": 26302,
"descriptionLen": 231,
"hasGateEval": true,
"hasPeriodicEval": true
},
{
"skill": "design-review",
"skillMdBytes": 100149,
"skillMdLines": 1953,
"estTokens": 25037,
"tmplBytes": 11674,
"descriptionLen": 304,
"hasGateEval": true,
"hasPeriodicEval": false
},
{
"skill": "review",
"skillMdBytes": 99573,
"skillMdLines": 1787,
"estTokens": 24893,
"tmplBytes": 14099,
"descriptionLen": 205,
"hasGateEval": true,
"hasPeriodicEval": false
},
{
"skill": "land-and-deploy",
"skillMdBytes": 96379,
"skillMdLines": 1877,
"estTokens": 24095,
"tmplBytes": 48624,
"descriptionLen": 160,
"hasGateEval": true,
"hasPeriodicEval": false
}
],
"skills": {
"autoplan": {
"skill": "autoplan",
"skillMdBytes": 95365,
"skillMdLines": 1805,
"estTokens": 23841,
"tmplBytes": 45271,
"descriptionLen": 366,
"hasGateEval": true,
"hasPeriodicEval": true
},
"benchmark": {
"skill": "benchmark",
"skillMdBytes": 33646,
"skillMdLines": 750,
"estTokens": 8412,
"tmplBytes": 9378,
"descriptionLen": 213,
"hasGateEval": true,
"hasPeriodicEval": false
},
"benchmark-models": {
"skill": "benchmark-models",
"skillMdBytes": 29713,
"skillMdLines": 625,
"estTokens": 7428,
"tmplBytes": 6631,
"descriptionLen": 217,
"hasGateEval": false,
"hasPeriodicEval": false
},
"browse": {
"skill": "browse",
"skillMdBytes": 48531,
"skillMdLines": 933,
"estTokens": 12133,
"tmplBytes": 10805,
"descriptionLen": 181,
"hasGateEval": true,
"hasPeriodicEval": false
},
"canary": {
"skill": "canary",
"skillMdBytes": 51598,
"skillMdLines": 1011,
"estTokens": 12900,
"tmplBytes": 8033,
"descriptionLen": 180,
"hasGateEval": true,
"hasPeriodicEval": false
},
"careful": {
"skill": "careful",
"skillMdBytes": 2567,
"skillMdLines": 68,
"estTokens": 642,
"tmplBytes": 2435,
"descriptionLen": 315,
"hasGateEval": false,
"hasPeriodicEval": false
},
"codex": {
"skill": "codex",
"skillMdBytes": 85212,
"skillMdLines": 1555,
"estTokens": 21303,
"tmplBytes": 34143,
"descriptionLen": 187,
"hasGateEval": true,
"hasPeriodicEval": false
},
"context-restore": {
"skill": "context-restore",
"skillMdBytes": 45986,
"skillMdLines": 869,
"estTokens": 11497,
"tmplBytes": 5255,
"descriptionLen": 238,
"hasGateEval": true,
"hasPeriodicEval": false
},
"context-save": {
"skill": "context-save",
"skillMdBytes": 50183,
"skillMdLines": 987,
"estTokens": 12546,
"tmplBytes": 9293,
"descriptionLen": 168,
"hasGateEval": true,
"hasPeriodicEval": false
},
"cso": {
"skill": "cso",
"skillMdBytes": 83808,
"skillMdLines": 1498,
"estTokens": 20952,
"tmplBytes": 35646,
"descriptionLen": 196,
"hasGateEval": true,
"hasPeriodicEval": false
},
"design-consultation": {
"skill": "design-consultation",
"skillMdBytes": 84683,
"skillMdLines": 1598,
"estTokens": 21171,
"tmplBytes": 25899,
"descriptionLen": 888,
"hasGateEval": true,
"hasPeriodicEval": false
},
"design-html": {
"skill": "design-html",
"skillMdBytes": 71042,
"skillMdLines": 1470,
"estTokens": 17761,
"tmplBytes": 22567,
"descriptionLen": 233,
"hasGateEval": false,
"hasPeriodicEval": false
},
"design-review": {
"skill": "design-review",
"skillMdBytes": 100149,
"skillMdLines": 1953,
"estTokens": 25037,
"tmplBytes": 11674,
"descriptionLen": 304,
"hasGateEval": true,
"hasPeriodicEval": false
},
"design-shotgun": {
"skill": "design-shotgun",
"skillMdBytes": 67331,
"skillMdLines": 1332,
"estTokens": 16833,
"tmplBytes": 13331,
"descriptionLen": 786,
"hasGateEval": false,
"hasPeriodicEval": false
},
"devex-review": {
"skill": "devex-review",
"skillMdBytes": 69681,
"skillMdLines": 1264,
"estTokens": 17420,
"tmplBytes": 7984,
"descriptionLen": 201,
"hasGateEval": false,
"hasPeriodicEval": false
},
"document-generate": {
"skill": "document-generate",
"skillMdBytes": 58327,
"skillMdLines": 1211,
"estTokens": 14582,
"tmplBytes": 15939,
"descriptionLen": 334,
"hasGateEval": false,
"hasPeriodicEval": false
},
"document-release": {
"skill": "document-release",
"skillMdBytes": 64403,
"skillMdLines": 1281,
"estTokens": 16101,
"tmplBytes": 20974,
"descriptionLen": 192,
"hasGateEval": true,
"hasPeriodicEval": false
},
"freeze": {
"skill": "freeze",
"skillMdBytes": 3184,
"skillMdLines": 92,
"estTokens": 796,
"tmplBytes": 3038,
"descriptionLen": 503,
"hasGateEval": false,
"hasPeriodicEval": false
},
"gstack-upgrade": {
"skill": "gstack-upgrade",
"skillMdBytes": 10817,
"skillMdLines": 285,
"estTokens": 2704,
"tmplBytes": 10667,
"descriptionLen": 163,
"hasGateEval": true,
"hasPeriodicEval": false
},
"guard": {
"skill": "guard",
"skillMdBytes": 3314,
"skillMdLines": 91,
"estTokens": 829,
"tmplBytes": 3181,
"descriptionLen": 686,
"hasGateEval": false,
"hasPeriodicEval": false
},
"health": {
"skill": "health",
"skillMdBytes": 52409,
"skillMdLines": 1035,
"estTokens": 13102,
"tmplBytes": 11617,
"descriptionLen": 184,
"hasGateEval": true,
"hasPeriodicEval": false
},
"investigate": {
"skill": "investigate",
"skillMdBytes": 54902,
"skillMdLines": 1033,
"estTokens": 13726,
"tmplBytes": 11561,
"descriptionLen": 1379,
"hasGateEval": true,
"hasPeriodicEval": false
},
"ios-clean": {
"skill": "ios-clean",
"skillMdBytes": 45540,
"skillMdLines": 834,
"estTokens": 11385,
"tmplBytes": 3851,
"descriptionLen": 252,
"hasGateEval": false,
"hasPeriodicEval": false
},
"ios-design-review": {
"skill": "ios-design-review",
"skillMdBytes": 46124,
"skillMdLines": 836,
"estTokens": 11531,
"tmplBytes": 4417,
"descriptionLen": 209,
"hasGateEval": false,
"hasPeriodicEval": false
},
"ios-fix": {
"skill": "ios-fix",
"skillMdBytes": 45253,
"skillMdLines": 832,
"estTokens": 11313,
"tmplBytes": 3574,
"descriptionLen": 187,
"hasGateEval": false,
"hasPeriodicEval": false
},
"ios-qa": {
"skill": "ios-qa",
"skillMdBytes": 51764,
"skillMdLines": 952,
"estTokens": 12941,
"tmplBytes": 10090,
"descriptionLen": 223,
"hasGateEval": true,
"hasPeriodicEval": false
},
"ios-sync": {
"skill": "ios-sync",
"skillMdBytes": 45230,
"skillMdLines": 825,
"estTokens": 11308,
"tmplBytes": 3544,
"descriptionLen": 269,
"hasGateEval": false,
"hasPeriodicEval": false
},
"land-and-deploy": {
"skill": "land-and-deploy",
"skillMdBytes": 96379,
"skillMdLines": 1877,
"estTokens": 24095,
"tmplBytes": 48624,
"descriptionLen": 160,
"hasGateEval": true,
"hasPeriodicEval": false
},
"landing-report": {
"skill": "landing-report",
"skillMdBytes": 48478,
"skillMdLines": 895,
"estTokens": 12120,
"tmplBytes": 6806,
"descriptionLen": 195,
"hasGateEval": false,
"hasPeriodicEval": false
},
"learn": {
"skill": "learn",
"skillMdBytes": 46215,
"skillMdLines": 912,
"estTokens": 11554,
"tmplBytes": 5594,
"descriptionLen": 178,
"hasGateEval": true,
"hasPeriodicEval": false
},
"make-pdf": {
"skill": "make-pdf",
"skillMdBytes": 30270,
"skillMdLines": 673,
"estTokens": 7568,
"tmplBytes": 5546,
"descriptionLen": 177,
"hasGateEval": false,
"hasPeriodicEval": false
},
"office-hours": {
"skill": "office-hours",
"skillMdBytes": 123037,
"skillMdLines": 2200,
"estTokens": 30759,
"tmplBytes": 55534,
"descriptionLen": 860,
"hasGateEval": true,
"hasPeriodicEval": false
},
"open-gstack-browser": {
"skill": "open-gstack-browser",
"skillMdBytes": 50624,
"skillMdLines": 975,
"estTokens": 12656,
"tmplBytes": 7702,
"descriptionLen": 204,
"hasGateEval": false,
"hasPeriodicEval": false
},
"pair-agent": {
"skill": "pair-agent",
"skillMdBytes": 51432,
"skillMdLines": 1031,
"estTokens": 12858,
"tmplBytes": 8548,
"descriptionLen": 167,
"hasGateEval": false,
"hasPeriodicEval": false
},
"plan-ceo-review": {
"skill": "plan-ceo-review",
"skillMdBytes": 144411,
"skillMdLines": 2349,
"estTokens": 36103,
"tmplBytes": 63461,
"descriptionLen": 794,
"hasGateEval": true,
"hasPeriodicEval": true
},
"plan-design-review": {
"skill": "plan-design-review",
"skillMdBytes": 118532,
"skillMdLines": 2073,
"estTokens": 29633,
"tmplBytes": 28717,
"descriptionLen": 218,
"hasGateEval": true,
"hasPeriodicEval": true
},
"plan-devex-review": {
"skill": "plan-devex-review",
"skillMdBytes": 117907,
"skillMdLines": 2277,
"estTokens": 29477,
"tmplBytes": 35773,
"descriptionLen": 250,
"hasGateEval": true,
"hasPeriodicEval": true
},
"plan-eng-review": {
"skill": "plan-eng-review",
"skillMdBytes": 114209,
"skillMdLines": 1906,
"estTokens": 28552,
"tmplBytes": 26302,
"descriptionLen": 231,
"hasGateEval": true,
"hasPeriodicEval": true
},
"plan-tune": {
"skill": "plan-tune",
"skillMdBytes": 67548,
"skillMdLines": 1372,
"estTokens": 16887,
"tmplBytes": 26922,
"descriptionLen": 325,
"hasGateEval": true,
"hasPeriodicEval": false
},
"qa": {
"skill": "qa",
"skillMdBytes": 78356,
"skillMdLines": 1643,
"estTokens": 19589,
"tmplBytes": 12701,
"descriptionLen": 218,
"hasGateEval": true,
"hasPeriodicEval": false
},
"qa-only": {
"skill": "qa-only",
"skillMdBytes": 60914,
"skillMdLines": 1215,
"estTokens": 15229,
"tmplBytes": 3851,
"descriptionLen": 165,
"hasGateEval": true,
"hasPeriodicEval": false
},
"retro": {
"skill": "retro",
"skillMdBytes": 87382,
"skillMdLines": 1771,
"estTokens": 21846,
"tmplBytes": 42427,
"descriptionLen": 648,
"hasGateEval": true,
"hasPeriodicEval": false
},
"review": {
"skill": "review",
"skillMdBytes": 99573,
"skillMdLines": 1787,
"estTokens": 24893,
"tmplBytes": 14099,
"descriptionLen": 205,
"hasGateEval": true,
"hasPeriodicEval": false
},
"scrape": {
"skill": "scrape",
"skillMdBytes": 48134,
"skillMdLines": 908,
"estTokens": 12034,
"tmplBytes": 5220,
"descriptionLen": 167,
"hasGateEval": true,
"hasPeriodicEval": false
},
"setup-browser-cookies": {
"skill": "setup-browser-cookies",
"skillMdBytes": 26998,
"skillMdLines": 597,
"estTokens": 6750,
"tmplBytes": 2724,
"descriptionLen": 222,
"hasGateEval": false,
"hasPeriodicEval": false
},
"setup-deploy": {
"skill": "setup-deploy",
"skillMdBytes": 48420,
"skillMdLines": 940,
"estTokens": 12105,
"tmplBytes": 7780,
"descriptionLen": 197,
"hasGateEval": true,
"hasPeriodicEval": false
},
"setup-gbrain": {
"skill": "setup-gbrain",
"skillMdBytes": 85495,
"skillMdLines": 1794,
"estTokens": 21374,
"tmplBytes": 44851,
"descriptionLen": 323,
"hasGateEval": true,
"hasPeriodicEval": false
},
"ship": {
"skill": "ship",
"skillMdBytes": 174407,
"skillMdLines": 3137,
"estTokens": 43602,
"tmplBytes": 53240,
"descriptionLen": 291,
"hasGateEval": true,
"hasPeriodicEval": true
},
"skillify": {
"skill": "skillify",
"skillMdBytes": 58027,
"skillMdLines": 1189,
"estTokens": 14507,
"tmplBytes": 15107,
"descriptionLen": 233,
"hasGateEval": true,
"hasPeriodicEval": false
},
"spec": {
"skill": "spec",
"skillMdBytes": 117382,
"skillMdLines": 2276,
"estTokens": 29346,
"tmplBytes": 30590,
"descriptionLen": 282,
"hasGateEval": true,
"hasPeriodicEval": false
},
"sync-gbrain": {
"skill": "sync-gbrain",
"skillMdBytes": 62977,
"skillMdLines": 1191,
"estTokens": 15744,
"tmplBytes": 16077,
"descriptionLen": 299,
"hasGateEval": false,
"hasPeriodicEval": false
},
"unfreeze": {
"skill": "unfreeze",
"skillMdBytes": 1504,
"skillMdLines": 49,
"estTokens": 376,
"tmplBytes": 1386,
"descriptionLen": 199,
"hasGateEval": false,
"hasPeriodicEval": false
}
}
}
+59
View File
@@ -3239,3 +3239,62 @@ describe('EXIT PLAN MODE GATE placement', () => {
expect(codex).toContain('Failing this gate and calling ExitPlanMode anyway is a contract violation');
});
});
describe('GSTACK REVIEW REPORT mandatory unresolved-decisions status', () => {
// Report text rides in PLAN_FILE_REVIEW_REPORT → every report consumer gets it.
// devex-review is a report consumer but NOT a gate consumer, so the two target
// sets differ (CP5/CX5). Regression guard: a future token-cut that drops the
// unresolved-status line again fails here. See plan-flag-unresolved-issues.
const REPORT_CONSUMERS = [
'plan-ceo-review',
'plan-eng-review',
'plan-design-review',
'plan-devex-review',
'codex',
'devex-review',
];
// Gate text rides in EXIT_PLAN_MODE_GATE (lives in SKILL.md, not sections).
const GATE_SKILLS = [
'plan-ceo-review',
'plan-eng-review',
'plan-design-review',
'plan-devex-review',
'codex',
];
for (const skill of REPORT_CONSUMERS) {
test(`${skill}: report mandates the unresolved-decisions status as final content`, () => {
const content = readSkillUnion(skill);
expect(content).toContain('NO UNRESOLVED DECISIONS');
// The "never omit / always final" contract must be present, not just the phrase.
expect(content).toContain('Unresolved-decisions status (MANDATORY');
expect(content).toMatch(/never omitted/);
// \s+ tolerates prose line-wraps within "final non-whitespace line".
expect(content).toMatch(/final\s+non-whitespace\s+line/);
});
}
for (const skill of GATE_SKILLS) {
test(`${skill}: exit gate blocks unless the unresolved status is the final line`, () => {
const md = fs.readFileSync(path.join(ROOT, skill, 'SKILL.md'), 'utf-8');
// Gate check #4 — present, sentinel named, and explicitly blocking (no escape).
expect(md).toContain('NO UNRESOLVED DECISIONS');
expect(md).toContain('FINAL non-whitespace line is the unresolved-decisions');
expect(md).toContain('FAILS the gate');
});
}
test('scripts/resolvers/review.ts source carries the mandatory block + blocking gate', () => {
const src = fs.readFileSync(path.join(ROOT, 'scripts', 'resolvers', 'review.ts'), 'utf-8');
// Report resolver: mandatory, never-omitted, exact sentinel, anti-double-count algorithm.
expect(src).toContain('Unresolved-decisions status (MANDATORY');
expect(src).toContain('NO UNRESOLVED DECISIONS');
expect(src).toContain('avoids double-counting');
expect(src).toContain('DROP the current skill');
// Gate resolver: the blocking final-line check with no "if applicable" escape.
expect(src).toContain('FINAL non-whitespace line is the unresolved-decisions');
expect(src).toContain('FAILS the gate');
// The old soft wording must be gone from the gate.
expect(src).not.toContain('absorbs CODEX / CROSS-MODEL / UNRESOLVED lines if applicable');
});
});
+13 -9
View File
@@ -2,15 +2,19 @@
* Cathedral parity suite gate-tier (free, structural + content checks).
*
* Runs every PARITY_INVARIANTS check against the current SKILL.md output
* vs the v1.53.0.0 baseline. Failures get an actionable, per-skill report
* vs the v1.57.7.0 baseline. Failures get an actionable, per-skill report
* showing missing phrases, missing headings, and size ratios.
*
* Baseline rebased v1.44.1 v1.53.0.0: the brain-aware-planning releases
* (v1.49v1.52) plus the v1.53 redaction guard pushed five planning skills
* past the 5% ratchet on the frozen v1.44.1 anchor. Rebasing absorbs that
* legitimate growth at HEAD while keeping the per-skill 1.05 ratio so future
* bloat is still caught. Historical v1.44.1 / v1.46.0.0 / v1.47.0.0 baselines
* are retained in test/fixtures/ for the v1v2 audit trail.
* Baseline rebased v1.53.0.0 v1.57.7.0: the v1.54v1.57 releases (ship/plan
* carving, carve-guards, AUQ prose fallback, the cross-session decision-log
* preamble) plus the mandatory unresolved-decisions status added to every
* GSTACK REVIEW REPORT pushed the three plan-review skills past the 5% ratchet
* on the v1.53 anchor even after exhaustive compression. The v1.57.7.0 baseline
* captures current UNION sizes (skeleton + sections/*.md, matching what the
* harness measures) so the per-skill 1.05 ratio still catches future bloat.
* Earlier rebase v1.44.1 v1.53.0.0: brain-aware-planning (v1.49v1.52) + the
* v1.53 redaction guard. Historical v1.44.1 / v1.46.0.0 / v1.47.0.0 / v1.53.0.0
* baselines are retained in test/fixtures/ for the audit trail.
*
* Periodic-tier LLM-judge parity (paid) lands in Phase B (v2.0.0.0)
* alongside the sections/ extraction. Plumbing is in parity-harness.ts.
@@ -23,9 +27,9 @@ import { runParityChecks, PARITY_INVARIANTS } from './helpers/parity-harness';
import type { ParityBaseline } from './helpers/capture-parity-baseline';
const REPO_ROOT = path.resolve(import.meta.dir, '..');
const BASELINE_PATH = path.join(REPO_ROOT, 'test', 'fixtures', 'parity-baseline-v1.53.0.0.json');
const BASELINE_PATH = path.join(REPO_ROOT, 'test', 'fixtures', 'parity-baseline-v1.57.7.0.json');
describe('parity suite vs v1.53.0.0 baseline (gate, free)', () => {
describe('parity suite vs v1.57.7.0 baseline (gate, free)', () => {
test('baseline exists', () => {
expect(fs.existsSync(BASELINE_PATH)).toBe(true);
});
+19 -2
View File
@@ -692,7 +692,7 @@ Read plan.md — that's the plan to review. This is a standalone plan document,
Proceed directly to the full review. Skip any AskUserQuestion calls this is non-interactive.
Skip the preamble bash block, lake intro, telemetry, and contributor mode sections.
CRITICAL REQUIREMENT: plan.md IS the plan file for this review session. After completing your review, you MUST write a "## GSTACK REVIEW REPORT" section to the END of plan.md, exactly as described in the "Plan File Review Report" section of SKILL.md. If gstack-review-read is not available or returns NO_REVIEWS, write the placeholder table with all four review rows (CEO, Codex, Eng, Design). Use the Edit tool to append to plan.md do NOT overwrite the existing plan content.
CRITICAL REQUIREMENT: plan.md IS the plan file for this review session. After completing your review, you MUST write a "## GSTACK REVIEW REPORT" section to the END of plan.md, exactly as described in the "Plan File Review Report" section of SKILL.md. If gstack-review-read is not available or returns NO_REVIEWS, write the placeholder table with all five review rows (CEO, Codex, Eng, Design, DX). The report MUST end with the mandatory unresolved-decisions status as its final line the exact unbolded line NO UNRESOLVED DECISIONS when nothing is open, or a "**UNRESOLVED DECISIONS:**" block of bullets when items remain. Nothing may follow it. Use the Edit tool to append to plan.md do NOT overwrite the existing plan content.
This review report at the bottom of the plan is the MOST IMPORTANT deliverable of this test.`,
workingDirectory: planDir,
@@ -741,7 +741,24 @@ This review report at the bottom of the plan is the MOST IMPORTANT deliverable o
expect(afterReport).toContain('Eng Review');
expect(afterReport).toContain('Design Review');
console.log('Plan review report found at bottom of plan.md');
// Mandatory unresolved-decisions status (plan-flag-unresolved-issues): the report's
// final non-whitespace line must be the unresolved status — the exact sentinel or a
// bullet of an UNRESOLVED DECISIONS block, with nothing (CODEX/CROSS-MODEL/VERDICT/
// prose) after it.
expect(afterReport).toContain('UNRESOLVED DECISIONS');
// Compute from afterReport (the report section to EOF), not the whole file, so a
// mid-file report surfaces the real trailing content in the failure message.
const nonEmpty = afterReport.split('\n').map(l => l.trim()).filter(l => l !== '');
const lastLine = nonEmpty[nonEmpty.length - 1];
const isSentinel = lastLine === 'NO UNRESOLVED DECISIONS';
const isUnresolvedBullet =
/^[-*]\s+/.test(lastLine) && !/VERDICT/i.test(lastLine) && afterReport.includes('UNRESOLVED DECISIONS:');
expect(
isSentinel || isUnresolvedBullet,
`report must end with the unresolved-decisions status; last line was: ${lastLine}`,
).toBe(true);
console.log('Plan review report found at bottom of plan.md (ends with unresolved status)');
}, 420_000);
});