From 54f94bec2e23853c3f86bbf0e7052b1f8cc07a3f Mon Sep 17 00:00:00 2001 From: Garry Tan Date: Sun, 7 Jun 2026 22:48:53 -0700 Subject: [PATCH] feat(review): make unresolved-decisions status mandatory in GSTACK REVIEW REPORT MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The report's UNRESOLVED line was optional ('omit if empty') and the EXIT PLAN MODE GATE only checked it 'if applicable', so a plan could ship with no statement about open decisions at all — a missed ambiguity read identically to a clean plan. Now every report ends with a mandatory unresolved-decisions status as its final line: either the exact unbolded sentinel 'NO UNRESOLVED DECISIONS', or a '**UNRESOLVED DECISIONS:**' block of bullets. The gate blocks ExitPlanMode unless that final line is present. generatePlanFileReviewReport: current-review items are listed from context; prior reviews contribute an aggregate count computed as latest-fresh-row- per-skill minus the current run (no double-count, dashboard 7-day window). generateExitPlanModeGate: check #3 is now blocking with no 'if applicable' escape; bolded sentinel does not satisfy it. Tests: static guard in gen-skill-docs.test.ts asserts the mandatory status across all six report consumers and the gate across gate-bearing skills; skill-e2e-plan.test.ts asserts the written report's final line is the status (and fixes a stale 'four review rows' -> five-row prompt). Co-Authored-By: Claude Opus 4.8 (1M context) --- codex/SKILL.md | 52 ++++++++++++++-- devex-review/SKILL.md | 37 +++++++++++- plan-ceo-review/SKILL.md | 15 +++-- plan-ceo-review/sections/review-sections.md | 37 +++++++++++- plan-design-review/SKILL.md | 15 +++-- .../sections/review-sections.md | 37 +++++++++++- plan-devex-review/SKILL.md | 15 +++-- plan-devex-review/sections/review-sections.md | 37 +++++++++++- plan-eng-review/SKILL.md | 15 +++-- plan-eng-review/sections/review-sections.md | 37 +++++++++++- scripts/resolvers/review.ts | 52 ++++++++++++++-- test/gen-skill-docs.test.ts | 59 +++++++++++++++++++ test/skill-e2e-plan.test.ts | 19 +++++- 13 files changed, 387 insertions(+), 40 deletions(-) diff --git a/codex/SKILL.md b/codex/SKILL.md index 7f602147d..259b29239 100644 --- a/codex/SKILL.md +++ b/codex/SKILL.md @@ -1082,14 +1082,47 @@ Produce this markdown table: | DX Review | \`/plan-devex-review\` | Developer experience gaps | {runs} | {status} | {findings} | \`\`\` -Below the table, add these lines (omit any that are empty/not applicable): +Below the table, add these lines. **CODEX** and **CROSS-MODEL** are optional (omit when +empty); **VERDICT** is always present: - **CODEX:** (only if codex-review ran) — one-line summary of codex fixes - **CROSS-MODEL:** (only if both Claude and Codex reviews exist) — overlap analysis -- **UNRESOLVED:** total unresolved decisions across all reviews - **VERDICT:** list reviews that are CLEAR (e.g., "CEO + ENG CLEARED — ready to implement"). If Eng Review is not CLEAR and not skipped globally, append "eng review required". +**Unresolved-decisions status (MANDATORY — the final content of every report).** +After the VERDICT line, the report MUST end with an unresolved-decisions status. This is +never omitted — it does NOT fall under the "omit when empty" rule above. It is the final +non-whitespace line(s) of the report section, written as content under the +\`## GSTACK REVIEW REPORT\` heading (a bold label — never a new \`## \` heading, +which would break the terminal-heading requirement). It is the last thing the user reads +before the approval prompt: either what is still open, or an explicit all-clear. + +Emit exactly one of: + +- **None open:** a single line containing exactly \`NO UNRESOLVED DECISIONS\`, unbolded + and with no markdown emphasis. A bolded \`**NO UNRESOLVED DECISIONS**\` does NOT count + — keep it plain so the gate's exact-line check can't be gamed by emphasis. +- **Some open:** a bold header line \`**UNRESOLVED DECISIONS:**\` followed by one bullet + per unresolved item; the last bullet is the final line of the report. Each bullet names + the decision and what breaks if it ships deferred (e.g. + "- Auth provider choice — if deferred: the login flow can't be built"). If prior + reviews carry unresolved items, add one final bullet + "+ N unresolved from prior reviews — see review rows above". + +Compute the status (this avoids double-counting the current review): +1. **Current-review items:** list every unanswered AskUserQuestion / deferred decision + from THIS review's Unresolved Decisions section, one bullet each. You have these in + context — do not reconstruct them from the log. +2. **Prior-reviews count:** the log stores only \`unresolved:N\` counts, and the current + review's row is already written before this report runs. Take the latest fresh row per + review skill (same 7-day window as the dashboard), DROP the current skill's row + entirely (do NOT backfill an older row for it), and sum \`unresolved\` across the + rest. Add the "+ N unresolved from prior reviews" bullet only when that sum is > 0. + Never sum all log rows — reruns would double-count. +3. **Sentinel rule:** emit \`NO UNRESOLVED DECISIONS\` if and only if step 1 produced + zero items AND the step 2 count is zero. + ### Write to the plan file **PLAN MODE EXCEPTION — ALWAYS RUN:** This writes to the plan file, which is the one @@ -1130,12 +1163,19 @@ missing work — do NOT call ExitPlanMode: In-body prose that mentions "outside voice", "codex findings", or similar does NOT count — only the structured `## GSTACK REVIEW REPORT` section satisfies this check. -3. Confirm the report contains: a Runs / Status / Findings table, a VERDICT - line, and absorbs CODEX / CROSS-MODEL / UNRESOLVED lines if applicable. -4. If a plan file is in context for this skill invocation: confirm +3. Confirm the report contains a Runs / Status / Findings table and a VERDICT + line (CODEX / CROSS-MODEL lines are absorbed if applicable). +4. Confirm the report's FINAL non-whitespace line is the unresolved-decisions + status — either the exact unbolded sentinel `NO UNRESOLVED DECISIONS`, or a + bullet belonging to a `**UNRESOLVED DECISIONS:**` block that has at least one + bullet. This check is BLOCKING with no "if applicable" escape: a report with no + unresolved status, or with anything (CODEX / CROSS-MODEL / VERDICT / prose) + after it, FAILS the gate. A bolded `**NO UNRESOLVED DECISIONS**` does NOT + satisfy the sentinel — it must be the plain unbolded line. +5. If a plan file is in context for this skill invocation: confirm `gstack-review-log` was called and `gstack-review-read` was run at least once. If no plan file is in context (e.g. `/codex consult` against a - diff with no plan), this check short-circuits — checks 1-3 already + diff with no plan), this check short-circuits — checks 1-4 already short-circuit when no plan file exists. Failing this gate and calling ExitPlanMode anyway is a contract violation — diff --git a/devex-review/SKILL.md b/devex-review/SKILL.md index 00ad9eba9..6803e5a5d 100644 --- a/devex-review/SKILL.md +++ b/devex-review/SKILL.md @@ -1146,14 +1146,47 @@ Produce this markdown table: | DX Review | \`/plan-devex-review\` | Developer experience gaps | {runs} | {status} | {findings} | \`\`\` -Below the table, add these lines (omit any that are empty/not applicable): +Below the table, add these lines. **CODEX** and **CROSS-MODEL** are optional (omit when +empty); **VERDICT** is always present: - **CODEX:** (only if codex-review ran) — one-line summary of codex fixes - **CROSS-MODEL:** (only if both Claude and Codex reviews exist) — overlap analysis -- **UNRESOLVED:** total unresolved decisions across all reviews - **VERDICT:** list reviews that are CLEAR (e.g., "CEO + ENG CLEARED — ready to implement"). If Eng Review is not CLEAR and not skipped globally, append "eng review required". +**Unresolved-decisions status (MANDATORY — the final content of every report).** +After the VERDICT line, the report MUST end with an unresolved-decisions status. This is +never omitted — it does NOT fall under the "omit when empty" rule above. It is the final +non-whitespace line(s) of the report section, written as content under the +\`## GSTACK REVIEW REPORT\` heading (a bold label — never a new \`## \` heading, +which would break the terminal-heading requirement). It is the last thing the user reads +before the approval prompt: either what is still open, or an explicit all-clear. + +Emit exactly one of: + +- **None open:** a single line containing exactly \`NO UNRESOLVED DECISIONS\`, unbolded + and with no markdown emphasis. A bolded \`**NO UNRESOLVED DECISIONS**\` does NOT count + — keep it plain so the gate's exact-line check can't be gamed by emphasis. +- **Some open:** a bold header line \`**UNRESOLVED DECISIONS:**\` followed by one bullet + per unresolved item; the last bullet is the final line of the report. Each bullet names + the decision and what breaks if it ships deferred (e.g. + "- Auth provider choice — if deferred: the login flow can't be built"). If prior + reviews carry unresolved items, add one final bullet + "+ N unresolved from prior reviews — see review rows above". + +Compute the status (this avoids double-counting the current review): +1. **Current-review items:** list every unanswered AskUserQuestion / deferred decision + from THIS review's Unresolved Decisions section, one bullet each. You have these in + context — do not reconstruct them from the log. +2. **Prior-reviews count:** the log stores only \`unresolved:N\` counts, and the current + review's row is already written before this report runs. Take the latest fresh row per + review skill (same 7-day window as the dashboard), DROP the current skill's row + entirely (do NOT backfill an older row for it), and sum \`unresolved\` across the + rest. Add the "+ N unresolved from prior reviews" bullet only when that sum is > 0. + Never sum all log rows — reruns would double-count. +3. **Sentinel rule:** emit \`NO UNRESOLVED DECISIONS\` if and only if step 1 produced + zero items AND the step 2 count is zero. + ### Write to the plan file **PLAN MODE EXCEPTION — ALWAYS RUN:** This writes to the plan file, which is the one diff --git a/plan-ceo-review/SKILL.md b/plan-ceo-review/SKILL.md index e7d4bf68d..5c71e7ef0 100644 --- a/plan-ceo-review/SKILL.md +++ b/plan-ceo-review/SKILL.md @@ -1383,12 +1383,19 @@ missing work — do NOT call ExitPlanMode: In-body prose that mentions "outside voice", "codex findings", or similar does NOT count — only the structured `## GSTACK REVIEW REPORT` section satisfies this check. -3. Confirm the report contains: a Runs / Status / Findings table, a VERDICT - line, and absorbs CODEX / CROSS-MODEL / UNRESOLVED lines if applicable. -4. If a plan file is in context for this skill invocation: confirm +3. Confirm the report contains a Runs / Status / Findings table and a VERDICT + line (CODEX / CROSS-MODEL lines are absorbed if applicable). +4. Confirm the report's FINAL non-whitespace line is the unresolved-decisions + status — either the exact unbolded sentinel `NO UNRESOLVED DECISIONS`, or a + bullet belonging to a `**UNRESOLVED DECISIONS:**` block that has at least one + bullet. This check is BLOCKING with no "if applicable" escape: a report with no + unresolved status, or with anything (CODEX / CROSS-MODEL / VERDICT / prose) + after it, FAILS the gate. A bolded `**NO UNRESOLVED DECISIONS**` does NOT + satisfy the sentinel — it must be the plain unbolded line. +5. If a plan file is in context for this skill invocation: confirm `gstack-review-log` was called and `gstack-review-read` was run at least once. If no plan file is in context (e.g. `/codex consult` against a - diff with no plan), this check short-circuits — checks 1-3 already + diff with no plan), this check short-circuits — checks 1-4 already short-circuit when no plan file exists. Failing this gate and calling ExitPlanMode anyway is a contract violation — diff --git a/plan-ceo-review/sections/review-sections.md b/plan-ceo-review/sections/review-sections.md index 9da3ee88e..afce3c990 100644 --- a/plan-ceo-review/sections/review-sections.md +++ b/plan-ceo-review/sections/review-sections.md @@ -709,14 +709,47 @@ Produce this markdown table: | DX Review | \`/plan-devex-review\` | Developer experience gaps | {runs} | {status} | {findings} | \`\`\` -Below the table, add these lines (omit any that are empty/not applicable): +Below the table, add these lines. **CODEX** and **CROSS-MODEL** are optional (omit when +empty); **VERDICT** is always present: - **CODEX:** (only if codex-review ran) — one-line summary of codex fixes - **CROSS-MODEL:** (only if both Claude and Codex reviews exist) — overlap analysis -- **UNRESOLVED:** total unresolved decisions across all reviews - **VERDICT:** list reviews that are CLEAR (e.g., "CEO + ENG CLEARED — ready to implement"). If Eng Review is not CLEAR and not skipped globally, append "eng review required". +**Unresolved-decisions status (MANDATORY — the final content of every report).** +After the VERDICT line, the report MUST end with an unresolved-decisions status. This is +never omitted — it does NOT fall under the "omit when empty" rule above. It is the final +non-whitespace line(s) of the report section, written as content under the +\`## GSTACK REVIEW REPORT\` heading (a bold label — never a new \`## \` heading, +which would break the terminal-heading requirement). It is the last thing the user reads +before the approval prompt: either what is still open, or an explicit all-clear. + +Emit exactly one of: + +- **None open:** a single line containing exactly \`NO UNRESOLVED DECISIONS\`, unbolded + and with no markdown emphasis. A bolded \`**NO UNRESOLVED DECISIONS**\` does NOT count + — keep it plain so the gate's exact-line check can't be gamed by emphasis. +- **Some open:** a bold header line \`**UNRESOLVED DECISIONS:**\` followed by one bullet + per unresolved item; the last bullet is the final line of the report. Each bullet names + the decision and what breaks if it ships deferred (e.g. + "- Auth provider choice — if deferred: the login flow can't be built"). If prior + reviews carry unresolved items, add one final bullet + "+ N unresolved from prior reviews — see review rows above". + +Compute the status (this avoids double-counting the current review): +1. **Current-review items:** list every unanswered AskUserQuestion / deferred decision + from THIS review's Unresolved Decisions section, one bullet each. You have these in + context — do not reconstruct them from the log. +2. **Prior-reviews count:** the log stores only \`unresolved:N\` counts, and the current + review's row is already written before this report runs. Take the latest fresh row per + review skill (same 7-day window as the dashboard), DROP the current skill's row + entirely (do NOT backfill an older row for it), and sum \`unresolved\` across the + rest. Add the "+ N unresolved from prior reviews" bullet only when that sum is > 0. + Never sum all log rows — reruns would double-count. +3. **Sentinel rule:** emit \`NO UNRESOLVED DECISIONS\` if and only if step 1 produced + zero items AND the step 2 count is zero. + ### Write to the plan file **PLAN MODE EXCEPTION — ALWAYS RUN:** This writes to the plan file, which is the one diff --git a/plan-design-review/SKILL.md b/plan-design-review/SKILL.md index 502b13ae9..2c83388be 100644 --- a/plan-design-review/SKILL.md +++ b/plan-design-review/SKILL.md @@ -1404,12 +1404,19 @@ missing work — do NOT call ExitPlanMode: In-body prose that mentions "outside voice", "codex findings", or similar does NOT count — only the structured `## GSTACK REVIEW REPORT` section satisfies this check. -3. Confirm the report contains: a Runs / Status / Findings table, a VERDICT - line, and absorbs CODEX / CROSS-MODEL / UNRESOLVED lines if applicable. -4. If a plan file is in context for this skill invocation: confirm +3. Confirm the report contains a Runs / Status / Findings table and a VERDICT + line (CODEX / CROSS-MODEL lines are absorbed if applicable). +4. Confirm the report's FINAL non-whitespace line is the unresolved-decisions + status — either the exact unbolded sentinel `NO UNRESOLVED DECISIONS`, or a + bullet belonging to a `**UNRESOLVED DECISIONS:**` block that has at least one + bullet. This check is BLOCKING with no "if applicable" escape: a report with no + unresolved status, or with anything (CODEX / CROSS-MODEL / VERDICT / prose) + after it, FAILS the gate. A bolded `**NO UNRESOLVED DECISIONS**` does NOT + satisfy the sentinel — it must be the plain unbolded line. +5. If a plan file is in context for this skill invocation: confirm `gstack-review-log` was called and `gstack-review-read` was run at least once. If no plan file is in context (e.g. `/codex consult` against a - diff with no plan), this check short-circuits — checks 1-3 already + diff with no plan), this check short-circuits — checks 1-4 already short-circuit when no plan file exists. Failing this gate and calling ExitPlanMode anyway is a contract violation — diff --git a/plan-design-review/sections/review-sections.md b/plan-design-review/sections/review-sections.md index 0d641198d..2b291715e 100644 --- a/plan-design-review/sections/review-sections.md +++ b/plan-design-review/sections/review-sections.md @@ -458,14 +458,47 @@ Produce this markdown table: | DX Review | \`/plan-devex-review\` | Developer experience gaps | {runs} | {status} | {findings} | \`\`\` -Below the table, add these lines (omit any that are empty/not applicable): +Below the table, add these lines. **CODEX** and **CROSS-MODEL** are optional (omit when +empty); **VERDICT** is always present: - **CODEX:** (only if codex-review ran) — one-line summary of codex fixes - **CROSS-MODEL:** (only if both Claude and Codex reviews exist) — overlap analysis -- **UNRESOLVED:** total unresolved decisions across all reviews - **VERDICT:** list reviews that are CLEAR (e.g., "CEO + ENG CLEARED — ready to implement"). If Eng Review is not CLEAR and not skipped globally, append "eng review required". +**Unresolved-decisions status (MANDATORY — the final content of every report).** +After the VERDICT line, the report MUST end with an unresolved-decisions status. This is +never omitted — it does NOT fall under the "omit when empty" rule above. It is the final +non-whitespace line(s) of the report section, written as content under the +\`## GSTACK REVIEW REPORT\` heading (a bold label — never a new \`## \` heading, +which would break the terminal-heading requirement). It is the last thing the user reads +before the approval prompt: either what is still open, or an explicit all-clear. + +Emit exactly one of: + +- **None open:** a single line containing exactly \`NO UNRESOLVED DECISIONS\`, unbolded + and with no markdown emphasis. A bolded \`**NO UNRESOLVED DECISIONS**\` does NOT count + — keep it plain so the gate's exact-line check can't be gamed by emphasis. +- **Some open:** a bold header line \`**UNRESOLVED DECISIONS:**\` followed by one bullet + per unresolved item; the last bullet is the final line of the report. Each bullet names + the decision and what breaks if it ships deferred (e.g. + "- Auth provider choice — if deferred: the login flow can't be built"). If prior + reviews carry unresolved items, add one final bullet + "+ N unresolved from prior reviews — see review rows above". + +Compute the status (this avoids double-counting the current review): +1. **Current-review items:** list every unanswered AskUserQuestion / deferred decision + from THIS review's Unresolved Decisions section, one bullet each. You have these in + context — do not reconstruct them from the log. +2. **Prior-reviews count:** the log stores only \`unresolved:N\` counts, and the current + review's row is already written before this report runs. Take the latest fresh row per + review skill (same 7-day window as the dashboard), DROP the current skill's row + entirely (do NOT backfill an older row for it), and sum \`unresolved\` across the + rest. Add the "+ N unresolved from prior reviews" bullet only when that sum is > 0. + Never sum all log rows — reruns would double-count. +3. **Sentinel rule:** emit \`NO UNRESOLVED DECISIONS\` if and only if step 1 produced + zero items AND the step 2 count is zero. + ### Write to the plan file **PLAN MODE EXCEPTION — ALWAYS RUN:** This writes to the plan file, which is the one diff --git a/plan-devex-review/SKILL.md b/plan-devex-review/SKILL.md index af3a3eeec..b268c6cbd 100644 --- a/plan-devex-review/SKILL.md +++ b/plan-devex-review/SKILL.md @@ -1367,12 +1367,19 @@ missing work — do NOT call ExitPlanMode: In-body prose that mentions "outside voice", "codex findings", or similar does NOT count — only the structured `## GSTACK REVIEW REPORT` section satisfies this check. -3. Confirm the report contains: a Runs / Status / Findings table, a VERDICT - line, and absorbs CODEX / CROSS-MODEL / UNRESOLVED lines if applicable. -4. If a plan file is in context for this skill invocation: confirm +3. Confirm the report contains a Runs / Status / Findings table and a VERDICT + line (CODEX / CROSS-MODEL lines are absorbed if applicable). +4. Confirm the report's FINAL non-whitespace line is the unresolved-decisions + status — either the exact unbolded sentinel `NO UNRESOLVED DECISIONS`, or a + bullet belonging to a `**UNRESOLVED DECISIONS:**` block that has at least one + bullet. This check is BLOCKING with no "if applicable" escape: a report with no + unresolved status, or with anything (CODEX / CROSS-MODEL / VERDICT / prose) + after it, FAILS the gate. A bolded `**NO UNRESOLVED DECISIONS**` does NOT + satisfy the sentinel — it must be the plain unbolded line. +5. If a plan file is in context for this skill invocation: confirm `gstack-review-log` was called and `gstack-review-read` was run at least once. If no plan file is in context (e.g. `/codex consult` against a - diff with no plan), this check short-circuits — checks 1-3 already + diff with no plan), this check short-circuits — checks 1-4 already short-circuit when no plan file exists. Failing this gate and calling ExitPlanMode anyway is a contract violation — diff --git a/plan-devex-review/sections/review-sections.md b/plan-devex-review/sections/review-sections.md index 17a3811cf..5979acd86 100644 --- a/plan-devex-review/sections/review-sections.md +++ b/plan-devex-review/sections/review-sections.md @@ -701,14 +701,47 @@ Produce this markdown table: | DX Review | \`/plan-devex-review\` | Developer experience gaps | {runs} | {status} | {findings} | \`\`\` -Below the table, add these lines (omit any that are empty/not applicable): +Below the table, add these lines. **CODEX** and **CROSS-MODEL** are optional (omit when +empty); **VERDICT** is always present: - **CODEX:** (only if codex-review ran) — one-line summary of codex fixes - **CROSS-MODEL:** (only if both Claude and Codex reviews exist) — overlap analysis -- **UNRESOLVED:** total unresolved decisions across all reviews - **VERDICT:** list reviews that are CLEAR (e.g., "CEO + ENG CLEARED — ready to implement"). If Eng Review is not CLEAR and not skipped globally, append "eng review required". +**Unresolved-decisions status (MANDATORY — the final content of every report).** +After the VERDICT line, the report MUST end with an unresolved-decisions status. This is +never omitted — it does NOT fall under the "omit when empty" rule above. It is the final +non-whitespace line(s) of the report section, written as content under the +\`## GSTACK REVIEW REPORT\` heading (a bold label — never a new \`## \` heading, +which would break the terminal-heading requirement). It is the last thing the user reads +before the approval prompt: either what is still open, or an explicit all-clear. + +Emit exactly one of: + +- **None open:** a single line containing exactly \`NO UNRESOLVED DECISIONS\`, unbolded + and with no markdown emphasis. A bolded \`**NO UNRESOLVED DECISIONS**\` does NOT count + — keep it plain so the gate's exact-line check can't be gamed by emphasis. +- **Some open:** a bold header line \`**UNRESOLVED DECISIONS:**\` followed by one bullet + per unresolved item; the last bullet is the final line of the report. Each bullet names + the decision and what breaks if it ships deferred (e.g. + "- Auth provider choice — if deferred: the login flow can't be built"). If prior + reviews carry unresolved items, add one final bullet + "+ N unresolved from prior reviews — see review rows above". + +Compute the status (this avoids double-counting the current review): +1. **Current-review items:** list every unanswered AskUserQuestion / deferred decision + from THIS review's Unresolved Decisions section, one bullet each. You have these in + context — do not reconstruct them from the log. +2. **Prior-reviews count:** the log stores only \`unresolved:N\` counts, and the current + review's row is already written before this report runs. Take the latest fresh row per + review skill (same 7-day window as the dashboard), DROP the current skill's row + entirely (do NOT backfill an older row for it), and sum \`unresolved\` across the + rest. Add the "+ N unresolved from prior reviews" bullet only when that sum is > 0. + Never sum all log rows — reruns would double-count. +3. **Sentinel rule:** emit \`NO UNRESOLVED DECISIONS\` if and only if step 1 produced + zero items AND the step 2 count is zero. + ### Write to the plan file **PLAN MODE EXCEPTION — ALWAYS RUN:** This writes to the plan file, which is the one diff --git a/plan-eng-review/SKILL.md b/plan-eng-review/SKILL.md index 1b5b3d90d..348d8a8eb 100644 --- a/plan-eng-review/SKILL.md +++ b/plan-eng-review/SKILL.md @@ -939,12 +939,19 @@ missing work — do NOT call ExitPlanMode: In-body prose that mentions "outside voice", "codex findings", or similar does NOT count — only the structured `## GSTACK REVIEW REPORT` section satisfies this check. -3. Confirm the report contains: a Runs / Status / Findings table, a VERDICT - line, and absorbs CODEX / CROSS-MODEL / UNRESOLVED lines if applicable. -4. If a plan file is in context for this skill invocation: confirm +3. Confirm the report contains a Runs / Status / Findings table and a VERDICT + line (CODEX / CROSS-MODEL lines are absorbed if applicable). +4. Confirm the report's FINAL non-whitespace line is the unresolved-decisions + status — either the exact unbolded sentinel `NO UNRESOLVED DECISIONS`, or a + bullet belonging to a `**UNRESOLVED DECISIONS:**` block that has at least one + bullet. This check is BLOCKING with no "if applicable" escape: a report with no + unresolved status, or with anything (CODEX / CROSS-MODEL / VERDICT / prose) + after it, FAILS the gate. A bolded `**NO UNRESOLVED DECISIONS**` does NOT + satisfy the sentinel — it must be the plain unbolded line. +5. If a plan file is in context for this skill invocation: confirm `gstack-review-log` was called and `gstack-review-read` was run at least once. If no plan file is in context (e.g. `/codex consult` against a - diff with no plan), this check short-circuits — checks 1-3 already + diff with no plan), this check short-circuits — checks 1-4 already short-circuit when no plan file exists. Failing this gate and calling ExitPlanMode anyway is a contract violation — diff --git a/plan-eng-review/sections/review-sections.md b/plan-eng-review/sections/review-sections.md index fefa4907a..d97861c3e 100644 --- a/plan-eng-review/sections/review-sections.md +++ b/plan-eng-review/sections/review-sections.md @@ -763,14 +763,47 @@ Produce this markdown table: | DX Review | \`/plan-devex-review\` | Developer experience gaps | {runs} | {status} | {findings} | \`\`\` -Below the table, add these lines (omit any that are empty/not applicable): +Below the table, add these lines. **CODEX** and **CROSS-MODEL** are optional (omit when +empty); **VERDICT** is always present: - **CODEX:** (only if codex-review ran) — one-line summary of codex fixes - **CROSS-MODEL:** (only if both Claude and Codex reviews exist) — overlap analysis -- **UNRESOLVED:** total unresolved decisions across all reviews - **VERDICT:** list reviews that are CLEAR (e.g., "CEO + ENG CLEARED — ready to implement"). If Eng Review is not CLEAR and not skipped globally, append "eng review required". +**Unresolved-decisions status (MANDATORY — the final content of every report).** +After the VERDICT line, the report MUST end with an unresolved-decisions status. This is +never omitted — it does NOT fall under the "omit when empty" rule above. It is the final +non-whitespace line(s) of the report section, written as content under the +\`## GSTACK REVIEW REPORT\` heading (a bold label — never a new \`## \` heading, +which would break the terminal-heading requirement). It is the last thing the user reads +before the approval prompt: either what is still open, or an explicit all-clear. + +Emit exactly one of: + +- **None open:** a single line containing exactly \`NO UNRESOLVED DECISIONS\`, unbolded + and with no markdown emphasis. A bolded \`**NO UNRESOLVED DECISIONS**\` does NOT count + — keep it plain so the gate's exact-line check can't be gamed by emphasis. +- **Some open:** a bold header line \`**UNRESOLVED DECISIONS:**\` followed by one bullet + per unresolved item; the last bullet is the final line of the report. Each bullet names + the decision and what breaks if it ships deferred (e.g. + "- Auth provider choice — if deferred: the login flow can't be built"). If prior + reviews carry unresolved items, add one final bullet + "+ N unresolved from prior reviews — see review rows above". + +Compute the status (this avoids double-counting the current review): +1. **Current-review items:** list every unanswered AskUserQuestion / deferred decision + from THIS review's Unresolved Decisions section, one bullet each. You have these in + context — do not reconstruct them from the log. +2. **Prior-reviews count:** the log stores only \`unresolved:N\` counts, and the current + review's row is already written before this report runs. Take the latest fresh row per + review skill (same 7-day window as the dashboard), DROP the current skill's row + entirely (do NOT backfill an older row for it), and sum \`unresolved\` across the + rest. Add the "+ N unresolved from prior reviews" bullet only when that sum is > 0. + Never sum all log rows — reruns would double-count. +3. **Sentinel rule:** emit \`NO UNRESOLVED DECISIONS\` if and only if step 1 produced + zero items AND the step 2 count is zero. + ### Write to the plan file **PLAN MODE EXCEPTION — ALWAYS RUN:** This writes to the plan file, which is the one diff --git a/scripts/resolvers/review.ts b/scripts/resolvers/review.ts index 0c7cb8230..1b590cff2 100644 --- a/scripts/resolvers/review.ts +++ b/scripts/resolvers/review.ts @@ -119,14 +119,47 @@ Produce this markdown table: | DX Review | \\\`/plan-devex-review\\\` | Developer experience gaps | {runs} | {status} | {findings} | \\\`\\\`\\\` -Below the table, add these lines (omit any that are empty/not applicable): +Below the table, add these lines. **CODEX** and **CROSS-MODEL** are optional (omit when +empty); **VERDICT** is always present: - **CODEX:** (only if codex-review ran) — one-line summary of codex fixes - **CROSS-MODEL:** (only if both Claude and Codex reviews exist) — overlap analysis -- **UNRESOLVED:** total unresolved decisions across all reviews - **VERDICT:** list reviews that are CLEAR (e.g., "CEO + ENG CLEARED — ready to implement"). If Eng Review is not CLEAR and not skipped globally, append "eng review required". +**Unresolved-decisions status (MANDATORY — the final content of every report).** +After the VERDICT line, the report MUST end with an unresolved-decisions status. This is +never omitted — it does NOT fall under the "omit when empty" rule above. It is the final +non-whitespace line(s) of the report section, written as content under the +\\\`## GSTACK REVIEW REPORT\\\` heading (a bold label — never a new \\\`## \\\` heading, +which would break the terminal-heading requirement). It is the last thing the user reads +before the approval prompt: either what is still open, or an explicit all-clear. + +Emit exactly one of: + +- **None open:** a single line containing exactly \\\`NO UNRESOLVED DECISIONS\\\`, unbolded + and with no markdown emphasis. A bolded \\\`**NO UNRESOLVED DECISIONS**\\\` does NOT count + — keep it plain so the gate's exact-line check can't be gamed by emphasis. +- **Some open:** a bold header line \\\`**UNRESOLVED DECISIONS:**\\\` followed by one bullet + per unresolved item; the last bullet is the final line of the report. Each bullet names + the decision and what breaks if it ships deferred (e.g. + "- Auth provider choice — if deferred: the login flow can't be built"). If prior + reviews carry unresolved items, add one final bullet + "+ N unresolved from prior reviews — see review rows above". + +Compute the status (this avoids double-counting the current review): +1. **Current-review items:** list every unanswered AskUserQuestion / deferred decision + from THIS review's Unresolved Decisions section, one bullet each. You have these in + context — do not reconstruct them from the log. +2. **Prior-reviews count:** the log stores only \\\`unresolved:N\\\` counts, and the current + review's row is already written before this report runs. Take the latest fresh row per + review skill (same 7-day window as the dashboard), DROP the current skill's row + entirely (do NOT backfill an older row for it), and sum \\\`unresolved\\\` across the + rest. Add the "+ N unresolved from prior reviews" bullet only when that sum is > 0. + Never sum all log rows — reruns would double-count. +3. **Sentinel rule:** emit \\\`NO UNRESOLVED DECISIONS\\\` if and only if step 1 produced + zero items AND the step 2 count is zero. + ### Write to the plan file **PLAN MODE EXCEPTION — ALWAYS RUN:** This writes to the plan file, which is the one @@ -169,12 +202,19 @@ missing work — do NOT call ExitPlanMode: In-body prose that mentions "outside voice", "codex findings", or similar does NOT count — only the structured \`## GSTACK REVIEW REPORT\` section satisfies this check. -3. Confirm the report contains: a Runs / Status / Findings table, a VERDICT - line, and absorbs CODEX / CROSS-MODEL / UNRESOLVED lines if applicable. -4. If a plan file is in context for this skill invocation: confirm +3. Confirm the report contains a Runs / Status / Findings table and a VERDICT + line (CODEX / CROSS-MODEL lines are absorbed if applicable). +4. Confirm the report's FINAL non-whitespace line is the unresolved-decisions + status — either the exact unbolded sentinel \`NO UNRESOLVED DECISIONS\`, or a + bullet belonging to a \`**UNRESOLVED DECISIONS:**\` block that has at least one + bullet. This check is BLOCKING with no "if applicable" escape: a report with no + unresolved status, or with anything (CODEX / CROSS-MODEL / VERDICT / prose) + after it, FAILS the gate. A bolded \`**NO UNRESOLVED DECISIONS**\` does NOT + satisfy the sentinel — it must be the plain unbolded line. +5. If a plan file is in context for this skill invocation: confirm \`gstack-review-log\` was called and \`gstack-review-read\` was run at least once. If no plan file is in context (e.g. \`/codex consult\` against a - diff with no plan), this check short-circuits — checks 1-3 already + diff with no plan), this check short-circuits — checks 1-4 already short-circuit when no plan file exists. Failing this gate and calling ExitPlanMode anyway is a contract violation — diff --git a/test/gen-skill-docs.test.ts b/test/gen-skill-docs.test.ts index 3554094ca..a676190a2 100644 --- a/test/gen-skill-docs.test.ts +++ b/test/gen-skill-docs.test.ts @@ -3214,3 +3214,62 @@ describe('EXIT PLAN MODE GATE placement', () => { expect(codex).toContain('Failing this gate and calling ExitPlanMode anyway is a contract violation'); }); }); + +describe('GSTACK REVIEW REPORT mandatory unresolved-decisions status', () => { + // Report text rides in PLAN_FILE_REVIEW_REPORT → every report consumer gets it. + // devex-review is a report consumer but NOT a gate consumer, so the two target + // sets differ (CP5/CX5). Regression guard: a future token-cut that drops the + // unresolved-status line again fails here. See plan-flag-unresolved-issues. + const REPORT_CONSUMERS = [ + 'plan-ceo-review', + 'plan-eng-review', + 'plan-design-review', + 'plan-devex-review', + 'codex', + 'devex-review', + ]; + // Gate text rides in EXIT_PLAN_MODE_GATE (lives in SKILL.md, not sections). + const GATE_SKILLS = [ + 'plan-ceo-review', + 'plan-eng-review', + 'plan-design-review', + 'plan-devex-review', + 'codex', + ]; + + for (const skill of REPORT_CONSUMERS) { + test(`${skill}: report mandates the unresolved-decisions status as final content`, () => { + const content = readSkillUnion(skill); + expect(content).toContain('NO UNRESOLVED DECISIONS'); + // The "never omit / always final" contract must be present, not just the phrase. + expect(content).toContain('Unresolved-decisions status (MANDATORY'); + expect(content).toMatch(/never omitted/); + // \s+ tolerates the prose line-wrap between "final" and "non-whitespace". + expect(content).toMatch(/final\s+non-whitespace line/); + }); + } + + for (const skill of GATE_SKILLS) { + test(`${skill}: exit gate blocks unless the unresolved status is the final line`, () => { + const md = fs.readFileSync(path.join(ROOT, skill, 'SKILL.md'), 'utf-8'); + // Gate check #4 — present, sentinel named, and explicitly blocking (no escape). + expect(md).toContain('NO UNRESOLVED DECISIONS'); + expect(md).toContain('FINAL non-whitespace line is the unresolved-decisions'); + expect(md).toContain('FAILS the gate'); + }); + } + + test('scripts/resolvers/review.ts source carries the mandatory block + blocking gate', () => { + const src = fs.readFileSync(path.join(ROOT, 'scripts', 'resolvers', 'review.ts'), 'utf-8'); + // Report resolver: mandatory, never-omitted, exact sentinel, anti-double-count algorithm. + expect(src).toContain('Unresolved-decisions status (MANDATORY'); + expect(src).toContain('NO UNRESOLVED DECISIONS'); + expect(src).toContain('avoids double-counting'); + expect(src).toContain('DROP the current skill'); + // Gate resolver: the blocking final-line check with no "if applicable" escape. + expect(src).toContain('FINAL non-whitespace line is the unresolved-decisions'); + expect(src).toContain('FAILS the gate'); + // The old soft wording must be gone from the gate. + expect(src).not.toContain('absorbs CODEX / CROSS-MODEL / UNRESOLVED lines if applicable'); + }); +}); diff --git a/test/skill-e2e-plan.test.ts b/test/skill-e2e-plan.test.ts index 98fded4bb..042cfd069 100644 --- a/test/skill-e2e-plan.test.ts +++ b/test/skill-e2e-plan.test.ts @@ -692,7 +692,7 @@ Read plan.md — that's the plan to review. This is a standalone plan document, Proceed directly to the full review. Skip any AskUserQuestion calls — this is non-interactive. Skip the preamble bash block, lake intro, telemetry, and contributor mode sections. -CRITICAL REQUIREMENT: plan.md IS the plan file for this review session. After completing your review, you MUST write a "## GSTACK REVIEW REPORT" section to the END of plan.md, exactly as described in the "Plan File Review Report" section of SKILL.md. If gstack-review-read is not available or returns NO_REVIEWS, write the placeholder table with all four review rows (CEO, Codex, Eng, Design). Use the Edit tool to append to plan.md — do NOT overwrite the existing plan content. +CRITICAL REQUIREMENT: plan.md IS the plan file for this review session. After completing your review, you MUST write a "## GSTACK REVIEW REPORT" section to the END of plan.md, exactly as described in the "Plan File Review Report" section of SKILL.md. If gstack-review-read is not available or returns NO_REVIEWS, write the placeholder table with all five review rows (CEO, Codex, Eng, Design, DX). The report MUST end with the mandatory unresolved-decisions status as its final line — the exact unbolded line NO UNRESOLVED DECISIONS when nothing is open, or a "**UNRESOLVED DECISIONS:**" block of bullets when items remain. Nothing may follow it. Use the Edit tool to append to plan.md — do NOT overwrite the existing plan content. This review report at the bottom of the plan is the MOST IMPORTANT deliverable of this test.`, workingDirectory: planDir, @@ -741,7 +741,22 @@ This review report at the bottom of the plan is the MOST IMPORTANT deliverable o expect(afterReport).toContain('Eng Review'); expect(afterReport).toContain('Design Review'); - console.log('Plan review report found at bottom of plan.md'); + // Mandatory unresolved-decisions status (plan-flag-unresolved-issues): the report's + // final non-whitespace line must be the unresolved status — the exact sentinel or a + // bullet of an UNRESOLVED DECISIONS block, with nothing (CODEX/CROSS-MODEL/VERDICT/ + // prose) after it. + expect(afterReport).toContain('UNRESOLVED DECISIONS'); + const nonEmpty = planContent.split('\n').map(l => l.trim()).filter(l => l !== ''); + const lastLine = nonEmpty[nonEmpty.length - 1]; + const isSentinel = lastLine === 'NO UNRESOLVED DECISIONS'; + const isUnresolvedBullet = + /^[-*]\s+/.test(lastLine) && !/VERDICT/i.test(lastLine) && afterReport.includes('UNRESOLVED DECISIONS:'); + expect( + isSentinel || isUnresolvedBullet, + `report must end with the unresolved-decisions status; last line was: ${lastLine}`, + ).toBe(true); + + console.log('Plan review report found at bottom of plan.md (ends with unresolved status)'); }, 420_000); });