From 62c25f8b6471c58214463ed0b410c120eab5673c Mon Sep 17 00:00:00 2001 From: Garry Tan Date: Sat, 21 Mar 2026 14:42:03 -0700 Subject: [PATCH] feat: comprehensive pre-merge readiness gate in /land-and-deploy MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Step 3.5 now checks 5 dimensions before allowing merge: 1. Review staleness — compares review commit hash against HEAD, flags if significant code changes happened after last review 2. Tests — runs free tests inline, checks today's E2E and LLM eval results from ~/.gstack-dev/evals/ 3. PR body accuracy — compares PR description against actual commits, flags missing features or stale descriptions 4. Document-release — checks if CHANGELOG/VERSION were updated when new features are present in the diff 5. Full readiness report — ASCII dashboard with warnings/blockers, explicit AskUserQuestion confirmation required before merge Co-Authored-By: Claude Opus 4.6 (1M context) --- .../skills/gstack-land-and-deploy/SKILL.md | 174 +++++++++++++----- land-and-deploy/SKILL.md | 174 +++++++++++++----- land-and-deploy/SKILL.md.tmpl | 174 +++++++++++++----- 3 files changed, 399 insertions(+), 123 deletions(-) diff --git a/.agents/skills/gstack-land-and-deploy/SKILL.md b/.agents/skills/gstack-land-and-deploy/SKILL.md index 68f81f68..3f98480a 100644 --- a/.agents/skills/gstack-land-and-deploy/SKILL.md +++ b/.agents/skills/gstack-land-and-deploy/SKILL.md @@ -369,79 +369,171 @@ If timeout (15 min): **STOP.** "CI has been running for 15 minutes. Investigate ## Step 3.5: Pre-merge readiness gate -**This is the critical safety check before an irreversible merge.** Gather ALL evidence, -present it to the user, and get explicit confirmation before merging. +**This is the critical safety check before an irreversible merge.** The merge cannot +be undone without a revert commit. Gather ALL evidence, build a readiness report, +and get explicit user confirmation before proceeding. -### 3.5a: Review Readiness Dashboard +Collect evidence for each check below. Track warnings (yellow) and blockers (red). + +### 3.5a: Review staleness check ```bash ~/.codex/skills/gstack/bin/gstack-review-read 2>/dev/null ``` -Parse the output. Find the most recent entry for each review skill within the last 7 days. -Check staleness: compare each review's commit hash against the current HEAD. +Parse the output. For each review skill (plan-eng-review, plan-ceo-review, +plan-design-review, design-review-lite, codex-review): -Display the dashboard: +1. Find the most recent entry within the last 7 days. +2. Extract its `commit` field. +3. Compare against current HEAD: `git rev-list --count STORED_COMMIT..HEAD` +**Staleness rules:** +- 0 commits since review → CURRENT +- 1-3 commits since review → RECENT (yellow if those commits touch code, not just docs) +- 4+ commits since review → STALE (red — review may not reflect current code) +- No review found → NOT RUN + +**Critical check:** Look at what changed AFTER the last review. Run: +```bash +git log --oneline STORED_COMMIT..HEAD ``` -PRE-MERGE READINESS CHECK -══════════════════════════ -Review | Status | Last Run | Stale? -----------------|-----------|---------------------|-------- -Eng Review | CLEAR/— | 2026-03-21 15:00 | N commits behind -CEO Review | CLEAR/— | — | — -Design Review | CLEAR/— | — | — -Codex Review | CLEAR/— | — | — -``` +If any commits after the review contain words like "fix", "refactor", "rewrite", +"overhaul", or touch more than 5 files — flag as **STALE (significant changes +since review)**. The review was done on different code than what's about to merge. ### 3.5b: Test results -Check for recent E2E eval results: +**Free tests — run them now:** + +Read CLAUDE.md to find the project's test command. If not specified, use `bun test`. +Run the test command and capture the exit code and output. + +```bash +bun test 2>&1 | tail -10 +``` + +If tests fail: **BLOCKER.** Cannot merge with failing tests. + +**E2E tests — check recent results:** ```bash ls -t ~/.gstack-dev/evals/*-e2e-*-$(date +%Y-%m-%d)*.json 2>/dev/null | head -20 ``` -For each eval file found from today, parse pass/fail counts and show a summary: +For each eval file from today, parse pass/fail counts. Show: +- Total tests, pass count, fail count +- How long ago the run finished (from file timestamp) +- Total cost +- Names of any failing tests -``` -E2E Tests | 52/52 pass | 25 min ago | $8.50 -Free Tests | (run bun test to check) -``` +If no E2E results from today: **WARNING — no E2E tests run today.** +If E2E results exist but have failures: **WARNING — N tests failed.** List them. -If no E2E results from today exist, note: "No E2E tests run today." - -Also run free tests inline: +**LLM judge evals — check recent results:** ```bash -bun test 2>&1 | tail -5 +ls -t ~/.gstack-dev/evals/*-llm-judge-*-$(date +%Y-%m-%d)*.json 2>/dev/null | head -5 ``` -### 3.5c: Document-release check +If found, parse and show pass/fail. If not found, note "No LLM evals run today." -Check if `/document-release` has been run on this branch: +### 3.5c: PR body accuracy check + +Read the current PR body: +```bash +gh pr view --json body -q .body +``` + +Read the current diff summary: +```bash +git log --oneline $(gh pr view --json baseRefName -q .baseRefName 2>/dev/null || echo main)..HEAD | head -20 +``` + +Compare the PR body against the actual commits. Check for: +1. **Missing features** — commits that add significant functionality not mentioned in the PR +2. **Stale descriptions** — PR body mentions things that were later changed or reverted +3. **Wrong version** — PR title or body references a version that doesn't match VERSION file + +If the PR body looks stale or incomplete: **WARNING — PR body may not reflect current +changes.** List what's missing or stale. + +### 3.5d: Document-release check + +Check if documentation was updated on this branch: ```bash -git log --oneline --grep="docs: update project documentation" HEAD~5..HEAD +git log --oneline --all-match --grep="docs:" $(gh pr view --json baseRefName -q .baseRefName 2>/dev/null || echo main)..HEAD | head -5 ``` -If no doc update commit found: flag "WARNING: /document-release has not been run." +Also check if key doc files were modified: +```bash +git diff --name-only $(gh pr view --json baseRefName -q .baseRefName 2>/dev/null || echo main)...HEAD -- README.md CHANGELOG.md ARCHITECTURE.md CONTRIBUTING.md CLAUDE.md VERSION +``` -### 3.5d: Merge confirmation +If CHANGELOG.md and VERSION were NOT modified on this branch and the diff includes +new features (new files, new commands, new skills): **WARNING — /document-release +likely not run. CHANGELOG and VERSION not updated despite new features.** -Present all evidence in a single AskUserQuestion: +If only docs changed (no code): skip this check. -- **Re-ground:** "About to merge PR #NNN on branch X to base branch Y." -- Show the review dashboard, test results, and doc-release status from above. -- If ANY of these are red flags (eng review missing/stale, E2E failures, no doc update), - list them explicitly as warnings. -- **RECOMMENDATION:** Choose A if everything is green. Choose B if there are warnings - to address first. Choose C to skip checks and merge anyway. -- A) Merge — all checks look good -- B) Don't merge yet — fix the flagged issues first -- C) Merge anyway — I accept the risks +### 3.5e: Readiness report and confirmation + +Build the full readiness report: + +``` +╔══════════════════════════════════════════════════════════╗ +║ PRE-MERGE READINESS REPORT ║ +╠══════════════════════════════════════════════════════════╣ +║ ║ +║ PR: #NNN — title ║ +║ Branch: feature → main ║ +║ ║ +║ REVIEWS ║ +║ ├─ Eng Review: CURRENT / STALE (N commits) / — ║ +║ ├─ CEO Review: CURRENT / — (optional) ║ +║ ├─ Design Review: CURRENT / — (optional) ║ +║ └─ Codex Review: CURRENT / — (optional) ║ +║ ║ +║ TESTS ║ +║ ├─ Free tests: PASS / FAIL (blocker) ║ +║ ├─ E2E tests: 52/52 pass (25 min ago) / NOT RUN ║ +║ └─ LLM evals: PASS / NOT RUN ║ +║ ║ +║ DOCUMENTATION ║ +║ ├─ CHANGELOG: Updated / NOT UPDATED (warning) ║ +║ ├─ VERSION: 0.9.8.0 / NOT BUMPED (warning) ║ +║ └─ Doc release: Run / NOT RUN (warning) ║ +║ ║ +║ PR BODY ║ +║ └─ Accuracy: Current / STALE (warning) ║ +║ ║ +║ WARNINGS: N | BLOCKERS: N ║ +╚══════════════════════════════════════════════════════════╝ +``` + +If there are BLOCKERS (failing free tests): list them and recommend B. +If there are WARNINGS but no blockers: list each warning and recommend A if +warnings are minor, or B if warnings are significant. +If everything is green: recommend A. + +Use AskUserQuestion: + +- **Re-ground:** "About to merge PR #NNN (title) from branch X to Y. Here's the + readiness report." Show the report above. +- List each warning and blocker explicitly. +- **RECOMMENDATION:** Choose A if green. Choose B if there are significant warnings. + Choose C only if the user understands the risks. +- A) Merge — readiness checks passed (Completeness: 10/10) +- B) Don't merge yet — address the warnings first (Completeness: 10/10) +- C) Merge anyway — I understand the risks (Completeness: 3/10) + +If the user chooses B: **STOP.** List exactly what needs to be done: +- If reviews are stale: "Re-run /plan-eng-review (or /review) to review current code." +- If E2E not run: "Run `bun run test:e2e` to verify." +- If docs not updated: "Run /document-release to update documentation." +- If PR body stale: "Update the PR body to reflect current changes." -If the user chooses B: **STOP** with a summary of what needs to be done. If the user chooses A or C: continue to Step 4. --- diff --git a/land-and-deploy/SKILL.md b/land-and-deploy/SKILL.md index 6b4271e4..d37798bf 100644 --- a/land-and-deploy/SKILL.md +++ b/land-and-deploy/SKILL.md @@ -376,79 +376,171 @@ If timeout (15 min): **STOP.** "CI has been running for 15 minutes. Investigate ## Step 3.5: Pre-merge readiness gate -**This is the critical safety check before an irreversible merge.** Gather ALL evidence, -present it to the user, and get explicit confirmation before merging. +**This is the critical safety check before an irreversible merge.** The merge cannot +be undone without a revert commit. Gather ALL evidence, build a readiness report, +and get explicit user confirmation before proceeding. -### 3.5a: Review Readiness Dashboard +Collect evidence for each check below. Track warnings (yellow) and blockers (red). + +### 3.5a: Review staleness check ```bash ~/.claude/skills/gstack/bin/gstack-review-read 2>/dev/null ``` -Parse the output. Find the most recent entry for each review skill within the last 7 days. -Check staleness: compare each review's commit hash against the current HEAD. +Parse the output. For each review skill (plan-eng-review, plan-ceo-review, +plan-design-review, design-review-lite, codex-review): -Display the dashboard: +1. Find the most recent entry within the last 7 days. +2. Extract its `commit` field. +3. Compare against current HEAD: `git rev-list --count STORED_COMMIT..HEAD` +**Staleness rules:** +- 0 commits since review → CURRENT +- 1-3 commits since review → RECENT (yellow if those commits touch code, not just docs) +- 4+ commits since review → STALE (red — review may not reflect current code) +- No review found → NOT RUN + +**Critical check:** Look at what changed AFTER the last review. Run: +```bash +git log --oneline STORED_COMMIT..HEAD ``` -PRE-MERGE READINESS CHECK -══════════════════════════ -Review | Status | Last Run | Stale? -----------------|-----------|---------------------|-------- -Eng Review | CLEAR/— | 2026-03-21 15:00 | N commits behind -CEO Review | CLEAR/— | — | — -Design Review | CLEAR/— | — | — -Codex Review | CLEAR/— | — | — -``` +If any commits after the review contain words like "fix", "refactor", "rewrite", +"overhaul", or touch more than 5 files — flag as **STALE (significant changes +since review)**. The review was done on different code than what's about to merge. ### 3.5b: Test results -Check for recent E2E eval results: +**Free tests — run them now:** + +Read CLAUDE.md to find the project's test command. If not specified, use `bun test`. +Run the test command and capture the exit code and output. + +```bash +bun test 2>&1 | tail -10 +``` + +If tests fail: **BLOCKER.** Cannot merge with failing tests. + +**E2E tests — check recent results:** ```bash ls -t ~/.gstack-dev/evals/*-e2e-*-$(date +%Y-%m-%d)*.json 2>/dev/null | head -20 ``` -For each eval file found from today, parse pass/fail counts and show a summary: +For each eval file from today, parse pass/fail counts. Show: +- Total tests, pass count, fail count +- How long ago the run finished (from file timestamp) +- Total cost +- Names of any failing tests -``` -E2E Tests | 52/52 pass | 25 min ago | $8.50 -Free Tests | (run bun test to check) -``` +If no E2E results from today: **WARNING — no E2E tests run today.** +If E2E results exist but have failures: **WARNING — N tests failed.** List them. -If no E2E results from today exist, note: "No E2E tests run today." - -Also run free tests inline: +**LLM judge evals — check recent results:** ```bash -bun test 2>&1 | tail -5 +ls -t ~/.gstack-dev/evals/*-llm-judge-*-$(date +%Y-%m-%d)*.json 2>/dev/null | head -5 ``` -### 3.5c: Document-release check +If found, parse and show pass/fail. If not found, note "No LLM evals run today." -Check if `/document-release` has been run on this branch: +### 3.5c: PR body accuracy check + +Read the current PR body: +```bash +gh pr view --json body -q .body +``` + +Read the current diff summary: +```bash +git log --oneline $(gh pr view --json baseRefName -q .baseRefName 2>/dev/null || echo main)..HEAD | head -20 +``` + +Compare the PR body against the actual commits. Check for: +1. **Missing features** — commits that add significant functionality not mentioned in the PR +2. **Stale descriptions** — PR body mentions things that were later changed or reverted +3. **Wrong version** — PR title or body references a version that doesn't match VERSION file + +If the PR body looks stale or incomplete: **WARNING — PR body may not reflect current +changes.** List what's missing or stale. + +### 3.5d: Document-release check + +Check if documentation was updated on this branch: ```bash -git log --oneline --grep="docs: update project documentation" HEAD~5..HEAD +git log --oneline --all-match --grep="docs:" $(gh pr view --json baseRefName -q .baseRefName 2>/dev/null || echo main)..HEAD | head -5 ``` -If no doc update commit found: flag "WARNING: /document-release has not been run." +Also check if key doc files were modified: +```bash +git diff --name-only $(gh pr view --json baseRefName -q .baseRefName 2>/dev/null || echo main)...HEAD -- README.md CHANGELOG.md ARCHITECTURE.md CONTRIBUTING.md CLAUDE.md VERSION +``` -### 3.5d: Merge confirmation +If CHANGELOG.md and VERSION were NOT modified on this branch and the diff includes +new features (new files, new commands, new skills): **WARNING — /document-release +likely not run. CHANGELOG and VERSION not updated despite new features.** -Present all evidence in a single AskUserQuestion: +If only docs changed (no code): skip this check. -- **Re-ground:** "About to merge PR #NNN on branch X to base branch Y." -- Show the review dashboard, test results, and doc-release status from above. -- If ANY of these are red flags (eng review missing/stale, E2E failures, no doc update), - list them explicitly as warnings. -- **RECOMMENDATION:** Choose A if everything is green. Choose B if there are warnings - to address first. Choose C to skip checks and merge anyway. -- A) Merge — all checks look good -- B) Don't merge yet — fix the flagged issues first -- C) Merge anyway — I accept the risks +### 3.5e: Readiness report and confirmation + +Build the full readiness report: + +``` +╔══════════════════════════════════════════════════════════╗ +║ PRE-MERGE READINESS REPORT ║ +╠══════════════════════════════════════════════════════════╣ +║ ║ +║ PR: #NNN — title ║ +║ Branch: feature → main ║ +║ ║ +║ REVIEWS ║ +║ ├─ Eng Review: CURRENT / STALE (N commits) / — ║ +║ ├─ CEO Review: CURRENT / — (optional) ║ +║ ├─ Design Review: CURRENT / — (optional) ║ +║ └─ Codex Review: CURRENT / — (optional) ║ +║ ║ +║ TESTS ║ +║ ├─ Free tests: PASS / FAIL (blocker) ║ +║ ├─ E2E tests: 52/52 pass (25 min ago) / NOT RUN ║ +║ └─ LLM evals: PASS / NOT RUN ║ +║ ║ +║ DOCUMENTATION ║ +║ ├─ CHANGELOG: Updated / NOT UPDATED (warning) ║ +║ ├─ VERSION: 0.9.8.0 / NOT BUMPED (warning) ║ +║ └─ Doc release: Run / NOT RUN (warning) ║ +║ ║ +║ PR BODY ║ +║ └─ Accuracy: Current / STALE (warning) ║ +║ ║ +║ WARNINGS: N | BLOCKERS: N ║ +╚══════════════════════════════════════════════════════════╝ +``` + +If there are BLOCKERS (failing free tests): list them and recommend B. +If there are WARNINGS but no blockers: list each warning and recommend A if +warnings are minor, or B if warnings are significant. +If everything is green: recommend A. + +Use AskUserQuestion: + +- **Re-ground:** "About to merge PR #NNN (title) from branch X to Y. Here's the + readiness report." Show the report above. +- List each warning and blocker explicitly. +- **RECOMMENDATION:** Choose A if green. Choose B if there are significant warnings. + Choose C only if the user understands the risks. +- A) Merge — readiness checks passed (Completeness: 10/10) +- B) Don't merge yet — address the warnings first (Completeness: 10/10) +- C) Merge anyway — I understand the risks (Completeness: 3/10) + +If the user chooses B: **STOP.** List exactly what needs to be done: +- If reviews are stale: "Re-run /plan-eng-review (or /review) to review current code." +- If E2E not run: "Run `bun run test:e2e` to verify." +- If docs not updated: "Run /document-release to update documentation." +- If PR body stale: "Update the PR body to reflect current changes." -If the user chooses B: **STOP** with a summary of what needs to be done. If the user chooses A or C: continue to Step 4. --- diff --git a/land-and-deploy/SKILL.md.tmpl b/land-and-deploy/SKILL.md.tmpl index a026eca2..d1ddd7b7 100644 --- a/land-and-deploy/SKILL.md.tmpl +++ b/land-and-deploy/SKILL.md.tmpl @@ -118,79 +118,171 @@ If timeout (15 min): **STOP.** "CI has been running for 15 minutes. Investigate ## Step 3.5: Pre-merge readiness gate -**This is the critical safety check before an irreversible merge.** Gather ALL evidence, -present it to the user, and get explicit confirmation before merging. +**This is the critical safety check before an irreversible merge.** The merge cannot +be undone without a revert commit. Gather ALL evidence, build a readiness report, +and get explicit user confirmation before proceeding. -### 3.5a: Review Readiness Dashboard +Collect evidence for each check below. Track warnings (yellow) and blockers (red). + +### 3.5a: Review staleness check ```bash ~/.claude/skills/gstack/bin/gstack-review-read 2>/dev/null ``` -Parse the output. Find the most recent entry for each review skill within the last 7 days. -Check staleness: compare each review's commit hash against the current HEAD. +Parse the output. For each review skill (plan-eng-review, plan-ceo-review, +plan-design-review, design-review-lite, codex-review): -Display the dashboard: +1. Find the most recent entry within the last 7 days. +2. Extract its `commit` field. +3. Compare against current HEAD: `git rev-list --count STORED_COMMIT..HEAD` +**Staleness rules:** +- 0 commits since review → CURRENT +- 1-3 commits since review → RECENT (yellow if those commits touch code, not just docs) +- 4+ commits since review → STALE (red — review may not reflect current code) +- No review found → NOT RUN + +**Critical check:** Look at what changed AFTER the last review. Run: +```bash +git log --oneline STORED_COMMIT..HEAD ``` -PRE-MERGE READINESS CHECK -══════════════════════════ -Review | Status | Last Run | Stale? -----------------|-----------|---------------------|-------- -Eng Review | CLEAR/— | 2026-03-21 15:00 | N commits behind -CEO Review | CLEAR/— | — | — -Design Review | CLEAR/— | — | — -Codex Review | CLEAR/— | — | — -``` +If any commits after the review contain words like "fix", "refactor", "rewrite", +"overhaul", or touch more than 5 files — flag as **STALE (significant changes +since review)**. The review was done on different code than what's about to merge. ### 3.5b: Test results -Check for recent E2E eval results: +**Free tests — run them now:** + +Read CLAUDE.md to find the project's test command. If not specified, use `bun test`. +Run the test command and capture the exit code and output. + +```bash +bun test 2>&1 | tail -10 +``` + +If tests fail: **BLOCKER.** Cannot merge with failing tests. + +**E2E tests — check recent results:** ```bash ls -t ~/.gstack-dev/evals/*-e2e-*-$(date +%Y-%m-%d)*.json 2>/dev/null | head -20 ``` -For each eval file found from today, parse pass/fail counts and show a summary: +For each eval file from today, parse pass/fail counts. Show: +- Total tests, pass count, fail count +- How long ago the run finished (from file timestamp) +- Total cost +- Names of any failing tests -``` -E2E Tests | 52/52 pass | 25 min ago | $8.50 -Free Tests | (run bun test to check) -``` +If no E2E results from today: **WARNING — no E2E tests run today.** +If E2E results exist but have failures: **WARNING — N tests failed.** List them. -If no E2E results from today exist, note: "No E2E tests run today." - -Also run free tests inline: +**LLM judge evals — check recent results:** ```bash -bun test 2>&1 | tail -5 +ls -t ~/.gstack-dev/evals/*-llm-judge-*-$(date +%Y-%m-%d)*.json 2>/dev/null | head -5 ``` -### 3.5c: Document-release check +If found, parse and show pass/fail. If not found, note "No LLM evals run today." -Check if `/document-release` has been run on this branch: +### 3.5c: PR body accuracy check + +Read the current PR body: +```bash +gh pr view --json body -q .body +``` + +Read the current diff summary: +```bash +git log --oneline $(gh pr view --json baseRefName -q .baseRefName 2>/dev/null || echo main)..HEAD | head -20 +``` + +Compare the PR body against the actual commits. Check for: +1. **Missing features** — commits that add significant functionality not mentioned in the PR +2. **Stale descriptions** — PR body mentions things that were later changed or reverted +3. **Wrong version** — PR title or body references a version that doesn't match VERSION file + +If the PR body looks stale or incomplete: **WARNING — PR body may not reflect current +changes.** List what's missing or stale. + +### 3.5d: Document-release check + +Check if documentation was updated on this branch: ```bash -git log --oneline --grep="docs: update project documentation" HEAD~5..HEAD +git log --oneline --all-match --grep="docs:" $(gh pr view --json baseRefName -q .baseRefName 2>/dev/null || echo main)..HEAD | head -5 ``` -If no doc update commit found: flag "WARNING: /document-release has not been run." +Also check if key doc files were modified: +```bash +git diff --name-only $(gh pr view --json baseRefName -q .baseRefName 2>/dev/null || echo main)...HEAD -- README.md CHANGELOG.md ARCHITECTURE.md CONTRIBUTING.md CLAUDE.md VERSION +``` -### 3.5d: Merge confirmation +If CHANGELOG.md and VERSION were NOT modified on this branch and the diff includes +new features (new files, new commands, new skills): **WARNING — /document-release +likely not run. CHANGELOG and VERSION not updated despite new features.** -Present all evidence in a single AskUserQuestion: +If only docs changed (no code): skip this check. -- **Re-ground:** "About to merge PR #NNN on branch X to base branch Y." -- Show the review dashboard, test results, and doc-release status from above. -- If ANY of these are red flags (eng review missing/stale, E2E failures, no doc update), - list them explicitly as warnings. -- **RECOMMENDATION:** Choose A if everything is green. Choose B if there are warnings - to address first. Choose C to skip checks and merge anyway. -- A) Merge — all checks look good -- B) Don't merge yet — fix the flagged issues first -- C) Merge anyway — I accept the risks +### 3.5e: Readiness report and confirmation + +Build the full readiness report: + +``` +╔══════════════════════════════════════════════════════════╗ +║ PRE-MERGE READINESS REPORT ║ +╠══════════════════════════════════════════════════════════╣ +║ ║ +║ PR: #NNN — title ║ +║ Branch: feature → main ║ +║ ║ +║ REVIEWS ║ +║ ├─ Eng Review: CURRENT / STALE (N commits) / — ║ +║ ├─ CEO Review: CURRENT / — (optional) ║ +║ ├─ Design Review: CURRENT / — (optional) ║ +║ └─ Codex Review: CURRENT / — (optional) ║ +║ ║ +║ TESTS ║ +║ ├─ Free tests: PASS / FAIL (blocker) ║ +║ ├─ E2E tests: 52/52 pass (25 min ago) / NOT RUN ║ +║ └─ LLM evals: PASS / NOT RUN ║ +║ ║ +║ DOCUMENTATION ║ +║ ├─ CHANGELOG: Updated / NOT UPDATED (warning) ║ +║ ├─ VERSION: 0.9.8.0 / NOT BUMPED (warning) ║ +║ └─ Doc release: Run / NOT RUN (warning) ║ +║ ║ +║ PR BODY ║ +║ └─ Accuracy: Current / STALE (warning) ║ +║ ║ +║ WARNINGS: N | BLOCKERS: N ║ +╚══════════════════════════════════════════════════════════╝ +``` + +If there are BLOCKERS (failing free tests): list them and recommend B. +If there are WARNINGS but no blockers: list each warning and recommend A if +warnings are minor, or B if warnings are significant. +If everything is green: recommend A. + +Use AskUserQuestion: + +- **Re-ground:** "About to merge PR #NNN (title) from branch X to Y. Here's the + readiness report." Show the report above. +- List each warning and blocker explicitly. +- **RECOMMENDATION:** Choose A if green. Choose B if there are significant warnings. + Choose C only if the user understands the risks. +- A) Merge — readiness checks passed (Completeness: 10/10) +- B) Don't merge yet — address the warnings first (Completeness: 10/10) +- C) Merge anyway — I understand the risks (Completeness: 3/10) + +If the user chooses B: **STOP.** List exactly what needs to be done: +- If reviews are stale: "Re-run /plan-eng-review (or /review) to review current code." +- If E2E not run: "Run `bun run test:e2e` to verify." +- If docs not updated: "Run /document-release to update documentation." +- If PR body stale: "Update the PR body to reflect current changes." -If the user chooses B: **STOP** with a summary of what needs to be done. If the user chooses A or C: continue to Step 4. ---