From b4c33261c725a23f75d75a94ab672874e13e0e36 Mon Sep 17 00:00:00 2001 From: Garry Tan Date: Mon, 16 Mar 2026 10:10:00 -0500 Subject: [PATCH] feat: add scope drift detection + verification of claims to /review MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Step 1.5: Before reviewing code quality, check if the diff matches stated intent. Flags scope creep and missing requirements (INFORMATIONAL). Step 5 addition: Every review claim must cite evidence — "this pattern is safe" needs a line reference, "tests cover this" needs a test name. Co-Authored-By: Claude Opus 4.6 (1M context) --- review/SKILL.md | 69 ++++++++++++++++++++++++++++++++++++++++++++ review/SKILL.md.tmpl | 44 ++++++++++++++++++++++++++++ 2 files changed, 113 insertions(+) diff --git a/review/SKILL.md b/review/SKILL.md index 32c597a3..c05e68aa 100644 --- a/review/SKILL.md +++ b/review/SKILL.md @@ -73,6 +73,31 @@ Then run: `mkdir -p ~/.gstack/contributor-logs && open ~/.gstack/contributor-log Slug: lowercase, hyphens, max 60 chars (e.g. `browse-snapshot-ref-gap`). Skip if file already exists. Max 3 reports per session. File inline and continue — don't stop the workflow. Tell user: "Filed gstack field report: {title}" +## Completion Status Protocol + +When completing a skill workflow, report status using one of: +- **DONE** — All steps completed successfully. Evidence provided for each claim. +- **DONE_WITH_CONCERNS** — Completed, but with issues the user should know about. List each concern. +- **BLOCKED** — Cannot proceed. State what is blocking and what was tried. +- **NEEDS_CONTEXT** — Missing information required to continue. State exactly what you need. + +### Escalation + +It is always OK to stop and say "this is too hard for me" or "I'm not confident in this result." + +Bad work is worse than no work. You will not be penalized for escalating. +- If you have attempted a task 3 times without success, STOP and escalate. +- If you are uncertain about a security-sensitive change, STOP and escalate. +- If the scope of work exceeds what you can verify, STOP and escalate. + +Escalation format: +``` +STATUS: BLOCKED | NEEDS_CONTEXT +REASON: [1-2 sentences] +ATTEMPTED: [what you tried] +RECOMMENDATION: [what the user should do next] +``` + # Pre-Landing PR Review You are running the `/review` workflow. Analyze the current branch's diff against main for structural issues that tests don't catch. @@ -87,6 +112,40 @@ You are running the `/review` workflow. Analyze the current branch's diff agains --- +## Step 1.5: Scope Drift Detection + +Before reviewing code quality, check: **did they build what was requested — nothing more, nothing less?** + +1. Read `TODOS.md` (if it exists). Read PR description (`gh pr view --json body --jq .body 2>/dev/null || true`). + Read commit messages (`git log origin/main..HEAD --oneline`). + **If no PR exists:** rely on commit messages and TODOS.md for stated intent — this is the common case since /review runs before /ship creates the PR. +2. Identify the **stated intent** — what was this branch supposed to accomplish? +3. Run `git diff origin/main --stat` and compare the files changed against the stated intent. +4. Evaluate with skepticism: + + **SCOPE CREEP detection:** + - Files changed that are unrelated to the stated intent + - New features or refactors not mentioned in the plan + - "While I was in there..." changes that expand blast radius + + **MISSING REQUIREMENTS detection:** + - Requirements from TODOS.md/PR description not addressed in the diff + - Test coverage gaps for stated requirements + - Partial implementations (started but not finished) + +5. Output (before the main review begins): + ``` + Scope Check: [CLEAN / DRIFT DETECTED / REQUIREMENTS MISSING] + Intent: <1-line summary of what was requested> + Delivered: <1-line summary of what the diff actually does> + [If drift: list each out-of-scope change] + [If missing: list each unaddressed requirement] + ``` + +6. This is **INFORMATIONAL** — does not block the review. Proceed to Step 2. + +--- + ## Step 2: Read the checklist Read `.claude/skills/review/checklist.md`. @@ -139,6 +198,16 @@ Follow the output format specified in the checklist. Respect the suppressions - If only non-critical issues found: output findings. No further action needed. - If no issues found: output `Pre-Landing Review: No issues found.` +### Verification of claims + +Before producing the final review output: +- If you claim "this pattern is safe" → cite the specific line proving safety +- If you claim "this is handled elsewhere" → read and cite the handling code +- If you claim "tests cover this" → name the test file and method +- Never say "likely handled" or "probably tested" — verify or flag as unknown + +**Rationalization prevention:** "This looks fine" is not a finding. Either cite evidence it IS fine, or flag it as unverified. + ### Greptile comment resolution After outputting your own findings, if Greptile comments were classified in Step 2.5: diff --git a/review/SKILL.md.tmpl b/review/SKILL.md.tmpl index 124a5393..14afe6f7 100644 --- a/review/SKILL.md.tmpl +++ b/review/SKILL.md.tmpl @@ -30,6 +30,40 @@ You are running the `/review` workflow. Analyze the current branch's diff agains --- +## Step 1.5: Scope Drift Detection + +Before reviewing code quality, check: **did they build what was requested — nothing more, nothing less?** + +1. Read `TODOS.md` (if it exists). Read PR description (`gh pr view --json body --jq .body 2>/dev/null || true`). + Read commit messages (`git log origin/main..HEAD --oneline`). + **If no PR exists:** rely on commit messages and TODOS.md for stated intent — this is the common case since /review runs before /ship creates the PR. +2. Identify the **stated intent** — what was this branch supposed to accomplish? +3. Run `git diff origin/main --stat` and compare the files changed against the stated intent. +4. Evaluate with skepticism: + + **SCOPE CREEP detection:** + - Files changed that are unrelated to the stated intent + - New features or refactors not mentioned in the plan + - "While I was in there..." changes that expand blast radius + + **MISSING REQUIREMENTS detection:** + - Requirements from TODOS.md/PR description not addressed in the diff + - Test coverage gaps for stated requirements + - Partial implementations (started but not finished) + +5. Output (before the main review begins): + ``` + Scope Check: [CLEAN / DRIFT DETECTED / REQUIREMENTS MISSING] + Intent: <1-line summary of what was requested> + Delivered: <1-line summary of what the diff actually does> + [If drift: list each out-of-scope change] + [If missing: list each unaddressed requirement] + ``` + +6. This is **INFORMATIONAL** — does not block the review. Proceed to Step 2. + +--- + ## Step 2: Read the checklist Read `.claude/skills/review/checklist.md`. @@ -82,6 +116,16 @@ Follow the output format specified in the checklist. Respect the suppressions - If only non-critical issues found: output findings. No further action needed. - If no issues found: output `Pre-Landing Review: No issues found.` +### Verification of claims + +Before producing the final review output: +- If you claim "this pattern is safe" → cite the specific line proving safety +- If you claim "this is handled elsewhere" → read and cite the handling code +- If you claim "tests cover this" → name the test file and method +- Never say "likely handled" or "probably tested" — verify or flag as unknown + +**Rationalization prevention:** "This looks fine" is not a finding. Either cite evidence it IS fine, or flag it as unverified. + ### Greptile comment resolution After outputting your own findings, if Greptile comments were classified in Step 2.5: