feat: adaptive gating + cross-review dedup for review army (v0.15.2.0) (#760)

* feat: add test_stub optional field to specialist finding schema All specialist prompts now document test_stub as an optional output field, enabling specialists to suggest test code alongside findings. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: adaptive gating + test framework detection for review army Adds gstack-specialist-stats binary for tracking specialist hit rates. Resolver now detects test framework for test_stub generation, applies adaptive gating to skip silent specialists, and compiles per-specialist stats for the review-log entry. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: cross-review finding dedup + test stub override + enriched review-log Step 5.0 suppresses findings previously skipped by the user when the relevant code hasn't changed. Test stub findings force ASK classification so users approve test creation. Review-log now includes quality_score, per-specialist stats, and per-finding action records. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * chore: bump version and changelog (v0.15.2.0) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: bash operator precedence in test framework detection [ -f a ] || [ -f b ] && X="y" evaluates as A || (B && C), so the assignment only runs when the second test passes. Wrap the OR group in braces: { [ -f a ] || [ -f b ]; } && X="y". Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-01 19:25:10 +02:00 · 2026-04-04 22:46:21 -07:00
parent 115d81d792
commit 9ca8f1d7a9
15 changed files with 260 additions and 11 deletions
@@ -1,5 +1,16 @@
 # Changelog

+## [0.15.8.0] - 2026-04-04 — Smarter Reviews
+
+Code reviews now learn from your decisions. Skip a finding once and it stays quiet until the code changes. Specialists auto-suggest test stubs alongside their findings. And silent specialists that never find anything get auto-gated so reviews stay fast.
+
+### Added
+
+- **Cross-review finding dedup.** When you skip a finding in one review, gstack remembers. On the next review, if the relevant code hasn't changed, the finding stays suppressed. No more re-skipping the same intentional pattern every PR.
+- **Test stub suggestions.** Specialists can now include a skeleton test alongside each finding. The test uses your project's detected framework (Jest, Vitest, RSpec, pytest, Go test). Findings with test stubs get surfaced as ASK items so you decide whether to create the test.
+- **Adaptive specialist gating.** Specialists that have been dispatched 10+ times with zero findings get auto-gated. Security and data-migration are exempt (insurance policies always run). Force any specialist back with `--security`, `--performance`, etc.
+- **Per-specialist stats in review log.** Every review now records which specialists ran, how many findings each produced, and which were skipped or gated. This powers the adaptive gating and gives /retro richer data.
+
 ## [0.15.7.0] - 2026-04-05 — Security Wave 1

 Fourteen fixes for the security audit (#783). Design server no longer binds all interfaces. Path traversal, auth bypass, CORS wildcard, world-readable files, prompt injection, and symlink race conditions all closed. Community PRs from @Gonzih and @garagon included.
@@ -1 +1 @@
-0.15.7.0
+0.15.8.0
@@ -0,0 +1,65 @@
+#!/usr/bin/env bash
+# gstack-specialist-stats — compute per-specialist hit rates from review history
+# Usage: gstack-specialist-stats
+#
+# Reads all *-reviews.jsonl files across branches, parses specialist fields,
+# and outputs hit rates. Tags specialists as GATE_CANDIDATE (0 findings in 10+
+# dispatches) or NEVER_GATE (security, data-migration — insurance policy).
+set -euo pipefail
+SCRIPT_DIR="$(cd "$(dirname "$0")" && pwd)"
+eval "$("$SCRIPT_DIR/gstack-slug" 2>/dev/null)"
+GSTACK_HOME="${GSTACK_HOME:-$HOME/.gstack}"
+PROJECT_DIR="$GSTACK_HOME/projects/$SLUG"
+
+if [ ! -d "$PROJECT_DIR" ]; then
+  echo "SPECIALIST_STATS: 0 reviews analyzed"
+  exit 0
+fi
+
+# Collect all review JSONL files (strip ---CONFIG--- and ---HEAD--- footers)
+COMBINED=""
+for f in "$PROJECT_DIR"/*-reviews.jsonl; do
+  [ -f "$f" ] || continue
+  COMBINED="$COMBINED$(sed '/^---/,$d' "$f" 2>/dev/null)
+"
+done
+
+if [ -z "$COMBINED" ]; then
+  echo "SPECIALIST_STATS: 0 reviews analyzed"
+  exit 0
+fi
+
+printf '%s' "$COMBINED" | bun -e "
+const lines = (await Bun.stdin.text()).trim().split('\n').filter(Boolean);
+const NEVER_GATE = new Set(['security', 'data-migration']);
+const stats = {};
+let reviewed = 0;
+
+for (const line of lines) {
+  try {
+    const e = JSON.parse(line);
+    if (!e.specialists) continue;
+    reviewed++;
+    for (const [name, info] of Object.entries(e.specialists)) {
+      if (!stats[name]) stats[name] = { dispatched: 0, findings: 0 };
+      if (info.dispatched) {
+        stats[name].dispatched++;
+        stats[name].findings += (info.findings || 0);
+      }
+    }
+  } catch {}
+}
+
+console.log('SPECIALIST_STATS: ' + reviewed + ' reviews analyzed');
+const sorted = Object.entries(stats).sort((a, b) => a[0].localeCompare(b[0]));
+for (const [name, s] of sorted) {
+  const pct = s.dispatched > 0 ? Math.round(100 * s.findings / s.dispatched) : 0;
+  let tag = '';
+  if (NEVER_GATE.has(name)) {
+    tag = ' [NEVER_GATE]';
+  } else if (s.dispatched >= 10 && s.findings === 0) {
+    tag = ' [GATE_CANDIDATE]';
+  }
+  console.log(name + ': ' + s.dispatched + '/' + reviewed + ' dispatched, ' + s.findings + ' findings (' + pct + '%)' + tag);
+}
+" 2>/dev/null || { echo "SPECIALIST_STATS: 0 reviews analyzed"; exit 0; }
@@ -1,6 +1,6 @@
 {
  "name": "gstack",
-  "version": "0.15.6.0",
+  "version": "0.15.8.0",
  "description": "Garry's Stack — Claude Code skills + fast headless browser. One repo, one install, entire AI engineering workflow.",
  "license": "MIT",
  "type": "module",
@@ -894,6 +894,20 @@ STACK=""
 echo "STACK: ${STACK:-unknown}"
 DIFF_LINES=$(git diff origin/<base> --stat | tail -1 | grep -oE '[0-9]+ insertion' | grep -oE '[0-9]+' || echo "0")
 echo "DIFF_LINES: $DIFF_LINES"
+# Detect test framework for specialist test stub generation
+TEST_FW=""
+{ [ -f jest.config.ts ] || [ -f jest.config.js ]; } && TEST_FW="jest"
+[ -f vitest.config.ts ] && TEST_FW="vitest"
+{ [ -f spec/spec_helper.rb ] || [ -f .rspec ]; } && TEST_FW="rspec"
+{ [ -f pytest.ini ] || [ -f conftest.py ]; } && TEST_FW="pytest"
+[ -f go.mod ] && TEST_FW="go-test"
+echo "TEST_FW: ${TEST_FW:-unknown}"
+```
+
+### Read specialist hit rates (adaptive gating)
+
+```bash
+~/.claude/skills/gstack/bin/gstack-specialist-stats 2>/dev/null || true
 ```

 ### Select specialists
@@ -913,8 +927,18 @@ Based on the scope signals above, select which specialists to dispatch.
 6. **API Contract** — if SCOPE_API=true. Read `~/.claude/skills/gstack/review/specialists/api-contract.md`
 7. **Design** — if SCOPE_FRONTEND=true. Use the existing design review checklist at `~/.claude/skills/gstack/review/design-checklist.md`

-Note which specialists were selected and which were skipped. Print the selection:
-"Dispatching N specialists: [names]. Skipped: [names] (scope not detected)."
+### Adaptive gating
+
+After scope-based selection, apply adaptive gating based on specialist hit rates:
+
+For each conditional specialist that passed scope gating, check the `gstack-specialist-stats` output above:
+- If tagged `[GATE_CANDIDATE]` (0 findings in 10+ dispatches): skip it. Print: "[specialist] auto-gated (0 findings in N reviews)."
+- If tagged `[NEVER_GATE]`: always dispatch regardless of hit rate. Security and data-migration are insurance policy specialists — they should run even when silent.
+
+**Force flags:** If the user's prompt includes `--security`, `--performance`, `--testing`, `--maintainability`, `--data-migration`, `--api-contract`, `--design`, or `--all-specialists`, force-include that specialist regardless of gating.
+
+Note which specialists were selected, gated, and skipped. Print the selection:
+"Dispatching N specialists: [names]. Skipped: [names] (scope not detected). Gated: [names] (0 findings in N+ reviews)."

 ---

@@ -947,7 +971,11 @@ For each finding, output a JSON object on its own line:
 {\"severity\":\"CRITICAL|INFORMATIONAL\",\"confidence\":N,\"path\":\"file\",\"line\":N,\"category\":\"category\",\"summary\":\"description\",\"fix\":\"recommended fix\",\"fingerprint\":\"path:line:category\",\"specialist\":\"name\"}

 Required fields: severity, confidence, path, category, summary, specialist.
-Optional: line, fix, fingerprint, evidence.
+Optional: line, fix, fingerprint, evidence, test_stub.
+
+If you can write a test that would catch this issue, include it in the `test_stub` field.
+Use the detected test framework ({TEST_FW}). Write a minimal skeleton — describe/it/test
+blocks with clear intent. Skip test_stub for architectural or design-only findings.

 If no findings: output `NO FINDINGS` and nothing else.
 Do not output anything else — no preamble, no summary, no commentary.
@@ -1014,6 +1042,17 @@ PR Quality Score: X/10
 These findings flow into Step 5 Fix-First alongside the CRITICAL pass findings from Step 4.
 The Fix-First heuristic applies identically — specialist findings follow the same AUTO-FIX vs ASK classification.

+**Compile per-specialist stats:**
+After merging findings, compile a `specialists` object for the review-log entry in Step 5.8.
+For each specialist (testing, maintainability, security, performance, data-migration, api-contract, design, red-team):
+- If dispatched: `{"dispatched": true, "findings": N, "critical": N, "informational": N}`
+- If skipped by scope: `{"dispatched": false, "reason": "scope"}`
+- If skipped by gating: `{"dispatched": false, "reason": "gated"}`
+- If not applicable (e.g., red-team not activated): omit from the object
+
+Include the Design specialist even though it uses `design-checklist.md` instead of the specialist schema files.
+Remember these stats — you will need them for the review-log entry in Step 5.8.
+
 ---

 ### Red Team dispatch (conditional)
@@ -1046,6 +1085,38 @@ If the Red Team subagent fails or times out, skip silently and continue.

 **Every finding gets action — not just critical ones.**

+### Step 5.0: Cross-review finding dedup
+
+Before classifying findings, check if any were previously skipped by the user in a prior review on this branch.
+
+```bash
+~/.claude/skills/gstack/bin/gstack-review-read
+```
+
+Parse the output: only lines BEFORE `---CONFIG---` are JSONL entries (the output also contains `---CONFIG---` and `---HEAD---` footer sections that are not JSONL — ignore those).
+
+For each JSONL entry that has a `findings` array:
+1. Collect all fingerprints where `action: "skipped"`
+2. Note the `commit` field from that entry
+
+If skipped fingerprints exist, get the list of files changed since that review:
+
+```bash
+git diff --name-only <prior-review-commit> HEAD
+```
+
+For each current finding (from both Step 4 critical pass and Step 4.5-4.6 specialists), check:
+- Does its fingerprint match a previously skipped finding?
+- Is the finding's file path NOT in the changed-files set?
+
+If both conditions are true: suppress the finding. It was intentionally skipped and the relevant code hasn't changed.
+
+Print: "Suppressed N findings from prior reviews (previously skipped by user)"
+
+**Only suppress `skipped` findings — never `fixed` or `auto-fixed`** (those might regress and should be re-checked).
+
+If no prior reviews exist or none have a `findings` array, skip this step silently.
+
 Output a summary header: `Pre-Landing Review: N issues (X critical, Y informational)`

 ### Step 5a: Classify each finding
@@ -1054,6 +1125,14 @@ For each finding, classify as AUTO-FIX or ASK per the Fix-First Heuristic in
 checklist.md. Critical findings lean toward ASK; informational findings lean
 toward AUTO-FIX.

+**Test stub override:** Any finding that has a `test_stub` field (generated by a specialist)
+is reclassified as ASK regardless of its original classification. When presenting the ASK
+item, show the proposed test file path and the test code. The user approves or skips the
+test creation. If approved, write the fix + test file. Derive the test file path from
+the finding's `path` using project conventions (`spec/` for RSpec, `__tests__/` for
+Jest/Vitest, `test_` prefix for pytest, `_test.go` suffix for Go). If the test file
+already exists, append the new test. Output: `[FIXED + TEST] [file:line] Problem -> fix + test at [test_path]`
+
 ### Step 5b: Auto-fix all AUTO-FIX items

 Apply each fix directly. For each one, output a one-line summary:
@@ -1287,7 +1366,7 @@ recognize that Eng Review was run on this branch.
 Run:

 ```bash
-~/.claude/skills/gstack/bin/gstack-review-log '{"skill":"review","timestamp":"TIMESTAMP","status":"STATUS","issues_found":N,"critical":N,"informational":N,"commit":"COMMIT"}'
+~/.claude/skills/gstack/bin/gstack-review-log '{"skill":"review","timestamp":"TIMESTAMP","status":"STATUS","issues_found":N,"critical":N,"informational":N,"quality_score":SCORE,"specialists":SPECIALISTS_JSON,"findings":FINDINGS_JSON,"commit":"COMMIT"}'
 ```

 Substitute:
@@ -1296,6 +1375,9 @@ Substitute:
 - `issues_found` = total remaining unresolved findings
 - `critical` = remaining unresolved critical findings
 - `informational` = remaining unresolved informational findings
+- `quality_score` = the PR Quality Score computed in Step 4.6 (e.g., 7.5). If specialists were skipped (small diff), use `10.0`
+- `specialists` = the per-specialist stats object compiled in Step 4.6. Each specialist that was considered gets an entry: `{"dispatched":true/false,"findings":N,"critical":N,"informational":N}` if dispatched, or `{"dispatched":false,"reason":"scope|gated"}` if skipped. Include Design specialist. Example: `{"testing":{"dispatched":true,"findings":2,"critical":0,"informational":2},"security":{"dispatched":false,"reason":"scope"}}`
+- `findings` = array of per-finding records from Step 5. For each finding (from critical pass and specialists), include: `{"fingerprint":"path:line:category","severity":"CRITICAL|INFORMATIONAL","action":"ACTION"}`. ACTION is `"auto-fixed"` (Step 5b), `"fixed"` (user approved in Step 5d), or `"skipped"` (user chose Skip in Step 5c). Suppressed findings from Step 5.0 are NOT included (they were already recorded in a prior review entry).
 - `COMMIT` = output of `git rev-parse --short HEAD`

 ## Capture Learnings
@@ -103,6 +103,38 @@ Follow the output format specified in the checklist. Respect the suppressions

 **Every finding gets action — not just critical ones.**

+### Step 5.0: Cross-review finding dedup
+
+Before classifying findings, check if any were previously skipped by the user in a prior review on this branch.
+
+```bash
+~/.claude/skills/gstack/bin/gstack-review-read
+```
+
+Parse the output: only lines BEFORE `---CONFIG---` are JSONL entries (the output also contains `---CONFIG---` and `---HEAD---` footer sections that are not JSONL — ignore those).
+
+For each JSONL entry that has a `findings` array:
+1. Collect all fingerprints where `action: "skipped"`
+2. Note the `commit` field from that entry
+
+If skipped fingerprints exist, get the list of files changed since that review:
+
+```bash
+git diff --name-only <prior-review-commit> HEAD
+```
+
+For each current finding (from both Step 4 critical pass and Step 4.5-4.6 specialists), check:
+- Does its fingerprint match a previously skipped finding?
+- Is the finding's file path NOT in the changed-files set?
+
+If both conditions are true: suppress the finding. It was intentionally skipped and the relevant code hasn't changed.
+
+Print: "Suppressed N findings from prior reviews (previously skipped by user)"
+
+**Only suppress `skipped` findings — never `fixed` or `auto-fixed`** (those might regress and should be re-checked).
+
+If no prior reviews exist or none have a `findings` array, skip this step silently.
+
 Output a summary header: `Pre-Landing Review: N issues (X critical, Y informational)`

 ### Step 5a: Classify each finding
@@ -111,6 +143,14 @@ For each finding, classify as AUTO-FIX or ASK per the Fix-First Heuristic in
 checklist.md. Critical findings lean toward ASK; informational findings lean
 toward AUTO-FIX.

+**Test stub override:** Any finding that has a `test_stub` field (generated by a specialist)
+is reclassified as ASK regardless of its original classification. When presenting the ASK
+item, show the proposed test file path and the test code. The user approves or skips the
+test creation. If approved, write the fix + test file. Derive the test file path from
+the finding's `path` using project conventions (`spec/` for RSpec, `__tests__/` for
+Jest/Vitest, `test_` prefix for pytest, `_test.go` suffix for Go). If the test file
+already exists, append the new test. Output: `[FIXED + TEST] [file:line] Problem -> fix + test at [test_path]`
+
 ### Step 5b: Auto-fix all AUTO-FIX items

 Apply each fix directly. For each one, output a one-line summary:
@@ -221,7 +261,7 @@ recognize that Eng Review was run on this branch.
 Run:

 ```bash
-~/.claude/skills/gstack/bin/gstack-review-log '{"skill":"review","timestamp":"TIMESTAMP","status":"STATUS","issues_found":N,"critical":N,"informational":N,"commit":"COMMIT"}'
+~/.claude/skills/gstack/bin/gstack-review-log '{"skill":"review","timestamp":"TIMESTAMP","status":"STATUS","issues_found":N,"critical":N,"informational":N,"quality_score":SCORE,"specialists":SPECIALISTS_JSON,"findings":FINDINGS_JSON,"commit":"COMMIT"}'
 ```

 Substitute:
@@ -230,6 +270,9 @@ Substitute:
 - `issues_found` = total remaining unresolved findings
 - `critical` = remaining unresolved critical findings
 - `informational` = remaining unresolved informational findings
+- `quality_score` = the PR Quality Score computed in Step 4.6 (e.g., 7.5). If specialists were skipped (small diff), use `10.0`
+- `specialists` = the per-specialist stats object compiled in Step 4.6. Each specialist that was considered gets an entry: `{"dispatched":true/false,"findings":N,"critical":N,"informational":N}` if dispatched, or `{"dispatched":false,"reason":"scope|gated"}` if skipped. Include Design specialist. Example: `{"testing":{"dispatched":true,"findings":2,"critical":0,"informational":2},"security":{"dispatched":false,"reason":"scope"}}`
+- `findings` = array of per-finding records from Step 5. For each finding (from critical pass and specialists), include: `{"fingerprint":"path:line:category","severity":"CRITICAL|INFORMATIONAL","action":"ACTION"}`. ACTION is `"auto-fixed"` (Step 5b), `"fixed"` (user approved in Step 5d), or `"skipped"` (user chose Skip in Step 5c). Suppressed findings from Step 5.0 are NOT included (they were already recorded in a prior review entry).
 - `COMMIT` = output of `git rev-parse --short HEAD`

 {{LEARNINGS_LOG}}
@@ -58,6 +58,8 @@ Design Review: N issues (X auto-fixable, Y need input, Z possible)
 - [file:line] Possible issue — verify with /design-review
 ```

+Optional: `test_stub` — skeleton test code for this finding using the project's test framework.
+
 If no issues found: `Design Review: No issues found.`

 If no frontend files changed: skip silently, no output.
@@ -3,6 +3,7 @@
 Scope: When SCOPE_API=true
 Output: JSON objects, one finding per line. Schema:
 {"severity":"CRITICAL|INFORMATIONAL","confidence":N,"path":"file","line":N,"category":"api-contract","summary":"...","fix":"...","fingerprint":"path:line:api-contract","specialist":"api-contract"}
+Optional: line, fix, fingerprint, evidence, test_stub.
 If no findings: output `NO FINDINGS` and nothing else.

 ---
@@ -3,6 +3,7 @@
 Scope: When SCOPE_MIGRATIONS=true
 Output: JSON objects, one finding per line. Schema:
 {"severity":"CRITICAL|INFORMATIONAL","confidence":N,"path":"file","line":N,"category":"data-migration","summary":"...","fix":"...","fingerprint":"path:line:data-migration","specialist":"data-migration"}
+Optional: line, fix, fingerprint, evidence, test_stub.
 If no findings: output `NO FINDINGS` and nothing else.

 ---
@@ -3,6 +3,7 @@
 Scope: Always-on (every review)
 Output: JSON objects, one finding per line. Schema:
 {"severity":"INFORMATIONAL","confidence":N,"path":"file","line":N,"category":"maintainability","summary":"...","fix":"...","fingerprint":"path:line:maintainability","specialist":"maintainability"}
+Optional: line, fix, fingerprint, evidence, test_stub.
 If no findings: output `NO FINDINGS` and nothing else.

 ---
@@ -3,6 +3,7 @@
 Scope: When SCOPE_BACKEND=true OR SCOPE_FRONTEND=true
 Output: JSON objects, one finding per line. Schema:
 {"severity":"CRITICAL|INFORMATIONAL","confidence":N,"path":"file","line":N,"category":"performance","summary":"...","fix":"...","fingerprint":"path:line:performance","specialist":"performance"}
+Optional: line, fix, fingerprint, evidence, test_stub.
 If no findings: output `NO FINDINGS` and nothing else.

 ---
@@ -3,6 +3,7 @@
 Scope: When diff > 200 lines OR security specialist found CRITICAL findings. Runs AFTER other specialists.
 Output: JSON objects, one finding per line. Schema:
 {"severity":"CRITICAL|INFORMATIONAL","confidence":N,"path":"file","line":N,"category":"red-team","summary":"...","fix":"...","fingerprint":"path:line:red-team","specialist":"red-team"}
+Optional: line, fix, fingerprint, evidence, test_stub.
 If no findings: output `NO FINDINGS` and nothing else.

 ---
@@ -3,6 +3,7 @@
 Scope: When SCOPE_AUTH=true OR (SCOPE_BACKEND=true AND diff > 100 lines)
 Output: JSON objects, one finding per line. Schema:
 {"severity":"CRITICAL|INFORMATIONAL","confidence":N,"path":"file","line":N,"category":"security","summary":"...","fix":"...","fingerprint":"path:line:security","specialist":"security"}
+Optional: line, fix, fingerprint, evidence, test_stub.
 If no findings: output `NO FINDINGS` and nothing else.

 ---
@@ -3,6 +3,7 @@
 Scope: Always-on (every review)
 Output: JSON objects, one finding per line. Schema:
 {"severity":"CRITICAL|INFORMATIONAL","confidence":N,"path":"file","line":N,"category":"testing","summary":"...","fix":"...","fingerprint":"path:line:testing","specialist":"testing"}
+Optional: line, fix, fingerprint, evidence, test_stub.
 If no findings: output `NO FINDINGS` and nothing else.

 ---
@@ -28,6 +28,20 @@ STACK=""
 echo "STACK: \${STACK:-unknown}"
 DIFF_LINES=$(git diff origin/<base> --stat | tail -1 | grep -oE '[0-9]+ insertion' | grep -oE '[0-9]+' || echo "0")
 echo "DIFF_LINES: $DIFF_LINES"
+# Detect test framework for specialist test stub generation
+TEST_FW=""
+{ [ -f jest.config.ts ] || [ -f jest.config.js ]; } && TEST_FW="jest"
+[ -f vitest.config.ts ] && TEST_FW="vitest"
+{ [ -f spec/spec_helper.rb ] || [ -f .rspec ]; } && TEST_FW="rspec"
+{ [ -f pytest.ini ] || [ -f conftest.py ]; } && TEST_FW="pytest"
+[ -f go.mod ] && TEST_FW="go-test"
+echo "TEST_FW: \${TEST_FW:-unknown}"
+\`\`\`
+
+### Read specialist hit rates (adaptive gating)
+
+\`\`\`bash
+${ctx.paths.binDir}/gstack-specialist-stats 2>/dev/null || true
 \`\`\`

 ### Select specialists
@@ -47,8 +61,18 @@ Based on the scope signals above, select which specialists to dispatch.
 6. **API Contract** — if SCOPE_API=true. Read \`${ctx.paths.skillRoot}/review/specialists/api-contract.md\`
 7. **Design** — if SCOPE_FRONTEND=true. Use the existing design review checklist at \`${ctx.paths.skillRoot}/review/design-checklist.md\`

-Note which specialists were selected and which were skipped. Print the selection:
-"Dispatching N specialists: [names]. Skipped: [names] (scope not detected)."`;
+### Adaptive gating
+
+After scope-based selection, apply adaptive gating based on specialist hit rates:
+
+For each conditional specialist that passed scope gating, check the \`gstack-specialist-stats\` output above:
+- If tagged \`[GATE_CANDIDATE]\` (0 findings in 10+ dispatches): skip it. Print: "[specialist] auto-gated (0 findings in N reviews)."
+- If tagged \`[NEVER_GATE]\`: always dispatch regardless of hit rate. Security and data-migration are insurance policy specialists — they should run even when silent.
+
+**Force flags:** If the user's prompt includes \`--security\`, \`--performance\`, \`--testing\`, \`--maintainability\`, \`--data-migration\`, \`--api-contract\`, \`--design\`, or \`--all-specialists\`, force-include that specialist regardless of gating.
+
+Note which specialists were selected, gated, and skipped. Print the selection:
+"Dispatching N specialists: [names]. Skipped: [names] (scope not detected). Gated: [names] (0 findings in N+ reviews)."`;
 }

 function generateSpecialistDispatch(ctx: TemplateContext): string {
@@ -81,7 +105,11 @@ For each finding, output a JSON object on its own line:
 {\\"severity\\":\\"CRITICAL|INFORMATIONAL\\",\\"confidence\\":N,\\"path\\":\\"file\\",\\"line\\":N,\\"category\\":\\"category\\",\\"summary\\":\\"description\\",\\"fix\\":\\"recommended fix\\",\\"fingerprint\\":\\"path:line:category\\",\\"specialist\\":\\"name\\"}

 Required fields: severity, confidence, path, category, summary, specialist.
-Optional: line, fix, fingerprint, evidence.
+Optional: line, fix, fingerprint, evidence, test_stub.
+
+If you can write a test that would catch this issue, include it in the \`test_stub\` field.
+Use the detected test framework ({TEST_FW}). Write a minimal skeleton — describe/it/test
+blocks with clear intent. Skip test_stub for architectural or design-only findings.

 If no findings: output \`NO FINDINGS\` and nothing else.
 Do not output anything else — no preamble, no summary, no commentary.
@@ -146,7 +174,18 @@ PR Quality Score: X/10
 \`\`\`

 These findings flow into Step 5 Fix-First alongside the CRITICAL pass findings from Step 4.
-The Fix-First heuristic applies identically — specialist findings follow the same AUTO-FIX vs ASK classification.`;
+The Fix-First heuristic applies identically — specialist findings follow the same AUTO-FIX vs ASK classification.
+
+**Compile per-specialist stats:**
+After merging findings, compile a \`specialists\` object for the review-log entry in Step 5.8.
+For each specialist (testing, maintainability, security, performance, data-migration, api-contract, design, red-team):
+- If dispatched: \`{"dispatched": true, "findings": N, "critical": N, "informational": N}\`
+- If skipped by scope: \`{"dispatched": false, "reason": "scope"}\`
+- If skipped by gating: \`{"dispatched": false, "reason": "gated"}\`
+- If not applicable (e.g., red-team not activated): omit from the object
+
+Include the Design specialist even though it uses \`design-checklist.md\` instead of the specialist schema files.
+Remember these stats — you will need them for the review-log entry in Step 5.8.`;
 }

 function generateRedTeam(ctx: TemplateContext): string {
@@ -1 +1 @@
 .15.7.0
 .15.8.0