mirror of
https://github.com/garrytan/gstack.git
synced 2026-05-07 05:56:41 +02:00
chore: regenerate SKILL.md files from updated templates
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -931,6 +931,8 @@ Required ASCII diagram: user flow showing screens/states and transitions.
|
||||
If this plan has significant UI scope, recommend: "Consider running /plan-design-review for a deep design review of this plan before implementation."
|
||||
**STOP.** AskUserQuestion once per issue. Do NOT batch. Recommend + WHY. If no issues or fix is obvious, state what you'll do and move on — don't waste a question. Do NOT proceed until user responds.
|
||||
|
||||
|
||||
|
||||
## Post-Implementation Design Audit (if UI scope detected)
|
||||
After implementation, run `/design-review` on the live site to catch visual issues that can only be evaluated with rendered output.
|
||||
|
||||
@@ -1025,6 +1027,7 @@ List every ASCII diagram in files this plan touches. Still accurate?
|
||||
| TODOS.md updates | ___ items proposed |
|
||||
| Scope proposals | ___ proposed, ___ accepted (EXP + SEL) |
|
||||
| CEO plan | written / skipped (HOLD/REDUCTION) |
|
||||
| Outside voice | ran (codex/claude) / skipped |
|
||||
| Lake Score | X/Y recommendations chose complete option |
|
||||
| Diagrams produced | ___ (list types) |
|
||||
| Stale diagrams found | ___ |
|
||||
@@ -1078,7 +1081,7 @@ After completing the review, read the review log and config to display the dashb
|
||||
~/.codex/skills/gstack/bin/gstack-review-read
|
||||
```
|
||||
|
||||
Parse the output. Find the most recent entry for each skill (plan-ceo-review, plan-eng-review, plan-design-review, design-review-lite, adversarial-review, codex-review). Ignore entries with timestamps older than 7 days. For the Adversarial row, show whichever is more recent between `adversarial-review` (new auto-scaled) and `codex-review` (legacy). For Design Review, show whichever is more recent between `plan-design-review` (full visual audit) and `design-review-lite` (code-level check). Append "(FULL)" or "(LITE)" to the status to distinguish. Display:
|
||||
Parse the output. Find the most recent entry for each skill (plan-ceo-review, plan-eng-review, plan-design-review, design-review-lite, adversarial-review, codex-review, codex-plan-review). Ignore entries with timestamps older than 7 days. For the Adversarial row, show whichever is more recent between `adversarial-review` (new auto-scaled) and `codex-review` (legacy). For Design Review, show whichever is more recent between `plan-design-review` (full visual audit) and `design-review-lite` (code-level check). Append "(FULL)" or "(LITE)" to the status to distinguish. Display:
|
||||
|
||||
```
|
||||
+====================================================================+
|
||||
@@ -1090,6 +1093,7 @@ Parse the output. Find the most recent entry for each skill (plan-ceo-review, pl
|
||||
| CEO Review | 0 | — | — | no |
|
||||
| Design Review | 0 | — | — | no |
|
||||
| Adversarial | 0 | — | — | no |
|
||||
| Outside Voice | 0 | — | — | no |
|
||||
+--------------------------------------------------------------------+
|
||||
| VERDICT: CLEARED — Eng Review passed |
|
||||
+====================================================================+
|
||||
@@ -1100,6 +1104,7 @@ Parse the output. Find the most recent entry for each skill (plan-ceo-review, pl
|
||||
- **CEO Review (optional):** Use your judgment. Recommend it for big product/business changes, new user-facing features, or scope decisions. Skip for bug fixes, refactors, infra, and cleanup.
|
||||
- **Design Review (optional):** Use your judgment. Recommend it for UI/UX changes. Skip for backend-only, infra, or prompt-only changes.
|
||||
- **Adversarial Review (automatic):** Auto-scales by diff size. Small diffs (<50 lines) skip adversarial. Medium diffs (50–199) get cross-model adversarial. Large diffs (200+) get all 4 passes: Claude structured, Codex structured, Claude adversarial subagent, Codex adversarial. No configuration needed.
|
||||
- **Outside Voice (optional):** Independent plan review from a different AI model. Offered after all review sections complete in /plan-ceo-review and /plan-eng-review. Falls back to Claude subagent if Codex is unavailable. Never gates shipping.
|
||||
|
||||
**Verdict logic:**
|
||||
- **CLEARED**: Eng Review has >= 1 entry within 7 days with status "clean" (or \`skip_eng_review\` is \`true\`)
|
||||
|
||||
@@ -528,7 +528,7 @@ After completing the review, read the review log and config to display the dashb
|
||||
~/.codex/skills/gstack/bin/gstack-review-read
|
||||
```
|
||||
|
||||
Parse the output. Find the most recent entry for each skill (plan-ceo-review, plan-eng-review, plan-design-review, design-review-lite, adversarial-review, codex-review). Ignore entries with timestamps older than 7 days. For the Adversarial row, show whichever is more recent between `adversarial-review` (new auto-scaled) and `codex-review` (legacy). For Design Review, show whichever is more recent between `plan-design-review` (full visual audit) and `design-review-lite` (code-level check). Append "(FULL)" or "(LITE)" to the status to distinguish. Display:
|
||||
Parse the output. Find the most recent entry for each skill (plan-ceo-review, plan-eng-review, plan-design-review, design-review-lite, adversarial-review, codex-review, codex-plan-review). Ignore entries with timestamps older than 7 days. For the Adversarial row, show whichever is more recent between `adversarial-review` (new auto-scaled) and `codex-review` (legacy). For Design Review, show whichever is more recent between `plan-design-review` (full visual audit) and `design-review-lite` (code-level check). Append "(FULL)" or "(LITE)" to the status to distinguish. Display:
|
||||
|
||||
```
|
||||
+====================================================================+
|
||||
@@ -540,6 +540,7 @@ Parse the output. Find the most recent entry for each skill (plan-ceo-review, pl
|
||||
| CEO Review | 0 | — | — | no |
|
||||
| Design Review | 0 | — | — | no |
|
||||
| Adversarial | 0 | — | — | no |
|
||||
| Outside Voice | 0 | — | — | no |
|
||||
+--------------------------------------------------------------------+
|
||||
| VERDICT: CLEARED — Eng Review passed |
|
||||
+====================================================================+
|
||||
@@ -550,6 +551,7 @@ Parse the output. Find the most recent entry for each skill (plan-ceo-review, pl
|
||||
- **CEO Review (optional):** Use your judgment. Recommend it for big product/business changes, new user-facing features, or scope decisions. Skip for bug fixes, refactors, infra, and cleanup.
|
||||
- **Design Review (optional):** Use your judgment. Recommend it for UI/UX changes. Skip for backend-only, infra, or prompt-only changes.
|
||||
- **Adversarial Review (automatic):** Auto-scales by diff size. Small diffs (<50 lines) skip adversarial. Medium diffs (50–199) get cross-model adversarial. Large diffs (200+) get all 4 passes: Claude structured, Codex structured, Claude adversarial subagent, Codex adversarial. No configuration needed.
|
||||
- **Outside Voice (optional):** Independent plan review from a different AI model. Offered after all review sections complete in /plan-ceo-review and /plan-eng-review. Falls back to Claude subagent if Codex is unavailable. Never gates shipping.
|
||||
|
||||
**Verdict logic:**
|
||||
- **CLEARED**: Eng Review has >= 1 entry within 7 days with status "clean" (or \`skip_eng_review\` is \`true\`)
|
||||
|
||||
@@ -327,28 +327,7 @@ Before reviewing anything, answer these questions:
|
||||
|
||||
If the complexity check triggers (8+ files or 2+ new classes/services), proactively recommend scope reduction via AskUserQuestion — explain what's overbuilt, propose a minimal version that achieves the core goal, and ask whether to reduce or proceed as-is. If the complexity check does not trigger, present your Step 0 findings and proceed directly to Section 1.
|
||||
|
||||
### Step 0.5: Codex plan review (optional)
|
||||
|
||||
Check if the Codex CLI is available: `which codex 2>/dev/null`
|
||||
|
||||
If available, after presenting Step 0 findings, use AskUserQuestion:
|
||||
```
|
||||
Want an independent Codex (OpenAI) review of this plan before the detailed review?
|
||||
A) Yes — let Codex critique the plan independently
|
||||
B) No — proceed with the Claude review only
|
||||
```
|
||||
|
||||
If the user chooses A: tell Codex to read the plan file itself (avoids ARG_MAX limits for large plans):
|
||||
```bash
|
||||
codex exec "You are a brutally honest technical reviewer. Read the plan file at <plan-file-path> and review it for: logical gaps and unstated assumptions, missing error handling or edge cases, overcomplexity (is there a simpler approach?), feasibility risks (what could go wrong?), and missing dependencies or sequencing issues. Be direct. Be terse. No compliments. Just the problems." -s read-only -c 'model_reasoning_effort="high"' --enable web_search_cached
|
||||
```
|
||||
|
||||
Replace `<plan-file-path>` with the actual path to the plan file detected earlier. Codex has filesystem access in read-only mode and will read the file itself.
|
||||
|
||||
Present the full output under a `CODEX SAYS (plan review):` header. Note any concerns
|
||||
that should inform the subsequent engineering review sections.
|
||||
|
||||
If Codex is not available, skip silently.
|
||||
|
||||
Always work through the full interactive review: one section at a time (Architecture → Code Quality → Tests → Performance) with at most 8 top issues per section.
|
||||
|
||||
@@ -483,6 +462,7 @@ At the end of the review, fill in and display this summary so the user can see a
|
||||
- What already exists: written
|
||||
- TODOS.md updates: ___ items proposed to user
|
||||
- Failure modes: ___ critical gaps flagged
|
||||
- Outside voice: ran (codex/claude) / skipped
|
||||
- Lake Score: X/Y recommendations chose complete option
|
||||
|
||||
## Retrospective learning
|
||||
@@ -525,7 +505,7 @@ After completing the review, read the review log and config to display the dashb
|
||||
~/.codex/skills/gstack/bin/gstack-review-read
|
||||
```
|
||||
|
||||
Parse the output. Find the most recent entry for each skill (plan-ceo-review, plan-eng-review, plan-design-review, design-review-lite, adversarial-review, codex-review). Ignore entries with timestamps older than 7 days. For the Adversarial row, show whichever is more recent between `adversarial-review` (new auto-scaled) and `codex-review` (legacy). For Design Review, show whichever is more recent between `plan-design-review` (full visual audit) and `design-review-lite` (code-level check). Append "(FULL)" or "(LITE)" to the status to distinguish. Display:
|
||||
Parse the output. Find the most recent entry for each skill (plan-ceo-review, plan-eng-review, plan-design-review, design-review-lite, adversarial-review, codex-review, codex-plan-review). Ignore entries with timestamps older than 7 days. For the Adversarial row, show whichever is more recent between `adversarial-review` (new auto-scaled) and `codex-review` (legacy). For Design Review, show whichever is more recent between `plan-design-review` (full visual audit) and `design-review-lite` (code-level check). Append "(FULL)" or "(LITE)" to the status to distinguish. Display:
|
||||
|
||||
```
|
||||
+====================================================================+
|
||||
@@ -537,6 +517,7 @@ Parse the output. Find the most recent entry for each skill (plan-ceo-review, pl
|
||||
| CEO Review | 0 | — | — | no |
|
||||
| Design Review | 0 | — | — | no |
|
||||
| Adversarial | 0 | — | — | no |
|
||||
| Outside Voice | 0 | — | — | no |
|
||||
+--------------------------------------------------------------------+
|
||||
| VERDICT: CLEARED — Eng Review passed |
|
||||
+====================================================================+
|
||||
@@ -547,6 +528,7 @@ Parse the output. Find the most recent entry for each skill (plan-ceo-review, pl
|
||||
- **CEO Review (optional):** Use your judgment. Recommend it for big product/business changes, new user-facing features, or scope decisions. Skip for bug fixes, refactors, infra, and cleanup.
|
||||
- **Design Review (optional):** Use your judgment. Recommend it for UI/UX changes. Skip for backend-only, infra, or prompt-only changes.
|
||||
- **Adversarial Review (automatic):** Auto-scales by diff size. Small diffs (<50 lines) skip adversarial. Medium diffs (50–199) get cross-model adversarial. Large diffs (200+) get all 4 passes: Claude structured, Codex structured, Claude adversarial subagent, Codex adversarial. No configuration needed.
|
||||
- **Outside Voice (optional):** Independent plan review from a different AI model. Offered after all review sections complete in /plan-ceo-review and /plan-eng-review. Falls back to Claude subagent if Codex is unavailable. Never gates shipping.
|
||||
|
||||
**Verdict logic:**
|
||||
- **CLEARED**: Eng Review has >= 1 entry within 7 days with status "clean" (or \`skip_eng_review\` is \`true\`)
|
||||
|
||||
@@ -294,7 +294,7 @@ After completing the review, read the review log and config to display the dashb
|
||||
~/.codex/skills/gstack/bin/gstack-review-read
|
||||
```
|
||||
|
||||
Parse the output. Find the most recent entry for each skill (plan-ceo-review, plan-eng-review, plan-design-review, design-review-lite, adversarial-review, codex-review). Ignore entries with timestamps older than 7 days. For the Adversarial row, show whichever is more recent between `adversarial-review` (new auto-scaled) and `codex-review` (legacy). For Design Review, show whichever is more recent between `plan-design-review` (full visual audit) and `design-review-lite` (code-level check). Append "(FULL)" or "(LITE)" to the status to distinguish. Display:
|
||||
Parse the output. Find the most recent entry for each skill (plan-ceo-review, plan-eng-review, plan-design-review, design-review-lite, adversarial-review, codex-review, codex-plan-review). Ignore entries with timestamps older than 7 days. For the Adversarial row, show whichever is more recent between `adversarial-review` (new auto-scaled) and `codex-review` (legacy). For Design Review, show whichever is more recent between `plan-design-review` (full visual audit) and `design-review-lite` (code-level check). Append "(FULL)" or "(LITE)" to the status to distinguish. Display:
|
||||
|
||||
```
|
||||
+====================================================================+
|
||||
@@ -306,6 +306,7 @@ Parse the output. Find the most recent entry for each skill (plan-ceo-review, pl
|
||||
| CEO Review | 0 | — | — | no |
|
||||
| Design Review | 0 | — | — | no |
|
||||
| Adversarial | 0 | — | — | no |
|
||||
| Outside Voice | 0 | — | — | no |
|
||||
+--------------------------------------------------------------------+
|
||||
| VERDICT: CLEARED — Eng Review passed |
|
||||
+====================================================================+
|
||||
@@ -316,6 +317,7 @@ Parse the output. Find the most recent entry for each skill (plan-ceo-review, pl
|
||||
- **CEO Review (optional):** Use your judgment. Recommend it for big product/business changes, new user-facing features, or scope decisions. Skip for bug fixes, refactors, infra, and cleanup.
|
||||
- **Design Review (optional):** Use your judgment. Recommend it for UI/UX changes. Skip for backend-only, infra, or prompt-only changes.
|
||||
- **Adversarial Review (automatic):** Auto-scales by diff size. Small diffs (<50 lines) skip adversarial. Medium diffs (50–199) get cross-model adversarial. Large diffs (200+) get all 4 passes: Claude structured, Codex structured, Claude adversarial subagent, Codex adversarial. No configuration needed.
|
||||
- **Outside Voice (optional):** Independent plan review from a different AI model. Offered after all review sections complete in /plan-ceo-review and /plan-eng-review. Falls back to Claude subagent if Codex is unavailable. Never gates shipping.
|
||||
|
||||
**Verdict logic:**
|
||||
- **CLEARED**: Eng Review has >= 1 entry within 7 days with status "clean" (or \`skip_eng_review\` is \`true\`)
|
||||
|
||||
+123
-1
@@ -940,6 +940,125 @@ Required ASCII diagram: user flow showing screens/states and transitions.
|
||||
If this plan has significant UI scope, recommend: "Consider running /plan-design-review for a deep design review of this plan before implementation."
|
||||
**STOP.** AskUserQuestion once per issue. Do NOT batch. Recommend + WHY. If no issues or fix is obvious, state what you'll do and move on — don't waste a question. Do NOT proceed until user responds.
|
||||
|
||||
## Outside Voice — Independent Plan Challenge (optional, recommended)
|
||||
|
||||
After all review sections are complete, offer an independent second opinion from a
|
||||
different AI system. Two models agreeing on a plan is stronger signal than one model's
|
||||
thorough review.
|
||||
|
||||
**Check tool availability:**
|
||||
|
||||
```bash
|
||||
which codex 2>/dev/null && echo "CODEX_AVAILABLE" || echo "CODEX_NOT_AVAILABLE"
|
||||
```
|
||||
|
||||
Use AskUserQuestion:
|
||||
|
||||
> "All review sections are complete. Want an outside voice? A different AI system can
|
||||
> give a brutally honest, independent challenge of this plan — logical gaps, feasibility
|
||||
> risks, and blind spots that are hard to catch from inside the review. Takes about 2
|
||||
> minutes."
|
||||
>
|
||||
> RECOMMENDATION: Choose A — an independent second opinion catches structural blind
|
||||
> spots. Two different AI models agreeing on a plan is stronger signal than one model's
|
||||
> thorough review. Completeness: A=9/10, B=7/10.
|
||||
|
||||
Options:
|
||||
- A) Get the outside voice (recommended)
|
||||
- B) Skip — proceed to outputs
|
||||
|
||||
**If B:** Print "Skipping outside voice." and continue to the next section.
|
||||
|
||||
**If A:** Construct the plan review prompt. Read the plan file being reviewed (the file
|
||||
the user pointed this review at, or the branch diff scope). If a CEO plan document
|
||||
was written in Step 0D-POST, read that too — it contains the scope decisions and vision.
|
||||
|
||||
Construct this prompt (substitute the actual plan content — if plan content exceeds 30KB,
|
||||
truncate to the first 30KB and note "Plan truncated for size"):
|
||||
|
||||
"You are a brutally honest technical reviewer examining a development plan that has
|
||||
already been through a multi-section review. Your job is NOT to repeat that review.
|
||||
Instead, find what it missed. Look for: logical gaps and unstated assumptions that
|
||||
survived the review scrutiny, overcomplexity (is there a fundamentally simpler
|
||||
approach the review was too deep in the weeds to see?), feasibility risks the review
|
||||
took for granted, missing dependencies or sequencing issues, and strategic
|
||||
miscalibration (is this the right thing to build at all?). Be direct. Be terse. No
|
||||
compliments. Just the problems.
|
||||
|
||||
THE PLAN:
|
||||
<plan content>"
|
||||
|
||||
**If CODEX_AVAILABLE:**
|
||||
|
||||
```bash
|
||||
TMPERR_PV=$(mktemp /tmp/codex-planreview-XXXXXXXX)
|
||||
codex exec "<prompt>" -s read-only -c 'model_reasoning_effort="xhigh"' --enable web_search_cached 2>"$TMPERR_PV"
|
||||
```
|
||||
|
||||
Use a 5-minute timeout (`timeout: 300000`). After the command completes, read stderr:
|
||||
```bash
|
||||
cat "$TMPERR_PV"
|
||||
```
|
||||
|
||||
Present the full output verbatim:
|
||||
|
||||
```
|
||||
CODEX SAYS (plan review — outside voice):
|
||||
════════════════════════════════════════════════════════════
|
||||
<full codex output, verbatim — do not truncate or summarize>
|
||||
════════════════════════════════════════════════════════════
|
||||
```
|
||||
|
||||
**Error handling:** All errors are non-blocking — the outside voice is informational.
|
||||
- Auth failure (stderr contains "auth", "login", "unauthorized"): "Codex auth failed. Run \`codex login\` to authenticate."
|
||||
- Timeout: "Codex timed out after 5 minutes."
|
||||
- Empty response: "Codex returned no response."
|
||||
|
||||
On any Codex error, fall back to the Claude adversarial subagent.
|
||||
|
||||
**If CODEX_NOT_AVAILABLE (or Codex errored):**
|
||||
|
||||
Dispatch via the Agent tool. The subagent has fresh context — genuine independence.
|
||||
|
||||
Subagent prompt: same plan review prompt as above.
|
||||
|
||||
Present findings under an `OUTSIDE VOICE (Claude subagent):` header.
|
||||
|
||||
If the subagent fails or times out: "Outside voice unavailable. Continuing to outputs."
|
||||
|
||||
**Cross-model tension:**
|
||||
|
||||
After presenting the outside voice findings, note any points where the outside voice
|
||||
disagrees with the review findings from earlier sections. Flag these as:
|
||||
|
||||
```
|
||||
CROSS-MODEL TENSION:
|
||||
[Topic]: Review said X. Outside voice says Y. [Your assessment of who's right.]
|
||||
```
|
||||
|
||||
For each substantive tension point, auto-propose as a TODO via AskUserQuestion:
|
||||
|
||||
> "Cross-model disagreement on [topic]. The review found [X] but the outside voice
|
||||
> argues [Y]. Worth investigating further?"
|
||||
|
||||
Options:
|
||||
- A) Add to TODOS.md
|
||||
- B) Skip — not substantive
|
||||
|
||||
If no tension points exist, note: "No cross-model tension — both reviewers agree."
|
||||
|
||||
**Persist the result:**
|
||||
```bash
|
||||
~/.claude/skills/gstack/bin/gstack-review-log '{"skill":"codex-plan-review","timestamp":"'"$(date -u +%Y-%m-%dT%H:%M:%SZ)"'","status":"STATUS","source":"SOURCE","commit":"'"$(git rev-parse --short HEAD)"'"}'
|
||||
```
|
||||
|
||||
Substitute: STATUS = "clean" if no findings, "issues_found" if findings exist.
|
||||
SOURCE = "codex" if Codex ran, "claude" if subagent ran.
|
||||
|
||||
**Cleanup:** Run `rm -f "$TMPERR_PV"` after processing (if Codex was used).
|
||||
|
||||
---
|
||||
|
||||
## Post-Implementation Design Audit (if UI scope detected)
|
||||
After implementation, run `/design-review` on the live site to catch visual issues that can only be evaluated with rendered output.
|
||||
|
||||
@@ -1034,6 +1153,7 @@ List every ASCII diagram in files this plan touches. Still accurate?
|
||||
| TODOS.md updates | ___ items proposed |
|
||||
| Scope proposals | ___ proposed, ___ accepted (EXP + SEL) |
|
||||
| CEO plan | written / skipped (HOLD/REDUCTION) |
|
||||
| Outside voice | ran (codex/claude) / skipped |
|
||||
| Lake Score | X/Y recommendations chose complete option |
|
||||
| Diagrams produced | ___ (list types) |
|
||||
| Stale diagrams found | ___ |
|
||||
@@ -1087,7 +1207,7 @@ After completing the review, read the review log and config to display the dashb
|
||||
~/.claude/skills/gstack/bin/gstack-review-read
|
||||
```
|
||||
|
||||
Parse the output. Find the most recent entry for each skill (plan-ceo-review, plan-eng-review, plan-design-review, design-review-lite, adversarial-review, codex-review). Ignore entries with timestamps older than 7 days. For the Adversarial row, show whichever is more recent between `adversarial-review` (new auto-scaled) and `codex-review` (legacy). For Design Review, show whichever is more recent between `plan-design-review` (full visual audit) and `design-review-lite` (code-level check). Append "(FULL)" or "(LITE)" to the status to distinguish. Display:
|
||||
Parse the output. Find the most recent entry for each skill (plan-ceo-review, plan-eng-review, plan-design-review, design-review-lite, adversarial-review, codex-review, codex-plan-review). Ignore entries with timestamps older than 7 days. For the Adversarial row, show whichever is more recent between `adversarial-review` (new auto-scaled) and `codex-review` (legacy). For Design Review, show whichever is more recent between `plan-design-review` (full visual audit) and `design-review-lite` (code-level check). Append "(FULL)" or "(LITE)" to the status to distinguish. Display:
|
||||
|
||||
```
|
||||
+====================================================================+
|
||||
@@ -1099,6 +1219,7 @@ Parse the output. Find the most recent entry for each skill (plan-ceo-review, pl
|
||||
| CEO Review | 0 | — | — | no |
|
||||
| Design Review | 0 | — | — | no |
|
||||
| Adversarial | 0 | — | — | no |
|
||||
| Outside Voice | 0 | — | — | no |
|
||||
+--------------------------------------------------------------------+
|
||||
| VERDICT: CLEARED — Eng Review passed |
|
||||
+====================================================================+
|
||||
@@ -1109,6 +1230,7 @@ Parse the output. Find the most recent entry for each skill (plan-ceo-review, pl
|
||||
- **CEO Review (optional):** Use your judgment. Recommend it for big product/business changes, new user-facing features, or scope decisions. Skip for bug fixes, refactors, infra, and cleanup.
|
||||
- **Design Review (optional):** Use your judgment. Recommend it for UI/UX changes. Skip for backend-only, infra, or prompt-only changes.
|
||||
- **Adversarial Review (automatic):** Auto-scales by diff size. Small diffs (<50 lines) skip adversarial. Medium diffs (50–199) get cross-model adversarial. Large diffs (200+) get all 4 passes: Claude structured, Codex structured, Claude adversarial subagent, Codex adversarial. No configuration needed.
|
||||
- **Outside Voice (optional):** Independent plan review from a different AI model. Offered after all review sections complete in /plan-ceo-review and /plan-eng-review. Falls back to Claude subagent if Codex is unavailable. Never gates shipping.
|
||||
|
||||
**Verdict logic:**
|
||||
- **CLEARED**: Eng Review has >= 1 entry within 7 days with status "clean" (or \`skip_eng_review\` is \`true\`)
|
||||
|
||||
@@ -536,7 +536,7 @@ After completing the review, read the review log and config to display the dashb
|
||||
~/.claude/skills/gstack/bin/gstack-review-read
|
||||
```
|
||||
|
||||
Parse the output. Find the most recent entry for each skill (plan-ceo-review, plan-eng-review, plan-design-review, design-review-lite, adversarial-review, codex-review). Ignore entries with timestamps older than 7 days. For the Adversarial row, show whichever is more recent between `adversarial-review` (new auto-scaled) and `codex-review` (legacy). For Design Review, show whichever is more recent between `plan-design-review` (full visual audit) and `design-review-lite` (code-level check). Append "(FULL)" or "(LITE)" to the status to distinguish. Display:
|
||||
Parse the output. Find the most recent entry for each skill (plan-ceo-review, plan-eng-review, plan-design-review, design-review-lite, adversarial-review, codex-review, codex-plan-review). Ignore entries with timestamps older than 7 days. For the Adversarial row, show whichever is more recent between `adversarial-review` (new auto-scaled) and `codex-review` (legacy). For Design Review, show whichever is more recent between `plan-design-review` (full visual audit) and `design-review-lite` (code-level check). Append "(FULL)" or "(LITE)" to the status to distinguish. Display:
|
||||
|
||||
```
|
||||
+====================================================================+
|
||||
@@ -548,6 +548,7 @@ Parse the output. Find the most recent entry for each skill (plan-ceo-review, pl
|
||||
| CEO Review | 0 | — | — | no |
|
||||
| Design Review | 0 | — | — | no |
|
||||
| Adversarial | 0 | — | — | no |
|
||||
| Outside Voice | 0 | — | — | no |
|
||||
+--------------------------------------------------------------------+
|
||||
| VERDICT: CLEARED — Eng Review passed |
|
||||
+====================================================================+
|
||||
@@ -558,6 +559,7 @@ Parse the output. Find the most recent entry for each skill (plan-ceo-review, pl
|
||||
- **CEO Review (optional):** Use your judgment. Recommend it for big product/business changes, new user-facing features, or scope decisions. Skip for bug fixes, refactors, infra, and cleanup.
|
||||
- **Design Review (optional):** Use your judgment. Recommend it for UI/UX changes. Skip for backend-only, infra, or prompt-only changes.
|
||||
- **Adversarial Review (automatic):** Auto-scales by diff size. Small diffs (<50 lines) skip adversarial. Medium diffs (50–199) get cross-model adversarial. Large diffs (200+) get all 4 passes: Claude structured, Codex structured, Claude adversarial subagent, Codex adversarial. No configuration needed.
|
||||
- **Outside Voice (optional):** Independent plan review from a different AI model. Offered after all review sections complete in /plan-ceo-review and /plan-eng-review. Falls back to Claude subagent if Codex is unavailable. Never gates shipping.
|
||||
|
||||
**Verdict logic:**
|
||||
- **CLEARED**: Eng Review has >= 1 entry within 7 days with status "clean" (or \`skip_eng_review\` is \`true\`)
|
||||
|
||||
+114
-15
@@ -337,28 +337,124 @@ Before reviewing anything, answer these questions:
|
||||
|
||||
If the complexity check triggers (8+ files or 2+ new classes/services), proactively recommend scope reduction via AskUserQuestion — explain what's overbuilt, propose a minimal version that achieves the core goal, and ask whether to reduce or proceed as-is. If the complexity check does not trigger, present your Step 0 findings and proceed directly to Section 1.
|
||||
|
||||
### Step 0.5: Codex plan review (optional)
|
||||
## Outside Voice — Independent Plan Challenge (optional, recommended)
|
||||
|
||||
Check if the Codex CLI is available: `which codex 2>/dev/null`
|
||||
After all review sections are complete, offer an independent second opinion from a
|
||||
different AI system. Two models agreeing on a plan is stronger signal than one model's
|
||||
thorough review.
|
||||
|
||||
If available, after presenting Step 0 findings, use AskUserQuestion:
|
||||
```
|
||||
Want an independent Codex (OpenAI) review of this plan before the detailed review?
|
||||
A) Yes — let Codex critique the plan independently
|
||||
B) No — proceed with the Claude review only
|
||||
```
|
||||
**Check tool availability:**
|
||||
|
||||
If the user chooses A: tell Codex to read the plan file itself (avoids ARG_MAX limits for large plans):
|
||||
```bash
|
||||
codex exec "You are a brutally honest technical reviewer. Read the plan file at <plan-file-path> and review it for: logical gaps and unstated assumptions, missing error handling or edge cases, overcomplexity (is there a simpler approach?), feasibility risks (what could go wrong?), and missing dependencies or sequencing issues. Be direct. Be terse. No compliments. Just the problems." -s read-only -c 'model_reasoning_effort="high"' --enable web_search_cached
|
||||
which codex 2>/dev/null && echo "CODEX_AVAILABLE" || echo "CODEX_NOT_AVAILABLE"
|
||||
```
|
||||
|
||||
Replace `<plan-file-path>` with the actual path to the plan file detected earlier. Codex has filesystem access in read-only mode and will read the file itself.
|
||||
Use AskUserQuestion:
|
||||
|
||||
Present the full output under a `CODEX SAYS (plan review):` header. Note any concerns
|
||||
that should inform the subsequent engineering review sections.
|
||||
> "All review sections are complete. Want an outside voice? A different AI system can
|
||||
> give a brutally honest, independent challenge of this plan — logical gaps, feasibility
|
||||
> risks, and blind spots that are hard to catch from inside the review. Takes about 2
|
||||
> minutes."
|
||||
>
|
||||
> RECOMMENDATION: Choose A — an independent second opinion catches structural blind
|
||||
> spots. Two different AI models agreeing on a plan is stronger signal than one model's
|
||||
> thorough review. Completeness: A=9/10, B=7/10.
|
||||
|
||||
If Codex is not available, skip silently.
|
||||
Options:
|
||||
- A) Get the outside voice (recommended)
|
||||
- B) Skip — proceed to outputs
|
||||
|
||||
**If B:** Print "Skipping outside voice." and continue to the next section.
|
||||
|
||||
**If A:** Construct the plan review prompt. Read the plan file being reviewed (the file
|
||||
the user pointed this review at, or the branch diff scope). If a CEO plan document
|
||||
was written in Step 0D-POST, read that too — it contains the scope decisions and vision.
|
||||
|
||||
Construct this prompt (substitute the actual plan content — if plan content exceeds 30KB,
|
||||
truncate to the first 30KB and note "Plan truncated for size"):
|
||||
|
||||
"You are a brutally honest technical reviewer examining a development plan that has
|
||||
already been through a multi-section review. Your job is NOT to repeat that review.
|
||||
Instead, find what it missed. Look for: logical gaps and unstated assumptions that
|
||||
survived the review scrutiny, overcomplexity (is there a fundamentally simpler
|
||||
approach the review was too deep in the weeds to see?), feasibility risks the review
|
||||
took for granted, missing dependencies or sequencing issues, and strategic
|
||||
miscalibration (is this the right thing to build at all?). Be direct. Be terse. No
|
||||
compliments. Just the problems.
|
||||
|
||||
THE PLAN:
|
||||
<plan content>"
|
||||
|
||||
**If CODEX_AVAILABLE:**
|
||||
|
||||
```bash
|
||||
TMPERR_PV=$(mktemp /tmp/codex-planreview-XXXXXXXX)
|
||||
codex exec "<prompt>" -s read-only -c 'model_reasoning_effort="xhigh"' --enable web_search_cached 2>"$TMPERR_PV"
|
||||
```
|
||||
|
||||
Use a 5-minute timeout (`timeout: 300000`). After the command completes, read stderr:
|
||||
```bash
|
||||
cat "$TMPERR_PV"
|
||||
```
|
||||
|
||||
Present the full output verbatim:
|
||||
|
||||
```
|
||||
CODEX SAYS (plan review — outside voice):
|
||||
════════════════════════════════════════════════════════════
|
||||
<full codex output, verbatim — do not truncate or summarize>
|
||||
════════════════════════════════════════════════════════════
|
||||
```
|
||||
|
||||
**Error handling:** All errors are non-blocking — the outside voice is informational.
|
||||
- Auth failure (stderr contains "auth", "login", "unauthorized"): "Codex auth failed. Run \`codex login\` to authenticate."
|
||||
- Timeout: "Codex timed out after 5 minutes."
|
||||
- Empty response: "Codex returned no response."
|
||||
|
||||
On any Codex error, fall back to the Claude adversarial subagent.
|
||||
|
||||
**If CODEX_NOT_AVAILABLE (or Codex errored):**
|
||||
|
||||
Dispatch via the Agent tool. The subagent has fresh context — genuine independence.
|
||||
|
||||
Subagent prompt: same plan review prompt as above.
|
||||
|
||||
Present findings under an `OUTSIDE VOICE (Claude subagent):` header.
|
||||
|
||||
If the subagent fails or times out: "Outside voice unavailable. Continuing to outputs."
|
||||
|
||||
**Cross-model tension:**
|
||||
|
||||
After presenting the outside voice findings, note any points where the outside voice
|
||||
disagrees with the review findings from earlier sections. Flag these as:
|
||||
|
||||
```
|
||||
CROSS-MODEL TENSION:
|
||||
[Topic]: Review said X. Outside voice says Y. [Your assessment of who's right.]
|
||||
```
|
||||
|
||||
For each substantive tension point, auto-propose as a TODO via AskUserQuestion:
|
||||
|
||||
> "Cross-model disagreement on [topic]. The review found [X] but the outside voice
|
||||
> argues [Y]. Worth investigating further?"
|
||||
|
||||
Options:
|
||||
- A) Add to TODOS.md
|
||||
- B) Skip — not substantive
|
||||
|
||||
If no tension points exist, note: "No cross-model tension — both reviewers agree."
|
||||
|
||||
**Persist the result:**
|
||||
```bash
|
||||
~/.claude/skills/gstack/bin/gstack-review-log '{"skill":"codex-plan-review","timestamp":"'"$(date -u +%Y-%m-%dT%H:%M:%SZ)"'","status":"STATUS","source":"SOURCE","commit":"'"$(git rev-parse --short HEAD)"'"}'
|
||||
```
|
||||
|
||||
Substitute: STATUS = "clean" if no findings, "issues_found" if findings exist.
|
||||
SOURCE = "codex" if Codex ran, "claude" if subagent ran.
|
||||
|
||||
**Cleanup:** Run `rm -f "$TMPERR_PV"` after processing (if Codex was used).
|
||||
|
||||
---
|
||||
|
||||
Always work through the full interactive review: one section at a time (Architecture → Code Quality → Tests → Performance) with at most 8 top issues per section.
|
||||
|
||||
@@ -493,6 +589,7 @@ At the end of the review, fill in and display this summary so the user can see a
|
||||
- What already exists: written
|
||||
- TODOS.md updates: ___ items proposed to user
|
||||
- Failure modes: ___ critical gaps flagged
|
||||
- Outside voice: ran (codex/claude) / skipped
|
||||
- Lake Score: X/Y recommendations chose complete option
|
||||
|
||||
## Retrospective learning
|
||||
@@ -535,7 +632,7 @@ After completing the review, read the review log and config to display the dashb
|
||||
~/.claude/skills/gstack/bin/gstack-review-read
|
||||
```
|
||||
|
||||
Parse the output. Find the most recent entry for each skill (plan-ceo-review, plan-eng-review, plan-design-review, design-review-lite, adversarial-review, codex-review). Ignore entries with timestamps older than 7 days. For the Adversarial row, show whichever is more recent between `adversarial-review` (new auto-scaled) and `codex-review` (legacy). For Design Review, show whichever is more recent between `plan-design-review` (full visual audit) and `design-review-lite` (code-level check). Append "(FULL)" or "(LITE)" to the status to distinguish. Display:
|
||||
Parse the output. Find the most recent entry for each skill (plan-ceo-review, plan-eng-review, plan-design-review, design-review-lite, adversarial-review, codex-review, codex-plan-review). Ignore entries with timestamps older than 7 days. For the Adversarial row, show whichever is more recent between `adversarial-review` (new auto-scaled) and `codex-review` (legacy). For Design Review, show whichever is more recent between `plan-design-review` (full visual audit) and `design-review-lite` (code-level check). Append "(FULL)" or "(LITE)" to the status to distinguish. Display:
|
||||
|
||||
```
|
||||
+====================================================================+
|
||||
@@ -547,6 +644,7 @@ Parse the output. Find the most recent entry for each skill (plan-ceo-review, pl
|
||||
| CEO Review | 0 | — | — | no |
|
||||
| Design Review | 0 | — | — | no |
|
||||
| Adversarial | 0 | — | — | no |
|
||||
| Outside Voice | 0 | — | — | no |
|
||||
+--------------------------------------------------------------------+
|
||||
| VERDICT: CLEARED — Eng Review passed |
|
||||
+====================================================================+
|
||||
@@ -557,6 +655,7 @@ Parse the output. Find the most recent entry for each skill (plan-ceo-review, pl
|
||||
- **CEO Review (optional):** Use your judgment. Recommend it for big product/business changes, new user-facing features, or scope decisions. Skip for bug fixes, refactors, infra, and cleanup.
|
||||
- **Design Review (optional):** Use your judgment. Recommend it for UI/UX changes. Skip for backend-only, infra, or prompt-only changes.
|
||||
- **Adversarial Review (automatic):** Auto-scales by diff size. Small diffs (<50 lines) skip adversarial. Medium diffs (50–199) get cross-model adversarial. Large diffs (200+) get all 4 passes: Claude structured, Codex structured, Claude adversarial subagent, Codex adversarial. No configuration needed.
|
||||
- **Outside Voice (optional):** Independent plan review from a different AI model. Offered after all review sections complete in /plan-ceo-review and /plan-eng-review. Falls back to Claude subagent if Codex is unavailable. Never gates shipping.
|
||||
|
||||
**Verdict logic:**
|
||||
- **CLEARED**: Eng Review has >= 1 entry within 7 days with status "clean" (or \`skip_eng_review\` is \`true\`)
|
||||
|
||||
+3
-1
@@ -305,7 +305,7 @@ After completing the review, read the review log and config to display the dashb
|
||||
~/.claude/skills/gstack/bin/gstack-review-read
|
||||
```
|
||||
|
||||
Parse the output. Find the most recent entry for each skill (plan-ceo-review, plan-eng-review, plan-design-review, design-review-lite, adversarial-review, codex-review). Ignore entries with timestamps older than 7 days. For the Adversarial row, show whichever is more recent between `adversarial-review` (new auto-scaled) and `codex-review` (legacy). For Design Review, show whichever is more recent between `plan-design-review` (full visual audit) and `design-review-lite` (code-level check). Append "(FULL)" or "(LITE)" to the status to distinguish. Display:
|
||||
Parse the output. Find the most recent entry for each skill (plan-ceo-review, plan-eng-review, plan-design-review, design-review-lite, adversarial-review, codex-review, codex-plan-review). Ignore entries with timestamps older than 7 days. For the Adversarial row, show whichever is more recent between `adversarial-review` (new auto-scaled) and `codex-review` (legacy). For Design Review, show whichever is more recent between `plan-design-review` (full visual audit) and `design-review-lite` (code-level check). Append "(FULL)" or "(LITE)" to the status to distinguish. Display:
|
||||
|
||||
```
|
||||
+====================================================================+
|
||||
@@ -317,6 +317,7 @@ Parse the output. Find the most recent entry for each skill (plan-ceo-review, pl
|
||||
| CEO Review | 0 | — | — | no |
|
||||
| Design Review | 0 | — | — | no |
|
||||
| Adversarial | 0 | — | — | no |
|
||||
| Outside Voice | 0 | — | — | no |
|
||||
+--------------------------------------------------------------------+
|
||||
| VERDICT: CLEARED — Eng Review passed |
|
||||
+====================================================================+
|
||||
@@ -327,6 +328,7 @@ Parse the output. Find the most recent entry for each skill (plan-ceo-review, pl
|
||||
- **CEO Review (optional):** Use your judgment. Recommend it for big product/business changes, new user-facing features, or scope decisions. Skip for bug fixes, refactors, infra, and cleanup.
|
||||
- **Design Review (optional):** Use your judgment. Recommend it for UI/UX changes. Skip for backend-only, infra, or prompt-only changes.
|
||||
- **Adversarial Review (automatic):** Auto-scales by diff size. Small diffs (<50 lines) skip adversarial. Medium diffs (50–199) get cross-model adversarial. Large diffs (200+) get all 4 passes: Claude structured, Codex structured, Claude adversarial subagent, Codex adversarial. No configuration needed.
|
||||
- **Outside Voice (optional):** Independent plan review from a different AI model. Offered after all review sections complete in /plan-ceo-review and /plan-eng-review. Falls back to Claude subagent if Codex is unavailable. Never gates shipping.
|
||||
|
||||
**Verdict logic:**
|
||||
- **CLEARED**: Eng Review has >= 1 entry within 7 days with status "clean" (or \`skip_eng_review\` is \`true\`)
|
||||
|
||||
Reference in New Issue
Block a user