mirror of
https://github.com/garrytan/gstack.git
synced 2026-05-11 15:27:22 +02:00
feat: auto-scaled adversarial review (v0.9.5.0) (#297)
* feat: auto-scaled adversarial review by diff size Replace config-driven Codex review step with automatic adversarial review that scales by diff size: small (<50 lines) skips adversarial, medium (50-199) gets cross-model adversarial, large (200+) gets all 4 passes. Adds Claude adversarial subagent as fallback when Codex unavailable. Review log uses new "adversarial-review" skill name with source/tier fields. Dashboard updated to read both adversarial-review and legacy codex-review. * chore: regenerate SKILL.md files for auto-scaled adversarial review * chore: bump version and changelog (v0.9.5.0) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat: allow user override of adversarial review tier Users can now say "paranoid review", "run all passes", "full adversarial", etc. to force the large tier regardless of diff size. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -1075,7 +1075,7 @@ After completing the review, read the review log and config to display the dashb
|
||||
~/.codex/skills/gstack/bin/gstack-review-read
|
||||
```
|
||||
|
||||
Parse the output. Find the most recent entry for each skill (plan-ceo-review, plan-eng-review, plan-design-review, design-review-lite, codex-review). Ignore entries with timestamps older than 7 days. For Design Review, show whichever is more recent between `plan-design-review` (full visual audit) and `design-review-lite` (code-level check). Append "(FULL)" or "(LITE)" to the status to distinguish. Display:
|
||||
Parse the output. Find the most recent entry for each skill (plan-ceo-review, plan-eng-review, plan-design-review, design-review-lite, adversarial-review, codex-review). Ignore entries with timestamps older than 7 days. For the Adversarial row, show whichever is more recent between `adversarial-review` (new auto-scaled) and `codex-review` (legacy). For Design Review, show whichever is more recent between `plan-design-review` (full visual audit) and `design-review-lite` (code-level check). Append "(FULL)" or "(LITE)" to the status to distinguish. Display:
|
||||
|
||||
```
|
||||
+====================================================================+
|
||||
@@ -1086,7 +1086,7 @@ Parse the output. Find the most recent entry for each skill (plan-ceo-review, pl
|
||||
| Eng Review | 1 | 2026-03-16 15:00 | CLEAR | YES |
|
||||
| CEO Review | 0 | — | — | no |
|
||||
| Design Review | 0 | — | — | no |
|
||||
| Codex Review | 0 | — | — | no |
|
||||
| Adversarial | 0 | — | — | no |
|
||||
+--------------------------------------------------------------------+
|
||||
| VERDICT: CLEARED — Eng Review passed |
|
||||
+====================================================================+
|
||||
@@ -1096,7 +1096,7 @@ Parse the output. Find the most recent entry for each skill (plan-ceo-review, pl
|
||||
- **Eng Review (required by default):** The only review that gates shipping. Covers architecture, code quality, tests, performance. Can be disabled globally with \`gstack-config set skip_eng_review true\` (the "don't bother me" setting).
|
||||
- **CEO Review (optional):** Use your judgment. Recommend it for big product/business changes, new user-facing features, or scope decisions. Skip for bug fixes, refactors, infra, and cleanup.
|
||||
- **Design Review (optional):** Use your judgment. Recommend it for UI/UX changes. Skip for backend-only, infra, or prompt-only changes.
|
||||
- **Codex Review (enabled by default when Codex CLI is installed):** Independent review + adversarial challenge from OpenAI Codex CLI. Shows pass/fail gate. Runs automatically when enabled — configure with \`gstack-config set codex_reviews enabled|disabled\`.
|
||||
- **Adversarial Review (automatic):** Auto-scales by diff size. Small diffs (<50 lines) skip adversarial. Medium diffs (50–199) get cross-model adversarial. Large diffs (200+) get all 4 passes: Claude structured, Codex structured, Claude adversarial subagent, Codex adversarial. No configuration needed.
|
||||
|
||||
**Verdict logic:**
|
||||
- **CLEARED**: Eng Review has >= 1 entry within 7 days with status "clean" (or \`skip_eng_review\` is \`true\`)
|
||||
|
||||
@@ -527,7 +527,7 @@ After completing the review, read the review log and config to display the dashb
|
||||
~/.codex/skills/gstack/bin/gstack-review-read
|
||||
```
|
||||
|
||||
Parse the output. Find the most recent entry for each skill (plan-ceo-review, plan-eng-review, plan-design-review, design-review-lite, codex-review). Ignore entries with timestamps older than 7 days. For Design Review, show whichever is more recent between `plan-design-review` (full visual audit) and `design-review-lite` (code-level check). Append "(FULL)" or "(LITE)" to the status to distinguish. Display:
|
||||
Parse the output. Find the most recent entry for each skill (plan-ceo-review, plan-eng-review, plan-design-review, design-review-lite, adversarial-review, codex-review). Ignore entries with timestamps older than 7 days. For the Adversarial row, show whichever is more recent between `adversarial-review` (new auto-scaled) and `codex-review` (legacy). For Design Review, show whichever is more recent between `plan-design-review` (full visual audit) and `design-review-lite` (code-level check). Append "(FULL)" or "(LITE)" to the status to distinguish. Display:
|
||||
|
||||
```
|
||||
+====================================================================+
|
||||
@@ -538,7 +538,7 @@ Parse the output. Find the most recent entry for each skill (plan-ceo-review, pl
|
||||
| Eng Review | 1 | 2026-03-16 15:00 | CLEAR | YES |
|
||||
| CEO Review | 0 | — | — | no |
|
||||
| Design Review | 0 | — | — | no |
|
||||
| Codex Review | 0 | — | — | no |
|
||||
| Adversarial | 0 | — | — | no |
|
||||
+--------------------------------------------------------------------+
|
||||
| VERDICT: CLEARED — Eng Review passed |
|
||||
+====================================================================+
|
||||
@@ -548,7 +548,7 @@ Parse the output. Find the most recent entry for each skill (plan-ceo-review, pl
|
||||
- **Eng Review (required by default):** The only review that gates shipping. Covers architecture, code quality, tests, performance. Can be disabled globally with \`gstack-config set skip_eng_review true\` (the "don't bother me" setting).
|
||||
- **CEO Review (optional):** Use your judgment. Recommend it for big product/business changes, new user-facing features, or scope decisions. Skip for bug fixes, refactors, infra, and cleanup.
|
||||
- **Design Review (optional):** Use your judgment. Recommend it for UI/UX changes. Skip for backend-only, infra, or prompt-only changes.
|
||||
- **Codex Review (enabled by default when Codex CLI is installed):** Independent review + adversarial challenge from OpenAI Codex CLI. Shows pass/fail gate. Runs automatically when enabled — configure with \`gstack-config set codex_reviews enabled|disabled\`.
|
||||
- **Adversarial Review (automatic):** Auto-scales by diff size. Small diffs (<50 lines) skip adversarial. Medium diffs (50–199) get cross-model adversarial. Large diffs (200+) get all 4 passes: Claude structured, Codex structured, Claude adversarial subagent, Codex adversarial. No configuration needed.
|
||||
|
||||
**Verdict logic:**
|
||||
- **CLEARED**: Eng Review has >= 1 entry within 7 days with status "clean" (or \`skip_eng_review\` is \`true\`)
|
||||
|
||||
@@ -524,7 +524,7 @@ After completing the review, read the review log and config to display the dashb
|
||||
~/.codex/skills/gstack/bin/gstack-review-read
|
||||
```
|
||||
|
||||
Parse the output. Find the most recent entry for each skill (plan-ceo-review, plan-eng-review, plan-design-review, design-review-lite, codex-review). Ignore entries with timestamps older than 7 days. For Design Review, show whichever is more recent between `plan-design-review` (full visual audit) and `design-review-lite` (code-level check). Append "(FULL)" or "(LITE)" to the status to distinguish. Display:
|
||||
Parse the output. Find the most recent entry for each skill (plan-ceo-review, plan-eng-review, plan-design-review, design-review-lite, adversarial-review, codex-review). Ignore entries with timestamps older than 7 days. For the Adversarial row, show whichever is more recent between `adversarial-review` (new auto-scaled) and `codex-review` (legacy). For Design Review, show whichever is more recent between `plan-design-review` (full visual audit) and `design-review-lite` (code-level check). Append "(FULL)" or "(LITE)" to the status to distinguish. Display:
|
||||
|
||||
```
|
||||
+====================================================================+
|
||||
@@ -535,7 +535,7 @@ Parse the output. Find the most recent entry for each skill (plan-ceo-review, pl
|
||||
| Eng Review | 1 | 2026-03-16 15:00 | CLEAR | YES |
|
||||
| CEO Review | 0 | — | — | no |
|
||||
| Design Review | 0 | — | — | no |
|
||||
| Codex Review | 0 | — | — | no |
|
||||
| Adversarial | 0 | — | — | no |
|
||||
+--------------------------------------------------------------------+
|
||||
| VERDICT: CLEARED — Eng Review passed |
|
||||
+====================================================================+
|
||||
@@ -545,7 +545,7 @@ Parse the output. Find the most recent entry for each skill (plan-ceo-review, pl
|
||||
- **Eng Review (required by default):** The only review that gates shipping. Covers architecture, code quality, tests, performance. Can be disabled globally with \`gstack-config set skip_eng_review true\` (the "don't bother me" setting).
|
||||
- **CEO Review (optional):** Use your judgment. Recommend it for big product/business changes, new user-facing features, or scope decisions. Skip for bug fixes, refactors, infra, and cleanup.
|
||||
- **Design Review (optional):** Use your judgment. Recommend it for UI/UX changes. Skip for backend-only, infra, or prompt-only changes.
|
||||
- **Codex Review (enabled by default when Codex CLI is installed):** Independent review + adversarial challenge from OpenAI Codex CLI. Shows pass/fail gate. Runs automatically when enabled — configure with \`gstack-config set codex_reviews enabled|disabled\`.
|
||||
- **Adversarial Review (automatic):** Auto-scales by diff size. Small diffs (<50 lines) skip adversarial. Medium diffs (50–199) get cross-model adversarial. Large diffs (200+) get all 4 passes: Claude structured, Codex structured, Claude adversarial subagent, Codex adversarial. No configuration needed.
|
||||
|
||||
**Verdict logic:**
|
||||
- **CLEARED**: Eng Review has >= 1 entry within 7 days with status "clean" (or \`skip_eng_review\` is \`true\`)
|
||||
|
||||
@@ -294,7 +294,7 @@ After completing the review, read the review log and config to display the dashb
|
||||
~/.codex/skills/gstack/bin/gstack-review-read
|
||||
```
|
||||
|
||||
Parse the output. Find the most recent entry for each skill (plan-ceo-review, plan-eng-review, plan-design-review, design-review-lite, codex-review). Ignore entries with timestamps older than 7 days. For Design Review, show whichever is more recent between `plan-design-review` (full visual audit) and `design-review-lite` (code-level check). Append "(FULL)" or "(LITE)" to the status to distinguish. Display:
|
||||
Parse the output. Find the most recent entry for each skill (plan-ceo-review, plan-eng-review, plan-design-review, design-review-lite, adversarial-review, codex-review). Ignore entries with timestamps older than 7 days. For the Adversarial row, show whichever is more recent between `adversarial-review` (new auto-scaled) and `codex-review` (legacy). For Design Review, show whichever is more recent between `plan-design-review` (full visual audit) and `design-review-lite` (code-level check). Append "(FULL)" or "(LITE)" to the status to distinguish. Display:
|
||||
|
||||
```
|
||||
+====================================================================+
|
||||
@@ -305,7 +305,7 @@ Parse the output. Find the most recent entry for each skill (plan-ceo-review, pl
|
||||
| Eng Review | 1 | 2026-03-16 15:00 | CLEAR | YES |
|
||||
| CEO Review | 0 | — | — | no |
|
||||
| Design Review | 0 | — | — | no |
|
||||
| Codex Review | 0 | — | — | no |
|
||||
| Adversarial | 0 | — | — | no |
|
||||
+--------------------------------------------------------------------+
|
||||
| VERDICT: CLEARED — Eng Review passed |
|
||||
+====================================================================+
|
||||
@@ -315,7 +315,7 @@ Parse the output. Find the most recent entry for each skill (plan-ceo-review, pl
|
||||
- **Eng Review (required by default):** The only review that gates shipping. Covers architecture, code quality, tests, performance. Can be disabled globally with \`gstack-config set skip_eng_review true\` (the "don't bother me" setting).
|
||||
- **CEO Review (optional):** Use your judgment. Recommend it for big product/business changes, new user-facing features, or scope decisions. Skip for bug fixes, refactors, infra, and cleanup.
|
||||
- **Design Review (optional):** Use your judgment. Recommend it for UI/UX changes. Skip for backend-only, infra, or prompt-only changes.
|
||||
- **Codex Review (enabled by default when Codex CLI is installed):** Independent review + adversarial challenge from OpenAI Codex CLI. Shows pass/fail gate. Runs automatically when enabled — configure with \`gstack-config set codex_reviews enabled|disabled\`.
|
||||
- **Adversarial Review (automatic):** Auto-scales by diff size. Small diffs (<50 lines) skip adversarial. Medium diffs (50–199) get cross-model adversarial. Large diffs (200+) get all 4 passes: Claude structured, Codex structured, Claude adversarial subagent, Codex adversarial. No configuration needed.
|
||||
|
||||
**Verdict logic:**
|
||||
- **CLEARED**: Eng Review has >= 1 entry within 7 days with status "clean" (or \`skip_eng_review\` is \`true\`)
|
||||
@@ -1098,7 +1098,7 @@ doc updates — the user runs `/ship` and documentation stays current without a
|
||||
- **Never skip tests.** If tests fail, stop.
|
||||
- **Never skip the pre-landing review.** If checklist.md is unreadable, stop.
|
||||
- **Never force push.** Use regular `git push` only.
|
||||
- **Never ask for trivial confirmations** (e.g., "ready to push?", "create PR?"). DO stop for: version bumps (MINOR/MAJOR), pre-landing review findings (ASK items), Codex critical findings ([P1]), and the one-time Codex adoption prompt.
|
||||
- **Never ask for trivial confirmations** (e.g., "ready to push?", "create PR?"). DO stop for: version bumps (MINOR/MAJOR), pre-landing review findings (ASK items), and Codex structured review [P1] findings (large diffs only).
|
||||
- **Always use the 4-digit version format** from the VERSION file.
|
||||
- **Date format in CHANGELOG:** `YYYY-MM-DD`
|
||||
- **Split commits for bisectability** — each commit = one logical change.
|
||||
|
||||
Reference in New Issue
Block a user