mirror of
https://github.com/garrytan/gstack.git
synced 2026-05-01 19:25:10 +02:00
feat: default codex reviews in /ship and /review (v0.9.4.0) (#256)
* feat: default codex reviews in /ship and /review with xhigh reasoning
Codex code reviews are now opt-in-once-then-always-on via a one-time
adoption prompt. When enabled, both review + adversarial run automatically
on every /ship and /review — no more choosing between them.
Key changes:
- New {{CODEX_REVIEW_STEP}} resolver centralizes Codex review logic (DRY)
- Three-state config: enabled/not-set/disabled via gstack-config
- P1 findings default to "Investigate and fix" instead of "Ship anyway"
- All reasoning bumped to xhigh (review, adversarial, consult)
- Codex review step stripped from codex-host variants (no self-invocation)
- Ship "Never ask" rule updated to accurately list quality-gate stops
- Error handling for auth, timeout, empty response (all non-blocking)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: update touchfiles test for plan-ceo-review-benefits dependency
The merge from main added plan-ceo-review-benefits to E2E_TOUCHFILES,
which means plan-ceo-review/SKILL.md now selects 3 tests, not 2.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* feat: default codex reviews in /ship and /review (v0.9.4.0)
Codex code reviews now run automatically — both review + adversarial
challenge — with a one-time opt-in prompt for new users. All modes use
xhigh reasoning. Codex-host builds strip the step to prevent recursion.
Fixes from Codex review: TMPERR properly defined, stderr captured for
both review and adversarial, error handling before log persist, commit
hash included in review log for staleness tracking.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
+5
-8
@@ -300,13 +300,13 @@ TMPERR=$(mktemp /tmp/codex-err-XXXXXX.txt)
|
||||
|
||||
2. Run the review (5-minute timeout):
|
||||
```bash
|
||||
codex review --base <base> -c 'model_reasoning_effort="high"' --enable web_search_cached 2>"$TMPERR"
|
||||
codex review --base <base> -c 'model_reasoning_effort="xhigh"' --enable web_search_cached 2>"$TMPERR"
|
||||
```
|
||||
|
||||
Use `timeout: 300000` on the Bash call. If the user provided custom instructions
|
||||
(e.g., `/codex review focus on security`), pass them as the prompt argument:
|
||||
```bash
|
||||
codex review "focus on security" --base <base> -c 'model_reasoning_effort="high"' --enable web_search_cached 2>"$TMPERR"
|
||||
codex review "focus on security" --base <base> -c 'model_reasoning_effort="xhigh"' --enable web_search_cached 2>"$TMPERR"
|
||||
```
|
||||
|
||||
3. Capture the output. Then parse cost from stderr:
|
||||
@@ -461,7 +461,7 @@ THE PLAN:
|
||||
|
||||
For a **new session:**
|
||||
```bash
|
||||
codex exec "<prompt>" -s read-only -c 'model_reasoning_effort="high"' --enable web_search_cached --json 2>"$TMPERR" | python3 -c "
|
||||
codex exec "<prompt>" -s read-only -c 'model_reasoning_effort="xhigh"' --enable web_search_cached --json 2>"$TMPERR" | python3 -c "
|
||||
import sys, json
|
||||
for line in sys.stdin:
|
||||
line = line.strip()
|
||||
@@ -494,7 +494,7 @@ for line in sys.stdin:
|
||||
|
||||
For a **resumed session** (user chose "Continue"):
|
||||
```bash
|
||||
codex exec resume <session-id> "<prompt>" -s read-only -c 'model_reasoning_effort="high"' --enable web_search_cached --json 2>"$TMPERR" | python3 -c "
|
||||
codex exec resume <session-id> "<prompt>" -s read-only -c 'model_reasoning_effort="xhigh"' --enable web_search_cached --json 2>"$TMPERR" | python3 -c "
|
||||
<same python streaming parser as above>
|
||||
"
|
||||
```
|
||||
@@ -530,10 +530,7 @@ Session saved — run /codex again to continue this conversation.
|
||||
agentic coding model). This means as OpenAI ships newer models, /codex automatically
|
||||
uses them. If the user wants a specific model, pass `-m` through to codex.
|
||||
|
||||
**Reasoning effort** varies by mode — use the right level for each task:
|
||||
- **Review mode:** `high` — thorough but not slow. Diff review benefits from depth but doesn't need maximum compute.
|
||||
- **Challenge (adversarial) mode:** `xhigh` — maximum reasoning power. When trying to break code, you want the model thinking as hard as possible.
|
||||
- **Consult mode:** `high` — good balance of depth and speed for conversations.
|
||||
**Reasoning effort:** All modes use `xhigh` — maximum reasoning power. When reviewing code, breaking code, or consulting on architecture, you want the model thinking as hard as possible.
|
||||
|
||||
**Web search:** All codex commands use `--enable web_search_cached` so Codex can look up
|
||||
docs and APIs during review. This is OpenAI's cached index — fast, no extra cost.
|
||||
|
||||
Reference in New Issue
Block a user