feat: per-mode reasoning (high for review/consult, xhigh for challenge) + web search

Review and consult use high reasoning — thorough but not slow.
Challenge (adversarial) uses xhigh — maximum depth for breaking code.
All modes enable web_search_cached so Codex can look up docs/APIs.
This commit is contained in:
Garry Tan
2026-03-18 21:29:07 -07:00
parent 4c60be711d
commit 6294c5a74a
2 changed files with 38 additions and 24 deletions
+19 -12
View File
@@ -221,13 +221,13 @@ TMPERR=$(mktemp /tmp/codex-err-XXXXXX.txt)
2. Run the review (5-minute timeout):
```bash
codex review --base <base> -c 'model_reasoning_effort="xhigh"' 2>"$TMPERR"
codex review --base <base> -c 'model_reasoning_effort="high"' --enable web_search_cached 2>"$TMPERR"
```
Use `timeout: 300000` on the Bash call. If the user provided custom instructions
(e.g., `/codex review focus on security`), pass them as the prompt argument:
```bash
codex review "focus on security" --base <base> -c 'model_reasoning_effort="xhigh"' 2>"$TMPERR"
codex review "focus on security" --base <base> -c 'model_reasoning_effort="high"' --enable web_search_cached 2>"$TMPERR"
```
3. Capture the output. Then parse cost from stderr:
@@ -306,7 +306,7 @@ With focus (e.g., "security"):
3. Run codex exec (5-minute timeout):
```bash
codex exec "<prompt>" -s read-only -c 'model_reasoning_effort="xhigh"' -o "$TMPRESP" 2>"$TMPERR"
codex exec "<prompt>" -s read-only -c 'model_reasoning_effort="xhigh"' --enable web_search_cached -o "$TMPRESP" 2>"$TMPERR"
```
4. Read the response and parse cost:
@@ -369,12 +369,12 @@ THE PLAN:
For a **new session:**
```bash
codex exec "<prompt>" -s read-only -c 'model_reasoning_effort="xhigh"' -o "$TMPRESP" 2>"$TMPERR"
codex exec "<prompt>" -s read-only -c 'model_reasoning_effort="high"' --enable web_search_cached -o "$TMPRESP" 2>"$TMPERR"
```
For a **resumed session** (user chose "Continue"):
```bash
codex exec resume <session-id> "<prompt>" -s read-only -c 'model_reasoning_effort="xhigh"' -o "$TMPRESP" 2>"$TMPERR"
codex exec resume <session-id> "<prompt>" -s read-only -c 'model_reasoning_effort="high"' --enable web_search_cached -o "$TMPRESP" 2>"$TMPERR"
```
5. Capture and save session ID:
@@ -410,15 +410,22 @@ Session saved — run /codex again to continue this conversation.
## Model & Reasoning
Codex with ChatGPT login only supports `gpt-5.2-codex` (the default). Other models
(o3, o4-mini, gpt-4o) require API key auth and are not available with ChatGPT login.
**Models** (with ChatGPT login):
- `gpt-5.2-codex` (default) — frontier agentic coding model, best overall
- `gpt-5.2` — deeper reasoning for non-coding analysis
- `gpt-5.1-codex-max` — deep and fast reasoning, 30% cheaper
- `gpt-5.1-codex-mini` — faster and cheaper, less capable
All codex commands use `-c 'model_reasoning_effort="xhigh"'` because the whole point of
this skill is deep, thorough analysis. We want maximum reasoning power — this is your
"200 IQ autistic developer" second opinion, not a quick sanity check.
**Reasoning effort** varies by mode — use the right level for each task:
- **Review mode:** `high` — thorough but not slow. Diff review benefits from depth but doesn't need maximum compute.
- **Challenge (adversarial) mode:** `xhigh` — maximum reasoning power. When trying to break code, you want the model thinking as hard as possible.
- **Consult mode:** `high` — good balance of depth and speed for conversations.
If the user has API key auth and wants a different model, they can say
`/codex review -m o3` and the `-m` flag should be passed through to codex.
**Web search:** All codex commands use `--enable web_search_cached` so Codex can look up
docs and APIs during review. This is OpenAI's cached index — fast, no extra cost.
If the user specifies a model (e.g., `/codex review -m gpt-5.1-codex-max`),
pass the `-m` flag through to codex.
---
+19 -12
View File
@@ -75,13 +75,13 @@ TMPERR=$(mktemp /tmp/codex-err-XXXXXX.txt)
2. Run the review (5-minute timeout):
```bash
codex review --base <base> -c 'model_reasoning_effort="xhigh"' 2>"$TMPERR"
codex review --base <base> -c 'model_reasoning_effort="high"' --enable web_search_cached 2>"$TMPERR"
```
Use `timeout: 300000` on the Bash call. If the user provided custom instructions
(e.g., `/codex review focus on security`), pass them as the prompt argument:
```bash
codex review "focus on security" --base <base> -c 'model_reasoning_effort="xhigh"' 2>"$TMPERR"
codex review "focus on security" --base <base> -c 'model_reasoning_effort="high"' --enable web_search_cached 2>"$TMPERR"
```
3. Capture the output. Then parse cost from stderr:
@@ -160,7 +160,7 @@ With focus (e.g., "security"):
3. Run codex exec (5-minute timeout):
```bash
codex exec "<prompt>" -s read-only -c 'model_reasoning_effort="xhigh"' -o "$TMPRESP" 2>"$TMPERR"
codex exec "<prompt>" -s read-only -c 'model_reasoning_effort="xhigh"' --enable web_search_cached -o "$TMPRESP" 2>"$TMPERR"
```
4. Read the response and parse cost:
@@ -223,12 +223,12 @@ THE PLAN:
For a **new session:**
```bash
codex exec "<prompt>" -s read-only -c 'model_reasoning_effort="xhigh"' -o "$TMPRESP" 2>"$TMPERR"
codex exec "<prompt>" -s read-only -c 'model_reasoning_effort="high"' --enable web_search_cached -o "$TMPRESP" 2>"$TMPERR"
```
For a **resumed session** (user chose "Continue"):
```bash
codex exec resume <session-id> "<prompt>" -s read-only -c 'model_reasoning_effort="xhigh"' -o "$TMPRESP" 2>"$TMPERR"
codex exec resume <session-id> "<prompt>" -s read-only -c 'model_reasoning_effort="high"' --enable web_search_cached -o "$TMPRESP" 2>"$TMPERR"
```
5. Capture and save session ID:
@@ -264,15 +264,22 @@ Session saved — run /codex again to continue this conversation.
## Model & Reasoning
Codex with ChatGPT login only supports `gpt-5.2-codex` (the default). Other models
(o3, o4-mini, gpt-4o) require API key auth and are not available with ChatGPT login.
**Models** (with ChatGPT login):
- `gpt-5.2-codex` (default) — frontier agentic coding model, best overall
- `gpt-5.2` — deeper reasoning for non-coding analysis
- `gpt-5.1-codex-max` — deep and fast reasoning, 30% cheaper
- `gpt-5.1-codex-mini` — faster and cheaper, less capable
All codex commands use `-c 'model_reasoning_effort="xhigh"'` because the whole point of
this skill is deep, thorough analysis. We want maximum reasoning power — this is your
"200 IQ autistic developer" second opinion, not a quick sanity check.
**Reasoning effort** varies by mode — use the right level for each task:
- **Review mode:** `high` — thorough but not slow. Diff review benefits from depth but doesn't need maximum compute.
- **Challenge (adversarial) mode:** `xhigh` — maximum reasoning power. When trying to break code, you want the model thinking as hard as possible.
- **Consult mode:** `high` — good balance of depth and speed for conversations.
If the user has API key auth and wants a different model, they can say
`/codex review -m o3` and the `-m` flag should be passed through to codex.
**Web search:** All codex commands use `--enable web_search_cached` so Codex can look up
docs and APIs during review. This is OpenAI's cached index — fast, no extra cost.
If the user specifies a model (e.g., `/codex review -m gpt-5.1-codex-max`),
pass the `-m` flag through to codex.
---