feat: per-mode reasoning (high for review/consult, xhigh for challenge) + web search

Review and consult use high reasoning — thorough but not slow. Challenge (adversarial) uses xhigh — maximum depth for breaking code. All modes enable web_search_cached so Codex can look up docs/APIs.
2026-05-06 05:35:46 +02:00 · 2026-03-18 21:29:07 -07:00
parent 4c60be711d
commit 6294c5a74a
2 changed files with 38 additions and 24 deletions
@@ -221,13 +221,13 @@ TMPERR=$(mktemp /tmp/codex-err-XXXXXX.txt)

 2. Run the review (5-minute timeout):
 ```bash
-codex review --base <base> -c 'model_reasoning_effort="xhigh"' 2>"$TMPERR"
+codex review --base <base> -c 'model_reasoning_effort="high"' --enable web_search_cached 2>"$TMPERR"
 ```

 Use `timeout: 300000` on the Bash call. If the user provided custom instructions
 (e.g., `/codex review focus on security`), pass them as the prompt argument:
 ```bash
-codex review "focus on security" --base <base> -c 'model_reasoning_effort="xhigh"' 2>"$TMPERR"
+codex review "focus on security" --base <base> -c 'model_reasoning_effort="high"' --enable web_search_cached 2>"$TMPERR"
 ```

 3. Capture the output. Then parse cost from stderr:
@@ -306,7 +306,7 @@ With focus (e.g., "security"):

 3. Run codex exec (5-minute timeout):
 ```bash
-codex exec "<prompt>" -s read-only -c 'model_reasoning_effort="xhigh"' -o "$TMPRESP" 2>"$TMPERR"
+codex exec "<prompt>" -s read-only -c 'model_reasoning_effort="xhigh"' --enable web_search_cached -o "$TMPRESP" 2>"$TMPERR"
 ```

 4. Read the response and parse cost:
@@ -369,12 +369,12 @@ THE PLAN:

 For a **new session:**
 ```bash
-codex exec "<prompt>" -s read-only -c 'model_reasoning_effort="xhigh"' -o "$TMPRESP" 2>"$TMPERR"
+codex exec "<prompt>" -s read-only -c 'model_reasoning_effort="high"' --enable web_search_cached -o "$TMPRESP" 2>"$TMPERR"
 ```

 For a **resumed session** (user chose "Continue"):
 ```bash
-codex exec resume <session-id> "<prompt>" -s read-only -c 'model_reasoning_effort="xhigh"' -o "$TMPRESP" 2>"$TMPERR"
+codex exec resume <session-id> "<prompt>" -s read-only -c 'model_reasoning_effort="high"' --enable web_search_cached -o "$TMPRESP" 2>"$TMPERR"
 ```

 5. Capture and save session ID:
@@ -410,15 +410,22 @@ Session saved — run /codex again to continue this conversation.

 ## Model & Reasoning

-Codex with ChatGPT login only supports `gpt-5.2-codex` (the default). Other models
-(o3, o4-mini, gpt-4o) require API key auth and are not available with ChatGPT login.
+**Models** (with ChatGPT login):
+- `gpt-5.2-codex` (default) — frontier agentic coding model, best overall
+- `gpt-5.2` — deeper reasoning for non-coding analysis
+- `gpt-5.1-codex-max` — deep and fast reasoning, 30% cheaper
+- `gpt-5.1-codex-mini` — faster and cheaper, less capable

-All codex commands use `-c 'model_reasoning_effort="xhigh"'` because the whole point of
-this skill is deep, thorough analysis. We want maximum reasoning power — this is your
-"200 IQ autistic developer" second opinion, not a quick sanity check.
+**Reasoning effort** varies by mode — use the right level for each task:
+- **Review mode:** `high` — thorough but not slow. Diff review benefits from depth but doesn't need maximum compute.
+- **Challenge (adversarial) mode:** `xhigh` — maximum reasoning power. When trying to break code, you want the model thinking as hard as possible.
+- **Consult mode:** `high` — good balance of depth and speed for conversations.

-If the user has API key auth and wants a different model, they can say
-`/codex review -m o3` and the `-m` flag should be passed through to codex.
+**Web search:** All codex commands use `--enable web_search_cached` so Codex can look up
+docs and APIs during review. This is OpenAI's cached index — fast, no extra cost.
+
+If the user specifies a model (e.g., `/codex review -m gpt-5.1-codex-max`),
+pass the `-m` flag through to codex.

 ---

@@ -75,13 +75,13 @@ TMPERR=$(mktemp /tmp/codex-err-XXXXXX.txt)

 2. Run the review (5-minute timeout):
 ```bash
-codex review --base <base> -c 'model_reasoning_effort="xhigh"' 2>"$TMPERR"
+codex review --base <base> -c 'model_reasoning_effort="high"' --enable web_search_cached 2>"$TMPERR"
 ```

 Use `timeout: 300000` on the Bash call. If the user provided custom instructions
 (e.g., `/codex review focus on security`), pass them as the prompt argument:
 ```bash
-codex review "focus on security" --base <base> -c 'model_reasoning_effort="xhigh"' 2>"$TMPERR"
+codex review "focus on security" --base <base> -c 'model_reasoning_effort="high"' --enable web_search_cached 2>"$TMPERR"
 ```

 3. Capture the output. Then parse cost from stderr:
@@ -160,7 +160,7 @@ With focus (e.g., "security"):

 3. Run codex exec (5-minute timeout):
 ```bash
-codex exec "<prompt>" -s read-only -c 'model_reasoning_effort="xhigh"' -o "$TMPRESP" 2>"$TMPERR"
+codex exec "<prompt>" -s read-only -c 'model_reasoning_effort="xhigh"' --enable web_search_cached -o "$TMPRESP" 2>"$TMPERR"
 ```

 4. Read the response and parse cost:
@@ -223,12 +223,12 @@ THE PLAN:

 For a **new session:**
 ```bash
-codex exec "<prompt>" -s read-only -c 'model_reasoning_effort="xhigh"' -o "$TMPRESP" 2>"$TMPERR"
+codex exec "<prompt>" -s read-only -c 'model_reasoning_effort="high"' --enable web_search_cached -o "$TMPRESP" 2>"$TMPERR"
 ```

 For a **resumed session** (user chose "Continue"):
 ```bash
-codex exec resume <session-id> "<prompt>" -s read-only -c 'model_reasoning_effort="xhigh"' -o "$TMPRESP" 2>"$TMPERR"
+codex exec resume <session-id> "<prompt>" -s read-only -c 'model_reasoning_effort="high"' --enable web_search_cached -o "$TMPRESP" 2>"$TMPERR"
 ```

 5. Capture and save session ID:
@@ -264,15 +264,22 @@ Session saved — run /codex again to continue this conversation.

 ## Model & Reasoning

-Codex with ChatGPT login only supports `gpt-5.2-codex` (the default). Other models
-(o3, o4-mini, gpt-4o) require API key auth and are not available with ChatGPT login.
+**Models** (with ChatGPT login):
+- `gpt-5.2-codex` (default) — frontier agentic coding model, best overall
+- `gpt-5.2` — deeper reasoning for non-coding analysis
+- `gpt-5.1-codex-max` — deep and fast reasoning, 30% cheaper
+- `gpt-5.1-codex-mini` — faster and cheaper, less capable

-All codex commands use `-c 'model_reasoning_effort="xhigh"'` because the whole point of
-this skill is deep, thorough analysis. We want maximum reasoning power — this is your
-"200 IQ autistic developer" second opinion, not a quick sanity check.
+**Reasoning effort** varies by mode — use the right level for each task:
+- **Review mode:** `high` — thorough but not slow. Diff review benefits from depth but doesn't need maximum compute.
+- **Challenge (adversarial) mode:** `xhigh` — maximum reasoning power. When trying to break code, you want the model thinking as hard as possible.
+- **Consult mode:** `high` — good balance of depth and speed for conversations.

-If the user has API key auth and wants a different model, they can say
-`/codex review -m o3` and the `-m` flag should be passed through to codex.
+**Web search:** All codex commands use `--enable web_search_cached` so Codex can look up
+docs and APIs during review. This is OpenAI's cached index — fast, no extra cost.
+
+If the user specifies a model (e.g., `/codex review -m gpt-5.1-codex-max`),
+pass the `-m` flag through to codex.

 ---