From 6294c5a74a4c929cce6e6ae3c3aebcf460a21adb Mon Sep 17 00:00:00 2001 From: Garry Tan Date: Wed, 18 Mar 2026 21:29:07 -0700 Subject: [PATCH] feat: per-mode reasoning (high for review/consult, xhigh for challenge) + web search MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Review and consult use high reasoning — thorough but not slow. Challenge (adversarial) uses xhigh — maximum depth for breaking code. All modes enable web_search_cached so Codex can look up docs/APIs. --- codex/SKILL.md | 31 +++++++++++++++++++------------ codex/SKILL.md.tmpl | 31 +++++++++++++++++++------------ 2 files changed, 38 insertions(+), 24 deletions(-) diff --git a/codex/SKILL.md b/codex/SKILL.md index 83e6d6a9..04e85664 100644 --- a/codex/SKILL.md +++ b/codex/SKILL.md @@ -221,13 +221,13 @@ TMPERR=$(mktemp /tmp/codex-err-XXXXXX.txt) 2. Run the review (5-minute timeout): ```bash -codex review --base -c 'model_reasoning_effort="xhigh"' 2>"$TMPERR" +codex review --base -c 'model_reasoning_effort="high"' --enable web_search_cached 2>"$TMPERR" ``` Use `timeout: 300000` on the Bash call. If the user provided custom instructions (e.g., `/codex review focus on security`), pass them as the prompt argument: ```bash -codex review "focus on security" --base -c 'model_reasoning_effort="xhigh"' 2>"$TMPERR" +codex review "focus on security" --base -c 'model_reasoning_effort="high"' --enable web_search_cached 2>"$TMPERR" ``` 3. Capture the output. Then parse cost from stderr: @@ -306,7 +306,7 @@ With focus (e.g., "security"): 3. Run codex exec (5-minute timeout): ```bash -codex exec "" -s read-only -c 'model_reasoning_effort="xhigh"' -o "$TMPRESP" 2>"$TMPERR" +codex exec "" -s read-only -c 'model_reasoning_effort="xhigh"' --enable web_search_cached -o "$TMPRESP" 2>"$TMPERR" ``` 4. Read the response and parse cost: @@ -369,12 +369,12 @@ THE PLAN: For a **new session:** ```bash -codex exec "" -s read-only -c 'model_reasoning_effort="xhigh"' -o "$TMPRESP" 2>"$TMPERR" +codex exec "" -s read-only -c 'model_reasoning_effort="high"' --enable web_search_cached -o "$TMPRESP" 2>"$TMPERR" ``` For a **resumed session** (user chose "Continue"): ```bash -codex exec resume "" -s read-only -c 'model_reasoning_effort="xhigh"' -o "$TMPRESP" 2>"$TMPERR" +codex exec resume "" -s read-only -c 'model_reasoning_effort="high"' --enable web_search_cached -o "$TMPRESP" 2>"$TMPERR" ``` 5. Capture and save session ID: @@ -410,15 +410,22 @@ Session saved — run /codex again to continue this conversation. ## Model & Reasoning -Codex with ChatGPT login only supports `gpt-5.2-codex` (the default). Other models -(o3, o4-mini, gpt-4o) require API key auth and are not available with ChatGPT login. +**Models** (with ChatGPT login): +- `gpt-5.2-codex` (default) — frontier agentic coding model, best overall +- `gpt-5.2` — deeper reasoning for non-coding analysis +- `gpt-5.1-codex-max` — deep and fast reasoning, 30% cheaper +- `gpt-5.1-codex-mini` — faster and cheaper, less capable -All codex commands use `-c 'model_reasoning_effort="xhigh"'` because the whole point of -this skill is deep, thorough analysis. We want maximum reasoning power — this is your -"200 IQ autistic developer" second opinion, not a quick sanity check. +**Reasoning effort** varies by mode — use the right level for each task: +- **Review mode:** `high` — thorough but not slow. Diff review benefits from depth but doesn't need maximum compute. +- **Challenge (adversarial) mode:** `xhigh` — maximum reasoning power. When trying to break code, you want the model thinking as hard as possible. +- **Consult mode:** `high` — good balance of depth and speed for conversations. -If the user has API key auth and wants a different model, they can say -`/codex review -m o3` and the `-m` flag should be passed through to codex. +**Web search:** All codex commands use `--enable web_search_cached` so Codex can look up +docs and APIs during review. This is OpenAI's cached index — fast, no extra cost. + +If the user specifies a model (e.g., `/codex review -m gpt-5.1-codex-max`), +pass the `-m` flag through to codex. --- diff --git a/codex/SKILL.md.tmpl b/codex/SKILL.md.tmpl index 1b30a236..bfd1fa20 100644 --- a/codex/SKILL.md.tmpl +++ b/codex/SKILL.md.tmpl @@ -75,13 +75,13 @@ TMPERR=$(mktemp /tmp/codex-err-XXXXXX.txt) 2. Run the review (5-minute timeout): ```bash -codex review --base -c 'model_reasoning_effort="xhigh"' 2>"$TMPERR" +codex review --base -c 'model_reasoning_effort="high"' --enable web_search_cached 2>"$TMPERR" ``` Use `timeout: 300000` on the Bash call. If the user provided custom instructions (e.g., `/codex review focus on security`), pass them as the prompt argument: ```bash -codex review "focus on security" --base -c 'model_reasoning_effort="xhigh"' 2>"$TMPERR" +codex review "focus on security" --base -c 'model_reasoning_effort="high"' --enable web_search_cached 2>"$TMPERR" ``` 3. Capture the output. Then parse cost from stderr: @@ -160,7 +160,7 @@ With focus (e.g., "security"): 3. Run codex exec (5-minute timeout): ```bash -codex exec "" -s read-only -c 'model_reasoning_effort="xhigh"' -o "$TMPRESP" 2>"$TMPERR" +codex exec "" -s read-only -c 'model_reasoning_effort="xhigh"' --enable web_search_cached -o "$TMPRESP" 2>"$TMPERR" ``` 4. Read the response and parse cost: @@ -223,12 +223,12 @@ THE PLAN: For a **new session:** ```bash -codex exec "" -s read-only -c 'model_reasoning_effort="xhigh"' -o "$TMPRESP" 2>"$TMPERR" +codex exec "" -s read-only -c 'model_reasoning_effort="high"' --enable web_search_cached -o "$TMPRESP" 2>"$TMPERR" ``` For a **resumed session** (user chose "Continue"): ```bash -codex exec resume "" -s read-only -c 'model_reasoning_effort="xhigh"' -o "$TMPRESP" 2>"$TMPERR" +codex exec resume "" -s read-only -c 'model_reasoning_effort="high"' --enable web_search_cached -o "$TMPRESP" 2>"$TMPERR" ``` 5. Capture and save session ID: @@ -264,15 +264,22 @@ Session saved — run /codex again to continue this conversation. ## Model & Reasoning -Codex with ChatGPT login only supports `gpt-5.2-codex` (the default). Other models -(o3, o4-mini, gpt-4o) require API key auth and are not available with ChatGPT login. +**Models** (with ChatGPT login): +- `gpt-5.2-codex` (default) — frontier agentic coding model, best overall +- `gpt-5.2` — deeper reasoning for non-coding analysis +- `gpt-5.1-codex-max` — deep and fast reasoning, 30% cheaper +- `gpt-5.1-codex-mini` — faster and cheaper, less capable -All codex commands use `-c 'model_reasoning_effort="xhigh"'` because the whole point of -this skill is deep, thorough analysis. We want maximum reasoning power — this is your -"200 IQ autistic developer" second opinion, not a quick sanity check. +**Reasoning effort** varies by mode — use the right level for each task: +- **Review mode:** `high` — thorough but not slow. Diff review benefits from depth but doesn't need maximum compute. +- **Challenge (adversarial) mode:** `xhigh` — maximum reasoning power. When trying to break code, you want the model thinking as hard as possible. +- **Consult mode:** `high` — good balance of depth and speed for conversations. -If the user has API key auth and wants a different model, they can say -`/codex review -m o3` and the `-m` flag should be passed through to codex. +**Web search:** All codex commands use `--enable web_search_cached` so Codex can look up +docs and APIs during review. This is OpenAI's cached index — fast, no extra cost. + +If the user specifies a model (e.g., `/codex review -m gpt-5.1-codex-max`), +pass the `-m` flag through to codex. ---