mirror of
https://github.com/garrytan/gstack.git
synced 2026-05-06 05:35:46 +02:00
feat: per-mode reasoning (high for review/consult, xhigh for challenge) + web search
Review and consult use high reasoning — thorough but not slow. Challenge (adversarial) uses xhigh — maximum depth for breaking code. All modes enable web_search_cached so Codex can look up docs/APIs.
This commit is contained in:
+19
-12
@@ -221,13 +221,13 @@ TMPERR=$(mktemp /tmp/codex-err-XXXXXX.txt)
|
||||
|
||||
2. Run the review (5-minute timeout):
|
||||
```bash
|
||||
codex review --base <base> -c 'model_reasoning_effort="xhigh"' 2>"$TMPERR"
|
||||
codex review --base <base> -c 'model_reasoning_effort="high"' --enable web_search_cached 2>"$TMPERR"
|
||||
```
|
||||
|
||||
Use `timeout: 300000` on the Bash call. If the user provided custom instructions
|
||||
(e.g., `/codex review focus on security`), pass them as the prompt argument:
|
||||
```bash
|
||||
codex review "focus on security" --base <base> -c 'model_reasoning_effort="xhigh"' 2>"$TMPERR"
|
||||
codex review "focus on security" --base <base> -c 'model_reasoning_effort="high"' --enable web_search_cached 2>"$TMPERR"
|
||||
```
|
||||
|
||||
3. Capture the output. Then parse cost from stderr:
|
||||
@@ -306,7 +306,7 @@ With focus (e.g., "security"):
|
||||
|
||||
3. Run codex exec (5-minute timeout):
|
||||
```bash
|
||||
codex exec "<prompt>" -s read-only -c 'model_reasoning_effort="xhigh"' -o "$TMPRESP" 2>"$TMPERR"
|
||||
codex exec "<prompt>" -s read-only -c 'model_reasoning_effort="xhigh"' --enable web_search_cached -o "$TMPRESP" 2>"$TMPERR"
|
||||
```
|
||||
|
||||
4. Read the response and parse cost:
|
||||
@@ -369,12 +369,12 @@ THE PLAN:
|
||||
|
||||
For a **new session:**
|
||||
```bash
|
||||
codex exec "<prompt>" -s read-only -c 'model_reasoning_effort="xhigh"' -o "$TMPRESP" 2>"$TMPERR"
|
||||
codex exec "<prompt>" -s read-only -c 'model_reasoning_effort="high"' --enable web_search_cached -o "$TMPRESP" 2>"$TMPERR"
|
||||
```
|
||||
|
||||
For a **resumed session** (user chose "Continue"):
|
||||
```bash
|
||||
codex exec resume <session-id> "<prompt>" -s read-only -c 'model_reasoning_effort="xhigh"' -o "$TMPRESP" 2>"$TMPERR"
|
||||
codex exec resume <session-id> "<prompt>" -s read-only -c 'model_reasoning_effort="high"' --enable web_search_cached -o "$TMPRESP" 2>"$TMPERR"
|
||||
```
|
||||
|
||||
5. Capture and save session ID:
|
||||
@@ -410,15 +410,22 @@ Session saved — run /codex again to continue this conversation.
|
||||
|
||||
## Model & Reasoning
|
||||
|
||||
Codex with ChatGPT login only supports `gpt-5.2-codex` (the default). Other models
|
||||
(o3, o4-mini, gpt-4o) require API key auth and are not available with ChatGPT login.
|
||||
**Models** (with ChatGPT login):
|
||||
- `gpt-5.2-codex` (default) — frontier agentic coding model, best overall
|
||||
- `gpt-5.2` — deeper reasoning for non-coding analysis
|
||||
- `gpt-5.1-codex-max` — deep and fast reasoning, 30% cheaper
|
||||
- `gpt-5.1-codex-mini` — faster and cheaper, less capable
|
||||
|
||||
All codex commands use `-c 'model_reasoning_effort="xhigh"'` because the whole point of
|
||||
this skill is deep, thorough analysis. We want maximum reasoning power — this is your
|
||||
"200 IQ autistic developer" second opinion, not a quick sanity check.
|
||||
**Reasoning effort** varies by mode — use the right level for each task:
|
||||
- **Review mode:** `high` — thorough but not slow. Diff review benefits from depth but doesn't need maximum compute.
|
||||
- **Challenge (adversarial) mode:** `xhigh` — maximum reasoning power. When trying to break code, you want the model thinking as hard as possible.
|
||||
- **Consult mode:** `high` — good balance of depth and speed for conversations.
|
||||
|
||||
If the user has API key auth and wants a different model, they can say
|
||||
`/codex review -m o3` and the `-m` flag should be passed through to codex.
|
||||
**Web search:** All codex commands use `--enable web_search_cached` so Codex can look up
|
||||
docs and APIs during review. This is OpenAI's cached index — fast, no extra cost.
|
||||
|
||||
If the user specifies a model (e.g., `/codex review -m gpt-5.1-codex-max`),
|
||||
pass the `-m` flag through to codex.
|
||||
|
||||
---
|
||||
|
||||
|
||||
+19
-12
@@ -75,13 +75,13 @@ TMPERR=$(mktemp /tmp/codex-err-XXXXXX.txt)
|
||||
|
||||
2. Run the review (5-minute timeout):
|
||||
```bash
|
||||
codex review --base <base> -c 'model_reasoning_effort="xhigh"' 2>"$TMPERR"
|
||||
codex review --base <base> -c 'model_reasoning_effort="high"' --enable web_search_cached 2>"$TMPERR"
|
||||
```
|
||||
|
||||
Use `timeout: 300000` on the Bash call. If the user provided custom instructions
|
||||
(e.g., `/codex review focus on security`), pass them as the prompt argument:
|
||||
```bash
|
||||
codex review "focus on security" --base <base> -c 'model_reasoning_effort="xhigh"' 2>"$TMPERR"
|
||||
codex review "focus on security" --base <base> -c 'model_reasoning_effort="high"' --enable web_search_cached 2>"$TMPERR"
|
||||
```
|
||||
|
||||
3. Capture the output. Then parse cost from stderr:
|
||||
@@ -160,7 +160,7 @@ With focus (e.g., "security"):
|
||||
|
||||
3. Run codex exec (5-minute timeout):
|
||||
```bash
|
||||
codex exec "<prompt>" -s read-only -c 'model_reasoning_effort="xhigh"' -o "$TMPRESP" 2>"$TMPERR"
|
||||
codex exec "<prompt>" -s read-only -c 'model_reasoning_effort="xhigh"' --enable web_search_cached -o "$TMPRESP" 2>"$TMPERR"
|
||||
```
|
||||
|
||||
4. Read the response and parse cost:
|
||||
@@ -223,12 +223,12 @@ THE PLAN:
|
||||
|
||||
For a **new session:**
|
||||
```bash
|
||||
codex exec "<prompt>" -s read-only -c 'model_reasoning_effort="xhigh"' -o "$TMPRESP" 2>"$TMPERR"
|
||||
codex exec "<prompt>" -s read-only -c 'model_reasoning_effort="high"' --enable web_search_cached -o "$TMPRESP" 2>"$TMPERR"
|
||||
```
|
||||
|
||||
For a **resumed session** (user chose "Continue"):
|
||||
```bash
|
||||
codex exec resume <session-id> "<prompt>" -s read-only -c 'model_reasoning_effort="xhigh"' -o "$TMPRESP" 2>"$TMPERR"
|
||||
codex exec resume <session-id> "<prompt>" -s read-only -c 'model_reasoning_effort="high"' --enable web_search_cached -o "$TMPRESP" 2>"$TMPERR"
|
||||
```
|
||||
|
||||
5. Capture and save session ID:
|
||||
@@ -264,15 +264,22 @@ Session saved — run /codex again to continue this conversation.
|
||||
|
||||
## Model & Reasoning
|
||||
|
||||
Codex with ChatGPT login only supports `gpt-5.2-codex` (the default). Other models
|
||||
(o3, o4-mini, gpt-4o) require API key auth and are not available with ChatGPT login.
|
||||
**Models** (with ChatGPT login):
|
||||
- `gpt-5.2-codex` (default) — frontier agentic coding model, best overall
|
||||
- `gpt-5.2` — deeper reasoning for non-coding analysis
|
||||
- `gpt-5.1-codex-max` — deep and fast reasoning, 30% cheaper
|
||||
- `gpt-5.1-codex-mini` — faster and cheaper, less capable
|
||||
|
||||
All codex commands use `-c 'model_reasoning_effort="xhigh"'` because the whole point of
|
||||
this skill is deep, thorough analysis. We want maximum reasoning power — this is your
|
||||
"200 IQ autistic developer" second opinion, not a quick sanity check.
|
||||
**Reasoning effort** varies by mode — use the right level for each task:
|
||||
- **Review mode:** `high` — thorough but not slow. Diff review benefits from depth but doesn't need maximum compute.
|
||||
- **Challenge (adversarial) mode:** `xhigh` — maximum reasoning power. When trying to break code, you want the model thinking as hard as possible.
|
||||
- **Consult mode:** `high` — good balance of depth and speed for conversations.
|
||||
|
||||
If the user has API key auth and wants a different model, they can say
|
||||
`/codex review -m o3` and the `-m` flag should be passed through to codex.
|
||||
**Web search:** All codex commands use `--enable web_search_cached` so Codex can look up
|
||||
docs and APIs during review. This is OpenAI's cached index — fast, no extra cost.
|
||||
|
||||
If the user specifies a model (e.g., `/codex review -m gpt-5.1-codex-max`),
|
||||
pass the `-m` flag through to codex.
|
||||
|
||||
---
|
||||
|
||||
|
||||
Reference in New Issue
Block a user