From 57e3e8a453e0030ad4e345c00164936c923f76ea Mon Sep 17 00:00:00 2001
From: Garry Tan <garrytan@gmail.com>
Date: Sat, 21 Mar 2026 11:34:25 -0700
Subject: [PATCH] =?UTF-8?q?fix:=20address=20Codex=20review=20=E2=80=94=20s?=
 =?UTF-8?q?anitize=20search,=20privacy=20gate,=20ETHOS.md=20sidecar?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Three fixes from adversarial Codex review:
- /investigate: sanitize error messages before searching (strip hostnames,
  IPs, file paths, SQL, customer data). Skip search if unsanitizable.
- /office-hours: add privacy gate before landscape search. Use generalized
  category terms, never the user's specific product name or stealth idea.
- setup: link ETHOS.md into .agents/skills/gstack/ sidecar so workspace-
  local Codex sessions can find the builder philosophy.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---
 .agents/skills/gstack-investigate/SKILL.md  |  2 +-
 .agents/skills/gstack-office-hours/SKILL.md |  6 ++++++
 investigate/SKILL.md                        |  2 +-
 investigate/SKILL.md.tmpl                   |  2 +-
 office-hours/SKILL.md                       |  6 ++++++
 office-hours/SKILL.md.tmpl                  |  6 ++++++
 setup                                       | 11 +++++++++++
 7 files changed, 32 insertions(+), 3 deletions(-)

diff --git a/.agents/skills/gstack-investigate/SKILL.md b/.agents/skills/gstack-investigate/SKILL.md
index edfb21b2..13c43518 100644
--- a/.agents/skills/gstack-investigate/SKILL.md
+++ b/.agents/skills/gstack-investigate/SKILL.md
@@ -323,7 +323,7 @@ Before writing ANY fix, verify your hypothesis.
 
 1. **Confirm the hypothesis:** Add a temporary log statement, assertion, or debug output at the suspected root cause. Run the reproduction. Does the evidence match?
 
-2. **If the hypothesis is wrong:** Before forming the next hypothesis, WebSearch for the exact error message (quoted) and "{component} {unexpected behavior} {framework version}". This often surfaces version-specific regressions or known issues that save hypothesis cycles. If WebSearch is unavailable, skip and proceed. Then return to Phase 1. Gather more evidence. Do not guess.
+2. **If the hypothesis is wrong:** Before forming the next hypothesis, consider searching for the error. **Sanitize first** — strip hostnames, IPs, file paths, SQL fragments, customer identifiers, and any internal/proprietary data from the error message. Search only the generic error type and framework context: "{component} {sanitized error type} {framework version}". If the error message is too specific to sanitize safely, skip the search. If WebSearch is unavailable, skip and proceed. Then return to Phase 1. Gather more evidence. Do not guess.
 
 3. **3-strike rule:** If 3 hypotheses fail, **STOP**. Use AskUserQuestion:
    ```
diff --git a/.agents/skills/gstack-office-hours/SKILL.md b/.agents/skills/gstack-office-hours/SKILL.md
index 4c3adbf2..c464c88c 100644
--- a/.agents/skills/gstack-office-hours/SKILL.md
+++ b/.agents/skills/gstack-office-hours/SKILL.md
@@ -473,6 +473,12 @@ Read ETHOS.md for the full Search Before Building framework (three layers, eurek
 
 After understanding the problem through questioning, search for what the world thinks. This is NOT competitive research (that's /design-consultation's job). This is understanding conventional wisdom so you can evaluate where it's wrong.
 
+**Privacy gate:** Before searching, use AskUserQuestion: "I'd like to search for what the world thinks about this space to inform our discussion. This sends generalized category terms (not your specific idea) to a search provider. OK to proceed?"
+Options: A) Yes, search away  B) Skip — keep this session private
+If B: skip this phase entirely and proceed to Phase 3. Use only in-distribution knowledge.
+
+When searching, use **generalized category terms** — never the user's specific product name, proprietary concept, or stealth idea. For example, search "task management app landscape" not "SuperTodo AI-powered task killer."
+
 If WebSearch is unavailable, skip this phase and note: "Search unavailable — proceeding with in-distribution knowledge only."
 
 **Startup mode:** WebSearch for:
diff --git a/investigate/SKILL.md b/investigate/SKILL.md
index 429300f8..04c98dd0 100644
--- a/investigate/SKILL.md
+++ b/investigate/SKILL.md
@@ -343,7 +343,7 @@ Before writing ANY fix, verify your hypothesis.
 
 1. **Confirm the hypothesis:** Add a temporary log statement, assertion, or debug output at the suspected root cause. Run the reproduction. Does the evidence match?
 
-2. **If the hypothesis is wrong:** Before forming the next hypothesis, WebSearch for the exact error message (quoted) and "{component} {unexpected behavior} {framework version}". This often surfaces version-specific regressions or known issues that save hypothesis cycles. If WebSearch is unavailable, skip and proceed. Then return to Phase 1. Gather more evidence. Do not guess.
+2. **If the hypothesis is wrong:** Before forming the next hypothesis, consider searching for the error. **Sanitize first** — strip hostnames, IPs, file paths, SQL fragments, customer identifiers, and any internal/proprietary data from the error message. Search only the generic error type and framework context: "{component} {sanitized error type} {framework version}". If the error message is too specific to sanitize safely, skip the search. If WebSearch is unavailable, skip and proceed. Then return to Phase 1. Gather more evidence. Do not guess.
 
 3. **3-strike rule:** If 3 hypotheses fail, **STOP**. Use AskUserQuestion:
    ```
diff --git a/investigate/SKILL.md.tmpl b/investigate/SKILL.md.tmpl
index 532e4b7a..7678e8e1 100644
--- a/investigate/SKILL.md.tmpl
+++ b/investigate/SKILL.md.tmpl
@@ -119,7 +119,7 @@ Before writing ANY fix, verify your hypothesis.
 
 1. **Confirm the hypothesis:** Add a temporary log statement, assertion, or debug output at the suspected root cause. Run the reproduction. Does the evidence match?
 
-2. **If the hypothesis is wrong:** Before forming the next hypothesis, WebSearch for the exact error message (quoted) and "{component} {unexpected behavior} {framework version}". This often surfaces version-specific regressions or known issues that save hypothesis cycles. If WebSearch is unavailable, skip and proceed. Then return to Phase 1. Gather more evidence. Do not guess.
+2. **If the hypothesis is wrong:** Before forming the next hypothesis, consider searching for the error. **Sanitize first** — strip hostnames, IPs, file paths, SQL fragments, customer identifiers, and any internal/proprietary data from the error message. Search only the generic error type and framework context: "{component} {sanitized error type} {framework version}". If the error message is too specific to sanitize safely, skip the search. If WebSearch is unavailable, skip and proceed. Then return to Phase 1. Gather more evidence. Do not guess.
 
 3. **3-strike rule:** If 3 hypotheses fail, **STOP**. Use AskUserQuestion:
    ```
diff --git a/office-hours/SKILL.md b/office-hours/SKILL.md
index 6f7c7f86..218fe133 100644
--- a/office-hours/SKILL.md
+++ b/office-hours/SKILL.md
@@ -483,6 +483,12 @@ Read ETHOS.md for the full Search Before Building framework (three layers, eurek
 
 After understanding the problem through questioning, search for what the world thinks. This is NOT competitive research (that's /design-consultation's job). This is understanding conventional wisdom so you can evaluate where it's wrong.
 
+**Privacy gate:** Before searching, use AskUserQuestion: "I'd like to search for what the world thinks about this space to inform our discussion. This sends generalized category terms (not your specific idea) to a search provider. OK to proceed?"
+Options: A) Yes, search away  B) Skip — keep this session private
+If B: skip this phase entirely and proceed to Phase 3. Use only in-distribution knowledge.
+
+When searching, use **generalized category terms** — never the user's specific product name, proprietary concept, or stealth idea. For example, search "task management app landscape" not "SuperTodo AI-powered task killer."
+
 If WebSearch is unavailable, skip this phase and note: "Search unavailable — proceeding with in-distribution knowledge only."
 
 **Startup mode:** WebSearch for:
diff --git a/office-hours/SKILL.md.tmpl b/office-hours/SKILL.md.tmpl
index 2db4c72f..7dbc6d32 100644
--- a/office-hours/SKILL.md.tmpl
+++ b/office-hours/SKILL.md.tmpl
@@ -242,6 +242,12 @@ Read ETHOS.md for the full Search Before Building framework (three layers, eurek
 
 After understanding the problem through questioning, search for what the world thinks. This is NOT competitive research (that's /design-consultation's job). This is understanding conventional wisdom so you can evaluate where it's wrong.
 
+**Privacy gate:** Before searching, use AskUserQuestion: "I'd like to search for what the world thinks about this space to inform our discussion. This sends generalized category terms (not your specific idea) to a search provider. OK to proceed?"
+Options: A) Yes, search away  B) Skip — keep this session private
+If B: skip this phase entirely and proceed to Phase 3. Use only in-distribution knowledge.
+
+When searching, use **generalized category terms** — never the user's specific product name, proprietary concept, or stealth idea. For example, search "task management app landscape" not "SuperTodo AI-powered task killer."
+
 If WebSearch is unavailable, skip this phase and note: "Search unavailable — proceeding with in-distribution knowledge only."
 
 **Startup mode:** WebSearch for:
diff --git a/setup b/setup
index 09d2282f..d67bdec1 100755
--- a/setup
+++ b/setup
@@ -205,6 +205,17 @@ create_agents_sidecar() {
       fi
     fi
   done
+
+  # Sidecar files that skills reference at runtime
+  for file in ETHOS.md; do
+    local src="$GSTACK_DIR/$file"
+    local dst="$agents_gstack/$file"
+    if [ -f "$src" ]; then
+      if [ -L "$dst" ] || [ ! -e "$dst" ]; then
+        ln -snf "$src" "$dst"
+      fi
+    fi
+  done
 }
 
 # 4. Install for Claude (default)