docs(CLAUDE.md): source ANTHROPIC/OPENAI keys from ~/.zshrc for paid evals

Conductor workspaces don't inherit the interactive shell env, so both API keys are absent from the default process env even though they're set in ~/.zshrc. Documents the source-from-zshrc pattern (grep + eval, never echo the value) plus the Agent SDK gotcha: do NOT pass env as an object to runAgentSdkTest — mutate process.env ambiently and restore in finally. Discovered this during the brain-privacy-gate E2E. First run failed at SDK auth with 401; second failed because explicit env handoff bypassed the SDK's own auth routing. Fix pattern now codified so the next paid-eval session in a Conductor workspace doesn't hit the same two dead ends. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-06-17 07:10:12 +02:00 · 2026-04-24 01:26:24 -07:00
parent 371a7e684a
commit 3139cf40fb
1 changed files with 20 additions and 0 deletions
@@ -26,6 +26,26 @@ bun run slop:diff     # slop findings in files changed on this branch only

 `test:evals` requires `ANTHROPIC_API_KEY`. Codex E2E tests (`test/codex-e2e.test.ts`)
 use Codex's own auth from `~/.codex/` config — no `OPENAI_API_KEY` env var needed.
+
+**Where the keys live on this machine.** Conductor workspaces don't inherit the
+user's interactive shell env, so `ANTHROPIC_API_KEY` and `OPENAI_API_KEY` aren't
+in the default process env. Before running any paid eval / E2E, source them from
+`~/.zshrc` (that's where Garry keeps them):
+
+```bash
+bash -c '
+  eval "$(grep -E "^export (ANTHROPIC_API_KEY|OPENAI_API_KEY)=" ~/.zshrc)"
+  export ANTHROPIC_API_KEY OPENAI_API_KEY
+  EVALS=1 EVALS_TIER=periodic bun test test/skill-e2e-<whatever>.test.ts
+'
+```
+
+Do not echo the key value anywhere (stdout, logs, shell history). The grep+eval
+pattern keeps it in process env only. When passing to a test's Agent SDK, do NOT
+pass `env: {...}` to `runAgentSdkTest` — the SDK's auth pipeline doesn't pick up
+the key the same way when env is supplied as an object (confirmed failure mode).
+Instead, mutate `process.env.ANTHROPIC_API_KEY` ambiently before the call and
+restore in `finally`.
 E2E tests stream progress in real-time (tool-by-tool via `--output-format stream-json
 --verbose`). Results are persisted to `~/.gstack-dev/evals/` with auto-comparison
 against the previous run.