feat(hooks): PostToolUse capture hook for AskUserQuestion

Plan-tune cathedral T5. Closes the substrate hole that motivated this
entire branch: agent-compliance-only logging produced zero events in weeks
of dogfood. PostToolUse hook captures every AUQ fire deterministically.

What ships:
- hosts/claude/hooks/question-log-hook.ts — TS hook that reads Claude
  Code's hook stdin, walks tool_input.questions[*], extracts user choice
  + recommended option from tool_response, spawns gstack-question-log per
  question.
- hosts/claude/hooks/question-log-hook — bash shim Claude Code's hook
  runner invokes; execs bun against the .ts file.
- Marker-first question_id extraction (D18 progressive markers):
  <gstack-qid:foo-bar> stripped from question text, used as the id.
  Hash fallback hook-<sha1[:10]> for unmarked questions (observed-only,
  never used as preference key — D18 hash drift mitigation).
- (recommended) label parsing for the user_choice/recommended fields,
  with refuse-on-ambiguous when two labels are present (D2 safety).
- Free-text capture: source=auq-other + free_text field when user picks
  Other and types (Layer 8 dream cycle input).
- Matcher covers both native AskUserQuestion and mcp__*__AskUserQuestion
  (Codex/Conductor catch from outside voice review).
- Crash safety: always exits 0; errors land in ~/.gstack/hook-errors.log
  so the user's session is never blocked by a hook failure.

gstack-question-log extended to:
- Accept `source` field (default 'agent', new values: hook, auq-other,
  auto-decided, codex-import-marker, codex-import-pattern).
- Accept `tool_use_id` (<=128 chars) for dedup.
- Composite dedup on (source, tool_use_id) across the last 100 lines —
  protects against hook + preamble both firing on the same tool call
  (D3 belt+suspenders).
- Async fire `gstack-developer-profile --derive` after each successful
  write so inferred.sample_size actually grows (D17 — without this, the
  cathedral's "before 0, after >0" metric never moves).
- GSTACK_QUESTION_LOG_NO_DERIVE=1 escape hatch for tests.

9 new unit tests covering capture, marker extraction, MCP variant,
free-text, dedup, ambiguous-recommended safety, crash paths. All pass
plus the existing 88 tests across related files.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Garry Tan
2026-05-27 07:42:46 -07:00
parent 2147532c07
commit a8a0447870
4 changed files with 658 additions and 2 deletions
+80 -2
View File
@@ -50,12 +50,48 @@ if (!j.skill || !/^[a-z0-9-]+\$/.test(j.skill)) {
process.exit(1);
}
// Required: question_id (kebab-case, <=64 chars)
// Required: question_id (kebab-case, <=64 chars).
// Cathedral T5: hook-sourced events use 'hook-<10-char-hash>' which is
// kebab-case-compatible and passes the same regex.
if (!j.question_id || !/^[a-z0-9-]+\$/.test(j.question_id) || j.question_id.length > 64) {
process.stderr.write('gstack-question-log: invalid question_id, must be kebab-case <=64 chars\n');
process.exit(1);
}
// Optional: source — tags which writer produced this event.
// 'agent' (default) — preamble-driven write from inside the running agent
// 'hook' — PostToolUse hook captured it deterministically (T5)
// 'auq-other' — user picked 'Other' and typed free text (Layer 8)
// 'auto-decided' — PreToolUse enforcement hook substituted the answer (T6)
// 'codex-import-marker' / 'codex-import-pattern' — T9 backfill from Codex
const ALLOWED_SOURCES = ['agent', 'hook', 'auq-other', 'auto-decided', 'codex-import-marker', 'codex-import-pattern'];
if (j.source !== undefined) {
if (!ALLOWED_SOURCES.includes(j.source)) {
process.stderr.write('gstack-question-log: invalid source, must be one of: ' + ALLOWED_SOURCES.join(', ') + '\n');
process.exit(1);
}
} else {
j.source = 'agent';
}
// Optional: tool_use_id — Claude Code hook stdin field; used for dedup.
if (j.tool_use_id !== undefined) {
if (typeof j.tool_use_id !== 'string' || j.tool_use_id.length > 128) {
process.stderr.write('gstack-question-log: tool_use_id must be string <=128 chars\n');
process.exit(1);
}
}
// Optional: free_text — sanitize (no newlines, <=300 chars).
if (j.free_text !== undefined) {
if (typeof j.free_text !== 'string') {
process.stderr.write('gstack-question-log: free_text must be string\n');
process.exit(1);
}
if (j.free_text.length > 300) j.free_text = j.free_text.slice(0, 300);
j.free_text = j.free_text.replace(/\n+/g, ' ');
}
// Required: question_summary (non-empty, <=200 chars, no newlines)
if (typeof j.question_summary !== 'string' || !j.question_summary.length) {
process.stderr.write('gstack-question-log: question_summary required\n');
@@ -165,7 +201,49 @@ if [ $VALIDATE_RC -ne 0 ] || [ -z "$VALIDATED" ]; then
exit 1
fi
echo "$VALIDATED" >> "$GSTACK_HOME/projects/$SLUG/question-log.jsonl"
LOG_FILE="$GSTACK_HOME/projects/$SLUG/question-log.jsonl"
# Cathedral T5: composite-source dedup. If this exact (source, tool_use_id)
# was already logged within the last 100 lines, skip — protects against
# hook + agent both writing the same fire (D3 plan-tune cathedral decision).
# Lookup is bounded so the bin stays cheap on hot paths.
DEDUP_SKIP=""
if [ -f "$LOG_FILE" ]; then
DEDUP_SKIP=$(VALIDATED_JSON="$VALIDATED" LOG_FILE_PATH="$LOG_FILE" bun -e '
const fs = require("fs");
const j = JSON.parse(process.env.VALIDATED_JSON);
if (!j.tool_use_id) { console.log(""); process.exit(0); }
const want = j.source + ":" + j.tool_use_id;
const lines = fs.readFileSync(process.env.LOG_FILE_PATH, "utf-8").trim().split("\n").slice(-100);
for (const ln of lines) {
try {
const p = JSON.parse(ln);
if (p.source && p.tool_use_id && (p.source + ":" + p.tool_use_id) === want) {
console.log("dup");
process.exit(0);
}
} catch {}
}
console.log("");
' 2>/dev/null)
fi
if [ "$DEDUP_SKIP" = "dup" ]; then
echo "DEDUP: skipped (source=$(echo "$VALIDATED" | bun -e 'const j=JSON.parse(await Bun.stdin.text()); console.log(j.source);'), tool_use_id duplicate)"
exit 0
fi
echo "$VALIDATED" >> "$LOG_FILE"
# Cathedral T5: fire-and-forget --derive so inferred dimensions stay current
# without per-event latency (D17). Sub-second op; output suppressed; never
# blocks the hook caller. Skipped via GSTACK_QUESTION_LOG_NO_DERIVE=1 for
# tests that don't want the side effect.
if [ -z "${GSTACK_QUESTION_LOG_NO_DERIVE:-}" ]; then
(
nohup "$SCRIPT_DIR/gstack-developer-profile" --derive >/dev/null 2>&1 &
) >/dev/null 2>&1
fi
# NOTE: question-log.jsonl is deliberately NOT enqueued for gbrain-sync.
# Per Codex v2 review, audit/derivation data stays local alongside the