mirror of
https://github.com/garrytan/gstack.git
synced 2026-06-26 03:30:05 +02:00
test(plan-tune): 5 cathedral E2E scenarios + touchfile registration
Plan-tune cathedral T16 (per D12 — all 5 in gate tier). One consolidated
file with five describeIfSelected scenarios, each selectable by its own
touchfile entry so they only run when the relevant code changes (or
EVALS_ALL=1 forces all):
plan-tune-hook-capture — PostToolUse hook fires → question-log fills
plan-tune-enforcement — never-ask + marker + 2-way → deny+reason
+ auto-decided event logged
plan-tune-annotation — declared profile + memory nugget
→ additionalContext surfaced on defer
plan-tune-codex-import — synthetic JSONL → import bin → log with
source=codex-import-marker
plan-tune-dream-cycle — apply proposal → re-fire question
→ memory injected via additionalContext
Each scenario fixtures an isolated git repo + bins + scripts + hooks
under tmp, then exercises the cathedral chain end-to-end against real
on-disk binaries (no mocks at the bin layer). GSTACK_STATE_ROOT keeps
the user's real ~/.gstack untouched.
These five complement the existing unit tests by proving the full
sub-process chain works (not just individual functions in isolation).
They DON'T spawn claude -p because the cathedral's substrate behavior is
deterministic — agent compliance is no longer the variable. The existing
test/skill-e2e-plan-tune.test.ts (plan-tune-inspect) still covers the
LLM-driven intent-routing behavior.
Cost: each scenario runs in ~1s with $0 because no claude -p invocations.
Touchfile-gated, so they only run on PRs that touch cathedral code.
Also fixes a bug found by the E2E: question-log-hook didn't pass the
incoming tool call's cwd to spawnSync when invoking gstack-question-log,
so the bin used the hook process's cwd (the repo root) instead of the
session's cwd. Result: log writes landed in the wrong project bucket.
Fix mirrors the same cwd-passing pattern from question-preference-hook.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -204,7 +204,7 @@ function detectSkill(cwd: string | undefined): string {
|
||||
return 'unknown';
|
||||
}
|
||||
|
||||
function spawnLog(payload: Record<string, unknown>): void {
|
||||
function spawnLog(payload: Record<string, unknown>, cwd?: string): void {
|
||||
// Locate the bin relative to this script's directory.
|
||||
const here = path.dirname(new URL(import.meta.url).pathname);
|
||||
// hosts/claude/hooks/ -> ../../../bin/
|
||||
@@ -214,6 +214,9 @@ function spawnLog(payload: Record<string, unknown>): void {
|
||||
encoding: 'utf-8',
|
||||
stdio: ['ignore', 'pipe', 'pipe'],
|
||||
timeout: 3000,
|
||||
// Run from the originating tool call's cwd so gstack-slug resolves to
|
||||
// the project the user is actually in, not the hook script's location.
|
||||
cwd: cwd && fs.existsSync(cwd) ? cwd : undefined,
|
||||
});
|
||||
if (res.status !== 0) {
|
||||
logHookError(`gstack-question-log exited ${res.status}: ${res.stderr || res.stdout}`);
|
||||
@@ -274,7 +277,7 @@ async function main(): Promise<void> {
|
||||
if (recommended) payload.recommended = recommended.slice(0, 64);
|
||||
if (choice.free_text) payload.free_text = String(choice.free_text);
|
||||
|
||||
spawnLog(payload);
|
||||
spawnLog(payload, stdin.cwd);
|
||||
}
|
||||
|
||||
process.exit(0);
|
||||
|
||||
Reference in New Issue
Block a user