8 periodic-tier E2E tests that spawn claude -p with the Skill tool
enabled and the skill installed in .claude/skills/. These exercise
the ROUTING path — the actual thing that broke with /checkpoint.
Prior tests hand-fed the Save section as a prompt; these invoke the
slash-command for real and verify the Skill tool was called.
Tests (~$0.20-$0.40 each, ~$2 total per run):
1. context-save-routing
Prompts "/context-save wintermute progress". Asserts the Skill
tool was invoked with skill:"context-save" AND a file landed in
the checkpoints dir. Guards against future upstream collisions
(if Claude Code ships /context-save as a built-in, this fails).
2. context-save-then-restore-roundtrip
Two slash commands in one session: /context-save <marker>, then
/context-restore. Asserts both Skill invocations happened AND
restore output contains the magic marker from the save.
3. context-restore-fragment-match
Seeds three saves (alpha, middle-payments, omega). Runs
/context-restore payments. Asserts the payments file loaded and
the other two did NOT leak into output. Proves fragment-matching
works (previously untested — we only tested "newest" default).
4. context-restore-empty-state
No saves seeded. /context-restore should produce a graceful
"no saved contexts yet"-style message, not crash or list cwd.
5. context-restore-list-delegates
/context-restore list should redirect to /context-save list
(our explicit design: list lives on the save side). Asserts
the output mentions "context-save list".
6. context-restore-legacy-compat
Seeds a pre-rename save file (old /checkpoint format) in the
checkpoints/ dir. Runs /context-restore. Asserts the legacy
content loads cleanly. Proves the storage-path stability
promise (users' old saves still work).
7. context-save-list-current-branch
Seeds saves on 3 branches (main, feat/alpha, feat/beta).
Current branch is main. Asserts list shows main, hides others.
8. context-save-list-all-branches
Same seed. /context-save list --all. Asserts all 3 branches
show up in output.
touchfiles.ts: all 8 registered in both E2E_TOUCHFILES and E2E_TIERS
as 'periodic'. Touchfile deps scoped per-test (save-only tests don't
run when only context-restore changes, etc.).
Coverage jump: smoke-test level (~5/10) → truly E2E (~9.5/10) for the
context-skills surface area. Combined with the 21 Tier-2 hardening
tests (free, 142ms) from the prior commit, every non-trivial code
path has either a live-fire assertion or a bash-level unit test.