The basic bash detach fixed SIGTERM but a real run on a shared dev box hit
three more killers: cross-worktree API saturation (15-way concurrency x a
sibling worktree mass-timed-out the suite), a silent hang (periodic bun died
with no exit marker), and shared-/tmp log contamination (a concurrent
worktree's agent output bled into the log). Rewrite as a portable python3 tool
that bakes in all four fixes:
- fork + setsid: SIGTERM-proof (own session, survives harness polite-quit)
- caffeinate -i on macOS: no idle-sleep death
- --lock NAME (fcntl, machine-wide): concurrent worktrees SERIALIZE instead of
saturating the shared model API
- run-scoped default log (~/.gstack-dev/eval-runs/<label>-<slug>-<branch>-<ts>-<pid>):
no cross-worktree collision/contamination
- --timeout watchdog + a guaranteed '### gstack-detach EXIT=<code> ###' sentinel
on every terminal path: no silent hang, finished-vs-died always detectable
Guard test pins all four: detached pgid differs + outlives launcher, run-scoped
log path, watchdog EXIT=timeout, and lock serialization (second run WAITS).
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Long agent-run jobs (30-60 min evals, benchmarks) die when the harness sends
SIGTERM to a background task's process group on turn boundaries / monitor
stops / interruptions (observed: 'script test:gate terminated by signal
SIGTERM'). gstack-detach runs the command in a fresh session (python3
os.setsid, or setsid on Linux, nohup fallback) so a group SIGTERM can't reach
it, and wraps it in caffeinate -i on macOS so idle-sleep can't kill it either.
Returns immediately; caller polls the logfile. Secrets stay in env, never argv.
The guard test pins the contract: the command runs in a different process
group than the caller and outlives the launching shell.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>