test(eval): bump maxTurns to 15 for claude-dedicated-tools-vs-bash

The prompt 'list every TypeScript file under src/ and tell me what
each exports' needs 1 turn for Glob + ~5 for Reads + 1 for summary.
Default maxTurns=5 was not enough; prior run threw from the SDK on
this fixture and tanked the whole periodic eval.

Bumping to 15 gives headroom. The runner now also handles max-turns
gracefully even if a future fixture underestimates, so this is belt
and suspenders.
This commit is contained in:
Garry Tan
2026-04-23 11:34:51 -07:00
parent 5294c65777
commit a9a3ac6d2a
+3
View File
@@ -253,6 +253,9 @@ export const OVERLAY_FIXTURES: OverlayFixture[] = [
trials: 10,
concurrency: 3,
direction: 'lower_is_better',
// 5 files + summary = needs more than default 5 turns. SDK throws
// instead of returning a result when it hits the cap.
maxTurns: 15,
setupWorkspace: (dir) => {
fs.mkdirSync(path.join(dir, 'src'), { recursive: true });
fs.writeFileSync(path.join(dir, 'src', 'index.ts'), "export const x = 1;\n");