Files
Garry Tan 0ac7ef4e81 fix: harden planted-bug eval prompt for reliable form testing
Phase 3 was too vague ("click every nav link") causing the agent to
wander instead of systematically testing form fields. Now explicitly
directs: fill every input, clear it, try invalid values, submit and
check console. Added Phase 4 finalize step to ensure report is updated
with all findings.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-14 13:28:18 -05:00
..