mirror of
https://github.com/garrytan/gstack.git
synced 2026-05-01 19:25:10 +02:00
fix: /qa never refuses browser testing on backend-only changes (#202)
* feat: QA skill never refuses browser testing Add anti-refusal guardrails to /qa and /qa-only skills. When the user invokes /qa, the skill must always use the browser — even if the diff shows only backend/config changes with no obvious UI surface. Falls back to Quick mode (homepage + top 5 nav targets) when no specific pages are identified from the diff. Adds LLM-as-judge eval to verify the anti-refusal behavior. * chore: bump version and changelog (v0.8.1) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -302,6 +302,8 @@ This is the **primary mode** for developers verifying their work. When the user
|
||||
- API endpoints → test them directly with \`$B js "await fetch('/api/...')"\`
|
||||
- Static pages (markdown, HTML) → navigate to them directly
|
||||
|
||||
**If no obvious pages/routes are identified from the diff:** Do not skip browser testing. The user invoked /qa because they want browser-based verification. Fall back to Quick mode — navigate to the homepage, follow the top 5 navigation targets, check console for errors, and test any interactive elements found. Backend, config, and infrastructure changes affect app behavior — always verify the app still works.
|
||||
|
||||
3. **Detect the running app** — check common local dev ports:
|
||||
\`\`\`bash
|
||||
$B goto http://localhost:3000 2>/dev/null && echo "Found app on :3000" || \\
|
||||
@@ -555,7 +557,8 @@ Minimum 0 per category.
|
||||
8. **Depth over breadth.** 5-10 well-documented issues with evidence > 20 vague descriptions.
|
||||
9. **Never delete output files.** Screenshots and reports accumulate — that's intentional.
|
||||
10. **Use \`snapshot -C\` for tricky UIs.** Finds clickable divs that the accessibility tree misses.
|
||||
11. **Show screenshots to the user.** After every \`$B screenshot\`, \`$B snapshot -a -o\`, or \`$B responsive\` command, use the Read tool on the output file(s) so the user can see them inline. For \`responsive\` (3 files), Read all three. This is critical — without it, screenshots are invisible to the user.`;
|
||||
11. **Show screenshots to the user.** After every \`$B screenshot\`, \`$B snapshot -a -o\`, or \`$B responsive\` command, use the Read tool on the output file(s) so the user can see them inline. For \`responsive\` (3 files), Read all three. This is critical — without it, screenshots are invisible to the user.
|
||||
12. **Never refuse to use the browser.** When the user invokes /qa or /qa-only, they are requesting browser-based testing. Never suggest evals, unit tests, or other alternatives as a substitute. Even if the diff appears to have no UI changes, backend changes affect app behavior — always open the browser and test.`;
|
||||
}
|
||||
|
||||
function generateDesignReviewLite(_ctx: TemplateContext): string {
|
||||
|
||||
Reference in New Issue
Block a user