mirror of
https://github.com/garrytan/gstack.git
synced 2026-05-02 03:35:09 +02:00
76803d789a
Adds comprehensive eval infrastructure: - Tier 1 (free): 13 new static tests — cross-skill path consistency, QA structure validation, greptile format, planted-bug fixture validation - Tier 2 (Agent SDK E2E): /qa quick, /review with pre-built git repo, 3 planted-bug outcome evals (static, SPA, checkout — each with 5 bugs) - Tier 3 (LLM judge): QA workflow quality, health rubric clarity, cross-skill consistency, baseline score pinning New fixtures: 3 HTML pages with 15 total planted bugs, ground truth JSON, review-eval-vuln.rb, eval-baselines.json. Shared llm-judge.ts helper (DRY). Unified EVALS=1 flag replaces SKILL_E2E + ANTHROPIC_API_KEY checks. `bun run test:evals` runs everything that costs money (~$4/run). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
52 lines
1.7 KiB
HTML
52 lines
1.7 KiB
HTML
<!DOCTYPE html>
|
|
<html lang="en">
|
|
<head>
|
|
<meta charset="utf-8">
|
|
<title>QA Eval — Widget Dashboard</title>
|
|
<style>
|
|
body { font-family: sans-serif; padding: 20px; }
|
|
nav { margin-bottom: 20px; }
|
|
nav a { margin-right: 15px; color: #0066cc; }
|
|
form { margin: 20px 0; padding: 15px; border: 1px solid #ccc; border-radius: 4px; }
|
|
input { display: block; margin: 8px 0; padding: 6px; }
|
|
button { padding: 8px 16px; margin-top: 8px; }
|
|
.stats { margin: 20px 0; }
|
|
img { display: block; margin: 20px 0; }
|
|
</style>
|
|
</head>
|
|
<body>
|
|
<nav>
|
|
<a href="/">Home</a>
|
|
<a href="/about">About</a>
|
|
<a href="/nonexistent-404-page">Resources</a> <!-- BUG 1: broken link (404) -->
|
|
</nav>
|
|
|
|
<h1>Widget Dashboard</h1>
|
|
|
|
<form id="contact">
|
|
<h2>Contact Us</h2>
|
|
<input type="text" name="name" placeholder="Name" required>
|
|
<input type="email" name="email" placeholder="Email" required>
|
|
<button type="submit" disabled>Submit</button> <!-- BUG 2: submit button permanently disabled -->
|
|
</form>
|
|
|
|
<div class="stats" style="width: 400px; overflow: hidden;">
|
|
<h2>Statistics</h2>
|
|
<p style="white-space: nowrap; width: 600px;">
|
|
Revenue: $1,234,567.89 | Users: 45,678 | Conversion: 3.2% | Growth: +12.5% MoM | Retention: 87.3%
|
|
</p> <!-- BUG 3: content overflow/clipping — text wider than container with overflow:hidden -->
|
|
</div>
|
|
|
|
<img src="/logo.png"> <!-- BUG 4: missing alt text on image -->
|
|
|
|
<footer>
|
|
<p>© 2026 Widget Co. All rights reserved.</p>
|
|
</footer>
|
|
|
|
<script>
|
|
console.error("TypeError: Cannot read properties of undefined (reading 'map')");
|
|
// BUG 5: console error on page load
|
|
</script>
|
|
</body>
|
|
</html>
|