mirror of
https://github.com/garrytan/gstack.git
synced 2026-06-25 02:59:59 +02:00
test(make-pdf)+feat(diagram): review-wave test pins + skill transport hardening
Tests: indented-fence byte-for-byte replay + no-extraction-in-lists,
drive-letter local-path routing, $-pattern slot immunity, base64 source
round-trip ('A --> B' exact), existing-style merge preservation, DOCX
rasterize-failure surfaces source, srcSha256 + font-stack drift guards,
landscape veto asserted as some-portrait/no-landscape (layout-order-proof),
judge rubric cap lowered to 5 so it actually fails, vacuous error-shape test
removed honestly, tmpdir cleanup.
/diagram skill: base64 transport (template literals corrupted backticks/${
in sources), content-addressed staging with hash verification, and --tab-id
pinned on every browse call so a concurrent /qa session can't be clobbered.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
This commit is contained in:
@@ -138,7 +138,8 @@ ${mmd}
|
||||
Score 1-10 on: faithfulness to the ask (are the named stages present and
|
||||
correctly ordered?), label quality (short node labels, detail on edges),
|
||||
and readable size (5-15 nodes, not a wall). A diagram that misses the
|
||||
failure/diagnostic path entirely caps at 6.
|
||||
failure/diagnostic path entirely caps at 5 — that path is an explicitly
|
||||
named requirement, so omitting it must fail the run.
|
||||
|
||||
Respond with JSON: {"score": N, "reasoning": "..."}`,
|
||||
);
|
||||
|
||||
Reference in New Issue
Block a user