test(make-pdf)+feat(diagram): review-wave test pins + skill transport hardening

Tests: indented-fence byte-for-byte replay + no-extraction-in-lists,
drive-letter local-path routing, $-pattern slot immunity, base64 source
round-trip ('A --> B' exact), existing-style merge preservation, DOCX
rasterize-failure surfaces source, srcSha256 + font-stack drift guards,
landscape veto asserted as some-portrait/no-landscape (layout-order-proof),
judge rubric cap lowered to 5 so it actually fails, vacuous error-shape test
removed honestly, tmpdir cleanup.

/diagram skill: base64 transport (template literals corrupted backticks/${
in sources), content-addressed staging with hash verification, and --tab-id
pinned on every browse call so a concurrent /qa session can't be clobbered.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
This commit is contained in:
Garry Tan
2026-06-12 07:57:42 -07:00
parent f2a03d43cb
commit 91ba37530d
8 changed files with 174 additions and 70 deletions
+34 -21
View File
@@ -789,6 +789,9 @@ Decide the output directory: `./diagrams/` when the cwd is a git repo
## Step 2 — Stage the render bundle (once per session)
The staged copy is content-addressed (same convention as make-pdf's pre-pass),
so concurrent sessions and mixed gstack versions never clobber each other:
```bash
BUNDLE=""
for c in "$HOME/.claude/skills/gstack/lib/diagram-render/dist/diagram-render.html" \
@@ -796,37 +799,46 @@ for c in "$HOME/.claude/skills/gstack/lib/diagram-render/dist/diagram-render.htm
[ -f "$c" ] && BUNDLE="$c" && break
done
[ -z "$BUNDLE" ] && echo "BUNDLE_MISSING — run: cd ~/.claude/skills/gstack && bun run build:diagram-render" && exit 1
STAGED="/tmp/gstack-diagram-render-skill.html"
cp "$BUNDLE" "$STAGED"
$B newtab >/dev/null 2>&1 || true
$B load-html "$STAGED"
$B wait '#done'
echo "RENDER_TAB_READY"
SHA=$(shasum -a 256 "$BUNDLE" | cut -c1-16)
STAGED="/tmp/gstack-diagram-render-$SHA.html"
[ -f "$STAGED" ] && shasum -a 256 "$STAGED" | grep -q "^$SHA" || { cp "$BUNDLE" "$STAGED.$$" && mv "$STAGED.$$" "$STAGED"; }
TAB=$($B newtab --json | sed -n 's/.*"tabId":\s*\([0-9]*\).*/\1/p')
[ -z "$TAB" ] && echo "TAB_OPEN_FAILED — daemon busy? check browse status" && exit 1
$B load-html "$STAGED" --tab-id "$TAB"
$B wait '#done' --tab-id "$TAB"
echo "RENDER_TAB_READY: tab $TAB"
```
Remember `$TAB`**every** `$B js` / `$B wait` / `$B closetab` below MUST pass
`--tab-id $TAB`. Without it, calls hit whatever tab is active, which may be a
live /qa or /scrape session sharing the daemon.
If `BUNDLE_MISSING`: stop and show the user the build command. Do not improvise
a CDN fallback — offline is the contract.
## Step 3 — Render the triplet
Write the mermaid source to `<outdir>/<slug>.mmd` first (Write tool). Then,
with `MMD` holding the mermaid text (escape for a JS string literal — the
safest path is reading it back inside the page is NOT possible; pass it via a
single-quoted JS template through `$B js`):
Write the mermaid source to `<outdir>/<slug>.mmd` first (Write tool). The page
cannot read files itself, so ship the source in via **base64** — never splice
file contents into a JS template literal (backticks, `${`, and backslashes in
the source would be interpreted and corrupt it):
```bash
# SVG (always)
$B js "window.__renderMermaid('diagram-1', \`$(cat <outdir>/<slug>.mmd)\`).then(s => { window.__svg = s; return 'SVG OK ' + s.length })"
$B js "window.__svg" --out <outdir>/<slug>.svg
# SVG (always). atob() decodes the base64 inside the page.
$B js --tab-id "$TAB" "window.__renderMermaid('diagram-1', atob('$(base64 < <outdir>/<slug>.mmd | tr -d '\n')')).then(s => { window.__svg = s; return 'SVG OK ' + s.length })"
$B js --tab-id "$TAB" "window.__svg" --out <outdir>/<slug>.svg
# PNG at 300dpi of a 6.5in placement (1950px)
$B js "window.__rasterize(window.__svg, 1950)" --out <outdir>/<slug>.png
$B js --tab-id "$TAB" "window.__rasterize(window.__svg, 1950)" --out <outdir>/<slug>.png
# Editable scene (flowcharts only)
$B js "window.__mermaidToExcalidraw(\`$(cat <outdir>/<slug>.mmd)\`).then(j => { window.__scene = j; return 'SCENE OK ' + JSON.parse(j).elements.length + ' elements' })"
$B js "window.__scene" --out <outdir>/<slug>.excalidraw
$B js --tab-id "$TAB" "window.__mermaidToExcalidraw(atob('$(base64 < <outdir>/<slug>.mmd | tr -d '\n')')).then(j => { window.__scene = j; return 'SCENE OK ' + JSON.parse(j).elements.length + ' elements' })"
$B js --tab-id "$TAB" "window.__scene" --out <outdir>/<slug>.excalidraw
```
Note: `atob()` yields Latin-1; for sources with non-ASCII labels use
`decodeURIComponent(escape(atob('…')))` to recover UTF-8 exactly.
If the mermaid render returns an error, show the parse error to the user, fix
the mermaid, and retry — do not hand the user a broken source file. If
`__mermaidToExcalidraw` fails on a non-flowchart type, skip the `.excalidraw`
@@ -842,12 +854,13 @@ artifact and deliver the rest with the limitation note from Step 1.
source is the single source of truth.
Re-rendering an EDITED `.excalidraw` (user round-trip): load the scene file
and export without touching the mermaid:
and export without touching the mermaid — base64 transport again, since scene
JSON is full of quotes and backslashes:
```bash
$B js "window.__excalidrawToSvg(\`$(cat <outdir>/<slug>.excalidraw)\`).then(s => { window.__svg = s; return 'OK' })"
$B js "window.__svg" --out <outdir>/<slug>.svg
$B js "window.__rasterize(window.__svg, 1950)" --out <outdir>/<slug>.png
$B js --tab-id "$TAB" "window.__excalidrawToSvg(atob('$(base64 < <outdir>/<slug>.excalidraw | tr -d '\n')')).then(s => { window.__svg = s; return 'OK' })"
$B js --tab-id "$TAB" "window.__svg" --out <outdir>/<slug>.svg
$B js --tab-id "$TAB" "window.__rasterize(window.__svg, 1950)" --out <outdir>/<slug>.png
```
## Rules
@@ -856,7 +869,7 @@ $B js "window.__rasterize(window.__svg, 1950)" --out <outdir>/<slug>.png
a diagram. If rendering is impossible (bundle missing, browse down), say so
and stop.
- **Cleanup:** close the render tab when the conversation's diagram work is
done (`$B closetab`), not between diagrams.
done (`$B closetab $TAB`), not between diagrams.
- For diagrams destined for a PDF: remind the user that `make-pdf` renders
` ```mermaid ` fences natively — embedding the `.mmd` in their markdown is
better than embedding the PNG.
+34 -21
View File
@@ -58,6 +58,9 @@ Decide the output directory: `./diagrams/` when the cwd is a git repo
## Step 2 — Stage the render bundle (once per session)
The staged copy is content-addressed (same convention as make-pdf's pre-pass),
so concurrent sessions and mixed gstack versions never clobber each other:
```bash
BUNDLE=""
for c in "$HOME/.claude/skills/gstack/lib/diagram-render/dist/diagram-render.html" \
@@ -65,37 +68,46 @@ for c in "$HOME/.claude/skills/gstack/lib/diagram-render/dist/diagram-render.htm
[ -f "$c" ] && BUNDLE="$c" && break
done
[ -z "$BUNDLE" ] && echo "BUNDLE_MISSING — run: cd ~/.claude/skills/gstack && bun run build:diagram-render" && exit 1
STAGED="/tmp/gstack-diagram-render-skill.html"
cp "$BUNDLE" "$STAGED"
$B newtab >/dev/null 2>&1 || true
$B load-html "$STAGED"
$B wait '#done'
echo "RENDER_TAB_READY"
SHA=$(shasum -a 256 "$BUNDLE" | cut -c1-16)
STAGED="/tmp/gstack-diagram-render-$SHA.html"
[ -f "$STAGED" ] && shasum -a 256 "$STAGED" | grep -q "^$SHA" || { cp "$BUNDLE" "$STAGED.$$" && mv "$STAGED.$$" "$STAGED"; }
TAB=$($B newtab --json | sed -n 's/.*"tabId":\s*\([0-9]*\).*/\1/p')
[ -z "$TAB" ] && echo "TAB_OPEN_FAILED — daemon busy? check browse status" && exit 1
$B load-html "$STAGED" --tab-id "$TAB"
$B wait '#done' --tab-id "$TAB"
echo "RENDER_TAB_READY: tab $TAB"
```
Remember `$TAB` — **every** `$B js` / `$B wait` / `$B closetab` below MUST pass
`--tab-id $TAB`. Without it, calls hit whatever tab is active, which may be a
live /qa or /scrape session sharing the daemon.
If `BUNDLE_MISSING`: stop and show the user the build command. Do not improvise
a CDN fallback — offline is the contract.
## Step 3 — Render the triplet
Write the mermaid source to `<outdir>/<slug>.mmd` first (Write tool). Then,
with `MMD` holding the mermaid text (escape for a JS string literal — the
safest path is reading it back inside the page is NOT possible; pass it via a
single-quoted JS template through `$B js`):
Write the mermaid source to `<outdir>/<slug>.mmd` first (Write tool). The page
cannot read files itself, so ship the source in via **base64** — never splice
file contents into a JS template literal (backticks, `${`, and backslashes in
the source would be interpreted and corrupt it):
```bash
# SVG (always)
$B js "window.__renderMermaid('diagram-1', \`$(cat <outdir>/<slug>.mmd)\`).then(s => { window.__svg = s; return 'SVG OK ' + s.length })"
$B js "window.__svg" --out <outdir>/<slug>.svg
# SVG (always). atob() decodes the base64 inside the page.
$B js --tab-id "$TAB" "window.__renderMermaid('diagram-1', atob('$(base64 < <outdir>/<slug>.mmd | tr -d '\n')')).then(s => { window.__svg = s; return 'SVG OK ' + s.length })"
$B js --tab-id "$TAB" "window.__svg" --out <outdir>/<slug>.svg
# PNG at 300dpi of a 6.5in placement (1950px)
$B js "window.__rasterize(window.__svg, 1950)" --out <outdir>/<slug>.png
$B js --tab-id "$TAB" "window.__rasterize(window.__svg, 1950)" --out <outdir>/<slug>.png
# Editable scene (flowcharts only)
$B js "window.__mermaidToExcalidraw(\`$(cat <outdir>/<slug>.mmd)\`).then(j => { window.__scene = j; return 'SCENE OK ' + JSON.parse(j).elements.length + ' elements' })"
$B js "window.__scene" --out <outdir>/<slug>.excalidraw
$B js --tab-id "$TAB" "window.__mermaidToExcalidraw(atob('$(base64 < <outdir>/<slug>.mmd | tr -d '\n')')).then(j => { window.__scene = j; return 'SCENE OK ' + JSON.parse(j).elements.length + ' elements' })"
$B js --tab-id "$TAB" "window.__scene" --out <outdir>/<slug>.excalidraw
```
Note: `atob()` yields Latin-1; for sources with non-ASCII labels use
`decodeURIComponent(escape(atob('…')))` to recover UTF-8 exactly.
If the mermaid render returns an error, show the parse error to the user, fix
the mermaid, and retry — do not hand the user a broken source file. If
`__mermaidToExcalidraw` fails on a non-flowchart type, skip the `.excalidraw`
@@ -111,12 +123,13 @@ artifact and deliver the rest with the limitation note from Step 1.
source is the single source of truth.
Re-rendering an EDITED `.excalidraw` (user round-trip): load the scene file
and export without touching the mermaid:
and export without touching the mermaid — base64 transport again, since scene
JSON is full of quotes and backslashes:
```bash
$B js "window.__excalidrawToSvg(\`$(cat <outdir>/<slug>.excalidraw)\`).then(s => { window.__svg = s; return 'OK' })"
$B js "window.__svg" --out <outdir>/<slug>.svg
$B js "window.__rasterize(window.__svg, 1950)" --out <outdir>/<slug>.png
$B js --tab-id "$TAB" "window.__excalidrawToSvg(atob('$(base64 < <outdir>/<slug>.excalidraw | tr -d '\n')')).then(s => { window.__svg = s; return 'OK' })"
$B js --tab-id "$TAB" "window.__svg" --out <outdir>/<slug>.svg
$B js --tab-id "$TAB" "window.__rasterize(window.__svg, 1950)" --out <outdir>/<slug>.png
```
## Rules
@@ -125,7 +138,7 @@ $B js "window.__rasterize(window.__svg, 1950)" --out <outdir>/<slug>.png
a diagram. If rendering is impossible (bundle missing, browse down), say so
and stop.
- **Cleanup:** close the render tab when the conversation's diagram work is
done (`$B closetab`), not between diagrams.
done (`$B closetab $TAB`), not between diagrams.
- For diagrams destined for a PDF: remind the user that `make-pdf` renders
` ```mermaid ` fences natively — embedding the `.mmd` in their markdown is
better than embedding the PNG.
+18 -18
View File
@@ -110,11 +110,19 @@ describe("rasterizeDiagramFigures (mock tab)", () => {
expect(out).toContain('src="data:image/png;base64,AAAA"');
});
test("figure rasterization failure keeps the figure (never silent loss)", () => {
test("figure rasterization failure surfaces the SOURCE as text (never silent loss)", () => {
// Returning the figure unchanged would make the diagram vanish in DOCX
// (the converter drops <figure>/<svg>) — the failure must be visible.
const { tab } = mockTab(() => { throw new RenderCallError("tainted"); });
const warnings: string[] = [];
const out = rasterizeDiagramFigures(figure, tab, 6.5, (m) => warnings.push(m));
expect(out).toContain('<figure class="diagram"');
const srcFigure = figure.replace(
'<figure class="diagram"',
`<figure class="diagram" data-gstack-source="${Buffer.from("graph LR\n A --> B").toString("base64")}"`,
);
const out = rasterizeDiagramFigures(srcFigure, tab, 6.5, (m) => warnings.push(m));
expect(out).toContain("could not be rasterized");
expect(out).toContain("A --&gt; B"); // source visible (escaped), not dropped
expect(out).not.toContain("<figure");
expect(warnings[0]).toContain("rasterization failed");
});
@@ -196,25 +204,17 @@ describe("pure-function stragglers", () => {
fs.unlinkSync(tmp);
}
});
test("resolveBundlePath error names every candidate and the fix", () => {
let message = "";
try {
resolveBundlePath({ GSTACK_DIAGRAM_BUNDLE: "/definitely/not/here.html", HOME: "/nonexistent-home" } as NodeJS.ProcessEnv);
} catch (err: any) {
message = err.message;
}
// The repo-relative candidates exist in this checkout, so force a miss is
// not possible portably — accept either a path return or the error shape.
if (message) {
expect(message).toContain("diagram-render bundle not found");
expect(message).toContain("build:diagram-render");
}
});
// NOTE: resolveBundlePath's not-found error shape is untestable from inside
// this checkout (the repo-relative candidate always exists), and a vacuous
// if-guarded assertion was worse than none. The env-override test above is
// the honest coverage; the error path is exercised manually via
// GSTACK_DIAGRAM_BUNDLE pointing at a missing file outside a repo.
test("screenCss is media-scoped and readable-width", () => {
const css = screenCss();
expect(css).toContain("@media screen");
expect(css).toContain("max-width: 52em");
// 42em at 12pt ≈ 70-75 chars/line — the readable ceiling (design review).
expect(css).toContain("max-width: 42em");
expect(css).toContain(".watermark { display: none; }");
});
});
+41 -1
View File
@@ -4,7 +4,7 @@
* byte-level image dimension prober. No browse daemon required — the tab
* factory returns null so downscale paths are exercised as no-ops.
*/
import { describe, expect, test } from "bun:test";
import { afterAll, describe, expect, test } from "bun:test";
import * as fs from "node:fs";
import * as os from "node:os";
import * as path from "node:path";
@@ -20,6 +20,7 @@ import {
inlineLocalImages,
parseInfoString,
substituteSlots,
decodeFigureSource,
} from "../src/diagram-prepass";
import { imageDims } from "../src/image-size";
@@ -146,6 +147,18 @@ describe("diagnostic + figure blocks", () => {
expect(fig).toContain('aria-label="Auth flow"');
expect(fig).toContain("diagram-caption");
});
test("embedded source round-trips mermaid arrows exactly", () => {
const source = "graph LR\n A --> B\n B -->|label with $& and `ticks`| C";
const fig = buildDiagramFigure({ ...fence, source }, "<svg></svg>");
expect(decodeFigureSource(fig)).toBe(source);
});
test("slot substitution is immune to $-replacement patterns in labels", () => {
const slotHtml = `<figure>label says $' and $& here</figure>`;
const out = substituteSlots("<p>tok-x</p><p>tail</p>", new Map([["tok-x", slotHtml]]));
expect(out).toContain("label says $' and $& here");
expect(out).toContain("<p>tail</p>");
expect(out).not.toContain("tailtail"); // $' expansion would duplicate the tail
});
});
// ─── image dimension probing ──────────────────────────────────────────
@@ -229,6 +242,9 @@ describe("content width", () => {
describe("inlineLocalImages", () => {
const dir = fs.mkdtempSync(path.join(os.tmpdir(), "prepass-img-"));
fs.writeFileSync(path.join(dir, "ok.png"), tinyPng(40, 20));
afterAll(() => {
try { fs.rmSync(dir, { recursive: true, force: true }); } catch { /* best-effort */ }
});
const base = {
inputDir: dir,
@@ -290,6 +306,30 @@ describe("inlineLocalImages", () => {
expect(out).toContain('data-gstack-px-height="44"');
});
test("Windows drive-letter src is treated as a local path, not a URL scheme", () => {
// C:/x.png matches the single-letter-scheme regex — it must reach the
// local-path branch (and the missing-file placeholder), never silently
// pass through as an unknown URL.
const warnings: string[] = [];
const out = inlineLocalImages(`<img src="C:/missing/x.png">`, { ...base, warn: (m) => warnings.push(m) });
expect(out).toContain("image-missing");
expect(warnings.length).toBe(1);
});
test("indented fences inside lists replay byte-for-byte (no list splitting)", () => {
const md = "- item\n\n ```js\n code();\n ```\n\n- next";
const { markdown, fences } = extractDiagramFences(md);
expect(fences).toHaveLength(0);
expect(markdown).toBe(md);
});
test("indented mermaid fences are NOT extracted (column-0 placeholder would split the list)", () => {
const md = "- item\n\n ```mermaid\n graph LR\n ```\n";
const { markdown, fences } = extractDiagramFences(md);
expect(fences).toHaveLength(0);
expect(markdown).toBe(md);
});
test("oversized raster without a tab inlines at full size with no downscale", () => {
// 6000px-wide PNG header (body irrelevant for probing; file must exist)
fs.writeFileSync(path.join(dir, "wide.png"), tinyPng(6000, 100));
+10 -8
View File
@@ -91,15 +91,17 @@ describe("landscape promotion gate", () => {
expect(portrait.length).toBeGreaterThanOrEqual(2);
expect(isLandscape(boxes[0])).toBe(false);
// The veto'd diagram rendered (its labels exist) on a PORTRAIT page.
// The veto'd diagram rendered on SOME portrait page and NO landscape
// page — the actual invariant. (Asserting a specific page index breaks
// spuriously when font metrics shift pagination.)
const pdftotext = resolvePopplerTool("pdftotext")!;
const lastPortrait = portrait[portrait.length - 1];
const vetoText = execFileSync(
pdftotext,
["-f", String(lastPortrait.page), "-l", String(lastPortrait.page), outputPdf, "-"],
{ encoding: "utf8", timeout: CHILD_TIMEOUT_MS },
);
expect(vetoText).toContain("vetoalpha");
const pageText = (page: number) =>
execFileSync(pdftotext, ["-f", String(page), "-l", String(page), outputPdf, "-"], {
encoding: "utf8",
timeout: CHILD_TIMEOUT_MS,
});
expect(portrait.some((b) => pageText(b.page).includes("vetoalpha"))).toBe(true);
expect(landscape.some((b) => pageText(b.page).includes("vetoalpha"))).toBe(false);
} finally {
try { fs.rmSync(workDir, { recursive: true, force: true }); } catch { /* ignore */ }
}
+8
View File
@@ -75,6 +75,14 @@ describe("width styles", () => {
const { html } = applyImagePolicy(img(`src="x" data-gstack-width="3in"`), OPTS);
expect(html).toContain("width: 3in");
});
test("width directive merges with an existing style attribute, preserving it", () => {
const { html } = applyImagePolicy(
img(`src="x" style="border: 1px solid" data-gstack-width="50%"`),
OPTS,
);
expect(html).toContain("border: 1px solid");
expect(html).toContain("width: 50%");
});
test("no directive → no inline style (CSS max-width owns the default)", () => {
const { html } = applyImagePolicy(img(`src="x" data-gstack-px-width="40" data-gstack-px-height="20"`), OPTS);
expect(html).not.toContain("style=");
+27
View File
@@ -36,6 +36,33 @@ describe("diagram-render bundle drift", () => {
expect(info.deps).toEqual(pkg.dependencies);
});
test("BUILD_INFO srcSha256 matches src on disk (edited-src-forgot-rebuild guard)", async () => {
// The deep rebuild check below needs node_modules, which CI doesn't
// install for this nested package — this tier-1.5 fingerprint catches a
// src edit committed without a rebuild using nothing but file hashes.
const info = await Bun.file(BUILD_INFO).json();
const srcSha = createHash("sha256")
.update(await Bun.file(path.join(ROOT, "src", "entry.ts")).text())
.update(await Bun.file(path.join(ROOT, "scripts", "build.ts")).text())
.digest("hex");
expect(srcSha).toBe(info.srcSha256);
});
test("bundle font stack matches print-css (text-measurement drift guard)", async () => {
const entrySrc = await Bun.file(path.join(ROOT, "src", "entry.ts")).text();
// Every family print-css composes into the body stack must appear in the
// bundle's PRINT_SANS literal — mermaid measures text with these fonts and
// the print document lays it out with print-css's; drift = overflowing
// labels (eng-review D3).
for (const family of [
"Helvetica", "Liberation Sans", "Arial",
"Hiragino Kaku Gothic ProN", "Noto Sans CJK JP", "Microsoft YaHei",
"Apple Color Emoji", "Segoe UI Emoji", "Noto Color Emoji",
]) {
expect(entrySrc).toContain(family);
}
});
test("page invariants: module script, base href, escaped terminators, error trap", async () => {
const html = await Bun.file(DIST_HTML).text();
expect(html).toContain('<script type="module">');
+2 -1
View File
@@ -138,7 +138,8 @@ ${mmd}
Score 1-10 on: faithfulness to the ask (are the named stages present and
correctly ordered?), label quality (short node labels, detail on edges),
and readable size (5-15 nodes, not a wall). A diagram that misses the
failure/diagnostic path entirely caps at 6.
failure/diagnostic path entirely caps at 5 — that path is an explicitly
named requirement, so omitting it must fail the run.
Respond with JSON: {"score": N, "reasoning": "..."}`,
);