mirror of
https://github.com/garrytan/gstack.git
synced 2026-05-06 21:46:40 +02:00
3af86348f6
Turn markdown into publication-quality PDFs. $P generate input.md out.pdf
produces a PDF with 1in margins, intelligent page breaks, page numbers,
running header, CONFIDENTIAL footer, and curly quotes/em dashes — all on
Helvetica so copy-paste extraction works ("S ai li ng" bug avoided).
Architecture (per Codex round 2):
markdown → render.ts (marked + sanitize + smartypants) → orchestrator
→ $B newtab --json → $B load-html --tab-id → $B js (poll Paged.js)
→ $B pdf --tab-id → $B closetab
browseClient.ts shells out to the compiled browse CLI rather than
duplicating Playwright. --tab-id isolation per render means parallel
$P generate calls don't race on the active tab. try/finally tab cleanup
survives Paged.js timeouts, browser crashes, and output-path failures.
Features in v1:
--cover left-aligned cover page (eyebrow + title + hairline rule)
--toc clickable static TOC (Paged.js page numbers deferred)
--watermark <text> diagonal DRAFT/CONFIDENTIAL layer
--no-chapter-breaks opt out of H1-starts-new-page
--page-numbers "N of M" footer (default on)
--tagged --outline accessible PDF + bookmark outline (default on)
--allow-network opt in to external image loading (default off for privacy)
--quiet --verbose stderr control
Design decisions locked from the /plan-design-review pass:
- Helvetica everywhere (Chromium emits single-word Tj operators for
system fonts; bundled webfonts emit per-glyph and break extraction).
- Left-aligned body, flush-left paragraphs, no text-indent, 12pt gap.
- Cover shares 1in margins with body pages; no flexbox-center, no
inset padding.
- The reference HTMLs at .context/designs/*.html are the implementation
source of truth for print-css.ts.
Tests (56 unit + 1 E2E combined-features gate):
- smartypants: code/URL-safe, verified against 10 fixtures
- sanitizer: strips <script>/<iframe>/on*/javascript: URLs
- render: HTML assembly, CJK fallback, cover/TOC/chapter wrap
- print-css: all @page rules, margin variants, watermark
- pdftotext: normalize()+copyPasteGate() cross-OS tolerance
- browseClient: binary resolution + typed error propagation
- combined-features gate (P0): 2-chapter fixture with smartypants +
hyphens + ligatures + bold/italic + inline code + lists + blockquote
passes through PDF → pdftotext → expected.txt diff
Deferred to Phase 4 (future PR): Paged.js vendored for accurate TOC page
numbers, highlight.js for syntax highlighting, drop caps, pull quotes,
two-column, CMYK, watermark visual-diff acceptance.
Plan: .context/ceo-plans/2026-04-19-perfect-pdf-generator.md
References: .context/designs/make-pdf-*.html
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
77 lines
3.3 KiB
TypeScript
77 lines
3.3 KiB
TypeScript
/**
|
|
* Combined-features copy-paste gate — the P0 CI gate.
|
|
*
|
|
* This test runs the compiled `make-pdf/dist/pdf` binary against a fixture
|
|
* that has every v1 typography feature on (smartypants, hyphens, chapter
|
|
* breaks, bold/italic, inline code, blockquote, lists, headings). It then
|
|
* pipes the output through pdftotext and asserts the extracted text
|
|
* matches the handwritten expected.txt.
|
|
*
|
|
* Codex round 2 told us this (not per-feature gates) is the real gate a
|
|
* user actually cares about — features interact, and the combined
|
|
* extraction is what predicts production quality.
|
|
*
|
|
* Gating: only runs when the compiled binary + browse + pdftotext are all
|
|
* available. Skipped cleanly otherwise (local dev without full install).
|
|
*/
|
|
|
|
import { describe, expect, test } from "bun:test";
|
|
import { execFileSync } from "node:child_process";
|
|
import * as fs from "node:fs";
|
|
import * as os from "node:os";
|
|
import * as path from "node:path";
|
|
|
|
import { copyPasteGate, resolvePdftotext } from "../../src/pdftotext";
|
|
|
|
const FIXTURE = path.resolve(__dirname, "../fixtures/combined-gate.md");
|
|
const EXPECTED = path.resolve(__dirname, "../fixtures/combined-gate.expected.txt");
|
|
const ROOT = path.resolve(__dirname, "../../..");
|
|
const PDF_BIN = path.join(ROOT, "make-pdf/dist/pdf");
|
|
const BROWSE_BIN = path.join(ROOT, "browse/dist/browse");
|
|
|
|
function prerequisitesAvailable(): { ok: true } | { ok: false; reason: string } {
|
|
if (!fs.existsSync(PDF_BIN)) return { ok: false, reason: `make-pdf binary missing (${PDF_BIN}). Run bun run build.` };
|
|
if (!fs.existsSync(BROWSE_BIN)) return { ok: false, reason: `browse binary missing (${BROWSE_BIN}).` };
|
|
if (!fs.existsSync(FIXTURE)) return { ok: false, reason: `fixture missing (${FIXTURE}).` };
|
|
if (!fs.existsSync(EXPECTED)) return { ok: false, reason: `expected.txt missing (${EXPECTED}).` };
|
|
try { resolvePdftotext(); } catch (err: any) { return { ok: false, reason: err.message }; }
|
|
return { ok: true };
|
|
}
|
|
|
|
describe("combined-features copy-paste gate", () => {
|
|
const avail = prerequisitesAvailable();
|
|
|
|
test.skipIf(!avail.ok)("fixture PDF extracts cleanly through pdftotext", () => {
|
|
if (!avail.ok) return; // satisfies the type checker
|
|
// Use /tmp directly (browse's validateOutputPath allows /private/tmp,
|
|
// which macOS resolves /tmp to). os.tmpdir() returns /var/folders/...
|
|
// which is outside the safe-dirs allowlist.
|
|
const outputPdf = `/tmp/make-pdf-combined-gate-${process.pid}.pdf`;
|
|
try {
|
|
execFileSync(PDF_BIN, ["generate", FIXTURE, outputPdf, "--quiet"], {
|
|
encoding: "utf8",
|
|
env: { ...process.env, BROWSE_BIN },
|
|
stdio: ["ignore", "pipe", "pipe"],
|
|
});
|
|
expect(fs.existsSync(outputPdf)).toBe(true);
|
|
|
|
const expected = fs.readFileSync(EXPECTED, "utf8");
|
|
const result = copyPasteGate(outputPdf, expected);
|
|
if (!result.ok) {
|
|
// Attach the extracted text so CI logs make the failure diagnosable
|
|
process.stderr.write(`\n--- EXTRACTED ---\n${result.extracted}\n--- END ---\n\n`);
|
|
process.stderr.write(`--- REASONS ---\n${result.reasons.join("\n")}\n--- END ---\n`);
|
|
}
|
|
expect(result.ok).toBe(true);
|
|
} finally {
|
|
try { fs.unlinkSync(outputPdf); } catch { /* ignore */ }
|
|
}
|
|
}, 30000);
|
|
|
|
if (!avail.ok) {
|
|
test("prerequisites check", () => {
|
|
console.warn(`[skip] ${avail.reason}`);
|
|
});
|
|
}
|
|
});
|