From 8241949357263be64013a8410171def68cff920c Mon Sep 17 00:00:00 2001 From: Garry Tan Date: Tue, 9 Jun 2026 22:29:23 -0700 Subject: [PATCH 1/4] v1.57.9.0 feat: source-clean gbrain render (dev-setup --out-dir + machine-wide gbrain-refresh) (#1951) * feat(gbrain-detect): add --is-ok live-detection exit-code gate Single source of truth for 'is gbrain usable'. Runs live detection (never reads the possibly-stale gbrain-detection.json) and exits 0 iff status is ok, so setup, bin/dev-setup, and gstack-config can gate brain-aware rendering on one shared check instead of re-grepping the JSON. Co-Authored-By: Claude Opus 4.8 (1M context) * feat(gen-skill-docs): add --out-dir with surgical section-path rewrite --out-dir mirrors the Claude skill tree (SKILL.md + sections) into a separate directory instead of writing in place, and rewrites the literal section-base path (~/.claude/skills/gstack//sections/) in generated content to point at the out-dir. The rewrite is surgical: only /sections/ paths move; bin/, browse/, docs/ references stay pointed at the global install. Global extras (proactive-suggestions.json) are skipped in out-dir mode. Default (no flag) behavior is unchanged. Co-Authored-By: Claude Opus 4.8 (1M context) * feat(dev-setup): render gbrain :user variant to an untracked workspace dir Stops the dev/Conductor workspace from dirtying tracked SKILL.md source. setup honors GSTACK_SKIP_GBRAIN_REGEN (passed inline by dev-setup, never exported) and skips the in-place :user regen; detection is still persisted (PID-unique tmp so concurrent workspaces can't clobber it). dev-setup instead renders the :user variant into .claude/gstack-rendered (gitignored, per-workspace) and repoints the workspace SKILL.md symlinks at it, so the workspace gets brain-aware blocks while the worktree stays canonical. dev-teardown removes the render. Co-Authored-By: Claude Opus 4.8 (1M context) * feat(dev-skill): refresh the untracked brain-aware render on template change After the default in-place regen (which keeps the worktree canonical and runs validation), also re-render the :user variant into .claude/gstack-rendered when it exists, so live template edits reflect at the workspace's runtime. Never creates the render dir during plain template dev. Co-Authored-By: Claude Opus 4.8 (1M context) * feat(gstack-config): gbrain-refresh renders brain-aware blocks into the install Extends gbrain-refresh to render the :user variant into the global install (~/.claude/skills/gstack) so every project's Claude sessions get brain-aware blocks, not just the gstack dev workspace. Guarded against mutating the wrong directory: the target must exist, not be a symlink (a symlinked install points at a dev worktree), and look like a real gstack clone (VERSION + package.json). Idempotent and self-documenting. CLAUDE.md's deploy section now notes that 'git reset --hard' reverts the blocks and to re-run gbrain-refresh. Co-Authored-By: Claude Opus 4.8 (1M context) * test: cover gstack-gbrain-detect --is-ok + dev-skill render refresh Fills the two automated-coverage gaps from the eng review: --is-ok exit-code gate (no-cli -> nonzero, healthy -> 0, plus an agrees-with-JSON no-skew check reusing the deterministic fake-gbrain harness) and a static tripwire that dev-skill re-renders the :user variant into the workspace render dir only when it already exists. Co-Authored-By: Claude Opus 4.8 (1M context) * chore: bump version and changelog (v1.57.9.0) Co-Authored-By: Claude Opus 4.8 (1M context) * docs: document brain-aware dev-setup render for v1.57.9.0 Co-Authored-By: Claude Opus 4.8 (1M context) --------- Co-authored-by: Claude Opus 4.8 (1M context) --- .gitignore | 1 + CHANGELOG.md | 71 +++++++++++++++++ CLAUDE.md | 6 ++ CONTRIBUTING.md | 20 ++++- VERSION | 2 +- bin/dev-setup | 46 ++++++++++- bin/dev-teardown | 9 ++- bin/gstack-config | 25 +++++- bin/gstack-gbrain-detect | 10 +++ package.json | 2 +- scripts/dev-skill.ts | 18 +++++ scripts/gen-skill-docs.ts | 59 +++++++++++++- setup | 33 +++++--- test/dev-setup-render-isolation.test.ts | 91 ++++++++++++++++++++++ test/gbrain-detect-shape.test.ts | 75 +++++++++++++++++- test/gbrain-refresh-install-render.test.ts | 60 ++++++++++++++ test/gen-skill-docs-out-dir.test.ts | 84 ++++++++++++++++++++ 17 files changed, 590 insertions(+), 22 deletions(-) create mode 100644 test/dev-setup-render-isolation.test.ts create mode 100644 test/gbrain-refresh-install-render.test.ts create mode 100644 test/gen-skill-docs-out-dir.test.ts diff --git a/.gitignore b/.gitignore index 42b2c2a04..6eac08f36 100644 --- a/.gitignore +++ b/.gitignore @@ -7,6 +7,7 @@ make-pdf/dist/ bin/gstack-global-discover* .gstack/ .claude/skills/ +.claude/gstack-rendered/ .claude/scheduled_tasks.lock .claude/*.lock .agents/ diff --git a/CHANGELOG.md b/CHANGELOG.md index 438536cad..3f8cffae1 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -1,5 +1,76 @@ # Changelog +## [1.57.9.0] - 2026-06-09 + +## **Your gstack checkout stays clean when gbrain is installed.** +## **Brain-aware skill blocks render to an untracked spot, never into tracked source.** + +Before this, finishing a Conductor or dev-workspace setup with gbrain installed +rewrote 16 planning and review SKILL.md files in place, adding 326 lines of +brain-aware blocks straight into tracked source. Your working tree came back dirty, +one stray `git add` away from committing a token regression for everyone who does +not run gbrain. Now `gen-skill-docs --out-dir` renders the brain-aware variant into +an untracked per-workspace directory, and `bin/dev-setup` repoints the workspace's +skill symlinks at it. The dev workspace gets the full gbrain experience (context-load +and save-to-brain blocks live at runtime), while the tracked SKILL.md files stay +byte-for-byte canonical. To turn the blocks on across all your projects' Claude +sessions, `gstack-config gbrain-refresh` now renders them into your global install, +guarded so it never mutates a symlinked or non-gstack directory. + +### The numbers that matter + +Structural facts of the change, verifiable from the diff plus `bun run gen:skill-docs` +(zero drift) and the new behavioral test (`test/gen-skill-docs-out-dir.test.ts`). + +| When gbrain is installed | Before | After | +|---|---|---| +| Tracked SKILL.md files dirtied by dev-setup | 16 (+326 lines) | 0 | +| Where brain-aware blocks render in a dev workspace | in-place, tracked source | `.claude/gstack-rendered/`, untracked | +| Brain-aware blocks across other projects | re-run `./setup` or hand-edit | `gstack-config gbrain-refresh` (idempotent) | +| "Is gbrain usable" check | per-caller JSON grep, can read stale state | `gstack-gbrain-detect --is-ok` (one live gate) | + +The section-path rewrite is surgical: only `~/.claude/skills/gstack//sections/` +references move to the render dir, so `bin/` and `docs/` references still resolve to +the install. + +### What this means for you + +If you develop gstack with gbrain on, `git status` is clean again after setup, and +you can stop fishing brain-block drift out of your commits. After a +`git reset --hard` deploy of your install, re-run `gstack-config gbrain-refresh` to +restore the machine-wide blocks (it is idempotent, and the deploy note in CLAUDE.md +spells this out). + +### Itemized changes + +#### Added +- `gen-skill-docs --out-dir `: render the Claude SKILL.md + sections into a + separate directory instead of in place, rewriting only the section-base path so + section reads resolve to the render. Default (no flag) output is unchanged. +- `gstack-gbrain-detect --is-ok`: live-detection exit-code gate (0 iff gbrain is + usable), so setup, dev-setup, and gstack-config share one check. +- `gstack-config gbrain-refresh` now renders brain-aware blocks into the global + install (`~/.claude/skills/gstack`), guarded against symlinked or non-gstack + targets and self-documenting about the `reset --hard` re-run cycle. + +#### Changed +- `bin/dev-setup` renders the brain-aware variant into `.claude/gstack-rendered` + (gitignored) and repoints workspace skill symlinks at it; the worktree stays + canonical. `GSTACK_SKIP_GBRAIN_REGEN` is passed inline to the nested setup, never + exported. +- `setup` honors `GSTACK_SKIP_GBRAIN_REGEN` (skips the in-place brain regen on dev + trees) and writes detection state to a PID-unique tmp so concurrent workspaces + cannot clobber it. +- `scripts/dev-skill.ts` refreshes the workspace render on template change, only + when the render dir already exists. +- `bin/dev-teardown` removes the untracked render. + +#### For contributors +- New tests: `test/gen-skill-docs-out-dir.test.ts` (behavioral: worktree unchanged, + blocks rendered, section paths rewritten), `test/dev-setup-render-isolation.test.ts` + and `test/gbrain-refresh-install-render.test.ts` (static tripwires), plus + `--is-ok` coverage in `test/gbrain-detect-shape.test.ts`. + ## [1.57.8.0] - 2026-06-09 ## **`browse` is now the one Chromium on the box, for offline rendering too.** diff --git a/CLAUDE.md b/CLAUDE.md index 41db0093e..03384ae79 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -883,6 +883,12 @@ The active skill lives at `~/.claude/skills/gstack/`. After making changes: 2. Fetch and reset in the skill directory: `cd ~/.claude/skills/gstack && git fetch origin && git reset --hard origin/main` 3. Rebuild: `cd ~/.claude/skills/gstack && bun run build` +**If you use gbrain:** the `git reset --hard` in step 2 reverts the brain-aware +(`GBRAIN_CONTEXT_LOAD` / `GBRAIN_SAVE_RESULTS`) blocks that `gstack-config +gbrain-refresh` renders into the install (those generated blocks differ from +`main` by design). After deploying, re-run `gstack-config gbrain-refresh` to +restore them across all your projects' Claude sessions. It's idempotent. + Or copy the binaries directly: - `cp browse/dist/browse ~/.claude/skills/gstack/browse/dist/browse` - `cp design/dist/design ~/.claude/skills/gstack/design/dist/design` diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md index a4872fc47..5a56ef5d3 100644 --- a/CONTRIBUTING.md +++ b/CONTRIBUTING.md @@ -106,6 +106,22 @@ bun run build bin/dev-teardown ``` +### Brain-aware blocks in a dev workspace (gbrain installed) + +If gbrain is installed and usable (`bin/gstack-gbrain-detect --is-ok` exits 0), +`bin/dev-setup` keeps your tracked `SKILL.md` files canonical and renders the +brain-aware variant (the `GBRAIN_CONTEXT_LOAD` / `GBRAIN_SAVE_RESULTS` blocks) +into `.claude/gstack-rendered/` (gitignored, per-workspace). It then repoints the +workspace's `SKILL.md` symlinks at that render, so your Claude sessions get the +full gbrain experience while `git status` stays clean. Under the hood, dev-setup +passes `GSTACK_SKIP_GBRAIN_REGEN=1` inline to the nested `./setup` (so it never +dirties tracked source) and runs `gen:skill-docs:user --out-dir .claude/gstack-rendered`, +which rewrites only the section-base paths to point at the render. `bin/dev-teardown` +removes the render. To make the blocks live across your *other* projects' Claude +sessions, run `gstack-config gbrain-refresh`, which renders them into the global +install (`~/.claude/skills/gstack`), guarded so it never touches a symlinked or +non-gstack directory. + ## Testing & evals ### Setup @@ -334,8 +350,8 @@ If you're using [Conductor](https://conductor.build) to run multiple Claude Code | Hook | Script | What it does | |------|--------|-------------| -| `setup` | `bin/dev-setup` | Copies `.env` from main worktree, installs deps, symlinks skills, runs `./setup` non-interactively | -| `archive` | `bin/dev-teardown` | Removes skill symlinks, cleans up `.claude/` directory | +| `setup` | `bin/dev-setup` | Copies `.env` from main worktree, installs deps, symlinks skills, runs `./setup` non-interactively, and (if gbrain is installed) renders brain-aware blocks into `.claude/gstack-rendered/` without dirtying tracked source | +| `archive` | `bin/dev-teardown` | Removes skill symlinks, the `.claude/gstack-rendered/` render, and cleans up `.claude/` directory | When Conductor creates a new workspace, `bin/dev-setup` runs automatically. It detects the main worktree (via `git worktree list`), copies your `.env` so API keys carry over, and sets up dev mode — no manual steps needed. diff --git a/VERSION b/VERSION index caf2638d9..10521b5e0 100644 --- a/VERSION +++ b/VERSION @@ -1 +1 @@ -1.57.8.0 +1.57.9.0 diff --git a/bin/dev-setup b/bin/dev-setup index 0d8460f91..00a286706 100755 --- a/bin/dev-setup +++ b/bin/dev-setup @@ -72,7 +72,48 @@ fi # no-op skip (no install, no decline marker). A dev workspace must never mutate # global settings.json. To install the hooks, run `./setup --plan-tune-hooks` # directly (outside dev-setup). Saved prefix/other config preferences still apply. -"$GSTACK_LINK/setup" --plan-tune-hooks=prompt /dev/null; then + echo "" + echo "gbrain detected — rendering brain-aware skills into .claude/gstack-rendered (workspace-only, untracked)..." + rm -rf "$RENDER_DIR" + if ( cd "$REPO_ROOT" && bun run gen:skill-docs:user --host claude --out-dir "$RENDER_DIR" >/dev/null 2>&1 ); then + # Repoint each project-local SKILL.md symlink whose worktree target has a + # rendered counterpart. The skill DIRECTORY name (basename of the symlink + # target's dir) maps to RENDER_DIR//SKILL.md, which is robust to + # frontmatter renames and the gstack- prefix on the link name. + repointed=0 + for skill_link in "$REPO_ROOT"/.claude/skills/*/SKILL.md; do + [ -L "$skill_link" ] || continue + target="$(readlink "$skill_link")" + skilldir="$(basename "$(dirname "$target")")" + rendered="$RENDER_DIR/$skilldir/SKILL.md" + if [ -f "$rendered" ]; then ln -snf "$rendered" "$skill_link"; repointed=$((repointed + 1)); fi + done + echo " $repointed workspace skills now serve brain-aware blocks (worktree stays canonical)." + else + echo " warning: brain-aware render failed — workspace uses canonical skills." + fi +fi echo "" echo "Dev mode active. Skills resolve from this working tree." @@ -80,4 +121,7 @@ echo " .claude/skills/gstack → $REPO_ROOT" echo " .agents/skills/gstack → $REPO_ROOT" echo "Edit any SKILL.md and test immediately — no copy/deploy needed." echo "" +echo "To make brain-aware blocks live across your OTHER projects too, run:" +echo " gstack-config gbrain-refresh" +echo "" echo "To tear down: bin/dev-teardown" diff --git a/bin/dev-teardown b/bin/dev-teardown index dc8f74260..06189e189 100755 --- a/bin/dev-teardown +++ b/bin/dev-teardown @@ -24,9 +24,16 @@ if [ -d "$CLAUDE_SKILLS" ]; then fi rmdir "$CLAUDE_SKILLS" 2>/dev/null || true - rmdir "$REPO_ROOT/.claude" 2>/dev/null || true fi +# ─── Clean up the untracked brain-aware render (bin/dev-setup step 7) ── +RENDER_DIR="$REPO_ROOT/.claude/gstack-rendered" +if [ -d "$RENDER_DIR" ]; then + rm -rf "$RENDER_DIR" + removed+=("claude/gstack-rendered") +fi +rmdir "$REPO_ROOT/.claude" 2>/dev/null || true + # ─── Clean up .agents/skills/ ──────────────────────────────── AGENTS_SKILLS="$REPO_ROOT/.agents/skills" if [ -d "$AGENTS_SKILLS" ]; then diff --git a/bin/gstack-config b/bin/gstack-config index 01defcef8..ec465b281 100755 --- a/bin/gstack-config +++ b/bin/gstack-config @@ -396,8 +396,29 @@ case "${1:-}" in case "$STATUS" in ok) - echo "Detected gbrain v$VERSION → brain-aware blocks will render in planning-skill SKILL.md files." - echo "Run 'bun run gen:skill-docs' in the gstack repo (or re-run ./setup) to regenerate now." + echo "Detected gbrain v$VERSION." + # Render brain-aware blocks INTO the global install so EVERY project's + # Claude sessions get them (other projects read SKILL.md + sections from + # ~/.claude/skills/gstack via absolute paths baked at gen time). Guards + # (never mutate an arbitrary directory): the target must exist, not be a + # symlink (a symlinked install points at a dev worktree — rendering there + # would dirty tracked source), and look like a real gstack clone. + INSTALL_DIR="$HOME/.claude/skills/gstack" + if [ ! -d "$INSTALL_DIR" ]; then + echo "No global install at $INSTALL_DIR — nothing to render. (Dev workspaces get blocks via bin/dev-setup.)" + elif [ -L "$INSTALL_DIR" ]; then + echo "Skip: $INSTALL_DIR is a symlink (likely a dev worktree). Rendering there would dirty tracked source — run bin/dev-setup in that worktree instead." + elif [ ! -f "$INSTALL_DIR/VERSION" ] || [ ! -f "$INSTALL_DIR/package.json" ]; then + echo "Skip: $INSTALL_DIR doesn't look like a gstack clone (missing VERSION/package.json) — refusing to modify it." + elif ! command -v bun >/dev/null 2>&1; then + echo "Skip: bun not on PATH — can't render. Install bun, then re-run 'gstack-config gbrain-refresh'." + elif ( cd "$INSTALL_DIR" && bun run gen:skill-docs:user --host claude >/dev/null 2>&1 ); then + echo "Rendered brain-aware blocks into $INSTALL_DIR — now live across all your projects' Claude sessions." + echo "Note: this dirties the install's git tree (generated blocks differ from main, by design)." + echo " A 'git reset --hard origin/main' there reverts them; re-run 'gstack-config gbrain-refresh' to restore." + else + echo "Warning: render failed. Run 'cd $INSTALL_DIR && bun run gen:skill-docs:user --host claude' manually to see the error." + fi ;; *) echo "gbrain not detected (local-status: $STATUS) → brain-aware blocks will be suppressed in planning-skill SKILL.md files." diff --git a/bin/gstack-gbrain-detect b/bin/gstack-gbrain-detect index 897bec243..4eee753da 100755 --- a/bin/gstack-gbrain-detect +++ b/bin/gstack-gbrain-detect @@ -234,4 +234,14 @@ function main(): void { process.stdout.write(JSON.stringify(out, null, 2) + "\n"); } +// --is-ok: live engine-status gate. Exits 0 iff gbrain is usable ("ok"), 1 +// otherwise. Runs detection live (never reads the possibly-stale +// gbrain-detection.json), so callers — setup, bin/dev-setup, and +// `gstack-config gbrain-refresh` — can decide whether to render the gbrain +// :user variant without duplicating the JSON grep. Prints nothing on stdout. +if (process.argv.includes("--is-ok")) { + const noCache = process.env.GSTACK_DETECT_NO_CACHE === "1"; + process.exit(localEngineStatus({ noCache }) === "ok" ? 0 : 1); +} + main(); diff --git a/package.json b/package.json index 789aa8db8..53da1d736 100644 --- a/package.json +++ b/package.json @@ -1,6 +1,6 @@ { "name": "gstack", - "version": "1.57.8.0", + "version": "1.57.9.0", "description": "Garry's Stack — Claude Code skills + fast headless browser. One repo, one install, entire AI engineering workflow.", "license": "MIT", "type": "module", diff --git a/scripts/dev-skill.ts b/scripts/dev-skill.ts index ae6ba30ad..f585f57b5 100644 --- a/scripts/dev-skill.ts +++ b/scripts/dev-skill.ts @@ -50,6 +50,24 @@ function regenerateAndValidate() { console.log(` [check] \u2705 ${output} — ${totalValid} commands, all valid`); } } + + // Dev workspace render isolation: the default in-place regen above keeps the + // worktree canonical. If bin/dev-setup set up an untracked brain-aware render + // (.claude/gstack-rendered), refresh it too so live template edits reflect at + // this workspace's runtime. Only runs when the render dir already exists — we + // never create it during plain template dev. + const RENDER_DIR = path.join(ROOT, '.claude', 'gstack-rendered'); + if (fs.existsSync(RENDER_DIR)) { + try { + execSync( + `bun run scripts/gen-skill-docs.ts --respect-detection --host claude --out-dir ${JSON.stringify(RENDER_DIR)}`, + { cwd: ROOT, stdio: 'pipe' }, + ); + console.log(' [render] refreshed .claude/gstack-rendered (brain-aware workspace copy)'); + } catch (err: any) { + console.log(` [render] ERROR: ${err.stderr?.toString().trim() || err.message}`); + } + } } // Initial run diff --git a/scripts/gen-skill-docs.ts b/scripts/gen-skill-docs.ts index 5fea07713..45f617a1b 100644 --- a/scripts/gen-skill-docs.ts +++ b/scripts/gen-skill-docs.ts @@ -137,6 +137,39 @@ const EXPLAIN_LEVEL: 'default' | 'terse' = (() => { return val; })(); +// ─── Out-dir (dev workspace render isolation) ─────────────── +// --out-dir redirects Claude SKILL.md + section output to a separate +// (untracked) directory instead of writing in place, AND rewrites the literal +// section-base path (`~/.claude/skills/gstack//sections/`) inside the +// generated content to point at the out-dir, so section Reads resolve to the +// rendered copy rather than the global install. Used by bin/dev-setup to render +// the gbrain `:user` variant for a Conductor workspace without dirtying tracked +// source. Default (unset) = in-place, behavior unchanged. Claude host only. +const OUT_DIR_ARG = process.argv.find(a => a.startsWith('--out-dir')); +const OUT_DIR: string | null = (() => { + if (!OUT_DIR_ARG) return null; + const val = OUT_DIR_ARG.includes('=') + ? OUT_DIR_ARG.split('=')[1] + : process.argv[process.argv.indexOf(OUT_DIR_ARG) + 1]; + if (!val) throw new Error('--out-dir requires a directory path'); + return path.resolve(val); +})(); + +/** + * When rendering to an out-dir, repoint the literal section-base path at the + * out-dir so section Reads resolve to the rendered copy, not the global install. + * Surgical: ONLY paths containing `/sections/` are rewritten — bin/, browse/, + * docs/ references keep pointing at `~/.claude/skills/gstack` (the global + * install, which still works). No-op when --out-dir is unset. + */ +function rewriteSectionBase(content: string): string { + if (!OUT_DIR) return content; + return content.replace( + /~\/\.claude\/skills\/gstack\/([^\s)`"'*]+\/sections\/)/g, + `${OUT_DIR}/$1`, + ); +} + // HostPaths, HOST_PATHS, and TemplateContext imported from ./resolvers/types (line 7-8) // Design constants (AI_SLOP_BLACKLIST, OPENAI_HARD_REJECTIONS, OPENAI_LITMUS_CHECKS) // live in ./resolvers/constants and are consumed by resolvers directly. @@ -768,6 +801,12 @@ function processTemplate(tmplPath: string, host: Host = 'claude'): { outputPath: // Determine skill directory relative to ROOT const skillDir = path.relative(ROOT, path.dirname(tmplPath)); + // --out-dir (Claude only): mirror the skill tree into the out-dir instead of + // writing in place. External hosts compute their own paths below. + if (OUT_DIR && host === 'claude') { + outputPath = path.join(OUT_DIR, skillDir, path.basename(tmplPath).replace(/\.tmpl$/, '')); + } + // Extract name/description: name drives external skill naming + setup symlinks // (and TemplateContext.skillName via buildContext); description feeds external // host metadata. When frontmatter name: differs from directory name (e.g. @@ -822,6 +861,9 @@ function processTemplate(tmplPath: string, host: Host = 'claude'): { outputPath: } } + // --out-dir: repoint section-base paths to the out-dir (no-op otherwise). + if (host === 'claude') content = rewriteSectionBase(content); + return { outputPath, content, symlinkLoop, catalogParts }; } @@ -860,6 +902,10 @@ function processSectionTemplate( // External hosts: rewrite cross-reference paths/tools (no frontmatter to transform). if (host !== 'claude') { content = applyHostRewrites(content, hostConfig); + } else { + // --out-dir: a section may cross-reference another section by absolute path; + // repoint those to the out-dir too (no-op when --out-dir is unset). + content = rewriteSectionBase(content); } // Plain generated header (no frontmatter to insert after). @@ -868,7 +914,7 @@ function processSectionTemplate( const fileName = path.basename(sectionTmplPath).replace(/\.tmpl$/, ''); let outputPath: string; if (host === 'claude') { - outputPath = path.join(ROOT, skillDir, 'sections', fileName); + outputPath = path.join(OUT_DIR || ROOT, skillDir, 'sections', fileName); } else { const externalName = externalSkillName(skillDir, parentName); outputPath = path.join(ROOT, hostConfig.hostSubdir, 'skills', externalName, 'sections', fileName); @@ -933,7 +979,7 @@ for (const currentHost of hostsToRun) { voice_line: catalogParts.voiceLine, }; } - const relOutput = path.relative(ROOT, outputPath); + const relOutput = path.relative(OUT_DIR || ROOT, outputPath); if (symlinkLoop) { console.log(`SKIPPED (symlink loop): ${relOutput}`); @@ -946,6 +992,9 @@ for (const currentHost of hostsToRun) { console.log(`FRESH: ${relOutput}`); } } else { + // In-place writes land in existing dirs; --out-dir needs the mirrored + // skill dir created first. + if (OUT_DIR) fs.mkdirSync(path.dirname(outputPath), { recursive: true }); fs.writeFileSync(outputPath, content); console.log(`GENERATED: ${relOutput}`); } @@ -982,7 +1031,7 @@ for (const currentHost of hostsToRun) { currentHostConfig.generation.skipSkills.includes(sec.skillDir)) continue; const { outputPath, content } = processSectionTemplate(path.join(ROOT, sec.tmpl), sec.skillDir, currentHost); - const relOutput = path.relative(ROOT, outputPath); + const relOutput = path.relative(OUT_DIR || ROOT, outputPath); if (DRY_RUN) { const existing = fs.existsSync(outputPath) ? fs.readFileSync(outputPath, 'utf-8') : ''; @@ -1079,7 +1128,9 @@ The orchestrator will persist the plan link to its own memory/knowledge store. // No timestamp field — keeps the file content-deterministic across runs so // CI dry-run freshness checks don't flap on regen. If a per-run timestamp // is ever needed for debugging, write it to a separate `.gen-stamp` file. - if (currentHost === 'claude' && CATALOG_MODE === 'trim' && Object.keys(proactiveAggregate).length > 0 && !DRY_RUN) { + // Skip the global proactive-suggestions.json in --out-dir mode: it lives at + // a repo path (scripts/) and the dev workspace render doesn't need it. + if (currentHost === 'claude' && CATALOG_MODE === 'trim' && Object.keys(proactiveAggregate).length > 0 && !DRY_RUN && !OUT_DIR) { const proactivePath = path.join(ROOT, 'scripts', 'proactive-suggestions.json'); // Sort keys alphabetically so the serialized JSON is identical across // machines regardless of filesystem-iteration order. Without this, CI diff --git a/setup b/setup index 0c180f7bf..ec1db22b7 100755 --- a/setup +++ b/setup @@ -1286,22 +1286,37 @@ fi DETECT_BIN="$SOURCE_GSTACK_DIR/bin/gstack-gbrain-detect" GBRAIN_STATE_DIR="${GSTACK_HOME:-$HOME/.gstack}" DETECTION_FILE="$GBRAIN_STATE_DIR/gbrain-detection.json" +# PID-unique tmp so concurrent setups (parallel Conductor workspaces) can't +# clobber each other's in-flight detection write. +DETECTION_TMP="$DETECTION_FILE.$$.tmp" mkdir -p "$GBRAIN_STATE_DIR" if [ -x "$DETECT_BIN" ]; then - if "$DETECT_BIN" > "$DETECTION_FILE.tmp" 2>/dev/null; then - mv "$DETECTION_FILE.tmp" "$DETECTION_FILE" - if grep -q '"gbrain_local_status": "ok"' "$DETECTION_FILE" 2>/dev/null; then - log "gbrain detected — regenerating Claude SKILL.md with brain-aware blocks (~250 token overhead per planning skill)..." - ( - cd "$SOURCE_GSTACK_DIR" - bun_cmd run gen:skill-docs:user --host claude 2>&1 | tail -3 - ) || log " warning: gen:skill-docs:user failed — run 'bun run gen:skill-docs:user' manually if you want brain-aware blocks" + if "$DETECT_BIN" > "$DETECTION_TMP" 2>/dev/null; then + mv "$DETECTION_TMP" "$DETECTION_FILE" + # Single source of truth for "is gbrain usable" — `--is-ok` runs live + # detection (exit 0 iff ok), so setup, bin/dev-setup, and gstack-config + # all gate on the same check instead of re-grepping the JSON. + if "$DETECT_BIN" --is-ok 2>/dev/null; then + if [ -n "${GSTACK_SKIP_GBRAIN_REGEN:-}" ]; then + # Dev/source tree (set by bin/dev-setup): never regenerate tracked + # SKILL.md in place — that dirties checked-in source. Detection is + # still persisted above; the dev workspace renders the :user variant + # into an untracked dir, and other projects get blocks via + # `gstack-config gbrain-refresh`. + log "gbrain detected — GSTACK_SKIP_GBRAIN_REGEN set: leaving tracked SKILL.md canonical (dev/source tree)." + else + log "gbrain detected — regenerating Claude SKILL.md with brain-aware blocks (~250 token overhead per planning skill)..." + ( + cd "$SOURCE_GSTACK_DIR" + bun_cmd run gen:skill-docs:user --host claude 2>&1 | tail -3 + ) || log " warning: gen:skill-docs:user failed — run 'bun run gen:skill-docs:user' manually if you want brain-aware blocks" + fi else log "gbrain not detected — brain-aware blocks suppressed in planning-skill SKILL.md files (zero token overhead)." log " To enable: install gbrain via /setup-gbrain, then re-run ./setup or 'gstack-config gbrain-refresh'." fi else - rm -f "$DETECTION_FILE.tmp" + rm -f "$DETECTION_TMP" log " warning: gstack-gbrain-detect failed — brain-aware blocks will stay suppressed" fi fi diff --git a/test/dev-setup-render-isolation.test.ts b/test/dev-setup-render-isolation.test.ts new file mode 100644 index 000000000..fbfeb790c --- /dev/null +++ b/test/dev-setup-render-isolation.test.ts @@ -0,0 +1,91 @@ +import { describe, test, expect } from 'bun:test'; +import * as path from 'path'; +import * as fs from 'fs'; + +// Static tripwires for the B2 render-isolation wiring. These fail CI if a +// refactor drops a load-bearing line, re-introducing the "dev-setup dirties +// tracked SKILL.md" drift (or worse, leaks the skip-guard into real installs). +const ROOT = path.resolve(import.meta.dir, '..'); +const read = (rel: string) => fs.readFileSync(path.join(ROOT, rel), 'utf-8'); + +describe('dev-setup: worktree stays canonical', () => { + const devSetup = read('bin/dev-setup'); + + test('passes GSTACK_SKIP_GBRAIN_REGEN inline on the nested setup call', () => { + expect(devSetup).toContain('GSTACK_SKIP_GBRAIN_REGEN=1 "$GSTACK_LINK/setup"'); + }); + + test('never exports GSTACK_SKIP_GBRAIN_REGEN (would leak into other setup paths)', () => { + expect(devSetup).not.toMatch(/export\s+GSTACK_SKIP_GBRAIN_REGEN/); + }); + + test('renders the :user variant into an out-dir, not in place', () => { + expect(devSetup).toContain('--out-dir'); + expect(devSetup).toContain('.claude/gstack-rendered'); + }); + + test('gates the render on gstack-gbrain-detect --is-ok', () => { + expect(devSetup).toContain('--is-ok'); + }); +}); + +describe('setup: honors GSTACK_SKIP_GBRAIN_REGEN', () => { + const setup = read('setup'); + + test('skips the in-place :user regen when the guard is set', () => { + expect(setup).toContain('${GSTACK_SKIP_GBRAIN_REGEN:-}'); + // The guard must wrap the in-place render, not the detection persist. + const idx = setup.indexOf('GSTACK_SKIP_GBRAIN_REGEN'); + const after = setup.slice(idx, idx + 600); + expect(after).toContain('leaving tracked SKILL.md canonical'); + }); + + test('uses a PID-unique detection tmp (no concurrent clobber)', () => { + expect(setup).toContain('$DETECTION_FILE.$$.tmp'); + }); + + test('gates detection on the shared --is-ok check', () => { + expect(setup).toContain('"$DETECT_BIN" --is-ok'); + }); +}); + +describe('gen-skill-docs: section rewrite is gated on --out-dir', () => { + const gen = read('scripts/gen-skill-docs.ts'); + + test('rewriteSectionBase is a no-op without --out-dir', () => { + expect(gen).toContain('function rewriteSectionBase'); + const idx = gen.indexOf('function rewriteSectionBase'); + const body = gen.slice(idx, idx + 400); + expect(body).toContain('if (!OUT_DIR) return content'); + expect(body).toContain('sections'); // surgical: regex targets only /sections/ paths + }); +}); + +describe('dev-teardown: removes the untracked render', () => { + const teardown = read('bin/dev-teardown'); + + test('rm -rf the gstack-rendered dir', () => { + expect(teardown).toContain('gstack-rendered'); + expect(teardown).toMatch(/rm -rf .*RENDER_DIR/); + }); +}); + +describe('.gitignore: render dir is declared untracked', () => { + test('.claude/gstack-rendered/ is ignored', () => { + expect(read('.gitignore')).toContain('.claude/gstack-rendered/'); + }); +}); + +describe('dev-skill: refreshes the render on template change', () => { + const devSkill = read('scripts/dev-skill.ts'); + + test('re-renders the :user variant into the workspace render dir', () => { + expect(devSkill).toContain('gstack-rendered'); + expect(devSkill).toContain('--out-dir'); + expect(devSkill).toContain('--respect-detection'); + }); + + test('only refreshes when the render dir already exists (never creates it during plain dev)', () => { + expect(devSkill).toContain('fs.existsSync(RENDER_DIR)'); + }); +}); diff --git a/test/gbrain-detect-shape.test.ts b/test/gbrain-detect-shape.test.ts index 465e55623..e2f67ee07 100644 --- a/test/gbrain-detect-shape.test.ts +++ b/test/gbrain-detect-shape.test.ts @@ -16,7 +16,7 @@ */ import { describe, it, expect } from "bun:test"; -import { execFileSync } from "child_process"; +import { execFileSync, spawnSync } from "child_process"; import { mkdtempSync, mkdirSync, @@ -47,6 +47,16 @@ function runDetect(env: Partial): string { }); } +/** Run detect with --is-ok and return its exit code (never throws). */ +function runIsOk(env: Partial): number { + const r = spawnSync(BUN_BIN, ["run", DETECT_BIN, "--is-ok"], { + timeout: 15_000, + stdio: ["ignore", "pipe", "pipe"], + env: { ...process.env, ...env }, + }); + return r.status ?? 1; +} + interface DetectShape { gbrain_on_path: boolean; gbrain_version: string | null; @@ -244,3 +254,66 @@ exit 0 } }); }); + +describe("bin/gstack-gbrain-detect --is-ok — live gate", () => { + it("exits non-zero when gbrain is not on PATH (no-cli)", () => { + const tmp = mkdtempSync(join(tmpdir(), "detect-isok-")); + try { + const code = runIsOk({ + HOME: tmp, + PATH: "/usr/bin:/bin", // no gbrain + GSTACK_HOME: tmp, + GSTACK_DETECT_NO_CACHE: "1", + }); + expect(code).not.toBe(0); + } finally { + rmSync(tmp, { recursive: true, force: true }); + } + }); + + it("exits 0 when a fake gbrain reports a healthy engine (ok)", () => { + const tmp = mkdtempSync(join(tmpdir(), "detect-isok-")); + const bindir = join(tmp, "bin"); + const home = join(tmp, "home"); + const configDir = join(home, ".gbrain"); + try { + mkdirSync(bindir, { recursive: true }); + mkdirSync(configDir, { recursive: true }); + writeFileSync(join(configDir, "config.json"), JSON.stringify({ engine: "pglite" })); + const fake = `#!/bin/sh +case "$1 $2" in + "--version ") echo "gbrain 0.33.1.0"; exit 0 ;; + "sources list") echo '{"sources":[]}'; exit 0 ;; + "doctor "*) echo '{"status":"ok","checks":[]}'; exit 0 ;; +esac +exit 0 +`; + const gbrainPath = join(bindir, "gbrain"); + writeFileSync(gbrainPath, fake); + chmodSync(gbrainPath, 0o755); + + const code = runIsOk({ + HOME: home, + PATH: `${bindir}:/usr/bin:/bin`, + GSTACK_HOME: tmp, + GSTACK_DETECT_NO_CACHE: "1", + }); + expect(code).toBe(0); + } finally { + rmSync(tmp, { recursive: true, force: true }); + } + }); + + it("exit code agrees with the JSON gbrain_local_status (no skew)", () => { + // Run both surfaces against the same env and assert they never disagree. + const tmp = mkdtempSync(join(tmpdir(), "detect-isok-")); + try { + const env = { HOME: tmp, PATH: "/usr/bin:/bin", GSTACK_HOME: tmp, GSTACK_DETECT_NO_CACHE: "1" }; + const status = (JSON.parse(runDetect(env)) as DetectShape).gbrain_local_status; + const code = runIsOk(env); + expect(code === 0).toBe(status === "ok"); + } finally { + rmSync(tmp, { recursive: true, force: true }); + } + }); +}); diff --git a/test/gbrain-refresh-install-render.test.ts b/test/gbrain-refresh-install-render.test.ts new file mode 100644 index 000000000..f1494d47d --- /dev/null +++ b/test/gbrain-refresh-install-render.test.ts @@ -0,0 +1,60 @@ +import { describe, test, expect } from 'bun:test'; +import * as path from 'path'; +import * as fs from 'fs'; + +// Static tripwires for the C (machine-wide) render in `gstack-config +// gbrain-refresh`. The render mutates the shared global install, so the guards +// that stop it from touching the wrong directory are load-bearing — these fail +// CI if any guard is dropped. +const ROOT = path.resolve(import.meta.dir, '..'); +const SRC = fs.readFileSync(path.join(ROOT, 'bin', 'gstack-config'), 'utf-8'); + +// Pull out just the gbrain-refresh `ok)` branch so assertions can't be +// satisfied by unrelated text elsewhere in the file. +function okBranch(): string { + const start = SRC.indexOf('gbrain-refresh)'); + const ok = SRC.indexOf('ok)', start); + const end = SRC.indexOf(';;', ok); + if (start < 0 || ok < 0 || end < 0) throw new Error('Could not locate gbrain-refresh ok) branch'); + return SRC.slice(ok, end); +} + +describe('gstack-config gbrain-refresh: machine-wide render guards', () => { + const branch = okBranch(); + + test('targets the global install', () => { + expect(branch).toContain('$HOME/.claude/skills/gstack'); + }); + + test('refuses a symlinked install (would dirty a dev worktree)', () => { + expect(branch).toMatch(/\[ -L "\$INSTALL_DIR" \]/); + }); + + test('verifies it is a real gstack clone before mutating it', () => { + expect(branch).toContain('$INSTALL_DIR/VERSION'); + expect(branch).toContain('$INSTALL_DIR/package.json'); + }); + + test('requires bun on PATH', () => { + expect(branch).toContain('command -v bun'); + }); + + test('renders the :user variant in place into the install', () => { + expect(branch).toContain('gen:skill-docs:user --host claude'); + }); + + test('is self-documenting about the reset --hard / re-run cycle', () => { + expect(branch).toContain('reset --hard'); + expect(branch).toContain('gbrain-refresh'); + }); +}); + +describe('CLAUDE.md: deploy section documents the re-run', () => { + test('notes re-running gbrain-refresh after reset --hard', () => { + const claudeMd = fs.readFileSync(path.join(ROOT, 'CLAUDE.md'), 'utf-8'); + const idx = claudeMd.indexOf('## Deploying to the active skill'); + expect(idx).toBeGreaterThan(-1); + const section = claudeMd.slice(idx, idx + 1200); + expect(section).toContain('gbrain-refresh'); + }); +}); diff --git a/test/gen-skill-docs-out-dir.test.ts b/test/gen-skill-docs-out-dir.test.ts new file mode 100644 index 000000000..fbc1345e4 --- /dev/null +++ b/test/gen-skill-docs-out-dir.test.ts @@ -0,0 +1,84 @@ +import { describe, test, expect } from 'bun:test'; +import { spawnSync } from 'child_process'; +import { createHash } from 'crypto'; +import * as path from 'path'; +import * as fs from 'fs'; +import * as os from 'os'; + +const ROOT = path.resolve(import.meta.dir, '..'); + +// Render the gbrain `:user` variant into a temp out-dir, forcing detection ON +// via a crafted GSTACK_HOME so the test is deterministic regardless of whether +// the dev machine actually has gbrain installed. Asserts the B2 contract: +// (a) the worktree SKILL.md is byte-unchanged (source stays canonical), +// (b) the out-dir SKILL.md gained the inline Brain Context Load block, +// (c) its section refs point at the out-dir, not ~/.claude/skills/gstack, +// (d) bin/ refs are left pointing at the global install, +// (e) the out-dir section file gained the Save Results to Brain block. +describe('gen-skill-docs --out-dir (B2 render isolation)', () => { + function hashFile(p: string): string { + return createHash('sha256').update(fs.readFileSync(p)).digest('hex'); + } + + test('renders :user to out-dir, rewrites section paths, leaves worktree canonical', () => { + const tmpHome = fs.mkdtempSync(path.join(os.tmpdir(), 'gstack-home-')); + const outDir = fs.mkdtempSync(path.join(os.tmpdir(), 'gstack-out-')); + const worktreeSkill = path.join(ROOT, 'ship', 'SKILL.md'); + const beforeHash = hashFile(worktreeSkill); + try { + // Force gbrain detection ON for --respect-detection. + fs.writeFileSync( + path.join(tmpHome, 'gbrain-detection.json'), + JSON.stringify({ gbrain_local_status: 'ok', gbrain_version: '9.9.9' }), + ); + + const res = spawnSync( + 'bun', + ['run', 'scripts/gen-skill-docs.ts', '--respect-detection', '--host', 'claude', '--out-dir', outDir], + { cwd: ROOT, encoding: 'utf-8', timeout: 120_000, env: { ...process.env, GSTACK_HOME: tmpHome } }, + ); + expect(res.status).toBe(0); + + const outSkill = path.join(outDir, 'ship', 'SKILL.md'); + const outSection = path.join(outDir, 'ship', 'sections', 'adversarial.md'); + expect(fs.existsSync(outSkill)).toBe(true); + const skillContent = fs.readFileSync(outSkill, 'utf-8'); + + // (a) worktree byte-unchanged + expect(hashFile(worktreeSkill)).toBe(beforeHash); + + // (b) inline block present in the rendered SKILL.md + expect(skillContent).toContain('Brain Context Load'); + + // (c) section refs repointed to the out-dir; none left pointing at the install + expect(skillContent).toContain(`${outDir}/ship/sections/`); + expect(skillContent).not.toContain('~/.claude/skills/gstack/ship/sections/'); + + // (d) bin refs are NOT rewritten — they still resolve to the global install + expect(skillContent).toContain('~/.claude/skills/gstack/bin/'); + + // (e) the SAVE block landed in the rendered section file + expect(fs.existsSync(outSection)).toBe(true); + expect(fs.readFileSync(outSection, 'utf-8')).toContain('Save Results to Brain'); + } finally { + fs.rmSync(tmpHome, { recursive: true, force: true }); + fs.rmSync(outDir, { recursive: true, force: true }); + } + }); + + test('global extras (proactive-suggestions.json) are NOT written in out-dir mode', () => { + const outDir = fs.mkdtempSync(path.join(os.tmpdir(), 'gstack-out-')); + try { + const res = spawnSync( + 'bun', + ['run', 'scripts/gen-skill-docs.ts', '--host', 'claude', '--out-dir', outDir], + { cwd: ROOT, encoding: 'utf-8', timeout: 120_000 }, + ); + expect(res.status).toBe(0); + // proactive-suggestions.json lives at a repo path; out-dir mode must skip it. + expect(fs.existsSync(path.join(outDir, 'scripts', 'proactive-suggestions.json'))).toBe(false); + } finally { + fs.rmSync(outDir, { recursive: true, force: true }); + } + }); +}); From a5833c413f98b13f105beac96262e8098b628461 Mon Sep 17 00:00:00 2001 From: Garry Tan Date: Wed, 10 Jun 2026 21:14:58 -0700 Subject: [PATCH 2/4] v1.57.10.0 feat: Codex review default-on across review/ship/plan/docs (#1966) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit * feat(config): make codex_reviews the master switch for all Codex review Broaden the codex_reviews doc to describe it governing /review, /ship, /document-release, plan reviews, and /autoplan. Reject invalid values on set (preserving the existing value) so a typo can never silently flip paid Codex calls on or off. Co-Authored-By: Claude Opus 4.8 (1M context) * feat(review): Codex review default-on across review/ship/plan/docs Add a shared codexPreflight() helper (constants.ts) that, in one bash block, reads codex_reviews, sources gstack-codex-probe, checks install + auth, and echoes a single canonical mode (ready/not_installed/not_authed/ disabled). All Codex resolvers route through it. - generateCodexPlanReview: opt-in question removed; the outside voice now runs automatically (default-on), falling back to a Claude subagent when Codex is missing/unauthed. Cross-model tension still gates on user approval (sovereignty preserved). - generateAdversarialStep: probe-based availability (install AND auth), distinct not-installed vs not-authed guidance; 200-line structured-review threshold unchanged. - generateCodexDocReview (new, wired via CODEX_DOC_REVIEW): reviews the release's docs against the shipped diff range, informational + an explicit apply-fixes decision point, never auto-edits. - autoplan Phase 0.5 now honors codex_reviews=disabled so the switch is truly global. Co-Authored-By: Claude Opus 4.8 (1M context) * chore(docs): regenerate SKILL docs + refresh ship golden Output of gen:skill-docs for the Codex-default-on resolver/template changes. Refreshes the factory-ship golden fixture (codex-host output unchanged — resolvers strip for the codex host). Co-Authored-By: Claude Opus 4.8 (1M context) * test(infra): widen size-budget guards for default-on Codex outside-voice The codexPreflight() block + CODEX_MODE branch prose (replacing the smaller opt-in question) grows plan-ceo/eng/devex-review and review by 5-7% over baseline. Each bump carries a comment justifying it as intentional capability, not slop. Co-Authored-By: Claude Opus 4.8 (1M context) * test: guard Codex default-on + config reject-on-set skill-validation: assert plan reviews no longer carry the opt-in question and render the default-on outside-voice, document-release carries the doc review, and the codex host strips all of it. gstack-config: codex_reviews defaults to enabled, accepts enabled/disabled, and rejects an invalid value while preserving the existing one. Co-Authored-By: Claude Opus 4.8 (1M context) * fix(test): align gstack-config tests with defaults-fallback behavior Three tests (last touched v0.13.7.0) asserted get/list print empty for unset keys, but gstack-config falls back to the documented defaults table (get returns the default, list shows the active-values block). Update the assertions to the real behavior and split out an unknown-key case that does still return empty. Pre-existing red, unrelated to codex review. Co-Authored-By: Claude Opus 4.8 (1M context) * v1.57.10.0 feat: Codex review default-on across review/ship/plan/docs Codex cross-model review now runs by default on /review, /ship, all four plan reviews, /document-release, and /autoplan, governed by one master switch (codex_reviews, default enabled). Plan-review outside voice is default-on; /document-release gets a new Codex doc-vs-diff audit; every call site detects install AND auth and falls back to a Claude subagent with a clear reason. Disable everything with: gstack-config set codex_reviews disabled Co-Authored-By: Claude Opus 4.8 (1M context) --------- Co-authored-by: Claude Opus 4.8 (1M context) --- CHANGELOG.md | 86 +++++++++ VERSION | 2 +- autoplan/SKILL.md | 8 +- autoplan/SKILL.md.tmpl | 8 +- bin/gstack-config | 18 +- browse/test/gstack-config.test.ts | 42 ++++- document-release/sections/release-body.md | 117 ++++++++++++ .../sections/release-body.md.tmpl | 4 + package.json | 2 +- plan-ceo-review/sections/review-sections.md | 72 ++++--- plan-devex-review/sections/review-sections.md | 72 ++++--- plan-eng-review/sections/review-sections.md | 72 ++++--- review/SKILL.md | 46 +++-- scripts/resolvers/constants.ts | 58 ++++++ scripts/resolvers/index.ts | 3 +- scripts/resolvers/review.ts | 178 +++++++++++++----- ship/sections/adversarial.md | 46 +++-- test/fixtures/golden/factory-ship-SKILL.md | 46 +++-- test/helpers/carve-guards.ts | 28 ++- test/helpers/parity-harness.ts | 6 +- test/skill-validation.test.ts | 48 ++++- 21 files changed, 766 insertions(+), 196 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index 3f8cffae1..503433f11 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -1,5 +1,91 @@ # Changelog +## [1.57.10.0] - 2026-06-10 + +## **Codex review now runs by default everywhere it matters.** +## **One switch governs it, and it falls back to Claude when Codex is missing or unauthed.** + +Codex cross-model review used to be inconsistent. `/review` and `/ship` ran it +automatically, but plan reviews hid it behind a "Want an outside voice?" question +you had to say yes to every time, `/document-release` never ran it at all, and every +entry point only checked whether the `codex` binary existed, not whether it was +logged in. Now `codex_reviews` is one master switch (default `enabled`) that governs +Codex review across `/review`, `/ship`, `/plan-ceo-review`, `/plan-eng-review`, +`/plan-design-review`, `/plan-devex-review`, `/document-release`, and `/autoplan`. +The plan-review outside voice runs automatically. `/document-release` gets a new +Codex pass that checks your docs against what actually shipped. Every call site now +detects install AND auth separately, and degrades to a Claude subagent with a clear +one-line reason instead of silently skipping. Turn the whole thing off with one +command: `gstack-config set codex_reviews disabled`. + +### The numbers that matter + +Verified by the gate-tier E2E evals that exercise these exact paths +(`codex-offered-ceo-review`, `codex-offered-eng-review`, `document-release`, +`codex-review-findings`), all green this run. + +| Metric | Before | After | Δ | +|--------|--------|-------|---| +| Skills where Codex review runs by default | 2 | 8 | +6 | +| Prompts to get a plan-review outside voice | 1 (opt-in each time) | 0 (automatic) | -1 | +| Codex readiness detection | install only | install + auth | sharper | +| Master switches to disable it all | 0 (per-skill only) | 1 (`codex_reviews`) | +1 | +| `/document-release` Codex doc audit | none | doc-vs-diff pass | new | + +When Codex is installed but not logged in, you used to get nothing on the paths that +checked only `command -v codex`. Now you get a named reason ("Codex installed but not +authenticated, using Claude subagent") and the review still happens. A typo on the +switch (`gstack-config set codex_reviews disabledd`) is rejected and your existing +setting is preserved, so a fat-finger can never silently turn paid Codex calls on or +off. + +### What this means for you + +If you run gstack day to day, you stop deciding whether to get a second model's eyes +on every plan and every release. It is just there, on by default, the way the strong +reviewers already worked on diffs. If you do not have Codex set up, nothing breaks: +you get the Claude outside voice instead, with a one-line note telling you how to add +Codex for true cross-model coverage. If you want it gone, one command turns off all +eight surfaces at once. + +### Itemized changes + +#### Added +- **`codex_reviews` as the master switch** for Codex review across `/review`, `/ship`, + `/document-release`, all four plan reviews, and `/autoplan` (`bin/gstack-config`). + Default `enabled`. Invalid values on `set` are rejected with the existing value + preserved, so a typo cannot flip paid Codex calls. +- **`/document-release` Codex doc audit** (`generateCodexDocReview`): reviews the + docs you touched against the release diff for stale claims, undocumented new + surface, and over/under-sold CHANGELOG entries. Informational, with an explicit + apply-fixes decision point. Never auto-edits docs. +- **`codexPreflight()` shared helper** (`scripts/resolvers/constants.ts`): one + self-contained bash block that reads the switch, sources the probe, checks install + and auth, and emits a single canonical mode (`ready` / `not_installed` / + `not_authed` / `disabled`). + +#### Changed +- **Plan-review outside voice is default-on**, not opt-in. The "Want an outside + voice?" question is gone; it runs automatically and falls back to a Claude subagent + when Codex is unavailable. Incorporating its findings still requires your explicit + approval (cross-model tension is presented, never auto-applied). +- **Adversarial review detects auth, not just install** (`generateAdversarialStep`): + distinct "not installed" vs "not authenticated" guidance. The 200-line threshold + for the heavier structured `codex review` is unchanged. +- **`/autoplan` honors `codex_reviews=disabled`** in its Phase 0.5 preflight, so the + switch is truly global. + +#### Fixed +- Three `gstack-config` tests asserted `get`/`list` print empty for unset keys; the + tool falls back to the documented defaults table. Assertions now match real behavior. + +#### For contributors +- Size-budget guards widened for the default-on outside-voice prose, each with a + rationale comment (`test/helpers/carve-guards.ts`, `test/helpers/parity-harness.ts`). +- Static guards added: plan reviews must not carry the opt-in question and must render + the default-on voice; `/document-release` must carry the doc review; the codex host + strips all of it (`test/skill-validation.test.ts`). + ## [1.57.9.0] - 2026-06-09 ## **Your gstack checkout stays clean when gbrain is installed.** diff --git a/VERSION b/VERSION index 10521b5e0..e535f0937 100644 --- a/VERSION +++ b/VERSION @@ -1 +1 @@ -1.57.9.0 +1.57.10.0 diff --git a/autoplan/SKILL.md b/autoplan/SKILL.md index bd372a4c3..de7174b03 100644 --- a/autoplan/SKILL.md +++ b/autoplan/SKILL.md @@ -1065,11 +1065,17 @@ workflow. ```bash _TEL=$(~/.claude/skills/gstack/bin/gstack-config get telemetry 2>/dev/null || echo off) +_CODEX_CFG=$(~/.claude/skills/gstack/bin/gstack-config get codex_reviews 2>/dev/null || echo enabled) source ~/.claude/skills/gstack/bin/gstack-codex-probe +# Master switch first: codex_reviews=disabled turns off ALL Codex work globally, +# including autoplan's own dual-voice orchestration. Honor it before probing. +if [ "$_CODEX_CFG" = "disabled" ]; then + echo "[codex disabled by config — Claude-only voices] Re-enable: gstack-config set codex_reviews enabled" + _CODEX_AVAILABLE=false # Check Codex binary. If missing, tag the degradation matrix and continue # with Claude subagent only (autoplan's existing degradation fallback). -if ! command -v codex >/dev/null 2>&1; then +elif ! command -v codex >/dev/null 2>&1; then _gstack_codex_log_event "codex_cli_missing" echo "[codex-unavailable: binary not found] — proceeding with Claude subagent only" _CODEX_AVAILABLE=false diff --git a/autoplan/SKILL.md.tmpl b/autoplan/SKILL.md.tmpl index 2e67eb9e1..b2eaca9fd 100644 --- a/autoplan/SKILL.md.tmpl +++ b/autoplan/SKILL.md.tmpl @@ -243,11 +243,17 @@ workflow. ```bash _TEL=$(~/.claude/skills/gstack/bin/gstack-config get telemetry 2>/dev/null || echo off) +_CODEX_CFG=$(~/.claude/skills/gstack/bin/gstack-config get codex_reviews 2>/dev/null || echo enabled) source ~/.claude/skills/gstack/bin/gstack-codex-probe +# Master switch first: codex_reviews=disabled turns off ALL Codex work globally, +# including autoplan's own dual-voice orchestration. Honor it before probing. +if [ "$_CODEX_CFG" = "disabled" ]; then + echo "[codex disabled by config — Claude-only voices] Re-enable: gstack-config set codex_reviews enabled" + _CODEX_AVAILABLE=false # Check Codex binary. If missing, tag the degradation matrix and continue # with Claude subagent only (autoplan's existing degradation fallback). -if ! command -v codex >/dev/null 2>&1; then +elif ! command -v codex >/dev/null 2>&1; then _gstack_codex_log_event "codex_cli_missing" echo "[codex-unavailable: binary not found] — proceeding with Claude subagent only" _CODEX_AVAILABLE=false diff --git a/bin/gstack-config b/bin/gstack-config index ec465b281..5442183b3 100755 --- a/bin/gstack-config +++ b/bin/gstack-config @@ -86,7 +86,16 @@ CONFIG_HEADER='# gstack configuration — edit freely, changes take effect on ne # # --no-plan-tune-hooks, or env GSTACK_PLAN_TUNE_HOOKS. # # ─── Advanced ──────────────────────────────────────────────────────── -# codex_reviews: enabled # disabled = skip Codex adversarial reviews in /ship +# codex_reviews: enabled # Master switch for Codex cross-model review. enabled = +# # Codex runs as a standard step in /review, /ship, +# # /document-release, plan reviews, and /autoplan (auto +# # falls back to a Claude subagent if Codex is missing or +# # not authenticated). disabled = skip all Codex passes. +# # Asymmetry on disabled: diff-review (/review, /ship) still +# # runs the free Claude adversarial subagent; plan-review and +# # /document-release skip the outside-voice step entirely. +# # An invalid value is REJECTED (existing value preserved) so +# # a typo cannot silently turn paid Codex calls on or off. # gstack_contributor: false # true = file field reports when gstack misbehaves # skip_eng_review: false # true = skip eng review gate in /ship (not recommended) # @@ -302,6 +311,13 @@ case "${1:-}" in echo "Warning: plan_tune_hooks '$VALUE' not recognized. Valid values: prompt, yes, no. Using prompt." >&2 VALUE="prompt" fi + # codex_reviews controls PAID Codex calls. Unlike the warn-and-default keys above, + # an invalid value is REJECTED and the existing setting is left unchanged — a typo + # must never silently flip the switch and turn paid Codex calls on or off. + if [ "$KEY" = "codex_reviews" ] && [ "$VALUE" != "enabled" ] && [ "$VALUE" != "disabled" ]; then + echo "Error: codex_reviews '$VALUE' not recognized. Valid values: enabled, disabled. Existing value left unchanged." >&2 + exit 1 + fi mkdir -p "$STATE_DIR" # Write annotated header on first creation if [ ! -f "$CONFIG_FILE" ]; then diff --git a/browse/test/gstack-config.test.ts b/browse/test/gstack-config.test.ts index a00af6096..e342a50b7 100644 --- a/browse/test/gstack-config.test.ts +++ b/browse/test/gstack-config.test.ts @@ -41,9 +41,16 @@ afterEach(() => { describe('gstack-config', () => { // ─── get ────────────────────────────────────────────────── - test('get on missing file returns empty, exit 0', () => { + test('get on missing file returns the default, exit 0', () => { + // auto_upgrade has a default of false; get falls back to the defaults table. const { exitCode, stdout } = run(['get', 'auto_upgrade']); expect(exitCode).toBe(0); + expect(stdout).toBe('false'); + }); + + test('get unknown key on missing file returns empty, exit 0', () => { + const { exitCode, stdout } = run(['get', 'some_unknown_key']); + expect(exitCode).toBe(0); expect(stdout).toBe(''); }); @@ -110,10 +117,12 @@ describe('gstack-config', () => { expect(stdout).toContain('update_check: false'); }); - test('list on missing file returns empty, exit 0', () => { + test('list on missing file shows defaults, exit 0', () => { + // list prints the active-values block with defaults for unset keys. const { exitCode, stdout } = run(['list']); expect(exitCode).toBe(0); - expect(stdout).toBe(''); + expect(stdout).toContain('proactive:'); + expect(stdout).toContain('(default)'); }); // ─── usage ──────────────────────────────────────────────── @@ -151,6 +160,29 @@ describe('gstack-config', () => { expect(content).toContain('skip_eng_review:'); }); + // ─── codex_reviews (paid-calls switch: reject-on-set, preserve existing) ── + test('codex_reviews defaults to enabled', () => { + const { exitCode, stdout } = run(['get', 'codex_reviews']); + expect(exitCode).toBe(0); + expect(stdout).toBe('enabled'); + }); + + test('codex_reviews accepts enabled and disabled', () => { + expect(run(['set', 'codex_reviews', 'disabled']).exitCode).toBe(0); + expect(run(['get', 'codex_reviews']).stdout).toBe('disabled'); + expect(run(['set', 'codex_reviews', 'enabled']).exitCode).toBe(0); + expect(run(['get', 'codex_reviews']).stdout).toBe('enabled'); + }); + + test('codex_reviews rejects an invalid value and preserves the existing one', () => { + run(['set', 'codex_reviews', 'disabled']); + const { exitCode, stderr } = run(['set', 'codex_reviews', 'disabledd']); + expect(exitCode).not.toBe(0); // rejected, not warn-and-default + expect(stderr).toContain('not recognized'); + // existing value must be untouched — a typo never silently flips paid Codex on/off + expect(run(['get', 'codex_reviews']).stdout).toBe('disabled'); + }); + test('header written only once, not duplicated on second set', () => { run(['set', 'foo', 'bar']); run(['set', 'baz', 'qux']); @@ -176,9 +208,9 @@ describe('gstack-config', () => { }); // ─── routing_declined ────────────────────────────────────── - test('routing_declined defaults to empty (not set)', () => { + test('routing_declined defaults to false (not set)', () => { const { stdout } = run(['get', 'routing_declined']); - expect(stdout).toBe(''); + expect(stdout).toBe('false'); }); test('routing_declined can be set and read', () => { diff --git a/document-release/sections/release-body.md b/document-release/sections/release-body.md index f391eb8a3..0f05f13b5 100644 --- a/document-release/sections/release-body.md +++ b/document-release/sections/release-body.md @@ -358,3 +358,120 @@ Diagram drift: ``` If all coverage is complete and no diagrams drifted, output: "Coverage: all shipped features have adequate documentation." + +--- + +## Codex Documentation Review (default-on) + +After the documentation updates above are written, run an independent cross-model pass that +checks the docs against what actually shipped. This is a standard part of /document-release, +not an opt-in. The user turns it off only by asking explicitly +(`gstack-config set codex_reviews disabled`). + +**Preflight — decide whether and how the doc review runs:** + +```bash +# Codex preflight: one block (functions sourced here don't persist to later blocks). +_TEL=$(~/.claude/skills/gstack/bin/gstack-config get telemetry 2>/dev/null || echo off) +_CODEX_CFG=$(~/.claude/skills/gstack/bin/gstack-config get codex_reviews 2>/dev/null || echo enabled) +source ~/.claude/skills/gstack/bin/gstack-codex-probe 2>/dev/null || true +if [ "$_CODEX_CFG" = "disabled" ]; then + _CODEX_MODE="disabled" +elif ! command -v codex >/dev/null 2>&1; then + _CODEX_MODE="not_installed"; _gstack_codex_log_event "codex_cli_missing" 2>/dev/null || true +elif ! _gstack_codex_auth_probe >/dev/null 2>&1; then + _CODEX_MODE="not_authed"; _gstack_codex_log_event "codex_auth_failed" 2>/dev/null || true +else + _CODEX_MODE="ready"; _gstack_codex_version_check 2>/dev/null || true +fi +echo "CODEX_MODE: $_CODEX_MODE" +``` + +Branch on the echoed `CODEX_MODE`: +- **`disabled`** — the user turned Codex reviews off (`codex_reviews=disabled`). Skip this section entirely; do NOT fall back to a Claude subagent — disabled means no extra review step. Print: "Codex review skipped (codex_reviews disabled). Re-enable: `gstack-config set codex_reviews enabled`." +- **`not_installed`** — Codex CLI absent. Print: "Codex not installed — using Claude subagent. Install for cross-model coverage: `npm install -g @openai/codex`." Fall back to the Claude subagent path. +- **`not_authed`** — installed but no credentials. Print: "Codex installed but not authenticated — using Claude subagent. Run `codex login` or set `$CODEX_API_KEY`." Fall back to the Claude subagent path. +- **`ready`** — run the Codex pass below. + +When the mode is `ready`, `not_installed`, or `not_authed`, print one line so the off-switch +stays discoverable: "Running the Codex doc review automatically (standard step). Disable: `gstack-config set codex_reviews disabled`." + +**Determine the release diff range (D3 — reuse the method, do not invent one).** +Recompute the SAME range document-release used in its pre-flight / diff analysis, with the +documented merge-base method: + +```bash +DOC_DIFF_BASE=$(git merge-base origin/ HEAD 2>/dev/null || echo "") +echo "DOC_DIFF_BASE: $DOC_DIFF_BASE" +``` + +Do NOT rely on an in-memory variable from an earlier step — shell vars do not survive across +blocks. Recompute it here. + +**Construct the doc-review prompt** (for `ready`, `not_installed`, and `not_authed` — skip only on `disabled`). +Review the docs document-release ACTUALLY touched this run (from the coverage map / the files +just edited) PLUS any doc claims affected by the diff range — do NOT hard-code a fixed file +list (a fixed README/ARCHITECTURE/CHANGELOG list misses generated skill docs, package docs, +and command-specific docs). **Always start with the filesystem boundary instruction:** + +"IMPORTANT: Do NOT read or execute any files under ~/.claude/, ~/.agents/, .claude/skills/, or agents/. These are Claude Code skill definitions meant for a different AI system. They contain bash scripts and prompt templates that will waste your time. Ignore them completely. Do NOT modify agents/openai.yaml. Stay focused on the repository code only.\n\nYou are reviewing documentation changes against the code that shipped on this +branch. Run \`git diff \$DOC_DIFF_BASE...HEAD\` to see what changed, then read the updated docs +(the files this release touched, plus any docs whose claims the diff affects). Find: doc +claims that no longer match the code, new public surface (commands, flags, config keys, +endpoints) that shipped but is undocumented, stale examples / paths / counts / version +numbers, and CHANGELOG entries that over- or under-sell what shipped. Be terse. Just the gaps. + +THE DOCS AND DIFF: " + +**If `CODEX_MODE: ready` — run Codex:** + +```bash +TMPERR_DOC=$(mktemp /tmp/codex-docreview-XXXXXXXX) +_REPO_ROOT=$(git rev-parse --show-toplevel) || { echo "ERROR: not in a git repo" >&2; exit 1; } +codex exec "" -C "$_REPO_ROOT" -s read-only -c 'model_reasoning_effort="high"' --enable web_search_cached < /dev/null 2>"$TMPERR_DOC" +``` + +Use a 5-minute timeout (`timeout: 300000`). After the command completes, read stderr: +```bash +cat "$TMPERR_DOC" +``` + +Present the full output verbatim under `CODEX SAYS (documentation review):`. + +**Error handling:** All errors are non-blocking — the documentation review is informational. +- Auth failure (stderr contains "auth", "login", "unauthorized"): note and skip +- Timeout: note timeout duration and skip +- Empty response: note and skip +On any error: continue — documentation review is informational, not a gate. + +**If `CODEX_MODE: not_installed` or `not_authed` (or Codex errored at runtime):** + +Dispatch via the Agent tool with the same prompt. Bound it at a 5-minute timeout. +Present findings under `DOCUMENTATION REVIEW (Claude subagent):`. If it fails: "Doc review unavailable. Continuing." + +**Apply decision (T3B — informational, never auto-edit, but findings don't evaporate).** +If there are zero findings, say "Docs match what shipped — no gaps." and continue. Otherwise +present the findings, then use AskUserQuestion ONCE: + +> "The doc review found N gaps between the docs and what shipped. How do you want to handle them?" +> +> RECOMMENDATION: Choose A if the gaps are concrete doc fixes (stale path, missing flag). The +> doc review only reports; nothing is edited without your say-so. Completeness: A=9/10, B=4/10, C=8/10. + +Options: +- A) Apply all the doc fixes now +- B) Skip — leave docs as-is +- C) Decide per-finding + +On A or per-finding approvals, make the approved edits yourself (the tool never silently +rewrites docs). On B, note the gaps in the output so they're visible. + +**Persist the result:** +```bash +~/.claude/skills/gstack/bin/gstack-review-log '{"skill":"codex-doc-review","timestamp":"'"$(date -u +%Y-%m-%dT%H:%M:%SZ)"'","status":"STATUS","source":"SOURCE","commit":"'"$(git rev-parse --short HEAD)"'"}' +``` +Substitute: STATUS = "clean" if no gaps, "issues_found" if gaps exist. SOURCE = "codex" if Codex ran, "claude" if the subagent ran. + +**Cleanup:** Run `rm -f "$TMPERR_DOC"` after processing (if Codex was used). + +--- diff --git a/document-release/sections/release-body.md.tmpl b/document-release/sections/release-body.md.tmpl index ea5a54524..475e8b258 100644 --- a/document-release/sections/release-body.md.tmpl +++ b/document-release/sections/release-body.md.tmpl @@ -356,3 +356,7 @@ Diagram drift: ``` If all coverage is complete and no diagrams drifted, output: "Coverage: all shipped features have adequate documentation." + +--- + +{{CODEX_DOC_REVIEW}} diff --git a/package.json b/package.json index 53da1d736..86801282f 100644 --- a/package.json +++ b/package.json @@ -1,6 +1,6 @@ { "name": "gstack", - "version": "1.57.9.0", + "version": "1.57.10.0", "description": "Garry's Stack — Claude Code skills + fast headless browser. One repo, one install, entire AI engineering workflow.", "license": "MIT", "type": "module", diff --git a/plan-ceo-review/sections/review-sections.md b/plan-ceo-review/sections/review-sections.md index 517125b39..71bae4d93 100644 --- a/plan-ceo-review/sections/review-sections.md +++ b/plan-ceo-review/sections/review-sections.md @@ -253,38 +253,46 @@ If this plan has significant UI scope, recommend: "Consider running /plan-design **STOP.** AskUserQuestion once per issue. Do NOT batch. Recommend + WHY. If this section turned up zero findings, state "No issues, moving on" and proceed. If the section has findings, you MUST call AskUserQuestion as a tool_use — a finding with an "obvious fix" is still a finding and still needs user approval before any change lands in the plan. Do NOT proceed until the user responds. **Reminder: Do NOT make any code changes. Review only.** -## Outside Voice — Independent Plan Challenge (optional, recommended) +## Outside Voice — Independent Plan Challenge (default-on) -After all review sections are complete, offer an independent second opinion from a -different AI system. Two models agreeing on a plan is stronger signal than one model's -thorough review. +After all review sections are complete, run an independent second opinion from a +different AI system automatically — it is a standard part of plan review, not an +opt-in. Two models agreeing on a plan is stronger signal than one model's thorough +review. The user turns this off only by asking explicitly +(`gstack-config set codex_reviews disabled`). -**Check tool availability:** +**Preflight — decide whether and how the outside voice runs:** ```bash -command -v codex >/dev/null 2>&1 && echo "CODEX_AVAILABLE" || echo "CODEX_NOT_AVAILABLE" +# Codex preflight: one block (functions sourced here don't persist to later blocks). +_TEL=$(~/.claude/skills/gstack/bin/gstack-config get telemetry 2>/dev/null || echo off) +_CODEX_CFG=$(~/.claude/skills/gstack/bin/gstack-config get codex_reviews 2>/dev/null || echo enabled) +source ~/.claude/skills/gstack/bin/gstack-codex-probe 2>/dev/null || true +if [ "$_CODEX_CFG" = "disabled" ]; then + _CODEX_MODE="disabled" +elif ! command -v codex >/dev/null 2>&1; then + _CODEX_MODE="not_installed"; _gstack_codex_log_event "codex_cli_missing" 2>/dev/null || true +elif ! _gstack_codex_auth_probe >/dev/null 2>&1; then + _CODEX_MODE="not_authed"; _gstack_codex_log_event "codex_auth_failed" 2>/dev/null || true +else + _CODEX_MODE="ready"; _gstack_codex_version_check 2>/dev/null || true +fi +echo "CODEX_MODE: $_CODEX_MODE" ``` -Use AskUserQuestion: +Branch on the echoed `CODEX_MODE`: +- **`disabled`** — the user turned Codex reviews off (`codex_reviews=disabled`). Skip this section entirely; do NOT fall back to a Claude subagent — disabled means no extra review step. Print: "Codex review skipped (codex_reviews disabled). Re-enable: `gstack-config set codex_reviews enabled`." +- **`not_installed`** — Codex CLI absent. Print: "Codex not installed — using Claude subagent. Install for cross-model coverage: `npm install -g @openai/codex`." Fall back to the Claude subagent path. +- **`not_authed`** — installed but no credentials. Print: "Codex installed but not authenticated — using Claude subagent. Run `codex login` or set `$CODEX_API_KEY`." Fall back to the Claude subagent path. +- **`ready`** — run the Codex pass below. -> "All review sections are complete. Want an outside voice? A different AI system can -> give a brutally honest, independent challenge of this plan — logical gaps, feasibility -> risks, and blind spots that are hard to catch from inside the review. Takes about 2 -> minutes." -> -> RECOMMENDATION: Choose A — an independent second opinion catches structural blind -> spots. Two different AI models agreeing on a plan is stronger signal than one model's -> thorough review. Completeness: A=9/10, B=7/10. +When the mode is `ready`, `not_installed`, or `not_authed`, print one line so the off-switch +stays discoverable: "Running the outside voice automatically (standard step). Disable: `gstack-config set codex_reviews disabled`." -Options: -- A) Get the outside voice (recommended) -- B) Skip — proceed to outputs - -**If B:** Print "Skipping outside voice." and continue to the next section. - -**If A:** Construct the plan review prompt. Read the plan file being reviewed (the file -the user pointed this review at, or the branch diff scope). If a CEO plan document -was written in Step 0D-POST, read that too — it contains the scope decisions and vision. +**Construct the plan review prompt** (for `ready`, `not_installed`, and `not_authed` — skip only on `disabled`). +Read the plan file being reviewed (the file the user pointed this review at, or the branch +diff scope). If a CEO plan document was written in Step 0D-POST, read that too — it contains +the scope decisions and vision. Construct this prompt (substitute the actual plan content — if plan content exceeds 30KB, truncate to the first 30KB and note "Plan truncated for size"). **Always start with the @@ -302,7 +310,7 @@ compliments. Just the problems. THE PLAN: " -**If CODEX_AVAILABLE:** +**If `CODEX_MODE: ready` — run Codex:** ```bash TMPERR_PV=$(mktemp /tmp/codex-planreview-XXXXXXXX) @@ -325,15 +333,15 @@ CODEX SAYS (plan review — outside voice): ``` **Error handling:** All errors are non-blocking — the outside voice is informational. -- Auth failure (stderr contains "auth", "login", "unauthorized"): "Codex auth failed. Run \`codex login\` to authenticate." -- Timeout: "Codex timed out after 5 minutes." -- Empty response: "Codex returned no response." +- Auth failure (stderr contains "auth", "login", "unauthorized"): "Codex auth failed. Run \`codex login\` to authenticate." Fall back to the Claude subagent below. +- Timeout: "Codex timed out after 5 minutes." Fall back to the Claude subagent below. +- Empty response: "Codex returned no response." Fall back to the Claude subagent below. -On any Codex error, fall back to the Claude adversarial subagent. - -**If CODEX_NOT_AVAILABLE (or Codex errored):** +**If `CODEX_MODE: not_installed` or `not_authed` (or Codex errored at runtime):** Dispatch via the Agent tool. The subagent has fresh context — genuine independence. +Bound it the same way as Codex: cap the dispatch at a 5-minute timeout so "never blocking" +is also "never hanging." Subagent prompt: same plan review prompt as above. @@ -341,6 +349,8 @@ Present findings under an `OUTSIDE VOICE (Claude subagent):` header. If the subagent fails or times out: "Outside voice unavailable. Continuing to outputs." +(On `CODEX_MODE: disabled` you already skipped this section per the preflight — do not reach here.) + **Cross-model tension:** After presenting the outside voice findings, note any points where the outside voice diff --git a/plan-devex-review/sections/review-sections.md b/plan-devex-review/sections/review-sections.md index db1be2a96..e4ce30a95 100644 --- a/plan-devex-review/sections/review-sections.md +++ b/plan-devex-review/sections/review-sections.md @@ -239,38 +239,46 @@ Check each item. For any unchecked item, explain what's missing and suggest the **STOP.** AskUserQuestion for any item that requires a design decision. -## Outside Voice — Independent Plan Challenge (optional, recommended) +## Outside Voice — Independent Plan Challenge (default-on) -After all review sections are complete, offer an independent second opinion from a -different AI system. Two models agreeing on a plan is stronger signal than one model's -thorough review. +After all review sections are complete, run an independent second opinion from a +different AI system automatically — it is a standard part of plan review, not an +opt-in. Two models agreeing on a plan is stronger signal than one model's thorough +review. The user turns this off only by asking explicitly +(`gstack-config set codex_reviews disabled`). -**Check tool availability:** +**Preflight — decide whether and how the outside voice runs:** ```bash -command -v codex >/dev/null 2>&1 && echo "CODEX_AVAILABLE" || echo "CODEX_NOT_AVAILABLE" +# Codex preflight: one block (functions sourced here don't persist to later blocks). +_TEL=$(~/.claude/skills/gstack/bin/gstack-config get telemetry 2>/dev/null || echo off) +_CODEX_CFG=$(~/.claude/skills/gstack/bin/gstack-config get codex_reviews 2>/dev/null || echo enabled) +source ~/.claude/skills/gstack/bin/gstack-codex-probe 2>/dev/null || true +if [ "$_CODEX_CFG" = "disabled" ]; then + _CODEX_MODE="disabled" +elif ! command -v codex >/dev/null 2>&1; then + _CODEX_MODE="not_installed"; _gstack_codex_log_event "codex_cli_missing" 2>/dev/null || true +elif ! _gstack_codex_auth_probe >/dev/null 2>&1; then + _CODEX_MODE="not_authed"; _gstack_codex_log_event "codex_auth_failed" 2>/dev/null || true +else + _CODEX_MODE="ready"; _gstack_codex_version_check 2>/dev/null || true +fi +echo "CODEX_MODE: $_CODEX_MODE" ``` -Use AskUserQuestion: +Branch on the echoed `CODEX_MODE`: +- **`disabled`** — the user turned Codex reviews off (`codex_reviews=disabled`). Skip this section entirely; do NOT fall back to a Claude subagent — disabled means no extra review step. Print: "Codex review skipped (codex_reviews disabled). Re-enable: `gstack-config set codex_reviews enabled`." +- **`not_installed`** — Codex CLI absent. Print: "Codex not installed — using Claude subagent. Install for cross-model coverage: `npm install -g @openai/codex`." Fall back to the Claude subagent path. +- **`not_authed`** — installed but no credentials. Print: "Codex installed but not authenticated — using Claude subagent. Run `codex login` or set `$CODEX_API_KEY`." Fall back to the Claude subagent path. +- **`ready`** — run the Codex pass below. -> "All review sections are complete. Want an outside voice? A different AI system can -> give a brutally honest, independent challenge of this plan — logical gaps, feasibility -> risks, and blind spots that are hard to catch from inside the review. Takes about 2 -> minutes." -> -> RECOMMENDATION: Choose A — an independent second opinion catches structural blind -> spots. Two different AI models agreeing on a plan is stronger signal than one model's -> thorough review. Completeness: A=9/10, B=7/10. +When the mode is `ready`, `not_installed`, or `not_authed`, print one line so the off-switch +stays discoverable: "Running the outside voice automatically (standard step). Disable: `gstack-config set codex_reviews disabled`." -Options: -- A) Get the outside voice (recommended) -- B) Skip — proceed to outputs - -**If B:** Print "Skipping outside voice." and continue to the next section. - -**If A:** Construct the plan review prompt. Read the plan file being reviewed (the file -the user pointed this review at, or the branch diff scope). If a CEO plan document -was written in Step 0D-POST, read that too — it contains the scope decisions and vision. +**Construct the plan review prompt** (for `ready`, `not_installed`, and `not_authed` — skip only on `disabled`). +Read the plan file being reviewed (the file the user pointed this review at, or the branch +diff scope). If a CEO plan document was written in Step 0D-POST, read that too — it contains +the scope decisions and vision. Construct this prompt (substitute the actual plan content — if plan content exceeds 30KB, truncate to the first 30KB and note "Plan truncated for size"). **Always start with the @@ -288,7 +296,7 @@ compliments. Just the problems. THE PLAN: " -**If CODEX_AVAILABLE:** +**If `CODEX_MODE: ready` — run Codex:** ```bash TMPERR_PV=$(mktemp /tmp/codex-planreview-XXXXXXXX) @@ -311,15 +319,15 @@ CODEX SAYS (plan review — outside voice): ``` **Error handling:** All errors are non-blocking — the outside voice is informational. -- Auth failure (stderr contains "auth", "login", "unauthorized"): "Codex auth failed. Run \`codex login\` to authenticate." -- Timeout: "Codex timed out after 5 minutes." -- Empty response: "Codex returned no response." +- Auth failure (stderr contains "auth", "login", "unauthorized"): "Codex auth failed. Run \`codex login\` to authenticate." Fall back to the Claude subagent below. +- Timeout: "Codex timed out after 5 minutes." Fall back to the Claude subagent below. +- Empty response: "Codex returned no response." Fall back to the Claude subagent below. -On any Codex error, fall back to the Claude adversarial subagent. - -**If CODEX_NOT_AVAILABLE (or Codex errored):** +**If `CODEX_MODE: not_installed` or `not_authed` (or Codex errored at runtime):** Dispatch via the Agent tool. The subagent has fresh context — genuine independence. +Bound it the same way as Codex: cap the dispatch at a 5-minute timeout so "never blocking" +is also "never hanging." Subagent prompt: same plan review prompt as above. @@ -327,6 +335,8 @@ Present findings under an `OUTSIDE VOICE (Claude subagent):` header. If the subagent fails or times out: "Outside voice unavailable. Continuing to outputs." +(On `CODEX_MODE: disabled` you already skipped this section per the preflight — do not reach here.) + **Cross-model tension:** After presenting the outside voice findings, note any points where the outside voice diff --git a/plan-eng-review/sections/review-sections.md b/plan-eng-review/sections/review-sections.md index cd677ab3c..7592f0a70 100644 --- a/plan-eng-review/sections/review-sections.md +++ b/plan-eng-review/sections/review-sections.md @@ -329,38 +329,46 @@ For each issue found in this section, call AskUserQuestion individually. One iss **STOP.** Do NOT proceed to the next review section, edit the plan file with the proposed fix, or call ExitPlanMode until the user responds. An issue with an "obvious fix" is still an issue and still needs explicit user approval before it lands in the plan. Loading the AskUserQuestion schema via ToolSearch and then writing the recommendation as chat prose is the failure mode this gate exists to prevent. -## Outside Voice — Independent Plan Challenge (optional, recommended) +## Outside Voice — Independent Plan Challenge (default-on) -After all review sections are complete, offer an independent second opinion from a -different AI system. Two models agreeing on a plan is stronger signal than one model's -thorough review. +After all review sections are complete, run an independent second opinion from a +different AI system automatically — it is a standard part of plan review, not an +opt-in. Two models agreeing on a plan is stronger signal than one model's thorough +review. The user turns this off only by asking explicitly +(`gstack-config set codex_reviews disabled`). -**Check tool availability:** +**Preflight — decide whether and how the outside voice runs:** ```bash -command -v codex >/dev/null 2>&1 && echo "CODEX_AVAILABLE" || echo "CODEX_NOT_AVAILABLE" +# Codex preflight: one block (functions sourced here don't persist to later blocks). +_TEL=$(~/.claude/skills/gstack/bin/gstack-config get telemetry 2>/dev/null || echo off) +_CODEX_CFG=$(~/.claude/skills/gstack/bin/gstack-config get codex_reviews 2>/dev/null || echo enabled) +source ~/.claude/skills/gstack/bin/gstack-codex-probe 2>/dev/null || true +if [ "$_CODEX_CFG" = "disabled" ]; then + _CODEX_MODE="disabled" +elif ! command -v codex >/dev/null 2>&1; then + _CODEX_MODE="not_installed"; _gstack_codex_log_event "codex_cli_missing" 2>/dev/null || true +elif ! _gstack_codex_auth_probe >/dev/null 2>&1; then + _CODEX_MODE="not_authed"; _gstack_codex_log_event "codex_auth_failed" 2>/dev/null || true +else + _CODEX_MODE="ready"; _gstack_codex_version_check 2>/dev/null || true +fi +echo "CODEX_MODE: $_CODEX_MODE" ``` -Use AskUserQuestion: +Branch on the echoed `CODEX_MODE`: +- **`disabled`** — the user turned Codex reviews off (`codex_reviews=disabled`). Skip this section entirely; do NOT fall back to a Claude subagent — disabled means no extra review step. Print: "Codex review skipped (codex_reviews disabled). Re-enable: `gstack-config set codex_reviews enabled`." +- **`not_installed`** — Codex CLI absent. Print: "Codex not installed — using Claude subagent. Install for cross-model coverage: `npm install -g @openai/codex`." Fall back to the Claude subagent path. +- **`not_authed`** — installed but no credentials. Print: "Codex installed but not authenticated — using Claude subagent. Run `codex login` or set `$CODEX_API_KEY`." Fall back to the Claude subagent path. +- **`ready`** — run the Codex pass below. -> "All review sections are complete. Want an outside voice? A different AI system can -> give a brutally honest, independent challenge of this plan — logical gaps, feasibility -> risks, and blind spots that are hard to catch from inside the review. Takes about 2 -> minutes." -> -> RECOMMENDATION: Choose A — an independent second opinion catches structural blind -> spots. Two different AI models agreeing on a plan is stronger signal than one model's -> thorough review. Completeness: A=9/10, B=7/10. +When the mode is `ready`, `not_installed`, or `not_authed`, print one line so the off-switch +stays discoverable: "Running the outside voice automatically (standard step). Disable: `gstack-config set codex_reviews disabled`." -Options: -- A) Get the outside voice (recommended) -- B) Skip — proceed to outputs - -**If B:** Print "Skipping outside voice." and continue to the next section. - -**If A:** Construct the plan review prompt. Read the plan file being reviewed (the file -the user pointed this review at, or the branch diff scope). If a CEO plan document -was written in Step 0D-POST, read that too — it contains the scope decisions and vision. +**Construct the plan review prompt** (for `ready`, `not_installed`, and `not_authed` — skip only on `disabled`). +Read the plan file being reviewed (the file the user pointed this review at, or the branch +diff scope). If a CEO plan document was written in Step 0D-POST, read that too — it contains +the scope decisions and vision. Construct this prompt (substitute the actual plan content — if plan content exceeds 30KB, truncate to the first 30KB and note "Plan truncated for size"). **Always start with the @@ -378,7 +386,7 @@ compliments. Just the problems. THE PLAN: " -**If CODEX_AVAILABLE:** +**If `CODEX_MODE: ready` — run Codex:** ```bash TMPERR_PV=$(mktemp /tmp/codex-planreview-XXXXXXXX) @@ -401,15 +409,15 @@ CODEX SAYS (plan review — outside voice): ``` **Error handling:** All errors are non-blocking — the outside voice is informational. -- Auth failure (stderr contains "auth", "login", "unauthorized"): "Codex auth failed. Run \`codex login\` to authenticate." -- Timeout: "Codex timed out after 5 minutes." -- Empty response: "Codex returned no response." +- Auth failure (stderr contains "auth", "login", "unauthorized"): "Codex auth failed. Run \`codex login\` to authenticate." Fall back to the Claude subagent below. +- Timeout: "Codex timed out after 5 minutes." Fall back to the Claude subagent below. +- Empty response: "Codex returned no response." Fall back to the Claude subagent below. -On any Codex error, fall back to the Claude adversarial subagent. - -**If CODEX_NOT_AVAILABLE (or Codex errored):** +**If `CODEX_MODE: not_installed` or `not_authed` (or Codex errored at runtime):** Dispatch via the Agent tool. The subagent has fresh context — genuine independence. +Bound it the same way as Codex: cap the dispatch at a 5-minute timeout so "never blocking" +is also "never hanging." Subagent prompt: same plan review prompt as above. @@ -417,6 +425,8 @@ Present findings under an `OUTSIDE VOICE (Claude subagent):` header. If the subagent fails or times out: "Outside voice unavailable. Continuing to outputs." +(On `CODEX_MODE: disabled` you already skipped this section per the preflight — do not reach here.) + **Cross-model tension:** After presenting the outside voice findings, note any points where the outside voice diff --git a/review/SKILL.md b/review/SKILL.md index e7a2fa4f2..16e90389f 100644 --- a/review/SKILL.md +++ b/review/SKILL.md @@ -1602,23 +1602,47 @@ If no documentation files exist, skip this step silently. Every diff gets adversarial review from both Claude and Codex. LOC is not a proxy for risk — a 5-line auth change can be critical. -**Detect diff size and tool availability:** +**Detect diff size:** ```bash DIFF_BASE=$(git merge-base origin/ HEAD) DIFF_INS=$(git diff "$DIFF_BASE" --stat | tail -1 | grep -oE '[0-9]+ insertion' | grep -oE '[0-9]+' || echo "0") DIFF_DEL=$(git diff "$DIFF_BASE" --stat | tail -1 | grep -oE '[0-9]+ deletion' | grep -oE '[0-9]+' || echo "0") DIFF_TOTAL=$((DIFF_INS + DIFF_DEL)) -command -v codex >/dev/null 2>&1 && echo "CODEX_AVAILABLE" || echo "CODEX_NOT_AVAILABLE" -# Legacy opt-out — only gates Codex passes, Claude always runs -OLD_CFG=$(~/.claude/skills/gstack/bin/gstack-config get codex_reviews 2>/dev/null || true) echo "DIFF_SIZE: $DIFF_TOTAL" -echo "OLD_CFG: ${OLD_CFG:-not_set}" ``` -If `OLD_CFG` is `disabled`: skip Codex passes only. Claude adversarial subagent still runs (it's free and fast). Jump to the "Claude adversarial subagent" section. +**Detect the Codex master switch + tool availability:** -**User override:** If the user explicitly requested "full review", "structured review", or "P1 gate", also run the Codex structured review regardless of diff size. +```bash +# Codex preflight: one block (functions sourced here don't persist to later blocks). +_TEL=$(~/.claude/skills/gstack/bin/gstack-config get telemetry 2>/dev/null || echo off) +_CODEX_CFG=$(~/.claude/skills/gstack/bin/gstack-config get codex_reviews 2>/dev/null || echo enabled) +source ~/.claude/skills/gstack/bin/gstack-codex-probe 2>/dev/null || true +if [ "$_CODEX_CFG" = "disabled" ]; then + _CODEX_MODE="disabled" +elif ! command -v codex >/dev/null 2>&1; then + _CODEX_MODE="not_installed"; _gstack_codex_log_event "codex_cli_missing" 2>/dev/null || true +elif ! _gstack_codex_auth_probe >/dev/null 2>&1; then + _CODEX_MODE="not_authed"; _gstack_codex_log_event "codex_auth_failed" 2>/dev/null || true +else + _CODEX_MODE="ready"; _gstack_codex_version_check 2>/dev/null || true +fi +echo "CODEX_MODE: $_CODEX_MODE" +``` + +Branch on the echoed `CODEX_MODE`: +- **`disabled`** — the user turned Codex reviews off (`codex_reviews=disabled`). Skip the Codex passes only; the Claude adversarial subagent below STILL runs (it is free and fast). Print: "Codex passes skipped (codex_reviews disabled) — running Claude adversarial only." +- **`not_installed`** — Codex CLI absent. Print: "Codex not installed — using Claude subagent. Install for cross-model coverage: `npm install -g @openai/codex`." Fall back to the Claude subagent path. +- **`not_authed`** — installed but no credentials. Print: "Codex installed but not authenticated — using Claude subagent. Run `codex login` or set `$CODEX_API_KEY`." Fall back to the Claude subagent path. +- **`ready`** — run the Codex pass below. + +For this diff-review path, `CODEX_MODE: disabled` means skip the Codex passes ONLY — the +Claude adversarial subagent below still runs (it's free and fast). `ready` runs the Codex +passes; `not_installed` / `not_authed` skip them with the printed note and continue with +Claude only. + +**User override:** If the user explicitly requested "full review", "structured review", or "P1 gate", also run the Codex structured review regardless of diff size (still requires `CODEX_MODE: ready`). --- @@ -1639,9 +1663,9 @@ If the subagent fails or times out: "Claude adversarial subagent unavailable. Co --- -### Codex adversarial challenge (always runs when available) +### Codex adversarial challenge (runs whenever `CODEX_MODE: ready`) -If Codex is available AND `OLD_CFG` is NOT `disabled`: +If `CODEX_MODE` is `ready`: ```bash TMPERR_ADV=$(mktemp /tmp/codex-adv-XXXXXXXX) @@ -1663,13 +1687,13 @@ Present the full output verbatim. This is informational — it never blocks ship **Cleanup:** Run `rm -f "$TMPERR_ADV"` after processing. -If Codex is NOT available: "Codex CLI not found — running Claude adversarial only. Install Codex for cross-model coverage: `npm install -g @openai/codex`" +If `CODEX_MODE` is `not_installed` / `not_authed` / `disabled`: the preflight already printed the reason; run Claude adversarial only. --- ### Codex structured review (large diffs only, 200+ lines) -If `DIFF_TOTAL >= 200` AND Codex is available AND `OLD_CFG` is NOT `disabled`: +If `DIFF_TOTAL >= 200` AND `CODEX_MODE` is `ready`: ```bash TMPERR=$(mktemp /tmp/codex-review-XXXXXXXX) diff --git a/scripts/resolvers/constants.ts b/scripts/resolvers/constants.ts index b02d68b05..b75db21cf 100644 --- a/scripts/resolvers/constants.ts +++ b/scripts/resolvers/constants.ts @@ -56,3 +56,61 @@ export function codexErrorHandling(feature: string): string { - Empty response: note and skip On any error: continue — ${feature} is informational, not a gate.`; } + +/** + * Shared Codex preflight bash block — the single source of truth for deciding + * whether a Codex review pass should run. Used by ADVERSARIAL_STEP, + * CODEX_PLAN_REVIEW, and CODEX_DOC_REVIEW so install/auth/config detection + * lives in exactly one place. + * + * Emits ONE self-contained bash block (the caller must place it in a single + * fenced block — CLAUDE.md: each block is a fresh shell, so functions sourced + * here do NOT persist to later blocks). It: + * 1. reads the `codex_reviews` master switch, + * 2. sources `gstack-codex-probe`, + * 3. runs `command -v codex` (literal — keeps the e2e substring assertion), + * then `_gstack_codex_auth_probe`, then `_gstack_codex_version_check`, + * 4. logs the relevant `_gstack_codex_log_event` for each non-ready outcome, + * 5. sets ONE canonical mode var and echoes `CODEX_MODE: ` so the agent + * gates later blocks on the echoed value. + * + * Mode values: `disabled` (config off) | `not_installed` | `not_authed` | `ready`. + * The path is host-rewritten at gen-skill-docs time (pathRewrites), so the + * literal `~/.claude/skills/gstack` is correct here and becomes `$GSTACK_ROOT` + * etc. for non-Claude hosts. + * + * `disabledBehavior` controls the `disabled`-mode interpretation, which is the + * one branch that legitimately differs per caller (D1): + * - `skip-all` (plan / doc reviews): disabled means no extra review step at + * all — skip the section, no Claude fallback. + * - `codex-only` (diff adversarial): disabled gates only the Codex passes; the + * free Claude adversarial subagent still runs. + */ +export function codexPreflight(opts: { modeVar?: string; disabledBehavior: 'skip-all' | 'codex-only' }): string { + const m = opts.modeVar ?? '_CODEX_MODE'; + const disabledLine = opts.disabledBehavior === 'codex-only' + ? 'Skip the Codex passes only; the Claude adversarial subagent below STILL runs (it is free and fast). Print: "Codex passes skipped (codex_reviews disabled) — running Claude adversarial only."' + : 'Skip this section entirely; do NOT fall back to a Claude subagent — disabled means no extra review step. Print: "Codex review skipped (codex_reviews disabled). Re-enable: `gstack-config set codex_reviews enabled`."'; + return `\`\`\`bash +# Codex preflight: one block (functions sourced here don't persist to later blocks). +_TEL=$(~/.claude/skills/gstack/bin/gstack-config get telemetry 2>/dev/null || echo off) +_CODEX_CFG=$(~/.claude/skills/gstack/bin/gstack-config get codex_reviews 2>/dev/null || echo enabled) +source ~/.claude/skills/gstack/bin/gstack-codex-probe 2>/dev/null || true +if [ "$_CODEX_CFG" = "disabled" ]; then + ${m}="disabled" +elif ! command -v codex >/dev/null 2>&1; then + ${m}="not_installed"; _gstack_codex_log_event "codex_cli_missing" 2>/dev/null || true +elif ! _gstack_codex_auth_probe >/dev/null 2>&1; then + ${m}="not_authed"; _gstack_codex_log_event "codex_auth_failed" 2>/dev/null || true +else + ${m}="ready"; _gstack_codex_version_check 2>/dev/null || true +fi +echo "CODEX_MODE: $${m}" +\`\`\` + +Branch on the echoed \`CODEX_MODE\`: +- **\`disabled\`** — the user turned Codex reviews off (\`codex_reviews=disabled\`). ${disabledLine} +- **\`not_installed\`** — Codex CLI absent. Print: "Codex not installed — using Claude subagent. Install for cross-model coverage: \`npm install -g @openai/codex\`." Fall back to the Claude subagent path. +- **\`not_authed\`** — installed but no credentials. Print: "Codex installed but not authenticated — using Claude subagent. Run \`codex login\` or set \`$CODEX_API_KEY\`." Fall back to the Claude subagent path. +- **\`ready\`** — run the Codex pass below.`; +} diff --git a/scripts/resolvers/index.ts b/scripts/resolvers/index.ts index 1c8d23b7f..aa598b867 100644 --- a/scripts/resolvers/index.ts +++ b/scripts/resolvers/index.ts @@ -22,7 +22,7 @@ import { generateTestFailureTriage } from './preamble'; import { generateCommandReference, generateSnapshotFlags, generateBrowseSetup } from './browse'; import { generateDesignMethodology, generateDesignHardRules, generateDesignOutsideVoices, generateDesignReviewLite, generateDesignSketch, generateDesignSetup, generateDesignMockup, generateDesignShotgunLoop, generateTasteProfile, generateUXPrinciples } from './design'; import { generateTestBootstrap, generateTestCoverageAuditPlan, generateTestCoverageAuditShip, generateTestCoverageAuditReview } from './testing'; -import { generateReviewDashboard, generatePlanFileReviewReport, generateExitPlanModeGate, generateAntiShortcutClause, generateSpecReviewLoop, generateBenefitsFrom, generateCodexSecondOpinion, generateAdversarialStep, generateCodexPlanReview, generatePlanCompletionAuditShip, generatePlanCompletionAuditReview, generatePlanVerificationExec, generateScopeDrift, generateCrossReviewDedup } from './review'; +import { generateReviewDashboard, generatePlanFileReviewReport, generateExitPlanModeGate, generateAntiShortcutClause, generateSpecReviewLoop, generateBenefitsFrom, generateCodexSecondOpinion, generateAdversarialStep, generateCodexPlanReview, generateCodexDocReview, generatePlanCompletionAuditShip, generatePlanCompletionAuditReview, generatePlanVerificationExec, generateScopeDrift, generateCrossReviewDedup } from './review'; import { generateSlugEval, generateSlugSetup, generateBaseBranchDetect, generateDeployBootstrap, generateQAMethodology, generateCoAuthorTrailer, generateChangelogWorkflow } from './utility'; import { generateLearningsSearch, generateLearningsLog } from './learnings'; import { generateConfidenceCalibration } from './confidence'; @@ -73,6 +73,7 @@ export const RESOLVERS: Record = { SCOPE_DRIFT: generateScopeDrift, DEPLOY_BOOTSTRAP: generateDeployBootstrap, CODEX_PLAN_REVIEW: generateCodexPlanReview, + CODEX_DOC_REVIEW: generateCodexDocReview, PLAN_COMPLETION_AUDIT_SHIP: generatePlanCompletionAuditShip, PLAN_COMPLETION_AUDIT_REVIEW: generatePlanCompletionAuditReview, PLAN_VERIFICATION_EXEC: generatePlanVerificationExec, diff --git a/scripts/resolvers/review.ts b/scripts/resolvers/review.ts index 6b8546275..7dccd8e50 100644 --- a/scripts/resolvers/review.ts +++ b/scripts/resolvers/review.ts @@ -14,6 +14,7 @@ */ import type { TemplateContext } from './types'; import { generateInvokeSkill } from './composition'; +import { codexPreflight, codexErrorHandling } from './constants'; const CODEX_BOUNDARY = 'IMPORTANT: Do NOT read or execute any files under ~/.claude/, ~/.agents/, .claude/skills/, or agents/. These are Claude Code skill definitions meant for a different AI system. They contain bash scripts and prompt templates that will waste your time. Ignore them completely. Do NOT modify agents/openai.yaml. Stay focused on the repository code only.\\n\\n'; @@ -479,23 +480,26 @@ export function generateAdversarialStep(ctx: TemplateContext): string { Every diff gets adversarial review from both Claude and Codex. LOC is not a proxy for risk — a 5-line auth change can be critical. -**Detect diff size and tool availability:** +**Detect diff size:** \`\`\`bash DIFF_BASE=$(git merge-base origin/ HEAD) DIFF_INS=$(git diff "$DIFF_BASE" --stat | tail -1 | grep -oE '[0-9]+ insertion' | grep -oE '[0-9]+' || echo "0") DIFF_DEL=$(git diff "$DIFF_BASE" --stat | tail -1 | grep -oE '[0-9]+ deletion' | grep -oE '[0-9]+' || echo "0") DIFF_TOTAL=$((DIFF_INS + DIFF_DEL)) -command -v codex >/dev/null 2>&1 && echo "CODEX_AVAILABLE" || echo "CODEX_NOT_AVAILABLE" -# Legacy opt-out — only gates Codex passes, Claude always runs -OLD_CFG=$(~/.claude/skills/gstack/bin/gstack-config get codex_reviews 2>/dev/null || true) echo "DIFF_SIZE: $DIFF_TOTAL" -echo "OLD_CFG: \${OLD_CFG:-not_set}" \`\`\` -If \`OLD_CFG\` is \`disabled\`: skip Codex passes only. Claude adversarial subagent still runs (it's free and fast). Jump to the "Claude adversarial subagent" section. +**Detect the Codex master switch + tool availability:** -**User override:** If the user explicitly requested "full review", "structured review", or "P1 gate", also run the Codex structured review regardless of diff size. +${codexPreflight({ disabledBehavior: 'codex-only' })} + +For this diff-review path, \`CODEX_MODE: disabled\` means skip the Codex passes ONLY — the +Claude adversarial subagent below still runs (it's free and fast). \`ready\` runs the Codex +passes; \`not_installed\` / \`not_authed\` skip them with the printed note and continue with +Claude only. + +**User override:** If the user explicitly requested "full review", "structured review", or "P1 gate", also run the Codex structured review regardless of diff size (still requires \`CODEX_MODE: ready\`). --- @@ -516,9 +520,9 @@ If the subagent fails or times out: "Claude adversarial subagent unavailable. Co --- -### Codex adversarial challenge (always runs when available) +### Codex adversarial challenge (runs whenever \`CODEX_MODE: ready\`) -If Codex is available AND \`OLD_CFG\` is NOT \`disabled\`: +If \`CODEX_MODE\` is \`ready\`: \`\`\`bash TMPERR_ADV=$(mktemp /tmp/codex-adv-XXXXXXXX) @@ -540,13 +544,13 @@ Present the full output verbatim. This is informational — it never blocks ship **Cleanup:** Run \`rm -f "$TMPERR_ADV"\` after processing. -If Codex is NOT available: "Codex CLI not found — running Claude adversarial only. Install Codex for cross-model coverage: \`npm install -g @openai/codex\`" +If \`CODEX_MODE\` is \`not_installed\` / \`not_authed\` / \`disabled\`: the preflight already printed the reason; run Claude adversarial only. --- ### Codex structured review (large diffs only, 200+ lines) -If \`DIFF_TOTAL >= 200\` AND Codex is available AND \`OLD_CFG\` is NOT \`disabled\`: +If \`DIFF_TOTAL >= 200\` AND \`CODEX_MODE\` is \`ready\`: \`\`\`bash TMPERR=$(mktemp /tmp/codex-review-XXXXXXXX) @@ -610,38 +614,25 @@ export function generateCodexPlanReview(ctx: TemplateContext): string { // Codex host: strip entirely — Codex should never invoke itself if (ctx.host === 'codex') return ''; - return `## Outside Voice — Independent Plan Challenge (optional, recommended) + return `## Outside Voice — Independent Plan Challenge (default-on) -After all review sections are complete, offer an independent second opinion from a -different AI system. Two models agreeing on a plan is stronger signal than one model's -thorough review. +After all review sections are complete, run an independent second opinion from a +different AI system automatically — it is a standard part of plan review, not an +opt-in. Two models agreeing on a plan is stronger signal than one model's thorough +review. The user turns this off only by asking explicitly +(\`gstack-config set codex_reviews disabled\`). -**Check tool availability:** +**Preflight — decide whether and how the outside voice runs:** -\`\`\`bash -command -v codex >/dev/null 2>&1 && echo "CODEX_AVAILABLE" || echo "CODEX_NOT_AVAILABLE" -\`\`\` +${codexPreflight({ disabledBehavior: 'skip-all' })} -Use AskUserQuestion: +When the mode is \`ready\`, \`not_installed\`, or \`not_authed\`, print one line so the off-switch +stays discoverable: "Running the outside voice automatically (standard step). Disable: \`gstack-config set codex_reviews disabled\`." -> "All review sections are complete. Want an outside voice? A different AI system can -> give a brutally honest, independent challenge of this plan — logical gaps, feasibility -> risks, and blind spots that are hard to catch from inside the review. Takes about 2 -> minutes." -> -> RECOMMENDATION: Choose A — an independent second opinion catches structural blind -> spots. Two different AI models agreeing on a plan is stronger signal than one model's -> thorough review. Completeness: A=9/10, B=7/10. - -Options: -- A) Get the outside voice (recommended) -- B) Skip — proceed to outputs - -**If B:** Print "Skipping outside voice." and continue to the next section. - -**If A:** Construct the plan review prompt. Read the plan file being reviewed (the file -the user pointed this review at, or the branch diff scope). If a CEO plan document -was written in Step 0D-POST, read that too — it contains the scope decisions and vision. +**Construct the plan review prompt** (for \`ready\`, \`not_installed\`, and \`not_authed\` — skip only on \`disabled\`). +Read the plan file being reviewed (the file the user pointed this review at, or the branch +diff scope). If a CEO plan document was written in Step 0D-POST, read that too — it contains +the scope decisions and vision. Construct this prompt (substitute the actual plan content — if plan content exceeds 30KB, truncate to the first 30KB and note "Plan truncated for size"). **Always start with the @@ -659,7 +650,7 @@ compliments. Just the problems. THE PLAN: " -**If CODEX_AVAILABLE:** +**If \`CODEX_MODE: ready\` — run Codex:** \`\`\`bash TMPERR_PV=$(mktemp /tmp/codex-planreview-XXXXXXXX) @@ -682,15 +673,15 @@ CODEX SAYS (plan review — outside voice): \`\`\` **Error handling:** All errors are non-blocking — the outside voice is informational. -- Auth failure (stderr contains "auth", "login", "unauthorized"): "Codex auth failed. Run \\\`codex login\\\` to authenticate." -- Timeout: "Codex timed out after 5 minutes." -- Empty response: "Codex returned no response." +- Auth failure (stderr contains "auth", "login", "unauthorized"): "Codex auth failed. Run \\\`codex login\\\` to authenticate." Fall back to the Claude subagent below. +- Timeout: "Codex timed out after 5 minutes." Fall back to the Claude subagent below. +- Empty response: "Codex returned no response." Fall back to the Claude subagent below. -On any Codex error, fall back to the Claude adversarial subagent. - -**If CODEX_NOT_AVAILABLE (or Codex errored):** +**If \`CODEX_MODE: not_installed\` or \`not_authed\` (or Codex errored at runtime):** Dispatch via the Agent tool. The subagent has fresh context — genuine independence. +Bound it the same way as Codex: cap the dispatch at a 5-minute timeout so "never blocking" +is also "never hanging." Subagent prompt: same plan review prompt as above. @@ -698,6 +689,8 @@ Present findings under an \`OUTSIDE VOICE (Claude subagent):\` header. If the subagent fails or times out: "Outside voice unavailable. Continuing to outputs." +(On \`CODEX_MODE: disabled\` you already skipped this section per the preflight — do not reach here.) + **Cross-model tension:** After presenting the outside voice findings, note any points where the outside voice @@ -747,6 +740,101 @@ SOURCE = "codex" if Codex ran, "claude" if subagent ran. ---`; } +export function generateCodexDocReview(ctx: TemplateContext): string { + // Codex host: strip entirely — Codex should never invoke itself + if (ctx.host === 'codex') return ''; + + return `## Codex Documentation Review (default-on) + +After the documentation updates above are written, run an independent cross-model pass that +checks the docs against what actually shipped. This is a standard part of /document-release, +not an opt-in. The user turns it off only by asking explicitly +(\`gstack-config set codex_reviews disabled\`). + +**Preflight — decide whether and how the doc review runs:** + +${codexPreflight({ disabledBehavior: 'skip-all' })} + +When the mode is \`ready\`, \`not_installed\`, or \`not_authed\`, print one line so the off-switch +stays discoverable: "Running the Codex doc review automatically (standard step). Disable: \`gstack-config set codex_reviews disabled\`." + +**Determine the release diff range (D3 — reuse the method, do not invent one).** +Recompute the SAME range document-release used in its pre-flight / diff analysis, with the +documented merge-base method: + +\`\`\`bash +DOC_DIFF_BASE=$(git merge-base origin/ HEAD 2>/dev/null || echo "") +echo "DOC_DIFF_BASE: $DOC_DIFF_BASE" +\`\`\` + +Do NOT rely on an in-memory variable from an earlier step — shell vars do not survive across +blocks. Recompute it here. + +**Construct the doc-review prompt** (for \`ready\`, \`not_installed\`, and \`not_authed\` — skip only on \`disabled\`). +Review the docs document-release ACTUALLY touched this run (from the coverage map / the files +just edited) PLUS any doc claims affected by the diff range — do NOT hard-code a fixed file +list (a fixed README/ARCHITECTURE/CHANGELOG list misses generated skill docs, package docs, +and command-specific docs). **Always start with the filesystem boundary instruction:** + +"${CODEX_BOUNDARY}You are reviewing documentation changes against the code that shipped on this +branch. Run \\\`git diff \\$DOC_DIFF_BASE...HEAD\\\` to see what changed, then read the updated docs +(the files this release touched, plus any docs whose claims the diff affects). Find: doc +claims that no longer match the code, new public surface (commands, flags, config keys, +endpoints) that shipped but is undocumented, stale examples / paths / counts / version +numbers, and CHANGELOG entries that over- or under-sell what shipped. Be terse. Just the gaps. + +THE DOCS AND DIFF: " + +**If \`CODEX_MODE: ready\` — run Codex:** + +\`\`\`bash +TMPERR_DOC=$(mktemp /tmp/codex-docreview-XXXXXXXX) +_REPO_ROOT=$(git rev-parse --show-toplevel) || { echo "ERROR: not in a git repo" >&2; exit 1; } +codex exec "" -C "$_REPO_ROOT" -s read-only -c 'model_reasoning_effort="high"' --enable web_search_cached < /dev/null 2>"$TMPERR_DOC" +\`\`\` + +Use a 5-minute timeout (\`timeout: 300000\`). After the command completes, read stderr: +\`\`\`bash +cat "$TMPERR_DOC" +\`\`\` + +Present the full output verbatim under \`CODEX SAYS (documentation review):\`. + +${codexErrorHandling('documentation review')} + +**If \`CODEX_MODE: not_installed\` or \`not_authed\` (or Codex errored at runtime):** + +Dispatch via the Agent tool with the same prompt. Bound it at a 5-minute timeout. +Present findings under \`DOCUMENTATION REVIEW (Claude subagent):\`. If it fails: "Doc review unavailable. Continuing." + +**Apply decision (T3B — informational, never auto-edit, but findings don't evaporate).** +If there are zero findings, say "Docs match what shipped — no gaps." and continue. Otherwise +present the findings, then use AskUserQuestion ONCE: + +> "The doc review found N gaps between the docs and what shipped. How do you want to handle them?" +> +> RECOMMENDATION: Choose A if the gaps are concrete doc fixes (stale path, missing flag). The +> doc review only reports; nothing is edited without your say-so. Completeness: A=9/10, B=4/10, C=8/10. + +Options: +- A) Apply all the doc fixes now +- B) Skip — leave docs as-is +- C) Decide per-finding + +On A or per-finding approvals, make the approved edits yourself (the tool never silently +rewrites docs). On B, note the gaps in the output so they're visible. + +**Persist the result:** +\`\`\`bash +~/.claude/skills/gstack/bin/gstack-review-log '{"skill":"codex-doc-review","timestamp":"'"$(date -u +%Y-%m-%dT%H:%M:%SZ)"'","status":"STATUS","source":"SOURCE","commit":"'"$(git rev-parse --short HEAD)"'"}' +\`\`\` +Substitute: STATUS = "clean" if no gaps, "issues_found" if gaps exist. SOURCE = "codex" if Codex ran, "claude" if the subagent ran. + +**Cleanup:** Run \`rm -f "$TMPERR_DOC"\` after processing (if Codex was used). + +---`; +} + // ─── Plan File Discovery (shared helper) ────────────────────────────── function generatePlanFileDiscovery(): string { diff --git a/ship/sections/adversarial.md b/ship/sections/adversarial.md index bbc1eb80d..c7b2321d1 100644 --- a/ship/sections/adversarial.md +++ b/ship/sections/adversarial.md @@ -4,23 +4,47 @@ Every diff gets adversarial review from both Claude and Codex. LOC is not a proxy for risk — a 5-line auth change can be critical. -**Detect diff size and tool availability:** +**Detect diff size:** ```bash DIFF_BASE=$(git merge-base origin/ HEAD) DIFF_INS=$(git diff "$DIFF_BASE" --stat | tail -1 | grep -oE '[0-9]+ insertion' | grep -oE '[0-9]+' || echo "0") DIFF_DEL=$(git diff "$DIFF_BASE" --stat | tail -1 | grep -oE '[0-9]+ deletion' | grep -oE '[0-9]+' || echo "0") DIFF_TOTAL=$((DIFF_INS + DIFF_DEL)) -command -v codex >/dev/null 2>&1 && echo "CODEX_AVAILABLE" || echo "CODEX_NOT_AVAILABLE" -# Legacy opt-out — only gates Codex passes, Claude always runs -OLD_CFG=$(~/.claude/skills/gstack/bin/gstack-config get codex_reviews 2>/dev/null || true) echo "DIFF_SIZE: $DIFF_TOTAL" -echo "OLD_CFG: ${OLD_CFG:-not_set}" ``` -If `OLD_CFG` is `disabled`: skip Codex passes only. Claude adversarial subagent still runs (it's free and fast). Jump to the "Claude adversarial subagent" section. +**Detect the Codex master switch + tool availability:** -**User override:** If the user explicitly requested "full review", "structured review", or "P1 gate", also run the Codex structured review regardless of diff size. +```bash +# Codex preflight: one block (functions sourced here don't persist to later blocks). +_TEL=$(~/.claude/skills/gstack/bin/gstack-config get telemetry 2>/dev/null || echo off) +_CODEX_CFG=$(~/.claude/skills/gstack/bin/gstack-config get codex_reviews 2>/dev/null || echo enabled) +source ~/.claude/skills/gstack/bin/gstack-codex-probe 2>/dev/null || true +if [ "$_CODEX_CFG" = "disabled" ]; then + _CODEX_MODE="disabled" +elif ! command -v codex >/dev/null 2>&1; then + _CODEX_MODE="not_installed"; _gstack_codex_log_event "codex_cli_missing" 2>/dev/null || true +elif ! _gstack_codex_auth_probe >/dev/null 2>&1; then + _CODEX_MODE="not_authed"; _gstack_codex_log_event "codex_auth_failed" 2>/dev/null || true +else + _CODEX_MODE="ready"; _gstack_codex_version_check 2>/dev/null || true +fi +echo "CODEX_MODE: $_CODEX_MODE" +``` + +Branch on the echoed `CODEX_MODE`: +- **`disabled`** — the user turned Codex reviews off (`codex_reviews=disabled`). Skip the Codex passes only; the Claude adversarial subagent below STILL runs (it is free and fast). Print: "Codex passes skipped (codex_reviews disabled) — running Claude adversarial only." +- **`not_installed`** — Codex CLI absent. Print: "Codex not installed — using Claude subagent. Install for cross-model coverage: `npm install -g @openai/codex`." Fall back to the Claude subagent path. +- **`not_authed`** — installed but no credentials. Print: "Codex installed but not authenticated — using Claude subagent. Run `codex login` or set `$CODEX_API_KEY`." Fall back to the Claude subagent path. +- **`ready`** — run the Codex pass below. + +For this diff-review path, `CODEX_MODE: disabled` means skip the Codex passes ONLY — the +Claude adversarial subagent below still runs (it's free and fast). `ready` runs the Codex +passes; `not_installed` / `not_authed` skip them with the printed note and continue with +Claude only. + +**User override:** If the user explicitly requested "full review", "structured review", or "P1 gate", also run the Codex structured review regardless of diff size (still requires `CODEX_MODE: ready`). --- @@ -41,9 +65,9 @@ If the subagent fails or times out: "Claude adversarial subagent unavailable. Co --- -### Codex adversarial challenge (always runs when available) +### Codex adversarial challenge (runs whenever `CODEX_MODE: ready`) -If Codex is available AND `OLD_CFG` is NOT `disabled`: +If `CODEX_MODE` is `ready`: ```bash TMPERR_ADV=$(mktemp /tmp/codex-adv-XXXXXXXX) @@ -65,13 +89,13 @@ Present the full output verbatim. This is informational — it never blocks ship **Cleanup:** Run `rm -f "$TMPERR_ADV"` after processing. -If Codex is NOT available: "Codex CLI not found — running Claude adversarial only. Install Codex for cross-model coverage: `npm install -g @openai/codex`" +If `CODEX_MODE` is `not_installed` / `not_authed` / `disabled`: the preflight already printed the reason; run Claude adversarial only. --- ### Codex structured review (large diffs only, 200+ lines) -If `DIFF_TOTAL >= 200` AND Codex is available AND `OLD_CFG` is NOT `disabled`: +If `DIFF_TOTAL >= 200` AND `CODEX_MODE` is `ready`: ```bash TMPERR=$(mktemp /tmp/codex-review-XXXXXXXX) diff --git a/test/fixtures/golden/factory-ship-SKILL.md b/test/fixtures/golden/factory-ship-SKILL.md index f5f48abaf..dd8d0e01d 100644 --- a/test/fixtures/golden/factory-ship-SKILL.md +++ b/test/fixtures/golden/factory-ship-SKILL.md @@ -2332,23 +2332,47 @@ For each comment in `comments`: Every diff gets adversarial review from both Claude and Codex. LOC is not a proxy for risk — a 5-line auth change can be critical. -**Detect diff size and tool availability:** +**Detect diff size:** ```bash DIFF_BASE=$(git merge-base origin/ HEAD) DIFF_INS=$(git diff "$DIFF_BASE" --stat | tail -1 | grep -oE '[0-9]+ insertion' | grep -oE '[0-9]+' || echo "0") DIFF_DEL=$(git diff "$DIFF_BASE" --stat | tail -1 | grep -oE '[0-9]+ deletion' | grep -oE '[0-9]+' || echo "0") DIFF_TOTAL=$((DIFF_INS + DIFF_DEL)) -command -v codex >/dev/null 2>&1 && echo "CODEX_AVAILABLE" || echo "CODEX_NOT_AVAILABLE" -# Legacy opt-out — only gates Codex passes, Claude always runs -OLD_CFG=$($GSTACK_ROOT/bin/gstack-config get codex_reviews 2>/dev/null || true) echo "DIFF_SIZE: $DIFF_TOTAL" -echo "OLD_CFG: ${OLD_CFG:-not_set}" ``` -If `OLD_CFG` is `disabled`: skip Codex passes only. Claude adversarial subagent still runs (it's free and fast). Jump to the "Claude adversarial subagent" section. +**Detect the Codex master switch + tool availability:** -**User override:** If the user explicitly requested "full review", "structured review", or "P1 gate", also run the Codex structured review regardless of diff size. +```bash +# Codex preflight: one block (functions sourced here don't persist to later blocks). +_TEL=$($GSTACK_ROOT/bin/gstack-config get telemetry 2>/dev/null || echo off) +_CODEX_CFG=$($GSTACK_ROOT/bin/gstack-config get codex_reviews 2>/dev/null || echo enabled) +source $GSTACK_ROOT/bin/gstack-codex-probe 2>/dev/null || true +if [ "$_CODEX_CFG" = "disabled" ]; then + _CODEX_MODE="disabled" +elif ! command -v codex >/dev/null 2>&1; then + _CODEX_MODE="not_installed"; _gstack_codex_log_event "codex_cli_missing" 2>/dev/null || true +elif ! _gstack_codex_auth_probe >/dev/null 2>&1; then + _CODEX_MODE="not_authed"; _gstack_codex_log_event "codex_auth_failed" 2>/dev/null || true +else + _CODEX_MODE="ready"; _gstack_codex_version_check 2>/dev/null || true +fi +echo "CODEX_MODE: $_CODEX_MODE" +``` + +Branch on the echoed `CODEX_MODE`: +- **`disabled`** — the user turned Codex reviews off (`codex_reviews=disabled`). Skip the Codex passes only; the Claude adversarial subagent below STILL runs (it is free and fast). Print: "Codex passes skipped (codex_reviews disabled) — running Claude adversarial only." +- **`not_installed`** — Codex CLI absent. Print: "Codex not installed — using Claude subagent. Install for cross-model coverage: `npm install -g @openai/codex`." Fall back to the Claude subagent path. +- **`not_authed`** — installed but no credentials. Print: "Codex installed but not authenticated — using Claude subagent. Run `codex login` or set `$CODEX_API_KEY`." Fall back to the Claude subagent path. +- **`ready`** — run the Codex pass below. + +For this diff-review path, `CODEX_MODE: disabled` means skip the Codex passes ONLY — the +Claude adversarial subagent below still runs (it's free and fast). `ready` runs the Codex +passes; `not_installed` / `not_authed` skip them with the printed note and continue with +Claude only. + +**User override:** If the user explicitly requested "full review", "structured review", or "P1 gate", also run the Codex structured review regardless of diff size (still requires `CODEX_MODE: ready`). --- @@ -2369,9 +2393,9 @@ If the subagent fails or times out: "Claude adversarial subagent unavailable. Co --- -### Codex adversarial challenge (always runs when available) +### Codex adversarial challenge (runs whenever `CODEX_MODE: ready`) -If Codex is available AND `OLD_CFG` is NOT `disabled`: +If `CODEX_MODE` is `ready`: ```bash TMPERR_ADV=$(mktemp /tmp/codex-adv-XXXXXXXX) @@ -2393,13 +2417,13 @@ Present the full output verbatim. This is informational — it never blocks ship **Cleanup:** Run `rm -f "$TMPERR_ADV"` after processing. -If Codex is NOT available: "Codex CLI not found — running Claude adversarial only. Install Codex for cross-model coverage: `npm install -g @openai/codex`" +If `CODEX_MODE` is `not_installed` / `not_authed` / `disabled`: the preflight already printed the reason; run Claude adversarial only. --- ### Codex structured review (large diffs only, 200+ lines) -If `DIFF_TOTAL >= 200` AND Codex is available AND `OLD_CFG` is NOT `disabled`: +If `DIFF_TOTAL >= 200` AND `CODEX_MODE` is `ready`: ```bash TMPERR=$(mktemp /tmp/codex-review-XXXXXXXX) diff --git a/test/helpers/carve-guards.ts b/test/helpers/carve-guards.ts index 127d7fbae..6a06ef086 100644 --- a/test/helpers/carve-guards.ts +++ b/test/helpers/carve-guards.ts @@ -145,6 +145,9 @@ export const CARVE_GUARDS: Record = { maxSkeletonBytes: 90_000, minUnionBytes: 80_000, mustContain: ['SCOPE EXPANSION', 'SELECTIVE EXPANSION', 'HOLD SCOPE', 'SCOPE REDUCTION'], + // Default-on Codex outside-voice (codexPreflight block + CODEX_MODE branch + // prose replacing the smaller opt-in question) lands this ~5.2% over baseline. + maxSizeRatio: 1.08, }, 'plan-eng-review': { skill: 'plan-eng-review', @@ -162,9 +165,11 @@ export const CARVE_GUARDS: Record = { minUnionBytes: 70_000, mustContain: ['Architecture', 'Code Quality', 'Test', 'Performance'], // Cross-cutting preamble growth (v1.57.2.0 AUQ-failure prose fallback + the - // decision-memory nudge + the v1.57.4.0 Boil-the-Ocean rename) lands this just - // over the strict 1.05; small headroom for the shared preamble additions. - maxSizeRatio: 1.06, + // decision-memory nudge + the v1.57.4.0 Boil-the-Ocean rename) plus the + // default-on Codex outside-voice (codexPreflight block + CODEX_MODE branch + // prose, replacing the smaller opt-in question) land this at ~6.6% over the + // v1.53.0.0 baseline. Headroom for those intentional additions. + maxSizeRatio: 1.08, }, 'plan-design-review': { skill: 'plan-design-review', @@ -197,6 +202,9 @@ export const CARVE_GUARDS: Record = { maxSkeletonBytes: 76_000, minUnionBytes: 70_000, mustContain: ['developer experience', 'Getting Started'], + // Default-on Codex outside-voice (codexPreflight block + CODEX_MODE branch + // prose replacing the smaller opt-in question) lands this ~5.7% over baseline. + maxSizeRatio: 1.08, }, 'office-hours': { skill: 'office-hours', @@ -232,11 +240,15 @@ export const CARVE_GUARDS: Record = { maxSkeletonBytes: 50_000, minUnionBytes: 55_000, mustContain: ['CHANGELOG', 'Diataxis', 'coverage'], - // The AUQ-failure prose fallback (v1.57.2.0) adds ~2KB to every skill's - // always-loaded preamble; on this small carved skeleton that lands at ~5.9% - // over the pre-carve/pre-AUQ v1.53.0.0 baseline. Headroom for the - // cross-cutting addition; all other skills keep the strict 1.05 ceiling. - maxSizeRatio: 1.08, + // Two intentional additions stack on this small skill: the AUQ-failure prose + // fallback (v1.57.2.0, ~2KB to every preamble) AND the new default-on Codex + // documentation-review section (codexPreflight + prompt + apply-gate, carved + // into release-body so the SKELETON stays under maxSkeletonBytes). On a ~55KB + // baseline that whole new capability is ~18.6% of union bytes. The doc review + // is a deliberate new feature, not preamble creep; the union ceiling is raised + // to match while the skeleton budget (50_000) still holds the always-loaded + // cost flat. + maxSizeRatio: 1.20, }, 'design-consultation': { skill: 'design-consultation', diff --git a/test/helpers/parity-harness.ts b/test/helpers/parity-harness.ts index 3515a35d1..ee668ff05 100644 --- a/test/helpers/parity-harness.ts +++ b/test/helpers/parity-harness.ts @@ -210,7 +210,11 @@ const MONOLITH_INVARIANTS: ParityInvariant[] = [ skill: 'review', mustContain: ['confidence', 'P1', 'P2'], mustHaveHeadings: ['## Preamble', '## When to invoke'], - maxSizeRatio: 1.05, + // The adversarial step swapped its bare `command -v codex` check for the shared + // codexPreflight() block (install + auth tri-state + CODEX_MODE branch prose), + // landing ~6.3% over the v1.53.0.0 baseline. Intentional: it adds proper + // not-installed vs not-authed handling, not slop. + maxSizeRatio: 1.08, minBytes: 70_000, }, { diff --git a/test/skill-validation.test.ts b/test/skill-validation.test.ts index fb1ec5bf4..e7def7dfa 100644 --- a/test/skill-validation.test.ts +++ b/test/skill-validation.test.ts @@ -1386,15 +1386,16 @@ describe('Codex skill', () => { expect(content).toContain('Adversarial review (always-on)'); // Always-on: both Claude and Codex adversarial expect(content).toContain('Claude adversarial subagent (always runs)'); - expect(content).toContain('Codex adversarial challenge (always runs when available)'); + expect(content).toContain('Codex adversarial challenge (runs whenever'); // Claude adversarial subagent dispatch expect(content).toContain('Agent tool'); expect(content).toContain('FIXABLE'); expect(content).toContain('INVESTIGATE'); - // Codex availability check - expect(content).toContain('CODEX_NOT_AVAILABLE'); - // OLD_CFG only gates Codex, not Claude - expect(content).toContain('skip Codex passes only'); + // Probe-based availability via the shared codexPreflight() (install + auth) + expect(content).toContain('CODEX_MODE'); + expect(content).toContain('command -v codex'); // install check kept literal + // codex_reviews=disabled gates Codex passes only; Claude adversarial still runs + expect(content).toContain('skip the Codex passes ONLY'); // Review log expect(content).toContain('adversarial-review'); expect(content).toContain('reasoning_effort="high"'); @@ -1449,6 +1450,43 @@ describe('Codex skill', () => { expect(content).toContain('codex exec'); }); + // D5 regression guard: the Codex outside voice is default-on, not opt-in. A future + // gen-skill-docs change must not silently reintroduce the "Want an outside voice?" + // AskUserQuestion. The CODEX_PLAN_REVIEW content renders into each skill's + // sections/review-sections.md (the skeleton points at it). plan-design-review uses + // DESIGN_OUTSIDE_VOICES, not CODEX_PLAN_REVIEW, so it is excluded here. + test('plan reviews run the Codex outside voice default-on (no opt-in question)', () => { + for (const skill of ['plan-eng-review', 'plan-ceo-review', 'plan-devex-review']) { + const content = fs.readFileSync( + path.join(ROOT, skill, 'sections', 'review-sections.md'), 'utf-8'); + expect(content).not.toContain('Want an outside voice'); + expect(content).toContain('Outside Voice — Independent Plan Challenge (default-on)'); + expect(content).toContain('CODEX_MODE'); + expect(content).toContain('command -v codex'); // preflight install check (e2e relies on it) + } + }); + + test('/document-release includes the default-on Codex documentation review', () => { + // The doc-review renders into the carved release-body section (kept out of the + // always-loaded skeleton to respect the skeleton-byte budget). + const content = fs.readFileSync( + path.join(ROOT, 'document-release', 'sections', 'release-body.md'), 'utf-8'); + expect(content).toContain('Codex Documentation Review (default-on)'); + expect(content).toContain('CODEX_MODE'); + expect(content).toContain('codex-doc-review'); + }); + + test('codex-host document-release does NOT contain the Codex doc review', () => { + // .agents/ is gitignored — generate on demand (codex never invokes itself) + Bun.spawnSync(['bun', 'run', 'scripts/gen-skill-docs.ts', '--host', 'codex'], { + cwd: ROOT, stdout: 'pipe', stderr: 'pipe', + }); + const content = fs.readFileSync( + path.join(ROOT, '.agents', 'skills', 'gstack-document-release', 'SKILL.md'), 'utf-8'); + expect(content).not.toContain('Codex Documentation Review'); + expect(content).not.toContain('codex-doc-review'); + }); + test('codex review invocations avoid the prompt plus --base argument shape', () => { for (const rel of ['codex/SKILL.md', 'review/SKILL.md', 'ship/SKILL.md']) { // ship's codex command moved into sections/adversarial.md (T9 carve). From 14fc0866d9ac9d09d25adcac7b4437c0a235902b Mon Sep 17 00:00:00 2001 From: Garry Tan Date: Fri, 12 Jun 2026 15:38:53 -0700 Subject: [PATCH 3/4] v1.58.0.0 feat: diagram + multi-format document engine (mermaid, excalidraw, single-file HTML, DOCX) (#1990) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit * docs(todos): P3 content-hash diagram render cache for make-pdf Deferred from the diagram-engine eng review (Codex outside-voice D7): repeat make-pdf runs re-render every fence; cache keyed on fence source + bundle version once multi-diagram docs make it worth building. Co-Authored-By: Claude Fable 5 * feat(diagram-render): offline mermaid+excalidraw render bundle for browse Single self-contained page (dist/diagram-render.html, 9.2MB, committed per eng-review D2) exposing __renderMermaid / __mermaidToExcalidraw / __excalidrawToSvg / __rasterize / __probeImage through browse load-html + js --out. Render contract per D3: securityLevel strict, per-fence ids, print-css font lock, htmlLabels off (canvas-taint-safe). Deterministic build (same sha twice); drift test pins dist == BUILD_INFO == package.json pins and rebuild-reproducibility when toolchain matches. Spike-proven offline: flowchart + sequence SVG, editable .excalidraw scene, 300dpi PNG. Co-Authored-By: Claude Fable 5 * feat(diagram-render): __downscaleRaster for print-resolution image normalization Data-URI rasters re-encode in their own format (JPEG stays JPEG at q0.9 — PNG-encoding photos bloats them) at an explicit target pixel width. Used by make-pdf's pre-pass for the 300dpi content-box ceiling (eng-review D4). Co-Authored-By: Claude Fable 5 * feat(make-pdf): diagram pre-pass — mermaid/excalidraw fences render as vector SVG; local images inline as data URIs ```mermaid / ```excalidraw fences extract to placeholder tokens, render in one diagram-render bundle tab per run (reset contract: bundle page reloads after any render error), and substitute back as accessible
blocks with the raw source preserved in a comment. Render failures produce a loud red diagnostic block, never silent raw code. render=false keeps a fence as code; title="..." becomes the aria-label and caption. Local images now actually render: page.setContent loads at about:blank (tab-session.ts:194), so relative paths silently 404'd before. The pre-pass resolves them against the markdown's directory, inlines as data URIs, probes intrinsic dimensions from the bytes (pure-TS PNG/JPEG/GIF/WebP/SVG sniffing), and downscales rasters wider than 2x the content box at 300dpi. Remote URLs warn (offline posture, --allow-network exempts); missing files get a visible placeholder; --strict hard-fails both for CI pipelines. Co-Authored-By: Claude Fable 5 * test(make-pdf): diagram pre-pass unit suite + e2e render gates 34 unit tests (fence extraction incl. nested/tilde/unclosed/render=false, info-string parsing, slot substitution, diagnostic/figure escaping + SVG script strip, byte-level dimension probing across 5 formats, content-box math, image inlining incl. strict/remote/missing/data-URI paths). E2E gate proves through the compiled binary: both fences render as vector text (id-collision check), raw mermaid ships only via render=false, broken fence yields the diagnostic block, and the relative fixture image rasterizes to colored pixels (CRITICAL regression for the about:blank image fix). --strict exits non-zero on a missing image. Co-Authored-By: Claude Fable 5 * feat(make-pdf): width directives + conservative auto-landscape via CSS named pages `![a](x.png){width=full||}` and `{page=landscape|portrait}` suffixes translate to data-gstack-* attrs in render() (before the sanitizer, which keeps data- attributes; unrecognized brace groups stay visible text). Default width rule needs no code: intrinsic CSS-px capped at the content box, never upscaled — figure img max-width owns it. Auto-landscape promotes a block to `@page wide { size: landscape }` only when aspect >= 1.8 AND intrinsic width > 2.5x the content box (~1600px on letter) AND diagram provenance (rendered fences) or a whole-word alt token (diagram|architecture|flowchart|chart|graph) for plain images. {page=...} forces or vetoes; fence info strings accept page=... too. preferCSSPageSize is passed to Chromium only when a promotion exists, so every other document prints exactly as before. False negatives are cheap; false positives feel broken (eng-review P4, Codex challenge accepted). Co-Authored-By: Claude Fable 5 * test(make-pdf): width-policy unit suite + landscape e2e gate with negative fixtures 24 unit tests weighted toward the false-positive guards: wide screenshot without an alt hint stays portrait, sub-threshold and tall images stay portrait, deterministic 1560/1561px boundary, whole-word alt matching ('photographic' must not match 'graph'), page=portrait veto beats every heuristic, diagnostic blocks never promote. E2E gate asserts pdfinfo per-page boxes through the compiled binary: exactly 3 of 5 fixture blocks get landscape pages (alt-hinted image, directive-forced image, wide sequence diagram) while the unhinted screenshot and the veto'd diagram stay portrait — plus the --toc combo proving TOC and named-page landscape coexist. Co-Authored-By: Claude Fable 5 * feat(make-pdf): --to html|docx output formats --to html writes the assembled self-contained document directly (no print round-trip): inline vector diagrams, data-URI images, zero network references, plus an @media screen layer for browser reading. --to docx is the content-fidelity export (eng-review P8): html-to-docx@1.8.0 (exact pin; pure JS, bun-compile-verified) maps headings/tables/code/lists; diagrams and SVG images rasterize at 300dpi of the content-box width via the render tab; diagnostic figures convert to plain p/pre so the converter can't silently drop an error. --format keeps its page-size-alias meaning; --to is the output format, and the CLI says so when confused. Co-Authored-By: Claude Fable 5 * test(make-pdf): format gate — html no-network-refs + docx zip content checks HTML: zero src/href network refs, no script/link tags, inline SVG diagrams, data-URI images, screen layer, diagnostic survives. DOCX: valid OOXML zip (document.xml + Content_Types), >=2 PNG media (diagram raster + fixture image), headings + render=false source + diagnostic text in document.xml, no leaked mermaid source from rendered fences. Plus --to validation UX. Co-Authored-By: Claude Fable 5 * feat(diagram): /diagram skill — English in, editable diagram triplet out New skill: agent authors mermaid from the user's description and renders the triplet through the offline diagram-render bundle in the browse daemon — .mmd source (the single source of truth), editable .excalidraw (opens at excalidraw.com, round-trips back through re-render), and SVG + PNG. Flowcharts convert to fully editable scenes; other mermaid types render with an explicit upstream-converter limitation note. Never ships an unrendered source file; offline is the contract (no CDN fallback). Inventory rows in AGENTS.md + docs/skills.md; generated SKILL.md + llms.txt via gen:skill-docs. Co-Authored-By: Claude Fable 5 * test(diagram): paid E2E pair — gate triplet contract + periodic authoring judge diagram-triplet (gate, deterministic functional): a fresh claude -p agent following the skill extract must emit a parseable triplet — graph LR/TD in .mmd, excalidraw scene with >3 elements, SVG markup, PNG magic bytes. Verified live: pass, $0.17, 58s. diagram-authoring-quality (periodic, LLM-judged): faithfulness/labels/size rubric with a diagnostic-path cap, floor 6/10. Verified live: pass at exactly 6 with substantive critique. Touchfiles select both on diagram/** and lib/diagram-render/** changes; tier split per E2E_TIERS rules (eng-review D5). Co-Authored-By: Claude Fable 5 * test(diagram): register /diagram in the skill coverage matrix Gate: triplet contract + structural floor; periodic: authoring-quality judge. Co-Authored-By: Claude Fable 5 * feat(make-pdf): typography scale-up, zero image truncation, landscape vertical centering Dogfooding round on the repo README surfaced four output-quality bugs: - Type was too small everywhere: body 11→12pt, h1 22→26pt, h2 15→18pt, cover title 32→56pt with poster spacing, cover meta 10→13pt, TOC 11→12pt with tighter leading, code 9.5→10.5pt, tables 10→11pt. - Zero image truncation, ever: the max-width cap was figure-scoped, but markdown images render as

— a 1850px GitHub screenshot ran off the page edge. Global img { max-width: 100%; height: auto; } cap. - hyphens: auto put real 'dif-\nferent' breaks into the PDF text layer the moment 12pt made lines wrap (combined-gate caught it). Clean copy-paste is the product contract; left-aligned rag doesn't need hyphenation → hyphens: manual. - Promoted landscape blocks now vertically center. CSS flex/min-height centering fragments into phantom empty landscape pages in Chromium (bisected: min-height at ANY value; 3 promotions printed 5 pages), so image-policy computes an inline margin-top from each block's known aspect ratio against the landscape content box instead — fragmentation handles margins fine. .page-wide also drops its explicit break-before/ after (the page-name change already breaks on both sides). Co-Authored-By: Claude Fable 5 * test(make-pdf): pin zero-truncation invariant, typography floor, centering math Global img cap pinned as a regex invariant (the figure-scoped-cap regression class); typography floor (12pt body, 56pt cover, 12pt TOC); .page-wide must NOT carry min-height/flex (the phantom-landscape-page regression class); centering margin math verified both ways (2400×1000 image → 1.38in, 2050×600 viewBox diagram → 1.93in, page-filling directive block → no margin). Co-Authored-By: Claude Fable 5 * docs: diagram + multi-format documentation across README, make-pdf skill, and how-to guide README gains /make-pdf (Publisher) and /diagram (Diagram Maker) rows in the sprint table. make-pdf's skill doc — the agent-facing contract — gains Core patterns for mermaid/excalidraw fences (title/render=false/page= options), the image policy ({width=}/{page=} directives, zero-truncation, conservative auto-landscape), --to html|docx, and --strict, plus the --to vs --format disambiguation in Common flags. New docs/howto-diagrams-and-formats.md is the user-facing walkthrough: fences, directives, formats, /diagram triplet, the mermaid racetrack trick, troubleshooting. Co-Authored-By: Claude Fable 5 * test(make-pdf): fill ship-audit coverage gaps — downscale, reset contract, excalidraw fence, WebP Ship coverage audit found 9 gaps (85%); this fills the 2 HIGH + 3 MEDIUM and most LOW. diagram-gate fixture gains a 4200px incompressible photo (the only live coverage of __downscaleRaster AND the 64KB chunked jsViaBuffer eval transport — asserted via the downscale stderr warning), an ```excalidraw scene fence rendered through exportToSvg (vector labels + caption in pdftotext, no leaked scene JSON), and the broken fence MOVED BETWEEN the two mermaid fences so the second diagram rendering proves the D6.2 reset contract end-to-end. New coverage-gaps.test.ts (16 tests): mock-tab reset contract (exactly one reload, post-failure fence renders), excalidraw fail-fast diagnostic without a bundle call, rasterize error fallbacks (figure/tag kept, never silent), WebP VP8/VP8L/VP8X byte parsers, landscapeContentBox a4/asymmetric margins, bare-token slot fallback, resolveBundlePath env override + error shape, screenCss media scoping. Co-Authored-By: Claude Fable 5 * fix(make-pdf): pre-landing review wave — fence fidelity, injection hardening, Windows paths, transport rework Review army (6 specialists + red team) findings, all fixed: - Indented fences replay byte-for-byte and indented diagram fences are NOT extracted (red-team conf-9: the pre-pass reconstructed fences at column 0, splitting any list containing fenced code — every ordinary document). - String.replace $-pattern injection killed at every seam: substituteSlots, mergeStyle, img/src rewrites all use function replacements (a diagram label containing $' duplicated the document tail). - Big-expression transport reworked: browse `eval ` (one spawn, any size, Windows-safe) replaces the 64KB chunked window-buffer eval — fixes the per-chunk spawn cost, the char-vs-byte argv units, AND the Windows 32,767-char command-line ceiling in one move. - Staged-bundle trust: content verified by hash even when the file exists, and the rename-failure path re-hashes the survivor (sticky-bit /tmp EPERM would otherwise ride a pre-planted file past the check). - Windows drive-letter img srcs (C:/x.png) reach the local-path branch instead of being swallowed as unknown URL schemes. - DOCX rasterize-failure now embeds the decoded source as visible text — returning the figure made diagrams vanish silently (converter drops svg). - Fence source preserved as base64 data-gstack-source attribute (the comment encoding corrupted every '-->' arrow); decodeFigureSource() round-trips. - inlineLocalImages memoizes per path; file:// uses fileURLToPath; preview prints a divergence note for fences/local images; --to docx strips the watermark div and warns about print-only flags; TOC links resolve in html/docx (heading ids assigned); waitForExpression sleeps instead of busy-spinning; escapeHtml/svg-dims deduped to single definitions; typography stragglers (blockquote 12pt, footnotes 10pt, 42em screen measure); bundle BUILD_INFO gains srcSha256 for no-node_modules drift detection; MAX_TARGET_PX shared guard. Co-Authored-By: Claude Fable 5 * ci: make-pdf gate covers the diagram-render bundle; bundle pinned to LF make-pdf-gate.yml paths gain lib/diagram-render/** and the drift test (a bundle-only PR previously skipped every render gate AND no CI lane ran the drift check at all). .gitattributes pins dist html/json to LF so Windows autocrlf can't break the hash-pinned bundle. Co-Authored-By: Claude Fable 5 * test(make-pdf)+feat(diagram): review-wave test pins + skill transport hardening Tests: indented-fence byte-for-byte replay + no-extraction-in-lists, drive-letter local-path routing, $-pattern slot immunity, base64 source round-trip ('A --> B' exact), existing-style merge preservation, DOCX rasterize-failure surfaces source, srcSha256 + font-stack drift guards, landscape veto asserted as some-portrait/no-landscape (layout-order-proof), judge rubric cap lowered to 5 so it actually fails, vacuous error-shape test removed honestly, tmpdir cleanup. /diagram skill: base64 transport (template literals corrupted backticks/${ in sources), content-addressed staging with hash verification, and --tab-id pinned on every browse call so a concurrent /qa session can't be clobbered. Co-Authored-By: Claude Fable 5 * feat(make-pdf): out-of-tree image reads warn; --strict makes them fatal (D8.1) Local CLI semantics stay (absolute paths and ../ still inline, like pandoc), but never silently: an agent PDF-ing untrusted markdown can't quietly embed a file from outside the input directory into a shareable document without a visible warning, and --strict pipelines hard-fail. Two unit tests. Also: TODOS.md gains the deferred e2e-harness dedup entry (D8.2). Co-Authored-By: Claude Fable 5 * fix: pre-existing test failure in skill-e2e-bws operational-learning Root cause was the fixture, not model behavior: gstack-learnings-log gained an import of lib/jsonl-store.ts in the v1.57.5.0 injection-sanitization wave, but the test copies only bin/ scripts into its sandbox — the inline bun import failed and the script exited 1 before writing, on every run, on main too (reproduced at a5833c41). Fixture now stages lib/jsonl-store.ts beside bin/; verified deterministically (script exits 0, learning written) and via the paid test (1 pass). Co-Authored-By: Claude Fable 5 * fix(make-pdf): adversarial-review wave — offline posture enforced, symlink-aware confinement, bounded reads Codex adversarial + structured review findings: - Remote images are now BLOCKED with a visible placeholder instead of warn-and-keep — leaving the tag meant Chromium fetched the URL at print time anyway, so the offline posture was a lie (tracking pixels and internal-URL probes ran without --allow-network). - The out-of-tree read check compares REAL paths: a symlink inside the input dir pointing at ~/.ssh/... passed the string-prefix check, including under --strict. Ordered after the existence check (realpath of a missing file false-positives on macOS /var → /private/var). - Image reads are bounded BEFORE reading: statSync first, non-regular files (fifo/device/dir) and >64MB files degrade to placeholders instead of hanging or exhausting memory; malformed percent-encoding (foo%zz.png) degrades to missing-image instead of crashing decodeURIComponent. - browse shell-outs get a 120s timeout — a wedged daemon or hostile mermaid source fails the run instead of hanging it. - TOC entries link to the heading's ACTUAL id (pre-id'd raw-HTML headings previously got dead #toc-N links); per-side margins compose into the CSS @page shorthand so a landscape promotion flipping preferCSSPageSize no longer silently reverts --margin-left/right to defaults (Codex P2). - The image memo is a typed object — literal NUL-byte separators had made diagram-prepass.ts register as binary to text tooling. Codex structured review GATE: PASS (no P1). Co-Authored-By: Claude Fable 5 * chore: bump version and changelog (v1.58.0.0) Co-Authored-By: Claude Fable 5 * docs: sync make-pdf image-policy docs with final shipped behavior (v1.58.0.0) The docs wave (87594420) predated the final review-wave commits, so two docs drifted from shipped behavior: - make-pdf/SKILL.md.tmpl + generated SKILL.md: remote images are BLOCKED with a visible placeholder (not warned-and-kept); out-of-tree reads (including via symlink) warn and --strict makes them fatal; --strict also covers oversized (>64MB) and non-regular files; troubleshooting entry now names the actual "[remote image blocked]" symptom. - docs/howto-diagrams-and-formats.md: same corrections in the image section, CI section, and troubleshooting. - README.md: docs/howto-diagrams-and-formats.md added to the Docs table (was unreachable from any entry-point doc). Co-Authored-By: Claude Fable 5 * docs: apply Codex doc-review findings for v1.58.0.0 Cross-model doc review (Codex, read-only) checked the v1.58.0.0 docs against the shipped code. Fixes: - howto + make-pdf SKILL: diagram source is preserved base64 in a data-gstack-source attribute, not an HTML comment (-- in mermaid arrows would corrupt a comment); fences must start at column 0; fence options example gains page=portrait; --to html "zero network refs" qualified (--allow-network deliberately keeps remote tags). - /diagram description, README + docs/skills.md rows: the hand-drawn aesthetic belongs to the .excalidraw artifact; rendered SVG/PNG use mermaid's clean neutral theme (lib/diagram-render entry.ts pins theme: "neutral"). - CHANGELOG v1.58.0.0 wording: --strict coverage lists all five fatal classes (missing/remote/out-of-tree/oversized/non-regular); fences are vector SVG in pdf+html, 300dpi PNG in docx; hand-drawn claim scoped to the .excalidraw file. - lib/diagram-render/README: Page API table gains __downscaleRaster. Co-Authored-By: Claude Fable 5 --------- Co-authored-by: Claude Fable 5 --- .gitattributes | 6 + .github/workflows/make-pdf-gate.yml | 4 +- .gitignore | 3 + AGENTS.md | 1 + CHANGELOG.md | 106 + README.md | 3 + TODOS.md | 38 + VERSION | 2 +- bun.lock | 153 +- diagram/SKILL.md | 881 +++ diagram/SKILL.md.tmpl | 150 + docs/howto-diagrams-and-formats.md | 146 + docs/skills.md | 3 +- gstack/llms.txt | 1 + lib/diagram-render/README.md | 42 + lib/diagram-render/THIRD-PARTY-LICENSES.md | 19 + lib/diagram-render/bun.lock | 625 ++ lib/diagram-render/dist/BUILD_INFO.json | 14 + lib/diagram-render/dist/diagram-render.html | 5135 +++++++++++++++++ lib/diagram-render/package.json | 16 + lib/diagram-render/scripts/build.ts | 99 + lib/diagram-render/src/entry.ts | 215 + make-pdf/SKILL.md | 87 +- make-pdf/SKILL.md.tmpl | 87 +- make-pdf/src/browseClient.ts | 35 +- make-pdf/src/cli.ts | 21 +- make-pdf/src/diagram-prepass.ts | 846 +++ make-pdf/src/image-policy.ts | 236 + make-pdf/src/image-size.ts | 117 + make-pdf/src/orchestrator.ts | 173 +- make-pdf/src/print-css.ts | 147 +- make-pdf/src/render.ts | 79 +- make-pdf/src/types.ts | 14 +- make-pdf/test/coverage-gaps.test.ts | 220 + make-pdf/test/diagram-prepass.test.ts | 403 ++ make-pdf/test/e2e/diagram-gate.test.ts | 173 + make-pdf/test/e2e/format-gate.test.ts | 131 + make-pdf/test/e2e/landscape-gate.test.ts | 136 + .../fixtures/diagram-assets/huge-noise.png | Bin 0 -> 302582 bytes .../test/fixtures/diagram-assets/red-box.png | Bin 0 -> 131 bytes .../fixtures/diagram-assets/wide-arch.png | Bin 0 -> 9302 bytes .../diagram-assets/wide-screenshot.png | Bin 0 -> 10062 bytes make-pdf/test/fixtures/diagram-gate.md | 48 + make-pdf/test/fixtures/landscape-gate.md | 52 + make-pdf/test/image-policy.test.ts | 215 + make-pdf/test/render.test.ts | 34 + package.json | 4 +- scripts/proactive-suggestions.json | 5 + test/diagram-render-drift.test.ts | 96 + test/helpers/touchfiles.ts | 9 + test/skill-coverage-matrix.ts | 5 + test/skill-e2e-bws.test.ts | 10 +- test/skill-e2e-diagram.test.ts | 153 + 53 files changed, 11129 insertions(+), 69 deletions(-) create mode 100644 diagram/SKILL.md create mode 100644 diagram/SKILL.md.tmpl create mode 100644 docs/howto-diagrams-and-formats.md create mode 100644 lib/diagram-render/README.md create mode 100644 lib/diagram-render/THIRD-PARTY-LICENSES.md create mode 100644 lib/diagram-render/bun.lock create mode 100644 lib/diagram-render/dist/BUILD_INFO.json create mode 100644 lib/diagram-render/dist/diagram-render.html create mode 100644 lib/diagram-render/package.json create mode 100644 lib/diagram-render/scripts/build.ts create mode 100644 lib/diagram-render/src/entry.ts create mode 100644 make-pdf/src/diagram-prepass.ts create mode 100644 make-pdf/src/image-policy.ts create mode 100644 make-pdf/src/image-size.ts create mode 100644 make-pdf/test/coverage-gaps.test.ts create mode 100644 make-pdf/test/diagram-prepass.test.ts create mode 100644 make-pdf/test/e2e/diagram-gate.test.ts create mode 100644 make-pdf/test/e2e/format-gate.test.ts create mode 100644 make-pdf/test/e2e/landscape-gate.test.ts create mode 100644 make-pdf/test/fixtures/diagram-assets/huge-noise.png create mode 100644 make-pdf/test/fixtures/diagram-assets/red-box.png create mode 100644 make-pdf/test/fixtures/diagram-assets/wide-arch.png create mode 100644 make-pdf/test/fixtures/diagram-assets/wide-screenshot.png create mode 100644 make-pdf/test/fixtures/diagram-gate.md create mode 100644 make-pdf/test/fixtures/landscape-gate.md create mode 100644 make-pdf/test/image-policy.test.ts create mode 100644 test/diagram-render-drift.test.ts create mode 100644 test/skill-e2e-diagram.test.ts diff --git a/.gitattributes b/.gitattributes index 713416057..e67042f0a 100644 --- a/.gitattributes +++ b/.gitattributes @@ -37,3 +37,9 @@ bin/* text eol=lf *.gif binary *.ico binary *.pdf binary + +# The committed diagram-render bundle is hash-pinned (BUILD_INFO sha256); +# a CRLF rewrite on Windows checkout would break the drift test and change +# the content-addressed staged filename. +lib/diagram-render/dist/*.html text eol=lf +lib/diagram-render/dist/*.json text eol=lf diff --git a/.github/workflows/make-pdf-gate.yml b/.github/workflows/make-pdf-gate.yml index 769fccd2b..cd07e26bc 100644 --- a/.github/workflows/make-pdf-gate.yml +++ b/.github/workflows/make-pdf-gate.yml @@ -4,6 +4,8 @@ on: branches: [main] paths: - 'make-pdf/**' + - 'lib/diagram-render/**' + - 'test/diagram-render-drift.test.ts' - 'browse/src/meta-commands.ts' - 'browse/src/write-commands.ts' - 'browse/src/commands.ts' @@ -81,7 +83,7 @@ jobs: which pdftotext && pdftotext -v 2>&1 | head -1 || true - name: Run make-pdf unit tests - run: bun test make-pdf/test/*.test.ts + run: bun test make-pdf/test/*.test.ts test/diagram-render-drift.test.ts - name: Run E2E gates (combined-features copy-paste + emoji render) env: diff --git a/.gitignore b/.gitignore index 6eac08f36..5196c0d05 100644 --- a/.gitignore +++ b/.gitignore @@ -4,6 +4,9 @@ dist/ browse/dist/ design/dist/ make-pdf/dist/ +# diagram-render ships its built bundle (offline-at-install premise, eng-review D2) +!lib/diagram-render/dist/ +!lib/diagram-render/dist/** bin/gstack-global-discover* .gstack/ .claude/skills/ diff --git a/AGENTS.md b/AGENTS.md index a3d1fdb48..69651022d 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -104,6 +104,7 @@ End-to-end walkthrough: [docs/howto-ios-testing-with-gstack.md](docs/howto-ios-t | `/guard` | Activate both careful + freeze at once. | | `/unfreeze` | Remove directory edit restrictions. | | `/make-pdf` | Turn any markdown file into a publication-quality PDF. | +| `/diagram` | English in, diagram out: mermaid source + editable .excalidraw + SVG/PNG, offline. | ## Build commands diff --git a/CHANGELOG.md b/CHANGELOG.md index 503433f11..2dd4b64a8 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -1,5 +1,111 @@ # Changelog +## [1.58.0.0] - 2026-06-12 + +## **Your documents grow diagrams. Mermaid and excalidraw fences render as real pictures,** +## **and make-pdf now ships single-file HTML and Word output from the same markdown.** + +Put a ` ```mermaid ` fence in your markdown and `make-pdf` renders it as a crisp +vector diagram, fully offline, with the source preserved for round-trips. A broken +fence prints a loud red diagnostic block with the parse error, never silent raw +code. The new `/diagram` skill goes the other way: describe a flow in English and +get a triplet back, the mermaid source, an editable `.excalidraw` file you can open +at excalidraw.com in the hand-drawn style, and rendered SVG + PNG. Images got the +same care: local paths inline automatically and never truncate, phone photos +downscale to print resolution instead of blowing up the file, and a wide small-text +diagram promotes itself onto a vertically centered landscape page inside an +otherwise portrait document. One markdown file now exports three ways: +`--to pdf | html | docx`, where html is one self-contained file with zero network +references. Type is bigger across the board (12pt body, 56pt cover titles), TOC +links actually jump, and `--strict` turns missing, remote, out-of-tree, or +oversized images into hard CI failures. + +### The numbers that matter + +Measured on this repo's README (5,940 words, lists, code, screenshots, one +diagram fence) and the free gate suite. Reproduce: `make-pdf generate README.md +--cover --toc` and `bun test make-pdf/test/`. + +| Metric | Before | After | Δ | +|--------|--------|-------|---| +| A mermaid fence in your PDF | raw code block | vector diagram | rendered | +| Output formats from one markdown | 1 (pdf) | 3 (pdf, html, docx) | +2 | +| Network requests at render time | up to 1 per remote image | 0 by default | sealed | +| Wide-diagram handling | shrunk into portrait | own centered landscape page | rotated | +| Free make-pdf gate tests | 121 | 189 | +68 | +| README → 29-page PDF with diagram | n/a | 4.4s | one command | + +The sealed-network number is the one to notice: the mermaid and excalidraw +runtimes are vendored into a 9.2MB sha-pinned bundle, so rendering works on a +plane and a tracking pixel in pasted markdown fetches nothing. + +### What this means for your documents + +The diagram you describe in English stays editable forever: `/diagram` writes the +source, you embed the source in markdown, and every export renders it fresh. Stop +pasting screenshots of diagrams into documents. Run `/diagram` for the picture, +` ```mermaid ` for the document, and `--to html` when the reader doesn't want a PDF. + +### Itemized changes + +#### Added +- ` ```mermaid ` and ` ```excalidraw ` fences render as inline vector SVG in pdf + and html output (docx embeds them as 300dpi PNGs). Fence options: `title="..."` (caption + aria-label), + `render=false` (keep as code), `page=landscape|portrait` (orientation override). + Render failures produce a visible diagnostic block with the parse error. +- `/diagram` skill: English in, editable triplet out (`.mmd` source, + `.excalidraw` scene, SVG + PNG). Flowcharts convert to fully editable + excalidraw scenes; other mermaid types render with an explicit limitation note. +- `lib/diagram-render/`: vendored offline bundle (mermaid 11.12.2, excalidraw + 0.18.0, exact pins), deterministic build, committed dist with sha256 + source + fingerprint, drift tests, THIRD-PARTY-LICENSES. +- `--to pdf|html|docx` output formats. HTML is one self-contained file (inline + SVG diagrams, data-URI images, zero network refs, screen-readable). DOCX is a + content-fidelity export with diagrams embedded as 300dpi PNGs and alt text. +- Per-image directives: `![x](a.png){width=full|50%|3in}` and + `{page=landscape|portrait}`. +- Conservative auto-landscape: wide, small-text, diagram-like images get their + own vertically centered landscape page (aspect ≥ 1.8, width over ~2.5x the + content box, diagram-ish alt word). Directives override in both directions. +- `--strict` for CI: missing images, remote images, out-of-tree image reads, + oversized files, and non-regular files fail the run instead of degrading to + placeholders. +- `docs/howto-diagrams-and-formats.md`: the full walkthrough, fences to formats. + +#### Changed +- Typography scale: 12pt body, 26pt h1, 56pt poster cover with 13pt meta, 12pt + TOC entries, larger code and tables. Auto-hyphenation is off so copy-paste + yields clean words. +- Local images inline as data URIs with byte-probed dimensions and never + truncate; oversized photos downscale to print resolution at inline time; + repeated images are read once. +- TOC links resolve in every format (headings get real anchor ids); the screen + layer hides print-only page-number dots in HTML output. +- Remote images are blocked with a visible placeholder unless `--allow-network` + is passed; out-of-tree image reads (including via symlink) warn loudly. +- `make-pdf preview` prints a note when the document contains fences or local + images that only `generate` renders fully. + +#### Fixed +- Relative image paths render correctly in PDFs (previously resolved against the + wrong base and could show as broken boxes). +- Fenced code inside lists survives the render byte-for-byte; indented fences + keep their list placement. +- Documents containing `$&`-style sequences in diagram labels render exactly; + Windows drive-letter image paths resolve as local files; malformed + percent-encoded image URLs degrade gracefully instead of failing the run. +- Per-side margins (`--margin-left` etc.) are honored on documents containing + landscape pages. + +#### For contributors +- 68 new free-tier gates (fence extraction, image policy, landscape promotion + with negative fixtures, format contracts, bundle drift) plus a paid gate-tier + /diagram triplet test and a periodic authoring-quality judge. +- make-pdf-gate CI now covers `lib/diagram-render/**` and the drift test; the + committed bundle is pinned to LF in .gitattributes. +- Fixed the `operational-learning` E2E fixture (bin scripts now ship with the + lib module they import). + ## [1.57.10.0] - 2026-06-10 ## **Codex review now runs by default everywhere it matters.** diff --git a/README.md b/README.md index c8b20b308..4bb177c3a 100644 --- a/README.md +++ b/README.md @@ -206,6 +206,8 @@ Each skill feeds into the next. `/office-hours` writes a design doc that `/plan- | `/autoplan` | **Review Pipeline** | One command, fully reviewed plan. Runs CEO → design → eng review automatically with encoded decision principles. Surfaces only taste decisions for your approval. | | `/spec` | **Spec Author** | Turn vague intent into a precise, executable spec in five phases (why, scope, technical with mandatory code-reading, draft, file). Codex quality gate before file (blocks below 7/10), fail-closed secret redaction, dedupe against existing issues, archive to `$GSTACK_STATE_ROOT/projects/$SLUG/specs/` for team-corpus recall. `--execute` spawns `claude -p` in a fresh worktree; `/ship` auto-closes the source issue on merge. Plan-mode aware. | | `/learn` | **Memory** | Manage what gstack learned across sessions. Review, search, prune, and export project-specific patterns, pitfalls, and preferences. Learnings compound across sessions so gstack gets smarter on your codebase over time. | +| `/make-pdf` | **Publisher** | Markdown in, publication-quality document out. Mermaid and excalidraw fences render as vector diagrams, fully offline. Images scale to the page and never truncate; wide diagrams get their own landscape page. `--to html` emits one self-contained file, `--to docx` a Word doc. | +| `/diagram` | **Diagram Maker** | English in, editable diagram out. Emits a triplet: mermaid source, `.excalidraw` you can open and edit on excalidraw.com (hand-drawn style), and rendered SVG/PNG. Zero network. Embed the source in markdown and `/make-pdf` renders it. | ### Which review should I use? @@ -429,6 +431,7 @@ Other references: [docs/gbrain-sync.md](docs/gbrain-sync.md) (sync-specific guid | Doc | What it covers | |-----|---------------| | [Skill Deep Dives](docs/skills.md) | Philosophy, examples, and workflow for every skill (includes Greptile integration) | +| [Diagrams & Document Formats](docs/howto-diagrams-and-formats.md) | Mermaid/excalidraw fences in PDFs, image sizing and safety defaults, `--to html\|docx`, `/diagram` triplets | | [Builder Ethos](ETHOS.md) | Builder philosophy: Boil the Ocean, Search Before Building, three layers of knowledge | | [Using GBrain with GStack](USING_GBRAIN_WITH_GSTACK.md) | Every path, flag, bin helper, and troubleshooting step for `/setup-gbrain` | | [GBrain Sync](docs/gbrain-sync.md) | Cross-machine memory setup, privacy modes, troubleshooting | diff --git a/TODOS.md b/TODOS.md index df510e032..52e806af3 100644 --- a/TODOS.md +++ b/TODOS.md @@ -2377,3 +2377,41 @@ Pre-existing in `auq-sdk-capture.ts` — affects `skill-e2e-ship-section-loading path to the fixture during the run. **Effort:** S (human ~3h, CC ~30min). **Depends on:** None. + +### P3: Content-hash diagram render cache for make-pdf + +**What:** Cache rendered diagram SVG/PNG in `~/.gstack/cache/diagram-render/`, +keyed on `sha256(fence source + bundle version + render options)`, so repeat +`make-pdf` runs skip the browse render tab for unchanged diagrams. + +**Why:** Every run currently re-renders every fence (~150-300ms each). Docs with +10+ diagrams pay seconds per iteration during write-preview loops. Codex +outside-voice flagged the missing cache story during the eng review of the +diagram engine plan (2026-06-11, D7). + +**Context:** The diagram-render bundle ships a `BUILD_INFO.json` with a content +hash (see `lib/diagram-render/`) — use that as the bundle-version cache key +component so bundle bumps invalidate cleanly. Invalidation surface is the main +risk: stale renders after a mermaid theme change must not survive. Only worth +building once users hit multi-diagram docs; wedge perf is fine without it. + +**Effort:** S (human ~1d, CC ~30min). **Depends on:** diagram engine wedge +shipping (lib/diagram-render bundle versioning). + +### P3: Dedupe the make-pdf e2e gate-test harness + +**What:** Five e2e files (`combined-gate`, `emoji-gate`, `diagram-gate`, +`landscape-gate`, `format-gate`) each hand-roll the same prerequisite probe +(binary/browse/poppler checks with CI hard-fail vs local skip), mkdtemp/rm +lifecycle, and child-timeout constants. Extract a shared +`make-pdf/test/e2e/helpers.ts` (prerequisites(), withWorkDir(), runGenerate()). + +**Why:** Review-army maintainability finding on v1.58.0.0 — the boilerplate +diverges a little more with each new gate (diagram-gate now captures stderr +via Bun.spawnSync while the others use execFileSync), and a future fix to the +CI-hard-fail contract has to land five times. + +**Context:** Deferred at ship time (D8.2) because it's test-only churn across +five green files at the tail of a release. Zero user-facing value; pure DRY. + +**Effort:** S (human ~3h, CC ~20min). **Depends on:** None. diff --git a/VERSION b/VERSION index e535f0937..3a62339b5 100644 --- a/VERSION +++ b/VERSION @@ -1 +1 @@ -1.57.10.0 +1.58.0.0 diff --git a/bun.lock b/bun.lock index 96fda00aa..7a1833e94 100644 --- a/bun.lock +++ b/bun.lock @@ -8,6 +8,7 @@ "@huggingface/transformers": "^4.1.0", "@ngrok/ngrok": "^1.7.0", "diff": "^7.0.0", + "html-to-docx": "1.8.0", "marked": "^18.0.2", "playwright": "^1.58.2", "puppeteer-core": "^24.40.0", @@ -134,6 +135,14 @@ "@ngrok/ngrok-win32-x64-msvc": ["@ngrok/ngrok-win32-x64-msvc@1.7.0", "", { "os": "win32", "cpu": "x64" }, "sha512-UFJg/duEWzZlLkEs61Gz6/5nYhGaKI62I8dvUGdBR3NCtIMagehnFaFxmnXZldyHmCM8U0aCIFNpWRaKcrQkoA=="], + "@oozcitak/dom": ["@oozcitak/dom@1.15.6", "", { "dependencies": { "@oozcitak/infra": "1.0.5", "@oozcitak/url": "1.0.0", "@oozcitak/util": "8.3.4" } }, "sha512-k4uEIa6DI3FCrFJMGq/05U/59WnS9DjME0kaPqBRCJAqBTkmopbYV1Xs4qFKbDJ/9wOg8W97p+1E0heng/LH7g=="], + + "@oozcitak/infra": ["@oozcitak/infra@1.0.5", "", { "dependencies": { "@oozcitak/util": "8.0.0" } }, "sha512-o+zZH7M6l5e3FaAWy3ojaPIVN5eusaYPrKm6MZQt0DKNdgXa2wDYExjpP0t/zx+GoQgQKzLu7cfD8rHCLt8JrQ=="], + + "@oozcitak/url": ["@oozcitak/url@1.0.0", "", { "dependencies": { "@oozcitak/infra": "1.0.3", "@oozcitak/util": "1.0.2" } }, "sha512-LGrMeSxeLzsdaitxq3ZmBRVOrlRRQIgNNci6L0VRnOKlJFuRIkNm4B+BObXPCJA6JT5bEJtrrwjn30jueHJYZQ=="], + + "@oozcitak/util": ["@oozcitak/util@8.3.4", "", {}, "sha512-6gH/bLQJSJEg7OEpkH4wGQdA8KXHRbzL1YkGyUO12YNAgV3jxKy4K9kvfXj4+9T0OLug5k58cnPCKSSIKzp7pg=="], + "@protobufjs/aspromise": ["@protobufjs/aspromise@1.1.2", "", {}, "sha512-j+gKExEuLmKwvz3OgROXtrJ2UG2x8Ch2YZUxahh+s1F2HZ+wAceUNLkvy6zKCPVRkU++ZWQrdxsUeQXmcg4uoQ=="], "@protobufjs/base64": ["@protobufjs/base64@1.1.2", "", {}, "sha512-AZkcAA5vnN/v4PDqKyMR5lx7hZttPDgClv83E//FMNhR2TMcLUhfRUBHCmSl0oi9zMgDDqRUJkSxO3wm85+XLg=="], @@ -198,6 +207,8 @@ "boolean": ["boolean@3.2.0", "", {}, "sha512-d0II/GO9uf9lfUHH2BQsjxzRJZBdsjgsBiW4BvhWk/3qoKwQFjIDVN19PfX8F2D/r9PCMTtLWjYVCFrpeYUzsw=="], + "browser-split": ["browser-split@0.0.1", "", {}, "sha512-JhvgRb2ihQhsljNda3BI8/UcRHVzrVwo3Q+P8vDtSiyobXuFpuZ9mq+MbRGMnC22CjW3RrfXdg6j6ITX8M+7Ow=="], + "buffer-crc32": ["buffer-crc32@0.2.13", "", {}, "sha512-VO9Ht/+p3SN7SKWqcrgEzjGbRSJYTx+Q1pTQC0wrWqHx0vpJraQ6GtHx8tvcg1rlK1byhU5gccxgOgj7B0TDkQ=="], "bytes": ["bytes@3.1.2", "", {}, "sha512-/Nf7TyzTx6S3yRJObOAV7956r8cr2+Oj8AC5dt8wSP3BQAoeX58NoHyCU8P8zGkNXStjTSi6fzO6F0pBdcYbEg=="], @@ -206,6 +217,8 @@ "call-bound": ["call-bound@1.0.4", "", { "dependencies": { "call-bind-apply-helpers": "^1.0.2", "get-intrinsic": "^1.3.0" } }, "sha512-+ys997U96po4Kx/ABpBCqhA9EuxJaQWDQg7295H4hBphv3IZg0boBKuwYpt4YXp6MZ5AmZQnU/tyMTlRpaSejg=="], + "camelize": ["camelize@1.0.1", "", {}, "sha512-dU+Tx2fsypxTgtLoE36npi3UqcjSSMNYfkqgmoEhtZrraP5VWq0K7FkWVTYa8eMPtnU/G2txVsfdCJTn9uzpuQ=="], + "chromium-bidi": ["chromium-bidi@14.0.0", "", { "dependencies": { "mitt": "^3.0.1", "zod": "^3.24.1" }, "peerDependencies": { "devtools-protocol": "*" } }, "sha512-9gYlLtS6tStdRWzrtXaTMnqcM4dudNegMXJxkR0I/CXObHalYeYcAMPrL19eroNZHtJ8DQmu1E+ZNOYu/IXMXw=="], "cliui": ["cliui@8.0.1", "", { "dependencies": { "string-width": "^4.2.0", "strip-ansi": "^6.0.1", "wrap-ansi": "^7.0.0" } }, "sha512-BSeNnyus75C4//NQ9gQt1/csTXyo/8Sb+afLAkzAptFuMsod9HFokGNudZpi/oQV73hnVK+sR+5PVRMd+Dr7YQ=="], @@ -222,6 +235,8 @@ "cookie-signature": ["cookie-signature@1.2.2", "", {}, "sha512-D76uU73ulSXrD1UXF4KE2TMxVVwhsnCgfAyTg9k8P6KGZjlXKrOLe4dJQKI3Bxi5wjesZoFXJWElNWBjPZMbhg=="], + "core-util-is": ["core-util-is@1.0.3", "", {}, "sha512-ZQBvi1DcpJ4GDqanjucZ2Hj3wEO5pZDS89BWbkcrvdxksJorwUDDZamX9ldFkp9aw2lmBDLgkObEA4DWNJ9FYQ=="], + "cors": ["cors@2.8.6", "", { "dependencies": { "object-assign": "^4", "vary": "^1" } }, "sha512-tJtZBBHA6vjIAaF6EnIaq6laBBP9aq/Y3ouVJjEfoHbRBcHBAHYcMh/w8LDrk2PvIMMq8gmopa5D4V8RmbrxGw=="], "cross-spawn": ["cross-spawn@7.0.6", "", { "dependencies": { "path-key": "^3.1.0", "shebang-command": "^2.0.0", "which": "^2.0.1" } }, "sha512-uV2QOWP2nWzsy2aMp8aRibhi9dlzF5Hgh5SHaB9OiTGEyDTiJJyx0uy51QXdyWbtAHNua4XJzUKca3OzKUd3vA=="], @@ -246,6 +261,16 @@ "diff": ["diff@7.0.0", "", {}, "sha512-PJWHUb1RFevKCwaFA9RlG5tCd+FO5iRh9A8HEtkmBH2Li03iJriB6m6JIN4rGz3K3JLawI7/veA1xzRKP6ISBw=="], + "dom-serializer": ["dom-serializer@0.2.2", "", { "dependencies": { "domelementtype": "^2.0.1", "entities": "^2.0.0" } }, "sha512-2/xPb3ORsQ42nHYiSunXkDjPLBaEj/xTwUO4B7XCZQTRk7EBtTOPaygh10YAAh2OI1Qrp6NWfpAhzswj0ydt9g=="], + + "dom-walk": ["dom-walk@0.1.2", "", {}, "sha512-6QvTW9mrGeIegrFXdtQi9pk7O/nSK6lSdXW2eqUspN5LWD7UTji2Fqw5V2YLjBpHEoU9Xl/eUWNpDeZvoyOv2w=="], + + "domelementtype": ["domelementtype@1.3.1", "", {}, "sha512-BSKB+TSpMpFI/HOxCNr1O8aMOTZ8hT3pM3GQ0w/mWRmkhEDSFJkkyzz4XQsBV44BChwGkrDfMyjVD0eA2aFV3w=="], + + "domhandler": ["domhandler@2.4.2", "", { "dependencies": { "domelementtype": "1" } }, "sha512-JiK04h0Ht5u/80fdLMCEmV4zkNh2BcoMFBmZ/91WtYZ8qVXSKjiw7fXMgFPnHcSZgOo3XdinHvmnDUeMf5R4wA=="], + + "domutils": ["domutils@1.7.0", "", { "dependencies": { "dom-serializer": "0", "domelementtype": "1" } }, "sha512-Lgd2XcJ/NjEw+7tFvfKxOzCYKZsdct5lczQ2ZaQY8Djz7pfAD3Gbp8ySJWtreII/vDlMVmxwa6pHmdxIYgttDg=="], + "dunder-proto": ["dunder-proto@1.0.1", "", { "dependencies": { "call-bind-apply-helpers": "^1.0.1", "es-errors": "^1.3.0", "gopd": "^1.2.0" } }, "sha512-KIN/nDJBQRcXw0MLVhZE9iQHmG68qAVIBg9CqmUYjmQIhgij9U5MFvrqkUL5FbtyyzZuOeOt0zdeRe4UY7ct+A=="], "ee-first": ["ee-first@1.1.1", "", {}, "sha512-WMwm9LhRUo+WUaRN+vRuETqG89IgZphVSNkdFgeb6sS/E4OrDIN7t48CAewSHXc6C8lefD8KKfr5vY61brQlow=="], @@ -256,6 +281,12 @@ "end-of-stream": ["end-of-stream@1.4.5", "", { "dependencies": { "once": "^1.4.0" } }, "sha512-ooEGc6HP26xXq/N+GCGOT0JKCLDGrq2bQUZrQ7gyrJiZANJ/8YDTxTpQBXGMn+WbIQXNVpyWymm7KYVICQnyOg=="], + "ent": ["ent@2.2.2", "", { "dependencies": { "call-bound": "^1.0.3", "es-errors": "^1.3.0", "punycode": "^1.4.1", "safe-regex-test": "^1.1.0" } }, "sha512-kKvD1tO6BM+oK9HzCPpUdRb4vKFQY/FPTFmurMvh6LlN68VMrdj77w8yp51/kDbpkFOS9J8w5W6zIzgM2H8/hw=="], + + "entities": ["entities@1.1.2", "", {}, "sha512-f2LZMYl1Fzu7YSBKg+RoROelpOaNrcGmE9AZubeDfrCEia483oW4MI4VyFd5VNHIgQ/7qm1I0wUHK1eJnn2y2w=="], + + "error": ["error@4.4.0", "", { "dependencies": { "camelize": "^1.0.0", "string-template": "~0.2.0", "xtend": "~4.0.0" } }, "sha512-SNDKualLUtT4StGFP7xNfuFybL2f6iJujFtrWuvJqGbVQGaN+adE23veqzPz1hjUjTunLi2EnJ+0SJxtbJreKw=="], + "es-define-property": ["es-define-property@1.0.1", "", {}, "sha512-e3nRfgfUZ4rNGL232gUgX06QNyyez04KdjFrF+LTRoOXmrOgFKDg4BCdsjW8EnT69eqdYGmRpJwiPVYNrCaW3g=="], "es-errors": ["es-errors@1.3.0", "", {}, "sha512-Zf5H2Kxt2xjTvbJvP2ZWLEICxA6j+hAmMzIlypy4xcBg1vKVnx89Wy0GbS+kf5cwCVFFzdCFh2XSCFNULS6csw=="], @@ -280,6 +311,8 @@ "etag": ["etag@1.8.1", "", {}, "sha512-aIL5Fx7mawVa300al2BnEE4iNvo1qETxLrPI/o05L7z6go7fCw1J6EQmbK4FmJ2AS7kgVF/KEZWufBfdClMcPg=="], + "ev-store": ["ev-store@7.0.0", "", { "dependencies": { "individual": "^3.0.0" } }, "sha512-otazchNRnGzp2YarBJ+GXKVGvhxVATB1zmaStxJBYet0Dyq7A9VhH8IUEB/gRcL6Ch52lfpgPTRJ2m49epyMsQ=="], + "events-universal": ["events-universal@1.0.1", "", { "dependencies": { "bare-events": "^2.7.0" } }, "sha512-LUd5euvbMLpwOF8m6ivPCbhQeSiYVNb8Vs0fQ8QjXo0JTkEHpz8pxdQf0gStltaPpw0Cca8b39KxvK9cfKRiAw=="], "eventsource": ["eventsource@3.0.7", "", { "dependencies": { "eventsource-parser": "^3.0.1" } }, "sha512-CRT1WTyuQoD771GW56XEZFQ/ZoSfWid1alKGDYMmkt2yl8UXrVR4pspqWNEcqKvVIzg6PAltWjxcSSPrboA4iA=="], @@ -322,6 +355,8 @@ "get-uri": ["get-uri@6.0.5", "", { "dependencies": { "basic-ftp": "^5.0.2", "data-uri-to-buffer": "^6.0.2", "debug": "^4.3.4" } }, "sha512-b1O07XYq8eRuVzBNgJLstU6FYc1tS6wnMtF1I1D9lE8LxZSOGZ7LhxN54yPP6mGw5f2CkXY2BQUL9Fx41qvcIg=="], + "global": ["global@4.4.0", "", { "dependencies": { "min-document": "^2.19.0", "process": "^0.11.10" } }, "sha512-wv/LAoHdRE3BeTGz53FAamhGlPLhlssK45usmGFThIi4XqnBmjKQ16u+RNbP7WvigRZDxUsM0J3gcQ5yicaL0w=="], + "global-agent": ["global-agent@3.0.0", "", { "dependencies": { "boolean": "^3.0.1", "es6-error": "^4.1.1", "matcher": "^3.0.0", "roarr": "^2.15.3", "semver": "^7.3.2", "serialize-error": "^7.0.1" } }, "sha512-PT6XReJ+D07JvGoxQMkT6qji/jVNfX/h364XHZOWeRzy64sSFr+xJ5OX7LI3b4MPQzdL4H8Y8M0xzPpsVMwA8Q=="], "globalthis": ["globalthis@1.0.4", "", { "dependencies": { "define-properties": "^1.2.1", "gopd": "^1.0.1" } }, "sha512-DpLKbNU4WylpxJykQujfCcwYWiV/Jhm50Goo0wrVILAv5jOr9d+H+UR3PhSCD2rCCEIg0uc+G+muBTwD54JhDQ=="], @@ -334,10 +369,20 @@ "has-symbols": ["has-symbols@1.1.0", "", {}, "sha512-1cDNdwJ2Jaohmb3sg4OmKaMBwuC48sYni5HUw2DvsC8LjGTLK9h+eb1X6RyuOHe4hT0ULCW68iomhjUoKUqlPQ=="], + "has-tostringtag": ["has-tostringtag@1.0.2", "", { "dependencies": { "has-symbols": "^1.0.3" } }, "sha512-NqADB8VjPFLM2V0VvHUewwwsw0ZWBaIdgo+ieHtK3hasLz4qeCRjYcqfB6AQrBggRKppKF8L52/VqdVsO47Dlw=="], + "hasown": ["hasown@2.0.3", "", { "dependencies": { "function-bind": "^1.1.2" } }, "sha512-ej4AhfhfL2Q2zpMmLo7U1Uv9+PyhIZpgQLGT1F9miIGmiCJIoCgSmczFdrc97mWT4kVY72KA+WnnhJ5pghSvSg=="], "hono": ["hono@4.12.14", "", {}, "sha512-am5zfg3yu6sqn5yjKBNqhnTX7Cv+m00ox+7jbaKkrLMRJ4rAdldd1xPd/JzbBWspqaQv6RSTrgFN95EsfhC+7w=="], + "html-entities": ["html-entities@2.6.0", "", {}, "sha512-kig+rMn/QOVRvr7c86gQ8lWXq+Hkv6CbAH1hLu+RG338StTpE8Z0b44SDVaqVu7HGKf27frdmUYEs9hTUX/cLQ=="], + + "html-to-docx": ["html-to-docx@1.8.0", "", { "dependencies": { "@oozcitak/dom": "1.15.6", "@oozcitak/util": "8.3.4", "color-name": "^1.1.4", "html-entities": "^2.3.3", "html-to-vdom": "^0.7.0", "image-size": "^1.0.0", "image-to-base64": "^2.2.0", "jszip": "^3.7.1", "lodash": "^4.17.21", "mime-types": "^2.1.35", "nanoid": "^3.1.25", "virtual-dom": "^2.1.1", "xmlbuilder2": "2.1.2" } }, "sha512-IiMBWIqXM4+cEsW//RKoonWV7DlXAJBmmKI73XJSVWTIXjGUaxSr2ck1jqzVRZknpvO8xsFnVicldKVAWrBYBA=="], + + "html-to-vdom": ["html-to-vdom@0.7.0", "", { "dependencies": { "ent": "^2.0.0", "htmlparser2": "^3.8.2" } }, "sha512-k+d2qNkbx0JO00KezQsNcn6k2I/xSBP4yXYFLvXbcasTTDh+RDLUJS3puxqyNnpdyXWRHFGoKU7cRmby8/APcQ=="], + + "htmlparser2": ["htmlparser2@3.10.1", "", { "dependencies": { "domelementtype": "^1.3.1", "domhandler": "^2.3.0", "domutils": "^1.5.1", "entities": "^1.1.1", "inherits": "^2.0.1", "readable-stream": "^3.1.1" } }, "sha512-IgieNijUMbkDovyoKObU1DUhm1iwNYE/fuifEoEHfd1oZKZDaONBSkal7Y01shxsM49R4XaMdGez3WnF9UfiCQ=="], + "http-errors": ["http-errors@2.0.1", "", { "dependencies": { "depd": "~2.0.0", "inherits": "~2.0.4", "setprototypeof": "~1.2.0", "statuses": "~2.0.2", "toidentifier": "~1.0.1" } }, "sha512-4FbRdAX+bSdmo4AUFuS0WNiPz8NgFt+r8ThgNWmlrjQjt1Q7ZR9+zTlce2859x4KSXrwIsaeTqDoKQmtP8pLmQ=="], "http-proxy-agent": ["http-proxy-agent@7.0.2", "", { "dependencies": { "agent-base": "^7.1.0", "debug": "^4.3.4" } }, "sha512-T1gkAiYYDWYx3V5Bmyu7HcfcvL7mUrTWiM6yOfa3PIphViJ/gFPbvidQ+veqSOHci/PxBcDabeUNCzpOODJZig=="], @@ -346,6 +391,14 @@ "iconv-lite": ["iconv-lite@0.7.2", "", { "dependencies": { "safer-buffer": ">= 2.1.2 < 3.0.0" } }, "sha512-im9DjEDQ55s9fL4EYzOAv0yMqmMBSZp6G0VvFyTMPKWxiSBHUj9NW/qqLmXUwXrrM7AvqSlTCfvqRb0cM8yYqw=="], + "image-size": ["image-size@1.2.1", "", { "dependencies": { "queue": "6.0.2" }, "bin": { "image-size": "bin/image-size.js" } }, "sha512-rH+46sQJ2dlwfjfhCyNx5thzrv+dtmBIhPHk0zgRUukHzZ/kRueTJXoYYsclBaKcSMBWuGbOFXtioLpzTb5euw=="], + + "image-to-base64": ["image-to-base64@2.2.0", "", { "dependencies": { "node-fetch": "^2.6.0" } }, "sha512-Z+aMwm/91UOQqHhrz7Upre2ytKhWejZlWV/JxUTD1sT7GWWKFDJUEV5scVQKnkzSgPHFuQBUEWcanO+ma0PSVw=="], + + "immediate": ["immediate@3.0.6", "", {}, "sha512-XXOFtyqDjNDAQxVfYxuF7g9Il/IbWmmlQg2MYKOH8ExIT1qg6xc4zyS3HaEEATgs1btfzxq15ciUiY7gjSXRGQ=="], + + "individual": ["individual@3.0.0", "", {}, "sha512-rUY5vtT748NMRbEMrTNiFfy29BgGZwGXUi2NFUVMWQrogSLzlJvQV9eeMWi+g1aVaQ53tpyLAQtd5x/JH0Nh1g=="], + "inherits": ["inherits@2.0.4", "", {}, "sha512-k/vGaX4/Yla3WzyMCvTQOXYeIHvqOKtnqBduzTHpzpQZzAskKMhZ2K+EnBiSM9zGSoIFeMpXKxa4dYeZIQqewQ=="], "ip-address": ["ip-address@10.2.0", "", {}, "sha512-/+S6j4E9AHvW9SWMSEY9Xfy66O5PWvVEJ08O0y5JGyEKQpojb0K0GKpz/v5HJ/G0vi3D2sjGK78119oXZeE0qA=="], @@ -354,8 +407,14 @@ "is-fullwidth-code-point": ["is-fullwidth-code-point@3.0.0", "", {}, "sha512-zymm5+u+sCsSWyD9qNaejV3DFvhCKclKdizYaJUuHA83RLjb7nSuGnddCHGv0hk+KY7BMAlsWeK4Ueg6EV6XQg=="], + "is-object": ["is-object@1.0.2", "", {}, "sha512-2rRIahhZr2UWb45fIOuvZGpFtz0TyOZLf32KxBbSoUCeZR495zCKlWUKKUByk3geS2eAs7ZAABt0Y/Rx0GiQGA=="], + "is-promise": ["is-promise@4.0.0", "", {}, "sha512-hvpoI6korhJMnej285dSg6nu1+e6uxs7zG3BYAm5byqDsgJNWwxzM6z6iZiAgQR4TJ30JmBTOwqZUw3WlyH3AQ=="], + "is-regex": ["is-regex@1.2.1", "", { "dependencies": { "call-bound": "^1.0.2", "gopd": "^1.2.0", "has-tostringtag": "^1.0.2", "hasown": "^2.0.2" } }, "sha512-MjYsKHO5O7mCsmRGxWcLWheFqN9DJ/2TmngvjKXihe6efViPqc274+Fx/4fYj/r03+ESvBdTXK0V6tA3rgez1g=="], + + "isarray": ["isarray@1.0.0", "", {}, "sha512-VLghIWNM6ELQzo7zwmcg0NmTVyWKYjvIeM83yjp0wRDTmUnrM678fQbcKBo6n2CJEF0szoG//ytg+TKla89ALQ=="], + "isexe": ["isexe@2.0.0", "", {}, "sha512-RHxMLp9lnKHGHRng9QFhRCMbYAcVpn69smSGcq3f36xjgVVWThj4qqLbTLlq7Ssj8B+fIQ1EuCEGI2lKsyQeIw=="], "jose": ["jose@6.2.2", "", {}, "sha512-d7kPDd34KO/YnzaDOlikGpOurfF0ByC2sEV4cANCtdqLlTfBlw2p14O/5d/zv40gJPbIQxfES3nSx1/oYNyuZQ=="], @@ -368,6 +427,12 @@ "json-stringify-safe": ["json-stringify-safe@5.0.1", "", {}, "sha512-ZClg6AaYvamvYEE82d3Iyd3vSSIjQ+odgjaTzRuO3s7toCdFKczob2i0zCh7JE8kWn17yvAWhUVxvqGwUalsRA=="], + "jszip": ["jszip@3.10.1", "", { "dependencies": { "lie": "~3.3.0", "pako": "~1.0.2", "readable-stream": "~2.3.6", "setimmediate": "^1.0.5" } }, "sha512-xXDvecyTpGLrqFrvkrUSoxxfJI5AH7U8zxxtVclpsUtMCq4JQ290LY8AW5c7Ggnr/Y/oK+bQMbqK2qmtk3pN4g=="], + + "lie": ["lie@3.3.0", "", { "dependencies": { "immediate": "~3.0.5" } }, "sha512-UaiMJzeWRlEujzAuw5LokY1L5ecNQYZKfmyZ9L7wDHb/p5etKaxXhohBcrw0EYby+G/NA52vRSN4N39dxHAIwQ=="], + + "lodash": ["lodash@4.18.1", "", {}, "sha512-dMInicTPVE8d1e5otfwmmjlxkZoUpiVLwyeTdUsi/Caj/gfzzblBcCE5sRHV/AsjuCmxWrte2TNGSYuCeCq+0Q=="], + "long": ["long@5.3.2", "", {}, "sha512-mNAgZ1GmyNhD7AuqnTG3/VQ26o760+ZYBPKjPvugO8+nLbYfX6TVpJPseBvopbdY+qpZ/lKUnmEc1LeZYS3QAA=="], "lru-cache": ["lru-cache@7.18.3", "", {}, "sha512-jumlc0BIUrS3qJGgIkWZsyfAM7NCWiBcCDhnd+3NNM5KbBmLTgHVfWBcg6W+rLUsIpzpERPsvwUP7CckAQSOoA=="], @@ -382,18 +447,26 @@ "merge-descriptors": ["merge-descriptors@2.0.0", "", {}, "sha512-Snk314V5ayFLhp3fkUREub6WtjBfPdCPY1Ln8/8munuLuiYhsABgBVWsozAG+MWMbVEvcdcpbi9R7ww22l9Q3g=="], - "mime-db": ["mime-db@1.54.0", "", {}, "sha512-aU5EJuIN2WDemCcAp2vFBfp/m4EAhWJnUNSSw0ixs7/kXbd6Pg64EmwJkNdFhB8aWt1sH2CTXrLxo/iAGV3oPQ=="], + "mime-db": ["mime-db@1.52.0", "", {}, "sha512-sPU4uV7dYlvtWJxwwxHD0PuihVNiE7TyAbQ5SWxDCB9mUYvOgroQOwYQQOKPJ8CIbE+1ETVlOoK1UC2nU3gYvg=="], - "mime-types": ["mime-types@3.0.2", "", { "dependencies": { "mime-db": "^1.54.0" } }, "sha512-Lbgzdk0h4juoQ9fCKXW4by0UJqj+nOOrI9MJ1sSj4nI8aI2eo1qmvQEie4VD1glsS250n15LsWsYtCugiStS5A=="], + "mime-types": ["mime-types@2.1.35", "", { "dependencies": { "mime-db": "1.52.0" } }, "sha512-ZDY+bPm5zTTF+YpCrAU9nK0UgICYPT0QtT1NZWFv4s++TNkcgVaT0g6+4R2uI4MjQjzysHB1zxuWL50hzaeXiw=="], + + "min-document": ["min-document@2.19.2", "", { "dependencies": { "dom-walk": "^0.1.0" } }, "sha512-8S5I8db/uZN8r9HSLFVWPdJCvYOejMcEC82VIzNUc6Zkklf/d1gg2psfE79/vyhWOj4+J8MtwmoOz3TmvaGu5A=="], "mitt": ["mitt@3.0.1", "", {}, "sha512-vKivATfr97l2/QBCYAkXYDbrIWPM2IIKEl7YPhjCvKlG3kE2gm+uBo6nEXK3M5/Ffh/FLpKExzOQ3JJoJGFKBw=="], "ms": ["ms@2.1.3", "", {}, "sha512-6FlzubTLZG3J2a/NVCAleEhjzq5oxgHyaCU9yYXvcLsvoVaHJq/s5xXI6/XXP6tz7R9xAOtHnSO/tXtF3WRTlA=="], + "nanoid": ["nanoid@3.3.12", "", { "bin": { "nanoid": "bin/nanoid.cjs" } }, "sha512-ZB9RH/39qpq5Vu6Y+NmUaFhQR6pp+M2Xt76XBnEwDaGcVAqhlvxrl3B2bKS5D3NH3QR76v3aSrKaF/Kiy7lEtQ=="], + "negotiator": ["negotiator@1.0.0", "", {}, "sha512-8Ofs/AUQh8MaEcrlq5xOX0CQ9ypTF5dl78mjlMNfOK08fzpgTHQRQPBxcPlEtIw0yRpws+Zo/3r+5WRby7u3Gg=="], "netmask": ["netmask@2.0.2", "", {}, "sha512-dBpDMdxv9Irdq66304OLfEmQ9tbNRFnFTuZiLo+bD+r332bBmMJ8GBLXklIXXgxd3+v9+KUnZaUR5PJMa75Gsg=="], + "next-tick": ["next-tick@0.2.2", "", {}, "sha512-f7h4svPtl+QidoBv4taKXUjJ70G2asaZ8G28nS0OkqaalX8dwwrtWtyxEDPK62AC00ur/+/E0pUwBwY5EPn15Q=="], + + "node-fetch": ["node-fetch@2.7.0", "", { "dependencies": { "whatwg-url": "^5.0.0" }, "peerDependencies": { "encoding": "^0.1.0" }, "optionalPeers": ["encoding"] }, "sha512-c4FRfUm/dbcWZ7U+1Wq0AwCyFL+3nt2bEw05wfxSz+DWpWsitgmSgYmy2dQdWyKC1694ELPqMs/YzUSNozLt8A=="], + "object-assign": ["object-assign@4.1.1", "", {}, "sha512-rJgTQnkUnH1sFw8yT6VSU3zD3sWmu6sZhIseY8VX+GRu3P6F7Fu+JNDoXfklElbLJSnc3FUQHVe4cU5hj+BcUg=="], "object-inspect": ["object-inspect@1.13.4", "", {}, "sha512-W67iLl4J2EXEGTbfeHCffrjDfitvLANg0UlX3wFUUSTx92KXRFegMHUVgSqE+wvhAbi4WqjGg9czysTV2Epbew=="], @@ -414,6 +487,8 @@ "pac-resolver": ["pac-resolver@7.0.1", "", { "dependencies": { "degenerator": "^5.0.0", "netmask": "^2.0.2" } }, "sha512-5NPgf87AT2STgwa2ntRMr45jTKrYBGkVU36yT0ig/n/GMAa3oPqhZfIQ2kMEimReg0+t9kZViDVZ83qfVUlckg=="], + "pako": ["pako@1.0.11", "", {}, "sha512-4hLB8Py4zZce5s4yd9XzopqwVv/yGNhV1Bl8NTmCq1763HeK2+EwVTv+leGeL13Dnh2wfbqowVPXCIO0z4taYw=="], + "parseurl": ["parseurl@1.3.3", "", {}, "sha512-CiyeOxFT/JZyN5m0z9PfXw4SCBJ6Sygz1Dpl0wqjlhDEGGBP1GnsUVEL0p63hoG1fcj3fHynXi9NYO4nWOL+qQ=="], "path-key": ["path-key@3.1.1", "", {}, "sha512-ojmeN0qd+y0jszEtoY48r0Peq5dwMEkIlCOu6Q5f41lfkswXuKtYrhgoTpLnyIcHm24Uhqx+5Tqm2InSwLhE6Q=="], @@ -430,6 +505,10 @@ "playwright-core": ["playwright-core@1.58.2", "", { "bin": { "playwright-core": "cli.js" } }, "sha512-yZkEtftgwS8CsfYo7nm0KE8jsvm6i/PTgVtB8DL726wNf6H2IMsDuxCpJj59KDaxCtSnrWan2AeDqM7JBaultg=="], + "process": ["process@0.11.10", "", {}, "sha512-cdGef/drWFoydD1JsMzuFf8100nZl+GT+yacc2bEced5f9Rjk4z+WtFUTBu9PhOi9j/jfmBPu0mMEY4wIdAF8A=="], + + "process-nextick-args": ["process-nextick-args@2.0.1", "", {}, "sha512-3ouUOpQhtgrbOa17J7+uxOTpITYWaGP7/AhoR3+A+/1e9skrzelGi/dXzEYyvbxubEF6Wn2ypscTKiKJFFn1ag=="], + "progress": ["progress@2.0.3", "", {}, "sha512-7PiHtLll5LdnKIMw100I+8xJXR5gW2QwWYkT6iJva0bXitZKa/XMrSbdmg3r2Xnaidz9Qumd0VPaMrZlF9V9sA=="], "protobufjs": ["protobufjs@7.5.5", "", { "dependencies": { "@protobufjs/aspromise": "^1.1.2", "@protobufjs/base64": "^1.1.2", "@protobufjs/codegen": "^2.0.4", "@protobufjs/eventemitter": "^1.1.0", "@protobufjs/fetch": "^1.1.0", "@protobufjs/float": "^1.0.2", "@protobufjs/inquire": "^1.1.0", "@protobufjs/path": "^1.1.2", "@protobufjs/pool": "^1.1.0", "@protobufjs/utf8": "^1.1.0", "@types/node": ">=13.7.0", "long": "^5.0.0" } }, "sha512-3wY1AxV+VBNW8Yypfd1yQY9pXnqTAN+KwQxL8iYm3/BjKYMNg4i0owhEe26PWDOMaIrzeeF98Lqd5NGz4omiIg=="], @@ -442,14 +521,20 @@ "pump": ["pump@3.0.4", "", { "dependencies": { "end-of-stream": "^1.1.0", "once": "^1.3.1" } }, "sha512-VS7sjc6KR7e1ukRFhQSY5LM2uBWAUPiOPa/A3mkKmiMwSmRFUITt0xuj+/lesgnCv+dPIEYlkzrcyXgquIHMcA=="], + "punycode": ["punycode@1.4.1", "", {}, "sha512-jmYNElW7yvO7TV33CjSmvSiE2yco3bV2czu/OzDKdMNVZQWfxCblURLhf+47syQRBntjfLdd/H0egrzIG+oaFQ=="], + "puppeteer-core": ["puppeteer-core@24.40.0", "", { "dependencies": { "@puppeteer/browsers": "2.13.0", "chromium-bidi": "14.0.0", "debug": "^4.4.3", "devtools-protocol": "0.0.1581282", "typed-query-selector": "^2.12.1", "webdriver-bidi-protocol": "0.4.1", "ws": "^8.19.0" } }, "sha512-MWL3XbUCfVgGR0gRsidzT6oKJT2QydPLhMITU6HoVWiiv4gkb6gJi3pcdAa8q4HwjBTbqISOWVP4aJiiyUJvag=="], "qs": ["qs@6.15.1", "", { "dependencies": { "side-channel": "^1.1.0" } }, "sha512-6YHEFRL9mfgcAvql/XhwTvf5jKcOiiupt2FiJxHkiX1z4j7WL8J/jRHYLluORvc1XxB5rV20KoeK00gVJamspg=="], + "queue": ["queue@6.0.2", "", { "dependencies": { "inherits": "~2.0.3" } }, "sha512-iHZWu+q3IdFZFX36ro/lKBkSvfkztY5Y7HMiPlOUjhupPcG2JMfst2KKEpu5XndviX/3UhFbRngUPNKtgvtZiA=="], + "range-parser": ["range-parser@1.2.1", "", {}, "sha512-Hrgsx+orqoygnmhFbKaHE6c296J+HTAQXoxEF6gNupROmmGJRoyzfG3ccAveqCBrwr/2yxQ5BVd/GTl5agOwSg=="], "raw-body": ["raw-body@3.0.2", "", { "dependencies": { "bytes": "~3.1.2", "http-errors": "~2.0.1", "iconv-lite": "~0.7.0", "unpipe": "~1.0.0" } }, "sha512-K5zQjDllxWkf7Z5xJdV0/B0WTNqx6vxG70zJE4N0kBs4LovmEYWJzQGxC9bS9RAKu3bgM40lrd5zoLJ12MQ5BA=="], + "readable-stream": ["readable-stream@2.3.8", "", { "dependencies": { "core-util-is": "~1.0.0", "inherits": "~2.0.3", "isarray": "~1.0.0", "process-nextick-args": "~2.0.0", "safe-buffer": "~5.1.1", "string_decoder": "~1.1.1", "util-deprecate": "~1.0.1" } }, "sha512-8p0AUk4XODgIewSi0l8Epjs+EVnWiK7NoDIEGU0HhE7+ZyY8D1IMY7odu5lRrFXGg71L15KG8QrPmum45RTtdA=="], + "require-directory": ["require-directory@2.1.1", "", {}, "sha512-fGxEI7+wsG9xrvdjsrlmL22OMTTiHRwAMroiEeMgq8gzoLC/PQr7RsRDSTLUg/bZAZtF+TVIkHc6/4RIKrui+Q=="], "require-from-string": ["require-from-string@2.0.2", "", {}, "sha512-Xf0nWe6RseziFMu+Ap9biiUbmplq6S9/p+7w7YXP/JBHhrUDDUhwa+vANyubuqfZWTveU//DYVGsDG7RKL/vEw=="], @@ -458,6 +543,10 @@ "router": ["router@2.2.0", "", { "dependencies": { "debug": "^4.4.0", "depd": "^2.0.0", "is-promise": "^4.0.0", "parseurl": "^1.3.3", "path-to-regexp": "^8.0.0" } }, "sha512-nLTrUKm2UyiL7rlhapu/Zl45FwNgkZGaCpZbIHajDYgwlJCOzLSk+cIPAnsEqV955GjILJnKbdQC1nVPz+gAYQ=="], + "safe-buffer": ["safe-buffer@5.1.2", "", {}, "sha512-Gd2UZBJDkXlY7GbJxfsE8/nvKkUEU1G38c1siN6QP6a9PT9MmHB8GnpscSmMJSoF8LOIrt8ud/wPtojys4G6+g=="], + + "safe-regex-test": ["safe-regex-test@1.1.0", "", { "dependencies": { "call-bound": "^1.0.2", "es-errors": "^1.3.0", "is-regex": "^1.2.1" } }, "sha512-x/+Cz4YrimQxQccJf5mKEbIa1NzeCRNI5Ecl/ekmlYaampdNLPalVyIcCZNNH3MvmqBugV5TMYZXv0ljslUlaw=="], + "safer-buffer": ["safer-buffer@2.1.2", "", {}, "sha512-YZo3K82SD7Riyi0E1EQPojLz7kpepnSQI9IyPbHHg1XXXevb5dJI7tpyN2ADxGcQbHG7vcyRHk0cbwqcQriUtg=="], "semver": ["semver@7.7.4", "", { "bin": { "semver": "bin/semver.js" } }, "sha512-vFKC2IEtQnVhpT78h1Yp8wzwrf8CM+MzKMHGJZfBtzhZNycRFnXsHk6E5TxIkkMsgNS7mdX3AGB7x2QM2di4lA=="], @@ -470,6 +559,8 @@ "serve-static": ["serve-static@2.2.1", "", { "dependencies": { "encodeurl": "^2.0.0", "escape-html": "^1.0.3", "parseurl": "^1.3.3", "send": "^1.2.0" } }, "sha512-xRXBn0pPqQTVQiC8wyQrKs2MOlX24zQ0POGaj0kultvoOCstBQM5yvOhAVSUwOMjQtTvsPWoNCHfPGwaaQJhTw=="], + "setimmediate": ["setimmediate@1.0.5", "", {}, "sha512-MATJdZp8sLqDl/68LfQmbP8zKPLQNV6BIZoIgrscFDQ+RsvK/BxeDQOgyxKKoh0y/8h3BqVFnCqQ/gd+reiIXA=="], + "setprototypeof": ["setprototypeof@1.2.0", "", {}, "sha512-E5LDX7Wrp85Kil5bhZv46j8jOeboKq5JMmYM3gVGdGH8xFpPWXUMsNrlODCrkoxMEeNi/XZIwuRvY4XNwYMJpw=="], "sharp": ["sharp@0.34.5", "", { "dependencies": { "@img/colour": "^1.0.0", "detect-libc": "^2.1.2", "semver": "^7.7.3" }, "optionalDependencies": { "@img/sharp-darwin-arm64": "0.34.5", "@img/sharp-darwin-x64": "0.34.5", "@img/sharp-libvips-darwin-arm64": "1.2.4", "@img/sharp-libvips-darwin-x64": "1.2.4", "@img/sharp-libvips-linux-arm": "1.2.4", "@img/sharp-libvips-linux-arm64": "1.2.4", "@img/sharp-libvips-linux-ppc64": "1.2.4", "@img/sharp-libvips-linux-riscv64": "1.2.4", "@img/sharp-libvips-linux-s390x": "1.2.4", "@img/sharp-libvips-linux-x64": "1.2.4", "@img/sharp-libvips-linuxmusl-arm64": "1.2.4", "@img/sharp-libvips-linuxmusl-x64": "1.2.4", "@img/sharp-linux-arm": "0.34.5", "@img/sharp-linux-arm64": "0.34.5", "@img/sharp-linux-ppc64": "0.34.5", "@img/sharp-linux-riscv64": "0.34.5", "@img/sharp-linux-s390x": "0.34.5", "@img/sharp-linux-x64": "0.34.5", "@img/sharp-linuxmusl-arm64": "0.34.5", "@img/sharp-linuxmusl-x64": "0.34.5", "@img/sharp-wasm32": "0.34.5", "@img/sharp-win32-arm64": "0.34.5", "@img/sharp-win32-ia32": "0.34.5", "@img/sharp-win32-x64": "0.34.5" } }, "sha512-Ou9I5Ft9WNcCbXrU9cMgPBcCK8LiwLqcbywW3t4oDV37n1pzpuNLsYiAV8eODnjbtQlSDwZ2cUEeQz4E54Hltg=="], @@ -500,8 +591,12 @@ "streamx": ["streamx@2.25.0", "", { "dependencies": { "events-universal": "^1.0.0", "fast-fifo": "^1.3.2", "text-decoder": "^1.1.0" } }, "sha512-0nQuG6jf1w+wddNEEXCF4nTg3LtufWINB5eFEN+5TNZW7KWJp6x87+JFL43vaAUPyCfH1wID+mNVyW6OHtFamg=="], + "string-template": ["string-template@0.2.1", "", {}, "sha512-Yptehjogou2xm4UJbxJ4CxgZx12HBfeystp0y3x7s4Dj32ltVVG1Gg8YhKjHZkHicuKpZX/ffilA8505VbUbpw=="], + "string-width": ["string-width@4.2.3", "", { "dependencies": { "emoji-regex": "^8.0.0", "is-fullwidth-code-point": "^3.0.0", "strip-ansi": "^6.0.1" } }, "sha512-wKyQRQpjJ0sIp62ErSZdGsjMJWsap5oRNihHhu6G7JVO/9jIB6UyevL+tXuOqrng8j/cxKTWyWUwvSTriiZz/g=="], + "string_decoder": ["string_decoder@1.1.1", "", { "dependencies": { "safe-buffer": "~5.1.0" } }, "sha512-n/ShnvDi6FHbbVfviro+WojiFzv+s8MPMHBczVePfUpDJLwoLT0ht1l4YwBCbi8pJAveEEdnkHyPyTP/mzRfwg=="], + "strip-ansi": ["strip-ansi@6.0.1", "", { "dependencies": { "ansi-regex": "^5.0.1" } }, "sha512-Y38VPSHcqkFrCpFnQ9vuSXmquuv5oXOKpGeT6aGrr3o3Gc9AlVa6JBfUSOCnbxGGZF+/0ooI7KrPuUSztUdU5A=="], "tar-fs": ["tar-fs@3.1.2", "", { "dependencies": { "pump": "^3.0.0", "tar-stream": "^3.1.5" }, "optionalDependencies": { "bare-fs": "^4.0.1", "bare-path": "^3.0.0" } }, "sha512-QGxxTxxyleAdyM3kpFs14ymbYmNFrfY+pHj7Z8FgtbZ7w2//VAgLMac7sT6nRpIHjppXO2AwwEOg0bPFVRcmXw=="], @@ -514,6 +609,8 @@ "toidentifier": ["toidentifier@1.0.1", "", {}, "sha512-o5sSPKEkg/DIQNmH43V0/uerLrpzVedkUh8tGNvaeXpfpuwjKenlSox/2O/BTlZUtEe+JG7s5YhEz608PlAHRA=="], + "tr46": ["tr46@0.0.3", "", {}, "sha512-N3WMsuqV66lT30CrXNbEjx4GEwlow3v6rr4mCcv6prnfwhS01rkgyFdjPNBYd9br7LpXV1+Emh01fHnq2Gdgrw=="], + "ts-algebra": ["ts-algebra@2.0.0", "", {}, "sha512-FPAhNPFMrkwz76P7cdjdmiShwMynZYN6SgOujD1urY4oNm80Ou9oMdmbR45LotcKOXoy7wSmHkRFE6Mxbrhefw=="], "tslib": ["tslib@2.8.1", "", {}, "sha512-oJFu94HQb+KVduSUQL7wnpmqnfmLsOA/nAh6b6EH0wCEoK0/mPeXU6c3wKDV83MkOuHPRHtSXKKU99IBazS/2w=="], @@ -528,10 +625,18 @@ "unpipe": ["unpipe@1.0.0", "", {}, "sha512-pjy2bYhSsufwWlKwPc+l3cN7+wuJlK6uz0YdJEOlQDbl6jo/YlPi4mb8agUkVC8BF7V8NuzeyPNqRksA3hztKQ=="], + "util-deprecate": ["util-deprecate@1.0.2", "", {}, "sha512-EPD5q1uXyFxJpCrLnCc1nHnq3gOa6DZBocAIiI2TaSCA7VCJ1UJDMagCzIkXNsUYfD1daK//LTEQ8xiIbrHtcw=="], + "vary": ["vary@1.1.2", "", {}, "sha512-BNGbWLfd0eUPabhkXUVm0j8uuvREyTh5ovRa/dyow/BqAbZJyC+5fU+IzQOzmAKzYqYRAISoRhdQr3eIZ/PXqg=="], + "virtual-dom": ["virtual-dom@2.1.1", "", { "dependencies": { "browser-split": "0.0.1", "error": "^4.3.0", "ev-store": "^7.0.0", "global": "^4.3.0", "is-object": "^1.0.1", "next-tick": "^0.2.2", "x-is-array": "0.1.0", "x-is-string": "0.1.0" } }, "sha512-wb6Qc9Lbqug0kRqo/iuApfBpJJAq14Sk1faAnSmtqXiwahg7PVTvWMs9L02Z8nNIMqbwsxzBAA90bbtRLbw0zg=="], + "webdriver-bidi-protocol": ["webdriver-bidi-protocol@0.4.1", "", {}, "sha512-ARrjNjtWRRs2w4Tk7nqrf2gBI0QXWuOmMCx2hU+1jUt6d00MjMxURrhxhGbrsoiZKJrhTSTzbIrc554iKI10qw=="], + "webidl-conversions": ["webidl-conversions@3.0.1", "", {}, "sha512-2JAn3z8AR6rjK8Sm8orRC0h/bcl/DqL7tRPdGZ4I1CjdF+EaMLmYxBHyXuKL849eucPFhvBoxMsflfOb8kxaeQ=="], + + "whatwg-url": ["whatwg-url@5.0.0", "", { "dependencies": { "tr46": "~0.0.3", "webidl-conversions": "^3.0.0" } }, "sha512-saE57nupxk6v3HY35+jzBwYa0rKSy0XR8JSxZPwgLr7ys0IBzhGviA1/TUGJLmSVqs8pb9AnvICXEuOHLprYTw=="], + "which": ["which@2.0.2", "", { "dependencies": { "isexe": "^2.0.0" }, "bin": { "node-which": "./bin/node-which" } }, "sha512-BLI3Tl1TW3Pvl70l3yq3Y64i+awpwXqsGBYWkkqMtnbXgrMD+yj7rhW0kuEDxzJaYXGjEW5ogapKNMEKNMjibA=="], "wrap-ansi": ["wrap-ansi@7.0.0", "", { "dependencies": { "ansi-styles": "^4.0.0", "string-width": "^4.1.0", "strip-ansi": "^6.0.0" } }, "sha512-YVGIj2kamLSTxw6NsZjoBxfSwsn0ycdesmc4p+Q21c5zPuZ1pl+NfxVdxPtdHvmNVOQ6XSYG4AUtyt/Fi7D16Q=="], @@ -540,6 +645,14 @@ "ws": ["ws@8.20.0", "", { "peerDependencies": { "bufferutil": "^4.0.1", "utf-8-validate": ">=5.0.2" }, "optionalPeers": ["bufferutil", "utf-8-validate"] }, "sha512-sAt8BhgNbzCtgGbt2OxmpuryO63ZoDk/sqaB/znQm94T4fCEsy/yV+7CdC1kJhOU9lboAEU7R3kquuycDoibVA=="], + "x-is-array": ["x-is-array@0.1.0", "", {}, "sha512-goHPif61oNrr0jJgsXRfc8oqtYzvfiMJpTqwE7Z4y9uH+T3UozkGqQ4d2nX9mB9khvA8U2o/UbPOFjgC7hLWIA=="], + + "x-is-string": ["x-is-string@0.1.0", "", {}, "sha512-GojqklwG8gpzOVEVki5KudKNoq7MbbjYZCbyWzEz7tyPA7eleiE0+ePwOWQQRb5fm86rD3S8Tc0tSFf3AOv50w=="], + + "xmlbuilder2": ["xmlbuilder2@2.1.2", "", { "dependencies": { "@oozcitak/dom": "1.15.5", "@oozcitak/infra": "1.0.5", "@oozcitak/util": "8.3.3" } }, "sha512-PI710tmtVlQ5VmwzbRTuhmVhKnj9pM8Si+iOZCV2g2SNo3gCrpzR2Ka9wNzZtqfD+mnP+xkrqoNy0sjKZqP4Dg=="], + + "xtend": ["xtend@4.0.2", "", {}, "sha512-LKYU1iAXJXUgAXn9URjiu+MWhyUXHsvfp7mcuYm9dSUKK0/CjtrUwFAxD82/mCWbtLsGjFIad0wIsod4zrTAEQ=="], + "xterm": ["xterm@5.3.0", "", {}, "sha512-8QqjlekLUFTrU6x7xck1MsPzPA571K5zNqWm0M0oroYEWVOptZ0+ubQSkQ3uxIEhcIHRujJy6emDWX4A7qyFzg=="], "xterm-addon-fit": ["xterm-addon-fit@0.8.0", "", { "peerDependencies": { "xterm": "^5.0.0" } }, "sha512-yj3Np7XlvxxhYF/EJ7p3KHaMt6OdwQ+HDu573Vx1lRXsVxOcnVJs51RgjZOouIZOczTsskaS+CpXspK81/DLqw=="], @@ -558,12 +671,48 @@ "@anthropic-ai/claude-agent-sdk/@anthropic-ai/sdk": ["@anthropic-ai/sdk@0.81.0", "", { "dependencies": { "json-schema-to-ts": "^3.1.1" }, "peerDependencies": { "zod": "^3.25.0 || ^4.0.0" }, "optionalPeers": ["zod"], "bin": { "anthropic-ai-sdk": "bin/cli" } }, "sha512-D4K5PvEV6wPiRtVlVsJHIUhHAmOZ6IT/I9rKlTf84gR7GyyAurPJK7z9BOf/AZqC5d1DhYQGJNKRmV+q8dGhgw=="], + "@oozcitak/infra/@oozcitak/util": ["@oozcitak/util@8.0.0", "", {}, "sha512-+9Hq6yuoq/3TRV/n/xcpydGBq2qN2/DEDMqNTG7rm95K6ZE2/YY/sPyx62+1n8QsE9O26e5M1URlXsk+AnN9Jw=="], + + "@oozcitak/url/@oozcitak/infra": ["@oozcitak/infra@1.0.3", "", { "dependencies": { "@oozcitak/util": "1.0.1" } }, "sha512-9O2wxXGnRzy76O1XUxESxDGsXT5kzETJPvYbreO4mv6bqe1+YSuux2cZTagjJ/T4UfEwFJz5ixanOqB0QgYAag=="], + + "@oozcitak/url/@oozcitak/util": ["@oozcitak/util@1.0.2", "", {}, "sha512-4n8B1cWlJleSOSba5gxsMcN4tO8KkkcvXhNWW+ADqvq9Xj+Lrl9uCa90GRpjekqQJyt84aUX015DG81LFpZYXA=="], + + "accepts/mime-types": ["mime-types@3.0.2", "", { "dependencies": { "mime-db": "^1.54.0" } }, "sha512-Lbgzdk0h4juoQ9fCKXW4by0UJqj+nOOrI9MJ1sSj4nI8aI2eo1qmvQEie4VD1glsS250n15LsWsYtCugiStS5A=="], + + "dom-serializer/domelementtype": ["domelementtype@2.3.0", "", {}, "sha512-OLETBj6w0OsagBwdXnPdN0cnMfF9opN69co+7ZrbfPGrdpPVNBUj02spi6B1N7wChLQiPn4CSH/zJvXw56gmHw=="], + + "dom-serializer/entities": ["entities@2.2.0", "", {}, "sha512-p92if5Nz619I0w+akJrLZH0MX0Pb5DX39XOwQTtXSdQQOaYH03S1uIQp4mhOZtAXrxq4ViO67YTiLBo2638o9A=="], + + "express/mime-types": ["mime-types@3.0.2", "", { "dependencies": { "mime-db": "^1.54.0" } }, "sha512-Lbgzdk0h4juoQ9fCKXW4by0UJqj+nOOrI9MJ1sSj4nI8aI2eo1qmvQEie4VD1glsS250n15LsWsYtCugiStS5A=="], + "express-rate-limit/ip-address": ["ip-address@10.1.0", "", {}, "sha512-XXADHxXmvT9+CRxhXg56LJovE+bmWnEWB78LB83VZTprKTmaC5QfruXocxzTZ2Kl0DNwKuBdlIhjL8LeY8Sf8Q=="], + "htmlparser2/readable-stream": ["readable-stream@3.6.2", "", { "dependencies": { "inherits": "^2.0.3", "string_decoder": "^1.1.1", "util-deprecate": "^1.0.1" } }, "sha512-9u/sniCrY3D5WdsERHzHE4G2YCXqoG5FTHUiCC4SIbr6XcLZBY05ya9EKjYek9O5xOAwjGq+1JdGBAS7Q9ScoA=="], + "onnxruntime-web/onnxruntime-common": ["onnxruntime-common@1.24.0-dev.20251116-b39e144322", "", {}, "sha512-BOoomdHYmNRL5r4iQ4bMvsl2t0/hzVQ3OM3PHD0gxeXu1PmggqBv3puZicEUVOA3AtHHYmqZtjMj9FOfGrATTw=="], + "send/mime-types": ["mime-types@3.0.2", "", { "dependencies": { "mime-db": "^1.54.0" } }, "sha512-Lbgzdk0h4juoQ9fCKXW4by0UJqj+nOOrI9MJ1sSj4nI8aI2eo1qmvQEie4VD1glsS250n15LsWsYtCugiStS5A=="], + "socks-proxy-agent/socks": ["socks@2.8.7", "", { "dependencies": { "ip-address": "^10.0.1", "smart-buffer": "^4.2.0" } }, "sha512-HLpt+uLy/pxB+bum/9DzAgiKS8CX1EvbWxI4zlmgGCExImLdiad2iCwXT5Z4c9c3Eq8rP2318mPW2c+QbtjK8A=="], + "type-is/mime-types": ["mime-types@3.0.2", "", { "dependencies": { "mime-db": "^1.54.0" } }, "sha512-Lbgzdk0h4juoQ9fCKXW4by0UJqj+nOOrI9MJ1sSj4nI8aI2eo1qmvQEie4VD1glsS250n15LsWsYtCugiStS5A=="], + + "xmlbuilder2/@oozcitak/dom": ["@oozcitak/dom@1.15.5", "", { "dependencies": { "@oozcitak/infra": "1.0.5", "@oozcitak/url": "1.0.0", "@oozcitak/util": "8.0.0" } }, "sha512-L6v3Mwb0TaYBYgeYlIeBaHnc+2ZEaDSbFiRm5KmqZQSoBlbPlf+l6aIH/sD5GUf2MYwULw00LT7+dOnEuAEC0A=="], + + "xmlbuilder2/@oozcitak/util": ["@oozcitak/util@8.3.3", "", {}, "sha512-Ufpab7G5PfnEhQyy5kDg9C8ltWJjsVT1P/IYqacjstaqydG4Q21HAT2HUZQYBrC/a1ZLKCz87pfydlDvv8y97w=="], + + "@oozcitak/url/@oozcitak/infra/@oozcitak/util": ["@oozcitak/util@1.0.1", "", {}, "sha512-dFwFqcKrQnJ2SapOmRD1nQWEZUtbtIy9Y6TyJquzsalWNJsKIPxmTI0KG6Ypyl8j7v89L2wixH9fQDNrF78hKg=="], + + "accepts/mime-types/mime-db": ["mime-db@1.54.0", "", {}, "sha512-aU5EJuIN2WDemCcAp2vFBfp/m4EAhWJnUNSSw0ixs7/kXbd6Pg64EmwJkNdFhB8aWt1sH2CTXrLxo/iAGV3oPQ=="], + + "express/mime-types/mime-db": ["mime-db@1.54.0", "", {}, "sha512-aU5EJuIN2WDemCcAp2vFBfp/m4EAhWJnUNSSw0ixs7/kXbd6Pg64EmwJkNdFhB8aWt1sH2CTXrLxo/iAGV3oPQ=="], + + "send/mime-types/mime-db": ["mime-db@1.54.0", "", {}, "sha512-aU5EJuIN2WDemCcAp2vFBfp/m4EAhWJnUNSSw0ixs7/kXbd6Pg64EmwJkNdFhB8aWt1sH2CTXrLxo/iAGV3oPQ=="], + "socks-proxy-agent/socks/ip-address": ["ip-address@10.1.0", "", {}, "sha512-XXADHxXmvT9+CRxhXg56LJovE+bmWnEWB78LB83VZTprKTmaC5QfruXocxzTZ2Kl0DNwKuBdlIhjL8LeY8Sf8Q=="], + + "type-is/mime-types/mime-db": ["mime-db@1.54.0", "", {}, "sha512-aU5EJuIN2WDemCcAp2vFBfp/m4EAhWJnUNSSw0ixs7/kXbd6Pg64EmwJkNdFhB8aWt1sH2CTXrLxo/iAGV3oPQ=="], + + "xmlbuilder2/@oozcitak/dom/@oozcitak/util": ["@oozcitak/util@8.0.0", "", {}, "sha512-+9Hq6yuoq/3TRV/n/xcpydGBq2qN2/DEDMqNTG7rm95K6ZE2/YY/sPyx62+1n8QsE9O26e5M1URlXsk+AnN9Jw=="], } } diff --git a/diagram/SKILL.md b/diagram/SKILL.md new file mode 100644 index 000000000..1ea8fd1a2 --- /dev/null +++ b/diagram/SKILL.md @@ -0,0 +1,881 @@ +--- +name: diagram +version: 1.0.0 +description: "Turn an English description (or mermaid source) into a diagram triplet: the source, an editable .excalidraw file you can open (gstack)" +allowed-tools: + - Bash + - Read + - Write + - AskUserQuestion +triggers: + - make a diagram + - draw a diagram + - create a flowchart + - diagram this + - visualize this flow + - architecture diagram +--- + + + + +## When to invoke this skill + +on excalidraw.com, +and rendered SVG + PNG (clean mermaid style; the .excalidraw carries the +hand-drawn aesthetic). Fully offline. +Use when asked to "make a diagram", "draw the architecture", "create a +flowchart", "diagram this", or "visualize this flow". + +## Preamble (run first) + +```bash +_UPD=$(~/.claude/skills/gstack/bin/gstack-update-check 2>/dev/null || .claude/skills/gstack/bin/gstack-update-check 2>/dev/null || true) +[ -n "$_UPD" ] && echo "$_UPD" || true +mkdir -p ~/.gstack/sessions +touch ~/.gstack/sessions/"$PPID" +_SESSIONS=$(find ~/.gstack/sessions -mmin -120 -type f 2>/dev/null | wc -l | tr -d ' ') +find ~/.gstack/sessions -mmin +120 -type f -exec rm {} + 2>/dev/null || true +_PROACTIVE=$(~/.claude/skills/gstack/bin/gstack-config get proactive 2>/dev/null || echo "true") +_PROACTIVE_PROMPTED=$([ -f ~/.gstack/.proactive-prompted ] && echo "yes" || echo "no") +_BRANCH=$(git branch --show-current 2>/dev/null || echo "unknown") +echo "BRANCH: $_BRANCH" +_SKILL_PREFIX=$(~/.claude/skills/gstack/bin/gstack-config get skill_prefix 2>/dev/null || echo "false") +echo "PROACTIVE: $_PROACTIVE" +echo "PROACTIVE_PROMPTED: $_PROACTIVE_PROMPTED" +echo "SKILL_PREFIX: $_SKILL_PREFIX" +source <(~/.claude/skills/gstack/bin/gstack-repo-mode 2>/dev/null) || true +REPO_MODE=${REPO_MODE:-unknown} +echo "REPO_MODE: $REPO_MODE" +_SESSION_KIND=$(~/.claude/skills/gstack/bin/gstack-session-kind 2>/dev/null || echo "interactive") +case "$_SESSION_KIND" in spawned|headless|interactive) ;; *) _SESSION_KIND="interactive" ;; esac +echo "SESSION_KIND: $_SESSION_KIND" +_LAKE_SEEN=$([ -f ~/.gstack/.completeness-intro-seen ] && echo "yes" || echo "no") +echo "LAKE_INTRO: $_LAKE_SEEN" +_TEL=$(~/.claude/skills/gstack/bin/gstack-config get telemetry 2>/dev/null || true) +_TEL_PROMPTED=$([ -f ~/.gstack/.telemetry-prompted ] && echo "yes" || echo "no") +_TEL_START=$(date +%s) +_SESSION_ID="$$-$(date +%s)" +echo "TELEMETRY: ${_TEL:-off}" +echo "TEL_PROMPTED: $_TEL_PROMPTED" +_EXPLAIN_LEVEL=$(~/.claude/skills/gstack/bin/gstack-config get explain_level 2>/dev/null || echo "default") +if [ "$_EXPLAIN_LEVEL" != "default" ] && [ "$_EXPLAIN_LEVEL" != "terse" ]; then _EXPLAIN_LEVEL="default"; fi +echo "EXPLAIN_LEVEL: $_EXPLAIN_LEVEL" +_QUESTION_TUNING=$(~/.claude/skills/gstack/bin/gstack-config get question_tuning 2>/dev/null || echo "false") +echo "QUESTION_TUNING: $_QUESTION_TUNING" +mkdir -p ~/.gstack/analytics +if [ "$_TEL" != "off" ]; then +echo '{"skill":"diagram","ts":"'$(date -u +%Y-%m-%dT%H:%M:%SZ)'","repo":"'$(_repo=$(basename "$(git rev-parse --show-toplevel 2>/dev/null)" 2>/dev/null | tr -cd 'a-zA-Z0-9._-'); echo "${_repo:-unknown}")'"}' >> ~/.gstack/analytics/skill-usage.jsonl 2>/dev/null || true +fi +for _PF in $(find ~/.gstack/analytics -maxdepth 1 -name '.pending-*' 2>/dev/null); do + if [ -f "$_PF" ]; then + if [ "$_TEL" != "off" ] && [ -x "~/.claude/skills/gstack/bin/gstack-telemetry-log" ]; then + ~/.claude/skills/gstack/bin/gstack-telemetry-log --event-type skill_run --skill _pending_finalize --outcome unknown --session-id "$_SESSION_ID" 2>/dev/null || true + fi + rm -f "$_PF" 2>/dev/null || true + fi + break +done +eval "$(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null)" 2>/dev/null || true +_LEARN_FILE="${GSTACK_HOME:-$HOME/.gstack}/projects/${SLUG:-unknown}/learnings.jsonl" +if [ -f "$_LEARN_FILE" ]; then + _LEARN_COUNT=$(wc -l < "$_LEARN_FILE" 2>/dev/null | tr -d ' ') + echo "LEARNINGS: $_LEARN_COUNT entries loaded" + if [ "$_LEARN_COUNT" -gt 5 ] 2>/dev/null; then + ~/.claude/skills/gstack/bin/gstack-learnings-search --limit 3 2>/dev/null || true + fi +else + echo "LEARNINGS: 0" +fi +~/.claude/skills/gstack/bin/gstack-timeline-log '{"skill":"diagram","event":"started","branch":"'"$_BRANCH"'","session":"'"$_SESSION_ID"'"}' 2>/dev/null & +_HAS_ROUTING="no" +if [ -f CLAUDE.md ] && grep -q "## Skill routing" CLAUDE.md 2>/dev/null; then + _HAS_ROUTING="yes" +fi +_ROUTING_DECLINED=$(~/.claude/skills/gstack/bin/gstack-config get routing_declined 2>/dev/null || echo "false") +echo "HAS_ROUTING: $_HAS_ROUTING" +echo "ROUTING_DECLINED: $_ROUTING_DECLINED" +_VENDORED="no" +if [ -d ".claude/skills/gstack" ] && [ ! -L ".claude/skills/gstack" ]; then + if [ -f ".claude/skills/gstack/VERSION" ] || [ -d ".claude/skills/gstack/.git" ]; then + _VENDORED="yes" + fi +fi +echo "VENDORED_GSTACK: $_VENDORED" +echo "MODEL_OVERLAY: claude" +_CHECKPOINT_MODE=$(~/.claude/skills/gstack/bin/gstack-config get checkpoint_mode 2>/dev/null || echo "explicit") +_CHECKPOINT_PUSH=$(~/.claude/skills/gstack/bin/gstack-config get checkpoint_push 2>/dev/null || echo "false") +echo "CHECKPOINT_MODE: $_CHECKPOINT_MODE" +echo "CHECKPOINT_PUSH: $_CHECKPOINT_PUSH" +# Plan-mode hint for skills like /spec that branch behavior on plan-mode state. +# Claude Code exposes plan mode via system reminders; we detect best-effort +# from CLAUDE_PLAN_FILE (set by the harness when plan mode is active) and +# fall back to "inactive". Codex hosts and Claude execution mode both end up +# inactive, which is the safe default (defaults to file+execute pipeline). +if [ -n "${CLAUDE_PLAN_FILE:-}${GSTACK_PLAN_MODE_FORCE:-}" ]; then + export GSTACK_PLAN_MODE="active" +elif [ "${GSTACK_PLAN_MODE:-}" = "active" ]; then + export GSTACK_PLAN_MODE="active" +else + export GSTACK_PLAN_MODE="inactive" +fi +echo "GSTACK_PLAN_MODE: $GSTACK_PLAN_MODE" +[ -n "$OPENCLAW_SESSION" ] && echo "SPAWNED_SESSION: true" || true +``` + +## Plan Mode Safe Operations + +In plan mode, allowed because they inform the plan: `$B`, `$D`, `codex exec`/`codex review`, writes to `~/.gstack/`, writes to the plan file, and `open` for generated artifacts. + +## Skill Invocation During Plan Mode + +If the user invokes a skill in plan mode, the skill takes precedence over generic plan mode behavior. **Treat the skill file as executable instructions, not reference.** Follow it step by step starting from Step 0; the first AskUserQuestion is the workflow entering plan mode, not a violation of it. AskUserQuestion (any variant — `mcp__*__AskUserQuestion` or native; see "AskUserQuestion Format → Tool resolution") satisfies plan mode's end-of-turn requirement. If AskUserQuestion is unavailable or a call fails, follow the AskUserQuestion Format failure fallback: `headless` → BLOCKED; `interactive` → the prose fallback (also satisfies end-of-turn). At a STOP point, stop immediately. Do not continue the workflow or call ExitPlanMode there. Commands marked "PLAN MODE EXCEPTION — ALWAYS RUN" execute. Call ExitPlanMode only after the skill workflow completes, or if the user tells you to cancel the skill or leave plan mode. + +If `PROACTIVE` is `"false"`, do not auto-invoke or proactively suggest skills. If a skill seems useful, ask: "I think /skillname might help here — want me to run it?" + +If `SKILL_PREFIX` is `"true"`, suggest/invoke `/gstack-*` names. Disk paths stay `~/.claude/skills/gstack/[skill-name]/SKILL.md`. + +If output shows `UPGRADE_AVAILABLE `: read `~/.claude/skills/gstack/gstack-upgrade/SKILL.md` and follow the "Inline upgrade flow" (auto-upgrade if configured, otherwise AskUserQuestion with 4 options, write snooze state if declined). + +If output shows `JUST_UPGRADED `: print "Running gstack v{to} (just updated!)". If `SPAWNED_SESSION` is true, skip feature discovery. + +Feature discovery, max one prompt per session: +- Missing `~/.claude/skills/gstack/.feature-prompted-continuous-checkpoint`: AskUserQuestion for Continuous checkpoint auto-commits. If accepted, run `~/.claude/skills/gstack/bin/gstack-config set checkpoint_mode continuous`. Always touch marker. +- Missing `~/.claude/skills/gstack/.feature-prompted-model-overlay`: inform "Model overlays are active. MODEL_OVERLAY shows the patch." Always touch marker. + +After upgrade prompts, continue workflow. + +If `WRITING_STYLE_PENDING` is `yes`: ask once about writing style: + +> v1 prompts are simpler: first-use jargon glosses, outcome-framed questions, shorter prose. Keep default or restore terse? + +Options: +- A) Keep the new default (recommended — good writing helps everyone) +- B) Restore V0 prose — set `explain_level: terse` + +If A: leave `explain_level` unset (defaults to `default`). +If B: run `~/.claude/skills/gstack/bin/gstack-config set explain_level terse`. + +Always run (regardless of choice): +```bash +rm -f ~/.gstack/.writing-style-prompt-pending +touch ~/.gstack/.writing-style-prompted +``` + +Skip if `WRITING_STYLE_PENDING` is `no`. + +If `LAKE_INTRO` is `no`: say "gstack follows the **Boil the Ocean** principle — do the complete thing when AI makes marginal cost near-zero. Read more: https://garryslist.org/posts/boil-the-ocean" Offer to open: + +```bash +open https://garryslist.org/posts/boil-the-ocean +touch ~/.gstack/.completeness-intro-seen +``` + +Only run `open` if yes. Always run `touch`. + +If `TEL_PROMPTED` is `no` AND `LAKE_INTRO` is `yes`: ask telemetry once via AskUserQuestion: + +> Help gstack get better. Share usage data only: skill, duration, crashes, stable device ID. No code or file paths. Your repo name is recorded locally only and stripped before any upload. + +Options: +- A) Help gstack get better! (recommended) +- B) No thanks + +If A: run `~/.claude/skills/gstack/bin/gstack-config set telemetry community` + +If B: ask follow-up: + +> Anonymous mode sends only aggregate usage, no unique ID. + +Options: +- A) Sure, anonymous is fine +- B) No thanks, fully off + +If B→A: run `~/.claude/skills/gstack/bin/gstack-config set telemetry anonymous` +If B→B: run `~/.claude/skills/gstack/bin/gstack-config set telemetry off` + +Always run: +```bash +touch ~/.gstack/.telemetry-prompted +``` + +Skip if `TEL_PROMPTED` is `yes`. + +If `PROACTIVE_PROMPTED` is `no` AND `TEL_PROMPTED` is `yes`: ask once: + +> Let gstack proactively suggest skills, like /qa for "does this work?" or /investigate for bugs? + +Options: +- A) Keep it on (recommended) +- B) Turn it off — I'll type /commands myself + +If A: run `~/.claude/skills/gstack/bin/gstack-config set proactive true` +If B: run `~/.claude/skills/gstack/bin/gstack-config set proactive false` + +Always run: +```bash +touch ~/.gstack/.proactive-prompted +``` + +Skip if `PROACTIVE_PROMPTED` is `yes`. + +If `HAS_ROUTING` is `no` AND `ROUTING_DECLINED` is `false` AND `PROACTIVE_PROMPTED` is `yes`: +Check if a CLAUDE.md file exists in the project root. If it does not exist, create it. + +Use AskUserQuestion: + +> gstack works best when your project's CLAUDE.md includes skill routing rules. + +Options: +- A) Add routing rules to CLAUDE.md (recommended) +- B) No thanks, I'll invoke skills manually + +If A: Append this section to the end of CLAUDE.md: + +```markdown + +## Skill routing + +When the user's request matches an available skill, invoke it via the Skill tool. When in doubt, invoke the skill. + +Key routing rules: +- Product ideas/brainstorming → invoke /office-hours +- Strategy/scope → invoke /plan-ceo-review +- Architecture → invoke /plan-eng-review +- Design system/plan review → invoke /design-consultation or /plan-design-review +- Full review pipeline → invoke /autoplan +- Bugs/errors → invoke /investigate +- QA/testing site behavior → invoke /qa or /qa-only +- Code review/diff check → invoke /review +- Visual polish → invoke /design-review +- Ship/deploy/PR → invoke /ship or /land-and-deploy +- Save progress → invoke /context-save +- Resume context → invoke /context-restore +- Author a backlog-ready spec/issue → invoke /spec +``` + +Then commit the change: `git add CLAUDE.md && git commit -m "chore: add gstack skill routing rules to CLAUDE.md"` + +If B: run `~/.claude/skills/gstack/bin/gstack-config set routing_declined true` and say they can re-enable with `gstack-config set routing_declined false`. + +This only happens once per project. Skip if `HAS_ROUTING` is `yes` or `ROUTING_DECLINED` is `true`. + +If `VENDORED_GSTACK` is `yes`, warn once via AskUserQuestion unless `~/.gstack/.vendoring-warned-$SLUG` exists: + +> This project has gstack vendored in `.claude/skills/gstack/`. Vendoring is deprecated. +> Migrate to team mode? + +Options: +- A) Yes, migrate to team mode now +- B) No, I'll handle it myself + +If A: +1. Run `git rm -r .claude/skills/gstack/` +2. Run `echo '.claude/skills/gstack/' >> .gitignore` +3. Run `~/.claude/skills/gstack/bin/gstack-team-init required` (or `optional`) +4. Run `git add .claude/ .gitignore CLAUDE.md && git commit -m "chore: migrate gstack from vendored to team mode"` +5. Tell the user: "Done. Each developer now runs: `cd ~/.claude/skills/gstack && ./setup --team`" + +If B: say "OK, you're on your own to keep the vendored copy up to date." + +Always run (regardless of choice): +```bash +eval "$(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null)" 2>/dev/null || true +touch ~/.gstack/.vendoring-warned-${SLUG:-unknown} +``` + +If marker exists, skip. + +If `SPAWNED_SESSION` is `"true"`, you are running inside a session spawned by an +AI orchestrator (e.g., OpenClaw). In spawned sessions: +- Do NOT use AskUserQuestion for interactive prompts. Auto-choose the recommended option. +- Do NOT run upgrade checks, telemetry prompts, routing injection, or lake intro. +- Focus on completing the task and reporting results via prose output. +- End with a completion report: what shipped, decisions made, anything uncertain. + +## AskUserQuestion Format + +### Tool resolution (read first) + +"AskUserQuestion" can resolve to two tools at runtime: the **host MCP variant** (e.g. `mcp__conductor__AskUserQuestion` — appears in your tool list when the host registers it) or the **native** Claude Code tool. + +**Rule:** if any `mcp__*__AskUserQuestion` variant is in your tool list, prefer it. Hosts may disable native AUQ via `--disallowedTools AskUserQuestion` (Conductor does, by default) and route through their MCP variant; calling native there silently fails. Same questions/options shape; same decision-brief format applies. + +If AskUserQuestion is unavailable (no variant in your tool list) OR a call to it fails, do NOT silently auto-decide or write the decision to the plan file as a substitute. Follow the **failure fallback** below. + +### When AskUserQuestion is unavailable or a call fails + +Tell three outcomes apart: + +1. **Auto-decide denial (NOT a failure).** The result contains `[plan-tune auto-decide]

loading
+ + + diff --git a/lib/diagram-render/package.json b/lib/diagram-render/package.json new file mode 100644 index 000000000..f164ba704 --- /dev/null +++ b/lib/diagram-render/package.json @@ -0,0 +1,16 @@ +{ + "name": "@gstack/diagram-render", + "private": true, + "type": "module", + "description": "Offline diagram-render bundle: mermaid + excalidraw export + mermaid-to-excalidraw, built into a single self-contained HTML page loaded by the browse daemon. Versions are exact-pinned; bump them only via scripts/build.ts (see README.md).", + "scripts": { + "build": "bun run scripts/build.ts" + }, + "dependencies": { + "@excalidraw/excalidraw": "0.18.0", + "@excalidraw/mermaid-to-excalidraw": "1.1.2", + "mermaid": "11.12.2", + "react": "18.3.1", + "react-dom": "18.3.1" + } +} diff --git a/lib/diagram-render/scripts/build.ts b/lib/diagram-render/scripts/build.ts new file mode 100644 index 000000000..2bd2caf99 --- /dev/null +++ b/lib/diagram-render/scripts/build.ts @@ -0,0 +1,99 @@ +/** + * Build dist/diagram-render.html — the single-file offline render page. + * + * One command updates everything: `bun run build` (in this directory) or + * `bun run build:diagram-render` (repo root). To bump a dependency: edit the + * exact pin in package.json, `bun install`, rebuild, commit src + dist + + * BUILD_INFO.json together. The drift test (test/diagram-render-drift.test.ts) + * fails CI when dist and BUILD_INFO disagree. + * + * Page assembly notes (learned in the spike, do not "simplify" away): + * - The script MUST be `type="module"` — mermaid's bundle contains + * import.meta, which throws in a classic script. + * - ` terminates early ("Unexpected end of input"). + * - A with an absolute URL is required: the page lives at + * about:blank (page.setContent), where relative URL construction throws. + */ +import { createHash } from "node:crypto"; +import path from "node:path"; + +const ROOT = path.resolve(import.meta.dir, ".."); +const ENTRY = path.join(ROOT, "src", "entry.ts"); +const DIST_DIR = path.join(ROOT, "dist"); +const DIST_HTML = path.join(DIST_DIR, "diagram-render.html"); +const BUILD_INFO = path.join(DIST_DIR, "BUILD_INFO.json"); + +const pkg = await Bun.file(path.join(ROOT, "package.json")).json(); +const deps: Record = pkg.dependencies; + +const result = await Bun.build({ + entrypoints: [ENTRY], + target: "browser", + minify: true, + define: { + __BUNDLE_INFO_DEPS__: JSON.stringify(deps), + "process.env.NODE_ENV": '"production"', + }, +}); +if (!result.success) { + for (const log of result.logs) console.error(log); + process.exit(1); +} +const js = await result.outputs[0].text(); + +// Escape inline-script terminators (see header note). +const inlineJs = js.replaceAll(" + + + + +gstack diagram-render + + + + +
loading
+ + + +`; + +const html = head + inlineJs + tail; +await Bun.write(DIST_HTML, html); + +const sha256 = createHash("sha256").update(html).digest("hex"); +// Source fingerprint: lets the drift test catch "edited src, forgot to +// rebuild dist" WITHOUT needing node_modules for a full rebuild (the deep +// rebuild check only runs where deps are installed). +const srcSha256 = createHash("sha256") + .update(await Bun.file(ENTRY).text()) + .update(await Bun.file(import.meta.path).text()) + .digest("hex"); +const info = { + name: "gstack-diagram-render", + sha256, + srcSha256, + bytes: Buffer.byteLength(html), + bunVersion: Bun.version, + deps, +}; +await Bun.write(BUILD_INFO, JSON.stringify(info, null, 2) + "\n"); + +console.log(`built ${path.relative(process.cwd(), DIST_HTML)} (${(info.bytes / 1024 / 1024).toFixed(2)} MB)`); +console.log(`sha256 ${sha256}`); diff --git a/lib/diagram-render/src/entry.ts b/lib/diagram-render/src/entry.ts new file mode 100644 index 000000000..503c23ed3 --- /dev/null +++ b/lib/diagram-render/src/entry.ts @@ -0,0 +1,215 @@ +/** + * diagram-render bundle entry. + * + * Built into a single self-contained HTML page (dist/diagram-render.html) that + * make-pdf and /diagram load into a browse daemon tab via `load-html`. Every + * capability is exposed as a window.__* function and driven through `browse js`; + * binary results return as data URLs that `js --out` decodes to bytes on disk. + * + * page lifecycle (one tab per make-pdf run, reused across fences): + * load-html dist copy ─▶ poll #status == "ready" ─▶ N × __renderMermaid/ + * __excalidrawToSvg/__rasterize ─▶ close tab (orchestrator finally) + * render error ─▶ caller reloads the page before the next fence + * (reset contract: no poisoned mermaid global survives, eng-review D6.2) + * + * Render contract (eng-review D3): + * - securityLevel "strict": no click callbacks, no HTML label injection in + * this tab. The make-pdf sanitizer is the second defense layer downstream. + * - Callers pass a unique id per fence (mermaid-fence-); mermaid bakes it + * into every internal SVG id, so two diagrams inlined into one document + * can't collide on gradients/markers. + * - Font stacks mirror make-pdf/src/print-css.ts so text measured here lays + * out identically in the printed document. + * - htmlLabels false: foreignObject labels taint canvases (blocks toDataURL + * rasterization) and break when the SVG is inlined into another document. + */ +import mermaid from "mermaid"; +import { parseMermaidToExcalidraw } from "@excalidraw/mermaid-to-excalidraw"; +import { convertToExcalidrawElements, exportToSvg } from "@excalidraw/excalidraw"; + +declare global { + interface Window { + __bundleInfo: { name: string; deps: Record }; + __renderMermaid: (id: string, text: string) => Promise; + __mermaidToExcalidraw: (text: string) => Promise; + __excalidrawToSvg: (sceneJson: string) => Promise; + __rasterize: (svgText: string, targetWidthPx: number) => Promise; + __downscaleRaster: (dataUri: string, targetWidthPx: number, mime: string) => Promise; + __mountForScreenshot: (svgText: string, targetWidthPx: number) => string; + __probeImage: (src: string) => Promise; + EXCALIDRAW_ASSET_PATH?: string; + __errors: string[]; + } +} + +// Excalidraw's font registry builds URLs from this against the document base. +// The host must be absolute and never resolves — the page is offline by design; +// exportToSvg embeds the bundled Excalifont glyphs without fetching. +window.EXCALIDRAW_ASSET_PATH = "https://gstack-render.localhost/excalidraw-assets/"; + +// Font stacks must match make-pdf/src/print-css.ts (sans + CJK + emoji) so +// mermaid's text measurement in this tab matches the print document's layout. +const PRINT_SANS = + 'Helvetica, "Liberation Sans", Arial, "Hiragino Kaku Gothic ProN", ' + + '"Noto Sans CJK JP", "Microsoft YaHei", "Apple Color Emoji", ' + + '"Segoe UI Emoji", "Noto Color Emoji", sans-serif'; + +mermaid.initialize({ + startOnLoad: false, + securityLevel: "strict", + theme: "neutral", + fontFamily: PRINT_SANS, + htmlLabels: false, + flowchart: { htmlLabels: false }, +}); + +window.__renderMermaid = async (id: string, text: string): Promise => { + if (!/^[A-Za-z][\w-]*$/.test(id)) throw new Error(`invalid mermaid render id: ${id}`); + const { svg } = await mermaid.render(id, text); + return svg; +}; + +window.__mermaidToExcalidraw = async (text: string): Promise => { + const { elements, files } = await parseMermaidToExcalidraw(text); + const converted = convertToExcalidrawElements(elements); + const scene = { + type: "excalidraw", + version: 2, + source: "gstack-diagram-render", + elements: converted, + appState: { viewBackgroundColor: "#ffffff" }, + files: files ?? {}, + }; + return JSON.stringify(scene); +}; + +window.__excalidrawToSvg = async (sceneJson: string): Promise => { + const scene = JSON.parse(sceneJson); + if (!Array.isArray(scene.elements)) throw new Error("excalidraw scene has no elements array"); + const svg = await exportToSvg({ + elements: scene.elements, + appState: { ...(scene.appState ?? {}), exportBackground: true }, + files: scene.files ?? null, + exportPadding: 16, + }); + return new XMLSerializer().serializeToString(svg); +}; + +/** + * SVG → PNG data URL at an explicit pixel width. Callers own the DPI math: + * targetWidthPx = placed physical width (in) × 300dpi (eng-review D6.5) — + * the bundle never guesses a viewport. + */ +/** Shared ceiling for rasterization targets (both window functions). */ +const MAX_TARGET_PX = 10_000; +function assertTargetWidth(px: number): void { + if (!(px > 0 && px <= MAX_TARGET_PX)) { + throw new Error(`targetWidthPx out of range: ${px}`); + } +} + +window.__rasterize = async (svgText: string, targetWidthPx: number): Promise => { + assertTargetWidth(targetWidthPx); + const blob = new Blob([svgText], { type: "image/svg+xml;charset=utf-8" }); + const url = URL.createObjectURL(blob); + try { + const img = new Image(); + await new Promise((resolve, reject) => { + img.onload = () => resolve(); + img.onerror = () => reject(new Error("SVG image decode failed (malformed SVG or foreignObject content)")); + img.src = url; + }); + const naturalW = img.naturalWidth || 800; + const naturalH = img.naturalHeight || 600; + const scale = targetWidthPx / naturalW; + const canvas = document.createElement("canvas"); + canvas.width = Math.round(naturalW * scale); + canvas.height = Math.round(naturalH * scale); + const ctx = canvas.getContext("2d"); + if (!ctx) throw new Error("2d canvas context unavailable"); + ctx.fillStyle = "#ffffff"; + ctx.fillRect(0, 0, canvas.width, canvas.height); + ctx.drawImage(img, 0, 0, canvas.width, canvas.height); + // Throws on tainted canvas — callers fall back to __mountForScreenshot + + // `browse screenshot --selector "#raster-stage"`. + return canvas.toDataURL("image/png"); + } finally { + URL.revokeObjectURL(url); + } +}; + +/** + * Fallback rasterization stage: mount the SVG in the DOM so the caller can + * take an element screenshot (no canvas, no taint rules). Returns a marker + * string; the artifact is the screenshot, not the return value. + */ +window.__mountForScreenshot = (svgText: string, targetWidthPx: number): string => { + document.getElementById("raster-stage")?.remove(); + const stage = document.createElement("div"); + stage.id = "raster-stage"; + stage.style.cssText = `display:inline-block;background:#fff;width:${targetWidthPx}px`; + stage.innerHTML = svgText; + const svg = stage.querySelector("svg"); + if (svg) { + svg.setAttribute("width", String(targetWidthPx)); + svg.removeAttribute("height"); + svg.style.height = "auto"; + } + document.body.appendChild(stage); + return `mounted:${targetWidthPx}`; +}; + +/** + * Downscale a raster image (data URI) to targetWidthPx, preserving aspect. + * Re-encodes in the requested mime — JPEG photos stay JPEG (q0.9); PNG-encoding + * a photo would bloat it past the original. Data URIs are same-origin, so the + * canvas never taints. + */ +window.__downscaleRaster = async ( + dataUri: string, + targetWidthPx: number, + mime: string, +): Promise => { + assertTargetWidth(targetWidthPx); + const img = new Image(); + await new Promise((resolve, reject) => { + img.onload = () => resolve(); + img.onerror = () => reject(new Error("image decode failed")); + img.src = dataUri; + }); + const scale = targetWidthPx / (img.naturalWidth || targetWidthPx); + const canvas = document.createElement("canvas"); + canvas.width = Math.round(img.naturalWidth * scale); + canvas.height = Math.round(img.naturalHeight * scale); + const ctx = canvas.getContext("2d"); + if (!ctx) throw new Error("2d canvas context unavailable"); + ctx.drawImage(img, 0, 0, canvas.width, canvas.height); + const outMime = mime === "image/jpeg" ? "image/jpeg" : "image/png"; + return outMime === "image/jpeg" ? canvas.toDataURL(outMime, 0.9) : canvas.toDataURL(outMime); +}; + +/** Probe intrinsic dimensions of an image (data URI or URL). Returns JSON. */ +window.__probeImage = async (src: string): Promise => { + const img = new Image(); + await new Promise((resolve, reject) => { + img.onload = () => resolve(); + img.onerror = () => reject(new Error("image decode failed")); + img.src = src; + }); + return JSON.stringify({ width: img.naturalWidth, height: img.naturalHeight }); +}; + +// __BUNDLE_INFO__ is replaced at build time with the pinned dependency map. +window.__bundleInfo = { name: "gstack-diagram-render", deps: __BUNDLE_INFO_DEPS__ }; + +// Readiness signal: pollable text beats a bare invisible div (Playwright's +// visibility-based `wait` never fires on an empty element). +const status = document.getElementById("status"); +if (status) status.textContent = "ready"; +const done = document.createElement("div"); +done.id = "done"; +done.textContent = "ready"; +done.style.cssText = "position:absolute;left:-9999px"; +document.body.appendChild(done); + +declare const __BUNDLE_INFO_DEPS__: Record; diff --git a/make-pdf/SKILL.md b/make-pdf/SKILL.md index 9205cda58..252336d7e 100644 --- a/make-pdf/SKILL.md +++ b/make-pdf/SKILL.md @@ -598,6 +598,79 @@ as you edit the markdown. Skip the PDF round trip until you're ready. $P generate --no-confidential memo.md memo.pdf ``` +### Diagrams — mermaid and excalidraw fences render as pictures + +A column-0 ` ```mermaid ` or ` ```excalidraw ` fence in the markdown renders +as a crisp vector diagram, fully offline (vendored bundle, no CDN). Indented +fences (inside lists) stay plain code blocks by design. A broken fence +produces a visible red diagnostic block with the parse error — never silent +raw code. + +Fence info-string options: + +``` +```mermaid title="Auth flow" ← caption + aria-label +```mermaid render=false ← keep it as a code block (today's behavior) +```mermaid page=landscape ← force this diagram onto a landscape page +```mermaid page=portrait ← veto auto-landscape for this diagram +``` + +A ` ```excalidraw ` fence contains a full .excalidraw scene file (what +excalidraw.com saves). Authoring NEW diagrams from English is `/diagram`'s +job — it emits an editable triplet (source, .excalidraw, SVG/PNG) and pairs +with this skill: embed the `.mmd` source in your markdown, not the PNG. + +### Images — scaled right, never truncated + +Local images inline automatically (relative paths resolve against the +markdown file). Every image caps at the content box — zero truncation, ever. +Oversized photos downscale to print resolution (300dpi) so payloads stay +small with no visible quality loss. + +Remote (http/https) images are **blocked with a visible placeholder** by +default — offline posture; pass `--allow-network` to fetch them. An image +that resolves outside the markdown's directory (even via symlink) still +inlines, but warns loudly; `--strict` makes it fatal. Files over 64MB or +non-regular files (fifos, devices) degrade to a placeholder instead of +hanging the run. + +Per-image directives, written immediately after the image: + +``` +![chart](data.png){width=full} ← stretch to content-box width +![chart](data.png){width=50%} ← percentage or 3in/8cm/200px +![wide](arch.png){page=landscape} ← give it its own landscape page +![wide](shot.png){page=portrait} ← veto auto-landscape +``` + +Wide, small-text diagram images auto-promote to their own landscape page +(conservative: aspect ≥ 1.8, width over ~2.5x the content box, AND a +diagram-ish alt word — diagram/architecture/flowchart/chart/graph). The +promoted page is vertically centered. When the heuristic guesses wrong, +`{page=portrait}` vetoes it; false negatives just need `{page=landscape}`. + +### Other formats — single-file HTML and Word + +```bash +$P generate readme.md out.html --to html # ONE self-contained file: inline + # SVG diagrams, data-URI images, + # zero network refs, screen-readable +$P generate readme.md out.docx --to docx # Word: content fidelity (headings, + # tables, code, diagrams as PNG) — + # layout is Word's, not ours +``` + +`--to` is the output format. `--format` is something else entirely (a +`--page-size` alias) — don't confuse them. + +### CI mode — fail loud on missing assets + +```bash +$P generate docs.md --strict # missing, remote, out-of-tree, oversized, + # and non-regular-file images exit non-zero + # instead of warn + placeholder +``` + ## Common flags ``` @@ -617,6 +690,10 @@ Branding: --no-confidential Suppress the CONFIDENTIAL right-footer Output: + --to pdf|html|docx Output format (default: pdf). html = single + self-contained file; docx = content fidelity. + --strict Missing, remote, out-of-tree, oversized, or + non-regular-file images fail the run (CI mode). --page-numbers "N of M" footer (default on) --tagged Accessible PDF (default on) --outline PDF bookmarks from headings (default on) @@ -624,8 +701,9 @@ Output: --verbose Per-stage timings Network: - --allow-network Fetch external images. Off by default - (blocks tracking pixels). + --allow-network Fetch external images. Off by default: remote + images render as a visible blocked placeholder + (no tracking pixels fetch at print time). Metadata: --title "..." Document title (defaults to first H1) @@ -653,8 +731,9 @@ If the user has a `.md` file open and says "make it look nice", propose `--no-syntax` once that flag exists. For now, remove fenced code blocks and regenerate. - Paged.js timeout → probably no headings in the markdown. Drop `--toc`. -- External image missing → add `--allow-network` (understand you're giving - the markdown file permission to fetch from its image URLs). +- "[remote image blocked]" placeholder in the output → add `--allow-network` + (understand you're giving the markdown file permission to fetch from its + image URLs). - Generated PDF too tall/wide → `--page-size a4` or `--margins 0.75in`. ## Output contract diff --git a/make-pdf/SKILL.md.tmpl b/make-pdf/SKILL.md.tmpl index bfd90441b..9133a711d 100644 --- a/make-pdf/SKILL.md.tmpl +++ b/make-pdf/SKILL.md.tmpl @@ -94,6 +94,79 @@ as you edit the markdown. Skip the PDF round trip until you're ready. $P generate --no-confidential memo.md memo.pdf ``` +### Diagrams — mermaid and excalidraw fences render as pictures + +A column-0 ` ```mermaid ` or ` ```excalidraw ` fence in the markdown renders +as a crisp vector diagram, fully offline (vendored bundle, no CDN). Indented +fences (inside lists) stay plain code blocks by design. A broken fence +produces a visible red diagnostic block with the parse error — never silent +raw code. + +Fence info-string options: + +``` +```mermaid title="Auth flow" ← caption + aria-label +```mermaid render=false ← keep it as a code block (today's behavior) +```mermaid page=landscape ← force this diagram onto a landscape page +```mermaid page=portrait ← veto auto-landscape for this diagram +``` + +A ` ```excalidraw ` fence contains a full .excalidraw scene file (what +excalidraw.com saves). Authoring NEW diagrams from English is `/diagram`'s +job — it emits an editable triplet (source, .excalidraw, SVG/PNG) and pairs +with this skill: embed the `.mmd` source in your markdown, not the PNG. + +### Images — scaled right, never truncated + +Local images inline automatically (relative paths resolve against the +markdown file). Every image caps at the content box — zero truncation, ever. +Oversized photos downscale to print resolution (300dpi) so payloads stay +small with no visible quality loss. + +Remote (http/https) images are **blocked with a visible placeholder** by +default — offline posture; pass `--allow-network` to fetch them. An image +that resolves outside the markdown's directory (even via symlink) still +inlines, but warns loudly; `--strict` makes it fatal. Files over 64MB or +non-regular files (fifos, devices) degrade to a placeholder instead of +hanging the run. + +Per-image directives, written immediately after the image: + +``` +![chart](data.png){width=full} ← stretch to content-box width +![chart](data.png){width=50%} ← percentage or 3in/8cm/200px +![wide](arch.png){page=landscape} ← give it its own landscape page +![wide](shot.png){page=portrait} ← veto auto-landscape +``` + +Wide, small-text diagram images auto-promote to their own landscape page +(conservative: aspect ≥ 1.8, width over ~2.5x the content box, AND a +diagram-ish alt word — diagram/architecture/flowchart/chart/graph). The +promoted page is vertically centered. When the heuristic guesses wrong, +`{page=portrait}` vetoes it; false negatives just need `{page=landscape}`. + +### Other formats — single-file HTML and Word + +```bash +$P generate readme.md out.html --to html # ONE self-contained file: inline + # SVG diagrams, data-URI images, + # zero network refs, screen-readable +$P generate readme.md out.docx --to docx # Word: content fidelity (headings, + # tables, code, diagrams as PNG) — + # layout is Word's, not ours +``` + +`--to` is the output format. `--format` is something else entirely (a +`--page-size` alias) — don't confuse them. + +### CI mode — fail loud on missing assets + +```bash +$P generate docs.md --strict # missing, remote, out-of-tree, oversized, + # and non-regular-file images exit non-zero + # instead of warn + placeholder +``` + ## Common flags ``` @@ -113,6 +186,10 @@ Branding: --no-confidential Suppress the CONFIDENTIAL right-footer Output: + --to pdf|html|docx Output format (default: pdf). html = single + self-contained file; docx = content fidelity. + --strict Missing, remote, out-of-tree, oversized, or + non-regular-file images fail the run (CI mode). --page-numbers "N of M" footer (default on) --tagged Accessible PDF (default on) --outline PDF bookmarks from headings (default on) @@ -120,8 +197,9 @@ Output: --verbose Per-stage timings Network: - --allow-network Fetch external images. Off by default - (blocks tracking pixels). + --allow-network Fetch external images. Off by default: remote + images render as a visible blocked placeholder + (no tracking pixels fetch at print time). Metadata: --title "..." Document title (defaults to first H1) @@ -149,8 +227,9 @@ If the user has a `.md` file open and says "make it look nice", propose `--no-syntax` once that flag exists. For now, remove fenced code blocks and regenerate. - Paged.js timeout → probably no headings in the markdown. Drop `--toc`. -- External image missing → add `--allow-network` (understand you're giving - the markdown file permission to fetch from its image URLs). +- "[remote image blocked]" placeholder in the output → add `--allow-network` + (understand you're giving the markdown file permission to fetch from its + image URLs). - Generated PDF too tall/wide → `--page-size a4` or `--margins 0.75in`. ## Output contract diff --git a/make-pdf/src/browseClient.ts b/make-pdf/src/browseClient.ts index 63cec7755..da25677e5 100644 --- a/make-pdf/src/browseClient.ts +++ b/make-pdf/src/browseClient.ts @@ -176,6 +176,9 @@ function runBrowse(args: string[]): string { encoding: "utf8", maxBuffer: 16 * 1024 * 1024, // 16MB; tab content can be large stdio: ["ignore", "pipe", "pipe"], + // A wedged daemon (or a hostile mermaid source spinning the renderer) + // must fail the run, not hang it forever. + timeout: 120_000, }); } catch (err: any) { const exitCode = typeof err.status === "number" ? err.status : 1; @@ -268,6 +271,17 @@ export function loadHtml(opts: LoadHtmlOptions): void { } } +/** + * Load an HTML file (already under browse's safe dirs, e.g. /tmp) into a tab + * by path. Cheaper than loadHtml for large pages — no JSON payload round-trip; + * browse reads the file directly (diagram-render bundle is ~9MB). + */ +export function loadHtmlFile(opts: { file: string; tabId: number; waitUntil?: "load" | "domcontentloaded" | "networkidle" }): void { + const args = ["load-html", opts.file, "--tab-id", String(opts.tabId)]; + if (opts.waitUntil) args.push("--wait-until", opts.waitUntil); + runBrowse(args); +} + /** * Evaluate a JS expression in a tab. Returns the serialized result as string. */ @@ -279,6 +293,19 @@ export function js(opts: JsOptions): string { ]).trim(); } +/** + * Evaluate a JS file in a tab (`browse eval `): the argv-safe transport + * for expressions too large for a command-line element. The file must live + * under browse's safe dirs (/tmp or cwd). + */ +export function evalFile(opts: { file: string; tabId: number }): string { + return runBrowse([ + "eval", + opts.file, + "--tab-id", String(opts.tabId), + ]).trim(); +} + /** * Poll a boolean JS expression until it evaluates to true, or timeout. * Returns true if it succeeded, false if timed out. @@ -300,9 +327,11 @@ export function waitForExpression(opts: { } const wait = Math.min(poll, Math.max(0, deadline - Date.now())); if (wait <= 0) break; - // Synchronous sleep is fine — this only runs once per PDF render - const end = Date.now() + wait; - while (Date.now() < end) { /* busy wait */ } + // Real sleep, not a busy-wait: this poll now runs on every diagram-render + // bundle load (and after every fence render error), exactly while Chromium + // is parsing a 9MB page on the same machine — spinning a core competes + // with the work being awaited. + Bun.sleepSync(wait); } return false; } diff --git a/make-pdf/src/cli.ts b/make-pdf/src/cli.ts index 62a3b948e..988e20444 100644 --- a/make-pdf/src/cli.ts +++ b/make-pdf/src/cli.ts @@ -64,9 +64,14 @@ function printUsage(): void { lines.push(` ${info.description}`); } lines.push(""); + lines.push("Output format:"); + lines.push(" --to pdf|html|docx What to produce (default: pdf)."); + lines.push(" html = single self-contained file, no network refs."); + lines.push(" docx = content fidelity, diagrams as PNG."); + lines.push(""); lines.push("Page layout:"); lines.push(" --margins All four margins (default: 1in). in, pt, cm, mm."); - lines.push(" --page-size letter|a4|legal (aliases: --format)"); + lines.push(" --page-size letter|a4|legal (aliases: --format — page SIZE, not output format)"); lines.push(""); lines.push("Document structure:"); lines.push(" --cover Add a cover page."); @@ -86,6 +91,12 @@ function printUsage(): void { lines.push(" --quiet Suppress progress on stderr."); lines.push(" --verbose Per-stage timings on stderr."); lines.push(""); + lines.push("Diagrams & images:"); + lines.push(" ```mermaid / ```excalidraw fences render as vector diagrams."); + lines.push(" Add render=false to a fence info string to keep it as a code block."); + lines.push(" Local images are inlined; oversized rasters downscale to print resolution."); + lines.push(" --strict Missing/remote images fail the run (CI mode)."); + lines.push(""); lines.push("Network:"); lines.push(" --allow-network Load external images (off by default)."); lines.push(""); @@ -112,9 +123,16 @@ function generateOptionsFromFlags(parsed: ParsedArgs): GenerateOptions { if (f[`no-${key}`] === true) return false; return def; }; + const to = typeof f.to === "string" ? f.to.toLowerCase() : "pdf"; + if (to !== "pdf" && to !== "html" && to !== "docx") { + console.error(`$P generate: invalid --to '${f.to}'. Expected pdf, html, or docx.`); + console.error("(--format is a --page-size alias, not the output format.)"); + process.exit(ExitCode.BadArgs); + } return { input: p[0], output: p[1], + to: to as GenerateOptions["to"], margins: f.margins as string | undefined, marginTop: f["margin-top"] as string | undefined, marginRight: f["margin-right"] as string | undefined, @@ -136,6 +154,7 @@ function generateOptionsFromFlags(parsed: ParsedArgs): GenerateOptions { quiet: f.quiet === true, verbose: f.verbose === true, allowNetwork: f["allow-network"] === true, + strict: f.strict === true, title: typeof f.title === "string" ? f.title : undefined, author: typeof f.author === "string" ? f.author : undefined, date: typeof f.date === "string" ? f.date : undefined, diff --git a/make-pdf/src/diagram-prepass.ts b/make-pdf/src/diagram-prepass.ts new file mode 100644 index 000000000..bf12249bb --- /dev/null +++ b/make-pdf/src/diagram-prepass.ts @@ -0,0 +1,846 @@ +/** + * Diagram + image pre-pass. Runs between "read markdown" and render() in the + * orchestrator, and owns everything that needs the diagram-render bundle. + * + * markdown ─▶ extractDiagramFences() ──▶ render() (marked+sanitize+smarty) + * │ fences → placeholder tokens │ + * │ ▼ + * └─▶ renderFenceSlots() ───────────▶ substituteSlots(html, slots) + * one browse render tab/run │ + * error ⇒ diagnostic block + page reload ▼ + * inlineLocalImages(html) + * data URIs, probe dims from bytes, + * downscale >2x content box @300dpi, + * remote warn / missing placeholder / + * --strict hard-fail + * + * Placeholders survive marked, the sanitizer, and smartypants because they are + * plain hyphenated lowercase tokens with no quotes or HTML. Slot HTML is run + * through the same sanitizer as user content before substitution (the bundle + * renders with securityLevel strict — the sanitizer is the second layer). + * + * Reset contract (eng-review D6.2): each fence renders with a fresh + * mermaid.render id; after ANY render error the bundle page is reloaded before + * the next fence so a poisoned global can't corrupt diagram N+1. + */ + +import * as fs from "node:fs"; +import * as os from "node:os"; +import * as path from "node:path"; +import * as crypto from "node:crypto"; +import { fileURLToPath } from "node:url"; + +import * as browseClient from "./browseClient"; +import { escapeHtml, sanitizeUntrustedHtml } from "./render"; +import { imageDims } from "./image-size"; + +// ─── Types ──────────────────────────────────────────────────────────── + +export interface DiagramFence { + /** "mermaid" | "excalidraw" */ + lang: string; + /** Fence body (the diagram source). */ + source: string; + /** Optional title="..." from the fence info string (a11y label, D6.4). */ + title?: string; + /** Optional page=landscape|portrait fence directive (image-policy override). */ + page?: "landscape" | "portrait"; + /** render=false → leave as a plain code block (escape hatch, D6.3). */ + render: boolean; + /** Placeholder token substituted into the markdown. */ + token: string; + /** 1-based ordinal among rendered fences (unique ids, aria fallback). */ + ordinal: number; +} + +export interface FenceExtraction { + markdown: string; + fences: DiagramFence[]; +} + +export interface PrepassWarnings { + warn: (msg: string) => void; +} + +export interface PrepassImageOptions { + /** Directory of the source markdown — relative image paths resolve here. */ + inputDir: string; + /** Hard-fail on missing/remote images instead of warn (D6.1). */ + strict: boolean; + /** Remote images are left untouched when network is explicitly allowed. */ + allowNetwork: boolean; + /** Physical content-box width in inches (page width minus margins). */ + contentWidthIn: number; + warn: (msg: string) => void; + /** Lazily provides a ready bundle tab (only opened when needed). */ + getTab: () => RenderTab | null; +} + +/** Print-resolution policy (eng-review D4): downscale rasters wider than + * 2 × contentWidth × 300dpi down to contentWidth × 300dpi. */ +const PRINT_DPI = 300; +const DOWNSCALE_FACTOR = 2; +/** Per-image read ceiling — bounds memory before any policy runs. */ +const MAX_IMAGE_BYTES = 64 * 1024 * 1024; + +export class StrictModeError extends Error { + constructor(msg: string) { + super(msg); + this.name = "StrictModeError"; + } +} + +// ─── Fence extraction (pure) ────────────────────────────────────────── + +const DIAGRAM_LANGS = new Set(["mermaid", "excalidraw"]); + +/** + * Extract column-0 ```mermaid / ```excalidraw fences, replacing each with a + * unique placeholder token paragraph. Backtick and tilde fences, any length + * >= 3; closers must be at least as long as the opener (CommonMark). Fences + * with `render=false` are left untouched. + * + * Two deliberate conservatisms (red-team finding — the original version + * reconstructed fences at column 0 and restructured lists): + * - Non-diagram fences replay as their ORIGINAL raw lines, byte-for-byte + * (only a render=false flag is removed, in place, preserving indent). + * - INDENTED diagram fences (inside lists/quotes) are NOT extracted — a + * column-0 placeholder would split the list. They replay verbatim as code. + */ +export function extractDiagramFences(markdown: string): FenceExtraction { + const lines = markdown.split("\n"); + const out: string[] = []; + const fences: DiagramFence[] = []; + const runId = crypto.randomBytes(4).toString("hex"); + + let i = 0; + let openFence: { + char: string; len: number; indent: number; info: string; + rawOpener: string; body: string[]; + } | null = null; + let ordinal = 0; + + while (i < lines.length) { + const line = lines[i]; + + if (openFence) { + const close = matchFenceLine(line); + if (close && close.char === openFence.char && close.len >= openFence.len && close.info === "") { + const info = parseInfoString(openFence.info); + if (DIAGRAM_LANGS.has(info.lang) && info.render && openFence.indent === 0) { + ordinal++; + const token = `gstack-diagram-slot-${runId}-${ordinal}`; + fences.push({ + lang: info.lang, + source: openFence.body.join("\n"), + title: info.title, + page: info.page, + render: true, + token, + ordinal, + }); + out.push("", token, ""); + } else { + // Not extracted (other language, render=false, or indented): replay + // the ORIGINAL lines verbatim; only strip a render=false flag. + out.push(stripRenderFalse(openFence.rawOpener)); + out.push(...openFence.body); + out.push(line); + } + openFence = null; + i++; + continue; + } + openFence.body.push(line); + i++; + continue; + } + + const open = matchFenceLine(line); + if (open && open.info !== "") { + openFence = { ...open, rawOpener: line, body: [] }; + i++; + continue; + } + if (open) { + // Anonymous fence (plain code block) — copy through to its closer so a + // ```mermaid example INSIDE a plain fence is never extracted. + out.push(line); + i++; + while (i < lines.length) { + const l = lines[i]; + const close = matchFenceLine(l); + out.push(l); + i++; + if (close && close.char === open.char && close.len >= open.len && close.info === "") break; + } + continue; + } + + out.push(line); + i++; + } + + // Unclosed fence at EOF: replay verbatim (CommonMark treats it as code to EOF). + if (openFence) { + out.push(openFence.rawOpener); + out.push(...openFence.body); + } + + return { markdown: out.join("\n"), fences }; +} + +function matchFenceLine(line: string): { char: string; len: number; indent: number; info: string } | null { + const m = line.match(/^( {0,3})(`{3,}|~{3,})\s*(.*)$/); + if (!m) return null; + return { indent: m[1].length, char: m[2][0], len: m[2].length, info: m[3].trim() }; +} + +/** Remove a render=false flag from a raw opener line, preserving everything else. */ +function stripRenderFalse(rawOpener: string): string { + return rawOpener.replace(/\s*\brender\s*=\s*false\b/i, ""); +} + +/** Parse a fence info string: `mermaid`, `mermaid render=false`, + * `mermaid title="Auth flow"`, `mermaid page=landscape`. */ +export function parseInfoString(info: string): { + lang: string; render: boolean; title?: string; page?: "landscape" | "portrait"; +} { + const lang = (info.match(/^\S+/)?.[0] ?? "").toLowerCase(); + const render = !/\brender\s*=\s*false\b/i.test(info); + const title = info.match(/\btitle\s*=\s*"([^"]*)"/i)?.[1] + ?? info.match(/\btitle\s*=\s*'([^']*)'/i)?.[1]; + const pageRaw = info.match(/\bpage\s*=\s*(landscape|portrait)\b/i)?.[1]?.toLowerCase(); + const page = pageRaw === "landscape" || pageRaw === "portrait" ? pageRaw : undefined; + return { lang, render, title, page }; +} + +// ─── Slot substitution (pure) ───────────────────────────────────────── + +/** + * Replace placeholder tokens in rendered HTML with their final slot HTML. + * marked wraps the bare token line in

; replace the wrapper too so + * the figure isn't nested inside a paragraph. + */ +export function substituteSlots(html: string, slots: Map): string { + let s = html; + for (const [token, slotHtml] of slots) { + // Function replacement is load-bearing: slot HTML carries user/LLM-authored + // diagram label text, and string-form replace() expands $&, $', $` patterns + // inside it — a label containing "$'" would duplicate the document tail. + const wrapped = new RegExp(`

\\s*${token}\\s*

`, "g"); + const replaced = s.replace(wrapped, () => slotHtml); + s = replaced !== s ? replaced : s.split(token).join(slotHtml); + } + return s; +} + +/** + * Visible diagnostic block for a failed fence render — never silent raw code + * (eng-review: explicit error blocks). Sanitizer-safe: all dynamic content is + * HTML-escaped. + */ +export function buildDiagnosticBlock(fence: DiagramFence, errorMessage: string): string { + const excerpt = fence.source.split("\n").slice(0, 8).join("\n"); + const truncated = fence.source.split("\n").length > 8 ? "\n…" : ""; + return [ + ``, + ].join("\n"); +} + +/** + * Wrap a rendered SVG in an accessible figure (D6.4). The raw fence source is + * preserved base64-encoded in a data attribute — an HTML comment would need + * `--` escaping, which corrupts every mermaid arrow (`-->`) and breaks + * round-trip recovery. + */ +export function buildDiagramFigure(fence: DiagramFence, svg: string): string { + const label = diagramLabel(fence); + const cleanSvg = sanitizeUntrustedHtml(svg); + const captioned = fence.title + ? `\n
${escapeHtml(fence.title)}
` + : ""; + const pageAttr = fence.page ? ` data-gstack-page="${fence.page}"` : ""; + const sourceB64 = Buffer.from(fence.source, "utf8").toString("base64"); + return [ + ``, + ].join("\n"); +} + +/** Recover the original fence source from a rendered figure (round-trip). */ +export function decodeFigureSource(figureHtml: string): string | null { + const m = figureHtml.match(/\bdata-gstack-source="([A-Za-z0-9+/=]*)"/); + if (!m) return null; + try { + return Buffer.from(m[1], "base64").toString("utf8"); + } catch { + return null; + } +} + +function diagramLabel(fence: DiagramFence): string { + return fence.title ?? `diagram ${fence.ordinal}`; +} + +// ─── Render tab (bundle page lifecycle) ─────────────────────────────── + +const PAYLOAD_TMP_DIR = process.platform === "win32" ? os.tmpdir() : "/tmp"; +const READY_TIMEOUT_MS = 20_000; +// Expressions bigger than this ship via `browse eval ` instead of argv. +// 8KB is safe on every platform (Windows CreateProcess caps the WHOLE command +// line at 32,767 chars; Linux MAX_ARG_STRLEN is ~128KiB) and the tmp-file +// round-trip costs microseconds — one spawn regardless of payload size. +const MAX_ARGV_EXPR_BYTES = 8_000; + +export class RenderTab { + private constructor( + public readonly tabId: number, + private readonly stagedBundlePath: string, + ) {} + + /** + * Open a tab and load the diagram-render bundle. The bundle HTML is staged + * under /tmp (content-addressed, reused across runs — load-html only reads + * inside its safe dirs) and loaded by PATH, not --from-file: a 9MB JSON + * round-trip per run would be pure waste. + */ + static open(): RenderTab { + const bundleSrc = resolveBundlePath(); + const html = fs.readFileSync(bundleSrc); + const sha = crypto.createHash("sha256").update(html).digest("hex").slice(0, 16); + const staged = path.join(PAYLOAD_TMP_DIR, `gstack-diagram-render-${sha}.html`); + // Never trust an existing file at the predictable shared-/tmp name: verify + // its content hash and re-stage on mismatch (a pre-planted file would + // otherwise be loaded into the render tab as the bundle). + let needsWrite = true; + if (fs.existsSync(staged)) { + try { + const existing = crypto.createHash("sha256").update(fs.readFileSync(staged)).digest("hex").slice(0, 16); + needsWrite = existing !== sha; + } catch { + needsWrite = true; + } + } + if (needsWrite) { + // Concurrent-safe: write to a unique temp name, then atomic rename. + const tmp = `${staged}.${process.pid}.${crypto.randomBytes(4).toString("hex")}`; + fs.writeFileSync(tmp, html); + try { + fs.renameSync(tmp, staged); + } catch (renameErr) { + try { fs.unlinkSync(tmp); } catch { /* best-effort tmp cleanup */ } + // Only swallow the rename failure when the surviving file HASHES to + // the expected bundle (a concurrent writer won an OS-level race). + // Sticky-bit /tmp makes rename-over-foreign-file fail EPERM — if the + // survivor were trusted on existence alone, a pre-planted file would + // ride through the exact check added to stop it. + let survivorOk = false; + try { + const survivor = crypto.createHash("sha256").update(fs.readFileSync(staged)).digest("hex").slice(0, 16); + survivorOk = survivor === sha; + } catch { /* unreadable survivor = not ok */ } + if (!survivorOk) throw renameErr; + } + } + const tabId = browseClient.newtab(); + const tab = new RenderTab(tabId, staged); + tab.loadBundle(); + return tab; + } + + /** (Re)load the bundle page — also the reset path after a render error. */ + loadBundle(): void { + browseClient.loadHtmlFile({ file: this.stagedBundlePath, tabId: this.tabId }); + const ready = browseClient.waitForExpression({ + expression: "document.getElementById('status') !== null && document.getElementById('status').textContent === 'ready'", + tabId: this.tabId, + timeoutMs: READY_TIMEOUT_MS, + }); + if (!ready) { + throw new Error( + "diagram-render bundle did not become ready in the browse tab " + + `(${READY_TIMEOUT_MS}ms). Check \`browse js "window.__errors"\` on tab ${this.tabId}.`, + ); + } + } + + /** + * Call one of the bundle's async window functions with JSON-safe string + * args. Errors come back as a recognizable ERR: prefix so a render failure + * is data, not a thrown browse exit. + */ + call(fn: string, ...args: Array): string { + const argList = args.map((a) => JSON.stringify(a)).join(","); + const expression = + `window.${fn}(${argList})` + + `.then(r => "OK:" + r)` + + `.catch(e => "ERR:" + String((e && e.message) || e))`; + const result = this.js(expression); + if (result.startsWith("OK:")) return result.slice(3); + if (result.startsWith("ERR:")) throw new RenderCallError(result.slice(4)); + throw new RenderCallError(`unexpected bundle result: ${result.slice(0, 200)}`); + } + + private js(expression: string): string { + // Large payloads (scene JSON, SVG text, data URIs) blow past argv limits — + // browseClient.js shells out with the expression as an argv element. The + // limit is BYTES, not chars (CJK content is 3x its char count in UTF-8), + // and Windows caps the whole command line at 32,767 chars — so anything + // big ships via `browse eval ` instead: one spawn, any size. + if (Buffer.byteLength(expression, "utf8") <= MAX_ARGV_EXPR_BYTES) { + return browseClient.js({ expression, tabId: this.tabId }); + } + return this.jsViaFile(expression); + } + + /** argv-safe path for big expressions: stage to a tmp file under browse's + * safe dirs and run `browse eval ` (one spawn regardless of size). */ + private jsViaFile(expression: string): string { + const file = path.join( + PAYLOAD_TMP_DIR, + `gstack-diagram-expr-${process.pid}-${crypto.randomBytes(4).toString("hex")}.js`, + ); + fs.writeFileSync(file, expression, "utf8"); + try { + return browseClient.evalFile({ file, tabId: this.tabId }); + } finally { + try { fs.unlinkSync(file); } catch { /* best-effort tmp cleanup */ } + } + } + + close(): void { + try { + browseClient.closetab(this.tabId); + } catch { + // best-effort: orchestrator finally path + } + } +} + +export class RenderCallError extends Error { + constructor(msg: string) { + super(msg); + this.name = "RenderCallError"; + } +} + +/** Resolve dist/diagram-render.html: env override → repo-relative (dev) → global install. */ +export function resolveBundlePath(env: NodeJS.ProcessEnv = process.env): string { + const candidates = [ + env.GSTACK_DIAGRAM_BUNDLE, + // dev: make-pdf/src/* → repo root lib/. (In a compiled binary this is the + // virtual /$bunfs/root and simply never exists — harmless.) + path.resolve(import.meta.dir, "../../lib/diagram-render/dist/diagram-render.html"), + // compiled binary at /make-pdf/dist/pdf → /lib/… — same shape + // in the repo and in the ~/.claude/skills/gstack global install. argv[0] + // is the literal string "bun" in compiled binaries; execPath is real. + path.resolve(path.dirname(process.execPath), "../../lib/diagram-render/dist/diagram-render.html"), + path.join(os.homedir(), ".claude/skills/gstack/lib/diagram-render/dist/diagram-render.html"), + ].filter((p): p is string => !!p); + for (const p of candidates) { + if (fs.existsSync(p)) return p; + } + throw new Error( + "diagram-render bundle not found. Tried:\n" + + candidates.map((c) => ` - ${c}`).join("\n") + + "\nRun `bun run build:diagram-render` (repo) or re-run ./setup (install).", + ); +} + +// ─── Fence rendering ────────────────────────────────────────────────── + +/** + * Render every extracted fence to its slot HTML. One bundle tab serves all + * fences; a failed fence yields a diagnostic block and a bundle reload + * (reset contract) before the next fence renders. + */ +export function renderFenceSlots( + fences: DiagramFence[], + tab: RenderTab, + warn: (msg: string) => void, +): Map { + const slots = new Map(); + for (const fence of fences) { + try { + let svg: string; + if (fence.lang === "mermaid") { + svg = tab.call("__renderMermaid", `mermaid-fence-${fence.ordinal}`, fence.source); + } else { + JSON.parse(fence.source); // fail fast with a JSON diagnostic, not a bundle stack + svg = tab.call("__excalidrawToSvg", fence.source); + } + slots.set(fence.token, buildDiagramFigure(fence, svg)); + } catch (err: any) { + const msg = err?.message ?? String(err); + warn(`diagram ${fence.ordinal} (${fence.lang}) failed to render: ${firstLine(msg)}`); + slots.set(fence.token, buildDiagnosticBlock(fence, msg)); + // Reset contract: a poisoned page must not corrupt the next fence. + try { + tab.loadBundle(); + } catch (reloadErr: any) { + warn(`bundle reload after render error failed: ${firstLine(reloadErr?.message ?? String(reloadErr))}`); + } + } + } + return slots; +} + +// ─── DOCX rasterization (eng-review D6.5, P8) ───────────────────────── + +/** + * Replace inline diagram SVGs (and svg data-URI images) with PNG tags + * for the DOCX export — Word's SVG support is unreliable, so the content- + * fidelity contract embeds rasters at 300dpi of the placed width (the + * content box). Diagnostic blocks keep their text form. + */ +export function rasterizeDiagramFigures( + html: string, + tab: RenderTab, + contentWidthIn: number, + warn: (msg: string) => void, +): string { + const targetPx = Math.round(contentWidthIn * PRINT_DPI); + + // 1. Rendered diagram figures → with the figure's aria-label as alt. + let out = html.replace( + /
]*>[\s\S]*?<\/figure>/gi, + (figure) => { + const svgMatch = figure.match(//i); + if (!svgMatch) return figure; + const label = figure.match(/\baria-label\s*=\s*"([^"]*)"/i)?.[1] ?? "diagram"; + try { + const png = tab.call("__rasterize", svgMatch[0], targetPx); + return `

${label}

`; + } catch (err: any) { + const reason = firstLine(err?.message ?? String(err)); + warn(`docx: diagram rasterization failed (${reason}); embedding source text instead`); + // The converter drops
/ entirely, so returning the figure + // would make the diagram vanish without a trace — the exact invisible + // failure the diagnostic contract forbids. Surface the source. + const source = decodeFigureSource(figure) ?? "(source unavailable)"; + return [ + `

Diagram could not be rasterized for DOCX (${escapeHtml(reason)}) — source:

`, + `
${escapeHtml(source)}
`, + ].join("\n"); + } + }, + ); + + // 2. SVG data-URI images (inlined .svg files) → PNG. + out = out.replace(/]*>/gi, (tag) => { + const m = tag.match(SRC_RE); + const src = m?.[2] ?? m?.[3] ?? ""; + if (!src.startsWith("data:image/svg+xml")) return tag; + try { + const b64 = src.slice(src.indexOf(",") + 1); + const svgText = Buffer.from(b64, "base64").toString("utf8"); + const png = tab.call("__rasterize", svgText, targetPx); + // Function replacement: data URIs can contain $-patterns. + return tag.replace(SRC_RE, () => `src="${png}"`); + } catch (err: any) { + warn(`docx: svg image rasterization failed (${firstLine(err?.message ?? String(err))})`); + return tag; + } + }); + + return out; +} + +/** + * Diagnostic figures → plain

/

 for the DOCX converter, which drops
+ * 
elements it can't map. An invisible error is the one thing the + * diagnostic contract forbids. Pure — no render tab needed. + */ +export function convertDiagnosticsForDocx(html: string): string { + return html.replace( + /
]*>([\s\S]*?)<\/figure>/gi, + (_full, body: string) => { + const title = body.match(/]*>([\s\S]*?)<\/figcaption>/i)?.[1] ?? "Diagram failed to render"; + const detail = body.match(/]*>([\s\S]*?)<\/pre>/i)?.[1] ?? ""; + return `

${title}

\n
${detail}
`; + }, + ); +} + +// ─── Image inlining (eng-review D1 + D4 + D6.1) ─────────────────────── + +const IMG_TAG_RE = /]*>/gi; +const SRC_RE = /\bsrc\s*=\s*("([^"]*)"|'([^']*)')/i; + +/** + * Inline every local as a data URI, probe intrinsic dimensions from the + * bytes, and annotate the tag with data-gstack-px-width/-height for the width + * policy. Oversized rasters are downscaled to print resolution via the bundle + * tab. Missing files become visible placeholders (or throw under --strict); + * remote URLs warn (offline posture) unless --allow-network. + */ +export function inlineLocalImages(html: string, opts: PrepassImageOptions): string { + const maxPx = Math.round(opts.contentWidthIn * PRINT_DPI * DOWNSCALE_FACTOR); + const targetPx = Math.round(opts.contentWidthIn * PRINT_DPI); + // An image referenced N times is read/probed/downscaled once; the same data + // URI string is reused (also dedupes memory until the final join). + const memo = new Map(); + + return html.replace(IMG_TAG_RE, (tag) => { + const srcMatch = tag.match(SRC_RE); + if (!srcMatch) return tag; + const src = srcMatch[2] ?? srcMatch[3] ?? ""; + + if (src.startsWith("data:")) return annotateFromDataUri(tag, src); + + // Windows drive-letter paths (C:/x.png, C:\x.png) look like single-letter + // URL schemes — they are local paths, not URLs. + const isDrivePath = /^[a-zA-Z]:[\\/]/.test(src); + + if (!isDrivePath && /^[a-z][a-z0-9+.-]*:/i.test(src)) { + // Absolute URL with a scheme (http, https, file, …) + if (opts.allowNetwork && /^https?:/i.test(src)) return tag; + if (/^https?:/i.test(src)) { + const msg = `remote image blocked (offline posture): ${src}`; + if (opts.strict) throw new StrictModeError(msg + " — re-run without --strict or pass --allow-network"); + opts.warn(msg); + // Leaving the tag would make Chromium fetch it at print time anyway — + // the warn would be a lie. Replace with a visible placeholder. + return buildBlockedRemotePlaceholder(src); + } + // file:// and friends fall through to the local path branch + if (!src.startsWith("file:")) return tag; + } + + // decodeURIComponent throws on malformed escapes (foo%zz.png) — a broken + // URL must degrade to the missing-image path, not crash the run. + let decodedSrc = src; + try { + decodedSrc = decodeURIComponent(src); + } catch { /* keep raw src */ } + + const filePath = src.startsWith("file:") + ? fileURLToPath(src) + : isDrivePath + ? path.resolve(src) + : path.resolve(opts.inputDir, decodedSrc); + + const cached = memo.get(filePath); + if (cached !== undefined) return rewriteImgTag(tag, cached); + + if (!fs.existsSync(filePath)) { + const msg = `image not found: ${src} (resolved to ${filePath})`; + if (opts.strict) throw new StrictModeError(msg); + opts.warn(msg); + return buildMissingImagePlaceholder(src); + } + + // Out-of-tree reads are legal (local CLI semantics — like pandoc) but + // never silent: an agent PDF-ing untrusted markdown should not quietly + // embed ~/.ssh/config into a shareable document. --strict makes it fatal. + // Compare REAL paths — a symlink inside the input dir pointing outside + // would otherwise pass a string-prefix check (Codex adversarial finding). + // Runs after the existence check: realpath of a missing file can't + // resolve, and on macOS /var vs /private/var would false-positive. + const inputRoot = safeRealpath(path.resolve(opts.inputDir)) + path.sep; + const realFilePath = safeRealpath(filePath); + if (!realFilePath.startsWith(inputRoot)) { + const msg = `image resolves OUTSIDE the input directory: ${src} → ${realFilePath}`; + if (opts.strict) throw new StrictModeError(msg + " — move it under the markdown's directory or drop --strict"); + opts.warn(msg); + } + + // Bound the read BEFORE reading: a markdown image pointing at a special + // file (fifo, device) would hang readFileSync, and a multi-GB file would + // exhaust memory before any policy ran. + let stat: fs.Stats; + try { + stat = fs.statSync(filePath); + } catch { + opts.warn(`image unreadable: ${src}`); + return buildMissingImagePlaceholder(src); + } + if (!stat.isFile()) { + const msg = `image is not a regular file: ${src}`; + if (opts.strict) throw new StrictModeError(msg); + opts.warn(msg); + return buildMissingImagePlaceholder(src); + } + if (stat.size > MAX_IMAGE_BYTES) { + const msg = `image exceeds ${Math.round(MAX_IMAGE_BYTES / 1024 / 1024)}MB cap: ${src} (${Math.round(stat.size / 1024 / 1024)}MB)`; + if (opts.strict) throw new StrictModeError(msg); + opts.warn(msg); + return buildMissingImagePlaceholder(src); + } + + let buf = fs.readFileSync(filePath); + let dims = imageDims(buf); + let mime = dims?.mime ?? mimeFromExtension(filePath); + + // Print-resolution normalization (D4): rasters only — SVG scales free. + if (dims && mime !== "image/svg+xml" && dims.width > maxPx) { + const tab = opts.getTab(); + if (tab) { + try { + const dataUri = `data:${mime};base64,${buf.toString("base64")}`; + const scaled = tab.call("__downscaleRaster", dataUri, targetPx, mime); + const scaledB64 = scaled.replace(/^data:[^,]*,/, ""); + opts.warn( + `downscaled ${path.basename(filePath)} ${dims.width}px → ${targetPx}px ` + + `(print is ${PRINT_DPI}dpi; original exceeds ${maxPx}px content-box ceiling)`, + ); + buf = Buffer.from(scaledB64, "base64"); + mime = scaled.slice(5, scaled.indexOf(";")); + dims = { ...dims, height: Math.round((dims.height * targetPx) / dims.width), width: targetPx }; + } catch (err: any) { + opts.warn(`downscale failed for ${src}, inlining at full size: ${firstLine(err?.message ?? String(err))}`); + } + } + } + + const dataUri = `data:${mime};base64,${buf.toString("base64")}`; + const attrs = dims + ? ` data-gstack-px-width="${Math.round(dims.width)}" data-gstack-px-height="${Math.round(dims.height)}"` + : ""; + memo.set(filePath, { dataUri, attrs }); + return rewriteImgTag(tag, memo.get(filePath)!); + }); +} + +/** Apply a memoized inline result to an img tag. */ +function rewriteImgTag(tag: string, entry: { dataUri: string; attrs: string }): string { + // Function replacement: data URIs are user-content-derived; string-form + // replace() would expand $-patterns inside them. + let out = tag.replace(SRC_RE, () => `src="${entry.dataUri}"`); + if (entry.attrs) out = out.replace(/^ `` + + `[missing image: ${escapeHtml(src)}]` + ); +} + +function buildBlockedRemotePlaceholder(src: string): string { + return ( + `` + + `[remote image blocked (use --allow-network): ${escapeHtml(src)}]` + ); +} + +/** realpath that degrades to the input path when resolution fails. */ +function safeRealpath(p: string): string { + try { + return fs.realpathSync(p); + } catch { + return p; + } +} + +function mimeFromExtension(p: string): string { + switch (path.extname(p).toLowerCase()) { + case ".png": return "image/png"; + case ".jpg": + case ".jpeg": return "image/jpeg"; + case ".gif": return "image/gif"; + case ".webp": return "image/webp"; + case ".svg": return "image/svg+xml"; + default: return "application/octet-stream"; + } +} + +// ─── Content-box math ───────────────────────────────────────────────── + +const PAGE_WIDTHS_IN: Record = { + letter: 8.5, + a4: 8.27, + legal: 8.5, + tabloid: 11, +}; + +/** Parse a CSS dimension ("1in" | "72pt" | "25mm" | "2.54cm") to inches. */ +export function dimToInches(dim: string | undefined, fallbackIn: number): number { + if (!dim) return fallbackIn; + const m = dim.trim().match(/^([0-9.]+)\s*(in|pt|cm|mm|px)?$/i); + if (!m) return fallbackIn; + const v = parseFloat(m[1]); + switch ((m[2] ?? "in").toLowerCase()) { + case "in": return v; + case "pt": return v / 72; + case "cm": return v / 2.54; + case "mm": return v / 25.4; + case "px": return v / 96; + default: return fallbackIn; + } +} + +export function contentWidthInches(opts: { + pageSize?: string; + margins?: string; + marginLeft?: string; + marginRight?: string; +}): number { + const pageW = PAGE_WIDTHS_IN[opts.pageSize ?? "letter"] ?? 8.5; + const left = dimToInches(opts.marginLeft ?? opts.margins, 1); + const right = dimToInches(opts.marginRight ?? opts.margins, 1); + return Math.max(1, pageW - left - right); +} + +const PAGE_HEIGHTS_IN: Record = { + letter: 11, + a4: 11.69, + legal: 14, + tabloid: 17, +}; + +/** + * Content box of the rotated (landscape) named page: portrait page HEIGHT + * becomes the landscape width; portrait WIDTH becomes the landscape height. + * Used by image-policy to vertically center promoted blocks. + */ +export function landscapeContentBox(opts: { + pageSize?: string; + margins?: string; + marginLeft?: string; + marginRight?: string; + marginTop?: string; + marginBottom?: string; +}): { contentWIn: number; contentHIn: number } { + const size = opts.pageSize ?? "letter"; + const pageH = PAGE_HEIGHTS_IN[size] ?? 11; + const pageW = PAGE_WIDTHS_IN[size] ?? 8.5; + const left = dimToInches(opts.marginLeft ?? opts.margins, 1); + const right = dimToInches(opts.marginRight ?? opts.margins, 1); + const top = dimToInches(opts.marginTop ?? opts.margins, 1); + const bottom = dimToInches(opts.marginBottom ?? opts.margins, 1); + return { + contentWIn: Math.max(1, pageH - left - right), + contentHIn: Math.max(1, pageW - top - bottom), + }; +} + +// ─── tiny helpers ───────────────────────────────────────────────────── +// escapeHtml is imported from ./render — single definition, no drift. + +function firstLine(s: string): string { + return s.split("\n")[0].slice(0, 200); +} diff --git a/make-pdf/src/image-policy.ts b/make-pdf/src/image-policy.ts new file mode 100644 index 000000000..2700ae011 --- /dev/null +++ b/make-pdf/src/image-policy.ts @@ -0,0 +1,236 @@ +/** + * Image width policy + conservative auto-landscape (eng-review P4, D4 spec). + * + * Two pure passes over rendered HTML: + * + * 1. applyImageDirectives — runs inside render() right after marked, before + * the sanitizer. Translates the markdown-adjacent directive suffix + * `![alt](x.png){width=50%}` / `{page=landscape}` into data-gstack-* + * attributes (the sanitizer keeps data- attributes; the brace text is + * consumed so it never reaches smartypants or the page). + * + * 2. applyImagePolicy — runs in the orchestrator after image inlining (which + * annotates data-gstack-px-width/-height from real bytes). Applies the + * width rule and decides landscape promotion: + * + * WIDTH RULE: render at intrinsic CSS-px width, capped at the content box, + * never upscaled — that is exactly `figure img { max-width: 100% }` doing + * its job, so the default needs no inline style. Directives opt into more: + * width=full stretches to the content box; / set explicit width. + * + * LANDSCAPE (conservative, false negatives are cheap): + * promote only when ALL hold — + * aspect ratio ≥ 1.8 + * AND intrinsic CSS-px width > SHRINK_LIMIT × content box + * (content shrunk below ~40% of natural size = unreadable) + * AND diagram provenance (rendered fence) or an alt-text token from + * ALT_HINT_TOKENS (plain images) + * `{page=landscape}` forces, `{page=portrait}` vetoes — both skip the + * heuristics entirely. + * + * Promotion wraps the block in
whose CSS named + * page (`@page wide { size: landscape }`, print-css.ts) rotates + * just that page. Chromium only honors CSS page sizes when the print call + * passes preferCSSPageSize — the orchestrator sets it when hasLandscape. + */ + +import { svgTagDims } from "./image-size"; + +export interface ImagePolicyOptions { + /** Physical content-box width in inches (page width minus margins). */ + contentWidthIn: number; + /** + * Landscape named-page content box (inches). Used to vertically center a + * promoted block via a computed inline margin-top — CSS flex/min-height + * centering fragments into phantom landscape pages in Chromium, so the + * margin is computed here from the block's known aspect ratio instead. + */ + landscape: { contentWIn: number; contentHIn: number }; + warn: (msg: string) => void; +} + +export interface ImagePolicyResult { + html: string; + /** True when at least one block was promoted to the landscape named page. */ + hasLandscape: boolean; +} + +/** Aspect ratio floor for auto-promotion. */ +const MIN_ASPECT = 1.8; +/** + * Auto-promote only when the intrinsic CSS-px width exceeds this multiple of + * the content box (in CSS px @96dpi). 2.5 ≈ the plan's ~1600px threshold on a + * 6.5in letter box; calibrated against fixtures (design doc Open Question 4). + */ +const SHRINK_LIMIT = 2.5; +/** Alt-text tokens that mark a plain image as diagram-like (case-insensitive). */ +const ALT_HINT_TOKENS = ["diagram", "architecture", "flowchart", "chart", "graph"]; + +// ─── Pass 1: directive suffixes ─────────────────────────────────────── + +const IMG_WITH_SUFFIX_RE = /(]*>)\s*\{([^{}<>\n]{1,120})\}/gi; + +/** + * Consume `{...}` directive suffixes adjacent to tags. Unrecognized + * brace groups are left untouched (someone's literal prose). + */ +export function applyImageDirectives(html: string): string { + return html.replace(IMG_WITH_SUFFIX_RE, (full, imgTag: string, body: string) => { + const parsed = parseDirectives(body); + if (!parsed) return full; + let tag = imgTag; + if (parsed.width) tag = addAttr(tag, "data-gstack-width", parsed.width); + if (parsed.page) tag = addAttr(tag, "data-gstack-page", parsed.page); + return tag; + }); +} + +export function parseDirectives(body: string): { width?: string; page?: string } | null { + let width: string | undefined; + let page: string | undefined; + let recognized = false; + for (const part of body.trim().split(/\s+/)) { + const m = part.match(/^(width|page)=(.+)$/i); + if (!m) return null; // any unknown token ⇒ not a directive group + const key = m[1].toLowerCase(); + const value = m[2].toLowerCase(); + if (key === "width" && /^(full|\d{1,3}%|[0-9.]+(in|cm|mm|pt|px))$/.test(value)) { + width = value; + recognized = true; + } else if (key === "page" && /^(landscape|portrait)$/.test(value)) { + page = value; + recognized = true; + } else { + return null; // recognized key, malformed value ⇒ leave visible, not silent + } + } + return recognized ? { width, page } : null; +} + +function addAttr(imgTag: string, name: string, value: string): string { + return imgTag.replace(/^]*>/gi, (tag) => { + const width = attrValue(tag, "data-gstack-width"); + if (!width) return tag; + const css = width === "full" ? "100%" : width; + return mergeStyle(tag, `width: ${css}; height: auto;`); + }); + + // 2b. landscape promotion — standalone images (markdown images render as + //

; promote by swapping the paragraph for the wide wrapper). + out = out.replace(/

\s*(]*>)\s*<\/p>/gi, (full, tag: string) => { + const decision = decideImagePromotion(tag, widthThresholdPx); + if (!decision.promote) return full; + hasLandscape = true; + opts.warn(`promoting image to a landscape page (${decision.reason})`); + const w = num(attrValue(tag, "data-gstack-px-width")); + const h = num(attrValue(tag, "data-gstack-px-height")); + return wrapPageWide(tag, w && h ? h / w : null, opts.landscape); + }); + + // 2c. landscape promotion — rendered diagram figures (provenance is + // automatic; dims come from the SVG's width/height or viewBox). + out = out.replace( + /

]*>[\s\S]*?<\/figure>/gi, + (figure) => { + if (figure.includes("diagram-error")) return figure; + const decision = decideDiagramPromotion(figure, widthThresholdPx); + if (!decision.promote) return figure; + hasLandscape = true; + opts.warn(`promoting diagram to a landscape page (${decision.reason})`); + const dims = svgCssDims(figure); + return wrapPageWide(figure, dims ? dims.height / dims.width : null, opts.landscape); + }, + ); + + return { html: out, hasLandscape }; +} + +/** + * Wrap a promoted block in the wide-page div, vertically centered via a + * computed margin-top: placed height = landscape content width × aspect, + * centered in the landscape content height. Unknown aspect → no margin + * (top placement beats a wrong guess). + */ +function wrapPageWide( + inner: string, + aspectHoverW: number | null, + landscape: { contentWIn: number; contentHIn: number }, +): string { + if (!aspectHoverW) return `
${inner}
`; + const placedHIn = landscape.contentWIn * aspectHoverW; + const marginIn = Math.max(0, (landscape.contentHIn - placedHIn) / 2); + if (marginIn < 0.1) return `
${inner}
`; + return `
${inner}
`; +} + +interface PromotionDecision { + promote: boolean; + reason: string; +} + +function decideImagePromotion(tag: string, widthThresholdPx: number): PromotionDecision { + const page = attrValue(tag, "data-gstack-page"); + if (page === "portrait") return { promote: false, reason: "page=portrait veto" }; + if (page === "landscape") return { promote: true, reason: "page=landscape directive" }; + + const w = num(attrValue(tag, "data-gstack-px-width")); + const h = num(attrValue(tag, "data-gstack-px-height")); + if (!w || !h) return { promote: false, reason: "no intrinsic dimensions" }; + if (w / h < MIN_ASPECT) return { promote: false, reason: "aspect below floor" }; + if (w <= widthThresholdPx) return { promote: false, reason: "fits portrait readably" }; + + const alt = (attrValue(tag, "alt") ?? "").toLowerCase(); + const hinted = ALT_HINT_TOKENS.some((t) => new RegExp(`\\b${t}\\b`).test(alt)); + if (!hinted) return { promote: false, reason: "no diagram hint in alt text" }; + + return { promote: true, reason: `wide diagram-like image (${Math.round(w)}px, alt hint)` }; +} + +function decideDiagramPromotion(figure: string, widthThresholdPx: number): PromotionDecision { + const page = attrValue(figure, "data-gstack-page"); + if (page === "portrait") return { promote: false, reason: "page=portrait veto" }; + if (page === "landscape") return { promote: true, reason: "page=landscape fence directive" }; + + const dims = svgCssDims(figure); + if (!dims) return { promote: false, reason: "no measurable SVG dimensions" }; + if (dims.width / dims.height < MIN_ASPECT) return { promote: false, reason: "aspect below floor" }; + if (dims.width <= widthThresholdPx) return { promote: false, reason: "fits portrait readably" }; + return { promote: true, reason: `wide diagram (${Math.round(dims.width)}px)` }; +} + +/** SVG dimension probing is shared with the byte prober — see image-size.ts. */ +const svgCssDims = svgTagDims; + +function attrValue(tag: string, name: string): string | null { + const m = tag.match(new RegExp(`\\b${name}\\s*=\\s*"([^"]*)"`, "i")) + ?? tag.match(new RegExp(`\\b${name}\\s*=\\s*'([^']*)'`, "i")); + return m ? m[1] : null; +} + +function num(s: string | null): number | null { + if (s === null) return null; + const n = parseFloat(s); + return Number.isFinite(n) && n > 0 ? n : null; +} + +function mergeStyle(tag: string, css: string): string { + const existing = attrValue(tag, "style"); + if (existing !== null) { + // Function replacement (no $-pattern expansion from user-controlled style + // values) and the existing declarations are preserved verbatim — attrValue + // already returned the unquoted inner value. + return tag.replace(/\bstyle\s*=\s*(".*?"|'.*?')/i, () => `style="${existing}; ${css}"`); + } + return tag.replace(/^ `= 0xd0 && marker <= 0xd9)) { i += 2; continue; } + const len = b.readUInt16BE(i + 2); + if (len < 2) return null; + // SOF0-SOF15 except DHT(C4)/JPGA(C8)/DAC(CC) carry dimensions + if (marker >= 0xc0 && marker <= 0xcf && marker !== 0xc4 && marker !== 0xc8 && marker !== 0xcc) { + if (i + 9 >= b.length) return null; + return { height: b.readUInt16BE(i + 5), width: b.readUInt16BE(i + 7), mime: "image/jpeg" }; + } + i += 2 + len; + } + return null; +} + +function gifDims(b: Buffer): ImageDims | null { + const sig = b.toString("ascii", 0, 6); + if (sig !== "GIF87a" && sig !== "GIF89a") return null; + return { width: b.readUInt16LE(6), height: b.readUInt16LE(8), mime: "image/gif" }; +} + +function webpDims(b: Buffer): ImageDims | null { + if (b.toString("ascii", 0, 4) !== "RIFF" || b.toString("ascii", 8, 12) !== "WEBP") return null; + const fmt = b.toString("ascii", 12, 16); + if (fmt === "VP8X" && b.length >= 30) { + // 24-bit little-endian width-1 / height-1 at offsets 24 / 27 + const w = 1 + (b[24] | (b[25] << 8) | (b[26] << 16)); + const h = 1 + (b[27] | (b[28] << 8) | (b[29] << 16)); + return { width: w, height: h, mime: "image/webp" }; + } + if (fmt === "VP8 " && b.length >= 30) { + // Lossy: dimensions at offset 26, 14 bits each, little-endian + return { + width: b.readUInt16LE(26) & 0x3fff, + height: b.readUInt16LE(28) & 0x3fff, + mime: "image/webp", + }; + } + if (fmt === "VP8L" && b.length >= 25) { + if (b[20] !== 0x2f) return null; + const bits = b.readUInt32LE(21); + return { + width: (bits & 0x3fff) + 1, + height: ((bits >> 14) & 0x3fff) + 1, + mime: "image/webp", + }; + } + return null; +} + +/** + * SVG: parse width/height attributes (px or unitless) off the root element, + * falling back to viewBox. CSS-unit widths (em, %, pt) are ignored — the + * width policy treats them as "no intrinsic size". + */ +function svgDims(b: Buffer): ImageDims | null { + const head = b.toString("utf8", 0, Math.min(b.length, 4096)); + const dims = svgTagDims(head); + return dims ? { ...dims, mime: "image/svg+xml" } : null; +} + +/** + * CSS-px dimensions of the first element in a markup string: explicit + * width/height attributes (px or unitless) first, else viewBox. Shared by the + * byte prober above and image-policy's diagram-figure measurements — one + * regex, no drift. + */ +export function svgTagDims(markup: string): { width: number; height: number } | null { + const tag = markup.match(/]*>/i)?.[0]; + if (!tag) return null; + const attr = (name: string): number | null => { + const m = tag.match(new RegExp(`\\b${name}\\s*=\\s*["']\\s*([0-9.]+)(px)?\\s*["']`, "i")); + return m ? parseFloat(m[1]) : null; + }; + const w = attr("width"); + const h = attr("height"); + if (w && h) return { width: w, height: h }; + const vb = tag.match(/\bviewBox\s*=\s*["']\s*[-0-9.]+[\s,]+[-0-9.]+[\s,]+([0-9.]+)[\s,]+([0-9.]+)\s*["']/i); + if (vb) return { width: parseFloat(vb[1]), height: parseFloat(vb[2]) }; + return null; +} diff --git a/make-pdf/src/orchestrator.ts b/make-pdf/src/orchestrator.ts index cf8dffae6..12a21570d 100644 --- a/make-pdf/src/orchestrator.ts +++ b/make-pdf/src/orchestrator.ts @@ -21,9 +21,22 @@ import * as crypto from "node:crypto"; import { spawn } from "node:child_process"; import { render } from "./render"; +import { screenCss } from "./print-css"; import type { GenerateOptions, PreviewOptions } from "./types"; import { ExitCode } from "./types"; import * as browseClient from "./browseClient"; +import { + RenderTab, + contentWidthInches, + convertDiagnosticsForDocx, + extractDiagramFences, + inlineLocalImages, + landscapeContentBox, + rasterizeDiagramFigures, + renderFenceSlots, + substituteSlots, +} from "./diagram-prepass"; +import { applyImagePolicy } from "./image-policy"; class ProgressReporter { private readonly quiet: boolean; @@ -71,8 +84,9 @@ export async function generate(opts: GenerateOptions): Promise { throw new Error(`input file not found: ${input}`); } + const to = opts.to ?? "pdf"; const outputPath = path.resolve( - opts.output ?? path.join(os.tmpdir(), `${deriveSlug(input)}.pdf`), + opts.output ?? path.join(os.tmpdir(), `${deriveSlug(input)}.${to}`), ); // Stage 1: read markdown @@ -80,10 +94,14 @@ export async function generate(opts: GenerateOptions): Promise { const markdown = fs.readFileSync(input, "utf8"); progress.end("Reading markdown"); + // Stage 1.5: diagram pre-pass — extract ```mermaid/```excalidraw fences and + // swap in placeholder tokens. Rendering happens after the tab opens below. + const extraction = extractDiagramFences(markdown); + // Stage 2: render HTML progress.begin("Rendering HTML"); const rendered = render({ - markdown, + markdown: extraction.markdown, title: opts.title, author: opts.author, date: opts.date, @@ -94,16 +112,144 @@ export async function generate(opts: GenerateOptions): Promise { confidential: opts.confidential, pageSize: opts.pageSize, margins: opts.margins, + marginTop: opts.marginTop, + marginRight: opts.marginRight, + marginBottom: opts.marginBottom, + marginLeft: opts.marginLeft, pageNumbers: opts.pageNumbers, footerTemplate: opts.footerTemplate, }); progress.end("Rendering HTML", `${rendered.meta.wordCount} words`); + // Stage 2.5: render diagram fences in a dedicated bundle tab, substitute + // slots, then inline + probe + (if oversized) downscale local images. + // The bundle tab is lazy: image-only documents open it only when a raster + // actually needs print-resolution downscaling (eng-review D4). + const warn = (msg: string) => { + if (!opts.quiet) process.stderr.write(`\r\x1b[K[make-pdf] warning: ${msg}\n`); + }; + let renderTab: RenderTab | null = null; + let hasLandscape = false; + const getRenderTab = (): RenderTab | null => { + if (renderTab) return renderTab; + try { + renderTab = RenderTab.open(); + } catch (err: any) { + warn(`diagram-render tab unavailable: ${String(err?.message ?? err).split("\n")[0]}`); + return null; + } + return renderTab; + }; + + let finalHtml = rendered.html; + try { + if (extraction.fences.length > 0) { + progress.begin(`Rendering ${extraction.fences.length} diagram(s)`); + const tab = getRenderTab(); + if (tab) { + const slots = renderFenceSlots(extraction.fences, tab, warn); + finalHtml = substituteSlots(finalHtml, slots); + } else { + // No bundle/tab: visible diagnostic beats silent raw tokens. + const slots = new Map( + extraction.fences.map((f) => [ + f.token, + ``, + ]), + ); + finalHtml = substituteSlots(finalHtml, slots); + } + progress.end(`Rendering ${extraction.fences.length} diagram(s)`); + } + + progress.begin("Inlining images"); + const contentWidthIn = contentWidthInches(opts); + finalHtml = inlineLocalImages(finalHtml, { + inputDir: path.dirname(input), + strict: opts.strict === true, + allowNetwork: opts.allowNetwork === true, + contentWidthIn, + warn, + getTab: getRenderTab, + }); + progress.end("Inlining images"); + + // Width directives + conservative auto-landscape (image-policy). + const policy = applyImagePolicy(finalHtml, { + contentWidthIn, + landscape: landscapeContentBox(opts), + warn, + }); + finalHtml = policy.html; + hasLandscape = policy.hasLandscape; + + // DOCX needs rasters, not inline SVG (Word's SVG support is unreliable) — + // do it while the render tab is still open. + if (to === "docx") { + const needsRaster = /
", + `\n`, + ); + fs.writeFileSync(outputPath, withScreenLayer, "utf8"); + const kb = Math.round(fs.statSync(outputPath).size / 1024); + progress.done(`${rendered.meta.wordCount} words · ${kb}KB · ${outputPath}`); + return outputPath; + } + + // ─── --to docx: content-fidelity conversion (eng-review P8) ──────────── + if (to === "docx") { + // Print-only surfaces don't survive the conversion. The watermark div + // would degrade to a literal body paragraph reading "DRAFT" (worse than + // absent) — strip it. Warn once about print-only flags that were set. + finalHtml = finalHtml.replace(/
[\s\S]*?<\/div>/, ""); + const printOnly: string[] = []; + if (opts.watermark) printOnly.push("--watermark"); + if (opts.headerTemplate) printOnly.push("--header-template"); + if (opts.footerTemplate) printOnly.push("--footer-template"); + if (opts.pageSize) printOnly.push("--page-size"); + if (opts.margins || opts.marginTop || opts.marginRight || opts.marginBottom || opts.marginLeft) printOnly.push("--margins"); + if (printOnly.length > 0) { + warn(`docx is content-fidelity: ${printOnly.join(", ")} do not apply to Word output`); + } + progress.begin("Converting to DOCX"); + const { default: HTMLtoDOCX } = await import("html-to-docx"); + const buf = await HTMLtoDOCX(finalHtml, null, { + title: rendered.meta.title, + creator: rendered.meta.author || undefined, + }); + const bytes: Uint8Array = buf instanceof Uint8Array ? buf : new Uint8Array(await (buf as Blob).arrayBuffer()); + fs.writeFileSync(outputPath, bytes); + progress.end("Converting to DOCX"); + const kb = Math.round(fs.statSync(outputPath).size / 1024); + progress.done(`${rendered.meta.wordCount} words · ${kb}KB · ${outputPath} (content fidelity — layout is Word's)`); + return outputPath; + } + // Stage 3: write HTML to a tmp file browse can read // (We don't actually write it; we pass inline via --from-file JSON.) // But for preview mode and debugging, we still write to tmp. const htmlTmp = tmpFile("html"); - fs.writeFileSync(htmlTmp, rendered.html, "utf8"); + fs.writeFileSync(htmlTmp, finalHtml, "utf8"); // Stage 4: spin up a dedicated tab, load HTML, (wait for Paged.js if TOC), // then emit PDF. Always close the tab. @@ -114,7 +260,7 @@ export async function generate(opts: GenerateOptions): Promise { try { progress.begin("Loading HTML into Chromium"); browseClient.loadHtml({ - html: rendered.html, + html: finalHtml, waitUntil: "domcontentloaded", tabId, }); @@ -145,6 +291,10 @@ export async function generate(opts: GenerateOptions): Promise { tagged: opts.tagged !== false, outline: opts.outline !== false, printBackground: !!opts.watermark, + // Named landscape pages only take effect when Chromium honors CSS page + // sizes. Flip it ONLY when a promotion exists — minimal behavior change + // for every other document. + preferCSSPageSize: hasLandscape ? true : undefined, toc: opts.toc, }); progress.end("Generating PDF"); @@ -178,6 +328,21 @@ export async function preview(opts: PreviewOptions): Promise { progress.begin("Rendering HTML"); const markdown = fs.readFileSync(input, "utf8"); + // Preview deliberately skips the diagram/image pre-pass (no browse daemon + // round-trip — preview is the fast loop). Be loud about the divergence so + // nobody signs off on a preview that lacks what the PDF will have. + if (!opts.quiet) { + const fenceCount = extractDiagramFences(markdown).fences.length; + const hasLocalImages = /!\[[^\]]*\]\((?!https?:|data:)[^)]+\)/.test(markdown); + if (fenceCount > 0 || hasLocalImages) { + process.stderr.write( + `[make-pdf] preview note: ${fenceCount > 0 ? `${fenceCount} diagram fence(s) shown as code` : ""}` + + `${fenceCount > 0 && hasLocalImages ? "; " : ""}` + + `${hasLocalImages ? "local images may not resolve from the preview location" : ""}` + + ` — \`generate\` renders them fully.\n`, + ); + } + } const rendered = render({ markdown, title: opts.title, diff --git a/make-pdf/src/print-css.ts b/make-pdf/src/print-css.ts index 2366f42b9..bf6f862bd 100644 --- a/make-pdf/src/print-css.ts +++ b/make-pdf/src/print-css.ts @@ -12,9 +12,11 @@ * breaks copy-paste extraction. * - All paragraphs flush-left. No first-line indent, no justify, no * p+p indent. text-align: left everywhere. 12pt margin-bottom. - * - Cover page has the same 1in margins as every other page. No flexbox - * center, no inset padding, no vertical centering. Distinction comes - * from eyebrow + larger title + hairline rule, not from centering. + * - Cover page (v1.58.0.0 poster revision, user-directed): 56pt title, + * 13pt meta, padding-top 1.4in for poster placement. Still no flexbox + * and no vertical centering; the inset is a deliberate top-third drop. + * (Supersedes the original "no inset padding" lock from the first + * /plan-design-review — the 32pt cover read as too small in print.) * - `@page :first` suppresses running header/footer but does NOT override * the 1in margin. * - No , no external CSS/fonts — everything inlined. @@ -118,19 +120,76 @@ function pageRules(size: string, margin: string, opts: PrintCssOptions): string ` @bottom-center { content: none; }`, ` @bottom-right { content: none; }`, `}`, + ``, + // Landscape named page for promoted wide diagrams/images (image-policy). + // Chromium-only — exactly the engine this pipeline always prints with. + // Honored only when the print call passes preferCSSPageSize (orchestrator + // sets it when a promotion exists). Vertical centering is NOT done here — + // image-policy emits a computed inline margin-top instead (see the + // .page-wide comment below for why). + `@page wide {`, + ` size: ${size} landscape;`, + ` margin: ${margin};`, + `}`, + // No explicit break-before/after (the page-name CHANGE already forces a + // break on both sides) and NO height/flex centering: a flex .page-wide + // with min-height fragments into a phantom empty landscape page in + // Chromium (landscape-gate counted 5 pages for 3 promotions; bisected to + // min-height at any value). Vertical centering is done by image-policy + // instead — it knows each promoted block's aspect ratio and emits an + // inline margin-top, which fragmentation handles fine. + `.page-wide {`, + ` page: wide;`, + ` text-align: center;`, + `}`, + // width: 100% stretch is intentional for promoted content: auto-promoted + // rasters are >=~1600px (≈190dpi at the 9in landscape box — prints fine), + // and a directive-forced small image is the user's explicit call. + `.page-wide img, .page-wide svg { width: 100%; height: auto; max-width: none; }`, + `.page-wide figure.diagram > svg { max-width: none; }`, ].filter(line => line !== "").join("\n"); } +/** + * Screen layer appended for `--to html` exports. The print CSS stays the + * source of truth; this only makes the same document readable in a browser + * (centered measure, padding, no print-only chapter breaks forcing scroll + * gaps). Print output is unaffected — media-scoped. + */ +export function screenCss(): string { + return [ + `@media screen {`, + // ~42em at 12pt ≈ 70-75 characters per line — the readable ceiling. + ` body { max-width: 42em; margin: 0 auto; padding: 2.5em 1.5em; }`, + ` .chapter { break-before: auto; }`, + ` .watermark { display: none; }`, + ` figure.diagram { overflow-x: auto; }`, + // Page numbers only exist in print; hide the empty spans + dot leaders. + ` .toc li .toc-page, .toc li .toc-dots { display: none; }`, + `}`, + ].join("\n"); +} + function rootTypography(): string { return [ `html { lang: en; }`, + // Zero image truncation, ever: every image caps at the content box, + // whatever element it lives in. Markdown images render as

(no + // figure), so a figure-scoped cap alone lets a 1900px screenshot run off + // the page edge. .page-wide deliberately overrides to fill its landscape + // box — still bounded, never clipped. + `img { max-width: 100%; height: auto; }`, `body {`, ` font-family: ${SANS_STACK}, ${CJK_STACK}, ${EMOJI_FAMILIES}, sans-serif;`, - ` font-size: 11pt;`, + ` font-size: 12pt;`, ` line-height: 1.5;`, ` color: #111;`, ` background: white;`, - ` hyphens: auto;`, + // No auto-hyphenation: it puts real "dif-\nferent" breaks into the PDF + // text layer, and clean copy-paste is the product contract (the + // combined-gate caught this the moment 12pt body made lines wrap). + // Left-aligned rag doesn't need hyphenation. + ` hyphens: manual;`, ` font-variant-ligatures: common-ligatures;`, ` font-kerning: normal;`, ` text-rendering: geometricPrecision;`, @@ -143,45 +202,47 @@ function rootTypography(): string { function coverRules(enabled: boolean): string { if (!enabled) return ""; return [ + // Poster scale: the cover is the one page where type should feel huge. `.cover {`, ` page: first;`, ` page-break-after: always;`, ` break-after: page;`, ` text-align: left;`, + ` padding-top: 1.4in;`, `}`, `.cover .eyebrow {`, - ` font-size: 9pt;`, + ` font-size: 11pt;`, ` letter-spacing: 0.2em;`, ` text-transform: uppercase;`, ` color: #666;`, ` margin: 0 0 36pt;`, `}`, `.cover h1.cover-title {`, - ` font-size: 32pt;`, - ` line-height: 1.15;`, + ` font-size: 56pt;`, + ` line-height: 1.08;`, ` font-weight: 700;`, - ` letter-spacing: -0.01em;`, - ` margin: 0 0 18pt;`, - ` max-width: 5.5in;`, + ` letter-spacing: -0.02em;`, + ` margin: 0 0 24pt;`, + ` max-width: 6in;`, ` text-align: left;`, `}`, `.cover .cover-subtitle {`, - ` font-size: 14pt;`, - ` line-height: 1.4;`, + ` font-size: 18pt;`, + ` line-height: 1.35;`, ` font-weight: 400;`, ` color: #333;`, ` margin: 0 0 36pt;`, - ` max-width: 5in;`, + ` max-width: 5.5in;`, ` text-align: left;`, `}`, `.cover hr.rule {`, ` width: 2.5in;`, ` height: 0;`, ` border: 0;`, - ` border-top: 1px solid #111;`, - ` margin: 0 0 18pt 0;`, + ` border-top: 1.5px solid #111;`, + ` margin: 0 0 24pt 0;`, `}`, - `.cover .cover-meta { font-size: 10pt; line-height: 1.6; color: #333; text-align: left; }`, + `.cover .cover-meta { font-size: 13pt; line-height: 1.6; color: #333; text-align: left; }`, `.cover .cover-meta strong { font-weight: 700; }`, ].join("\n"); } @@ -191,12 +252,12 @@ function tocRules(enabled: boolean): string { return [ `.toc { page-break-after: always; break-after: page; }`, `.toc h2 {`, - ` font-size: 13pt;`, + ` font-size: 16pt;`, ` text-transform: uppercase;`, ` letter-spacing: 0.15em;`, - ` color: #666;`, - ` font-weight: 600;`, - ` margin: 0 0 0.5in;`, + ` color: #444;`, + ` font-weight: 700;`, + ` margin: 0 0 0.4in;`, `}`, `.toc ol {`, ` list-style: none;`, @@ -207,14 +268,14 @@ function tocRules(enabled: boolean): string { ` display: flex;`, ` align-items: baseline;`, ` gap: 0.25in;`, - ` font-size: 11pt;`, - ` line-height: 2;`, - ` padding: 4pt 0;`, + ` font-size: 12pt;`, + ` line-height: 1.7;`, + ` padding: 3pt 0;`, `}`, `.toc li .toc-title { flex: 0 0 auto; }`, `.toc li .toc-dots { flex: 1 1 auto; border-bottom: 1px dotted #aaa; margin: 0 6pt; transform: translateY(-4pt); }`, `.toc li .toc-page { flex: 0 0 auto; color: #666; font-variant-numeric: tabular-nums; }`, - `.toc li.level-2 { padding-left: 0.35in; font-size: 10pt; }`, + `.toc li.level-2 { padding-left: 0.35in; font-size: 11pt; }`, `.toc li a { color: inherit; text-decoration: none; }`, ].join("\n"); } @@ -229,7 +290,7 @@ function chapterRules(noChapterBreaks: boolean): string { return [ breakRule, `h1 {`, - ` font-size: 22pt;`, + ` font-size: 26pt;`, ` line-height: 1.2;`, ` font-weight: 700;`, ` letter-spacing: -0.01em;`, @@ -237,9 +298,9 @@ function chapterRules(noChapterBreaks: boolean): string { ` break-after: avoid;`, ` page-break-after: avoid;`, `}`, - `h2 { font-size: 15pt; line-height: 1.3; font-weight: 700; margin: 24pt 0 6pt; break-after: avoid; page-break-after: avoid; }`, - `h3 { font-size: 12pt; line-height: 1.4; font-weight: 700; text-transform: uppercase; letter-spacing: 0.08em; color: #333; margin: 18pt 0 4pt; break-after: avoid; page-break-after: avoid; }`, - `h4 { font-size: 11pt; font-weight: 700; margin: 12pt 0 4pt; break-after: avoid; page-break-after: avoid; }`, + `h2 { font-size: 18pt; line-height: 1.3; font-weight: 700; margin: 26pt 0 8pt; break-after: avoid; page-break-after: avoid; }`, + `h3 { font-size: 13.5pt; line-height: 1.4; font-weight: 700; text-transform: uppercase; letter-spacing: 0.08em; color: #333; margin: 20pt 0 5pt; break-after: avoid; page-break-after: avoid; }`, + `h4 { font-size: 12pt; font-weight: 700; margin: 14pt 0 5pt; break-after: avoid; page-break-after: avoid; }`, ].join("\n"); } @@ -254,7 +315,7 @@ function blockRules(): string { ` orphans: 3;`, `}`, `p:first-child { margin-top: 0; }`, - `p.lead { font-size: 13pt; line-height: 1.45; color: #222; margin: 0 0 18pt; }`, + `p.lead { font-size: 14pt; line-height: 1.45; color: #222; margin: 0 0 18pt; }`, ].join("\n"); } @@ -275,7 +336,7 @@ function codeRules(): string { return [ `code {`, ` font-family: "SF Mono", Menlo, Consolas, monospace;`, - ` font-size: 9.5pt;`, + ` font-size: 10.5pt;`, ` background: #f4f4f4;`, ` padding: 1pt 3pt;`, ` border-radius: 2pt;`, @@ -283,7 +344,7 @@ function codeRules(): string { `}`, `pre {`, ` font-family: "SF Mono", Menlo, Consolas, monospace;`, - ` font-size: 9pt;`, + ` font-size: 10pt;`, ` line-height: 1.4;`, ` background: #f7f7f5;`, ` padding: 10pt 12pt;`, @@ -310,11 +371,11 @@ function quoteRules(): string { ` padding: 0 0 0 18pt;`, ` border-left: 2pt solid #111;`, ` color: #333;`, - ` font-size: 11pt;`, + ` font-size: 12pt;`, ` line-height: 1.5;`, `}`, `blockquote p { margin-bottom: 6pt; text-align: left; }`, - `blockquote cite { display: block; margin-top: 6pt; font-style: normal; font-size: 9.5pt; color: #666; letter-spacing: 0.02em; }`, + `blockquote cite { display: block; margin-top: 6pt; font-style: normal; font-size: 10pt; color: #666; letter-spacing: 0.02em; }`, `blockquote cite::before { content: "— "; }`, ].join("\n"); } @@ -323,13 +384,25 @@ function figureRules(): string { return [ `figure { margin: 12pt 0; }`, `figure img { display: block; max-width: 100%; height: auto; }`, - `figcaption { font-size: 9pt; color: #666; margin-top: 6pt; font-style: italic; }`, + `figcaption { font-size: 10pt; color: #666; margin-top: 6pt; font-style: italic; }`, + // Diagram figures (diagram-prepass): rendered mermaid/excalidraw SVG. + // SVGs scale to the content box and never split across pages. + `figure.diagram { break-inside: avoid; text-align: center; }`, + `figure.diagram > svg { max-width: 100%; height: auto; }`, + `figure.diagram .diagram-caption { text-align: center; }`, + // Diagnostic block for a fence that failed to render — loud, boxed, + // unmistakably an error (never silent raw code). + `figure.diagram-error { border: 1.5pt solid #b00020; padding: 8pt 10pt; text-align: left; }`, + `figure.diagram-error .diagram-error-title { font-weight: 700; color: #b00020; font-style: normal; margin: 0 0 6pt; }`, + `figure.diagram-error .diagram-error-detail { font-size: 8.5pt; white-space: pre-wrap; margin: 0; }`, + // Missing local image placeholder (non-strict mode). + `.image-missing { display: inline-block; border: 1pt dashed #b00020; color: #b00020; padding: 4pt 8pt; font-size: 9pt; }`, ].join("\n"); } function tableRules(): string { return [ - `table { width: 100%; border-collapse: collapse; margin: 12pt 0; font-size: 10pt; }`, + `table { width: 100%; border-collapse: collapse; margin: 12pt 0; font-size: 11pt; }`, `th, td { border-bottom: 0.5pt solid #ccc; padding: 5pt 8pt; text-align: left; vertical-align: top; }`, `th { font-weight: 700; border-bottom: 1pt solid #111; background: transparent; }`, ].join("\n"); @@ -346,7 +419,7 @@ function listRules(): string { function footnoteRules(): string { return [ `.footnote-ref { font-size: 0.75em; vertical-align: super; line-height: 0; text-decoration: none; color: #0055cc; }`, - `.footnotes { margin-top: 24pt; padding-top: 12pt; border-top: 0.5pt solid #ccc; font-size: 9.5pt; line-height: 1.4; }`, + `.footnotes { margin-top: 24pt; padding-top: 12pt; border-top: 0.5pt solid #ccc; font-size: 10pt; line-height: 1.4; }`, `.footnotes ol { padding-left: 18pt; }`, ].join("\n"); } diff --git a/make-pdf/src/render.ts b/make-pdf/src/render.ts index ae5228f42..514fbbc89 100644 --- a/make-pdf/src/render.ts +++ b/make-pdf/src/render.ts @@ -14,6 +14,7 @@ import { marked } from "marked"; import { smartypants } from "./smartypants"; import { printCss, type PrintCssOptions } from "./print-css"; +import { applyImageDirectives } from "./image-policy"; export interface RenderOptions { markdown: string; @@ -34,6 +35,14 @@ export interface RenderOptions { // Page layout pageSize?: "letter" | "a4" | "legal" | "tabloid"; margins?: string; + // Per-side margins (override `margins`). Must reach the CSS @page rule: + // when a landscape promotion flips preferCSSPageSize on, the CSS margins + // are the ones Chromium honors — dropping per-side flags there would + // silently change the whole document's layout (Codex P2). + marginTop?: string; + marginRight?: string; + marginBottom?: string; + marginLeft?: string; // Footer behavior. pageNumbers defaults to true. When footerTemplate is set, // CSS page numbers are suppressed so the custom Chromium footer wins cleanly. @@ -60,8 +69,13 @@ export function render(opts: RenderOptions): RenderResult { // 1. Markdown → HTML const rawHtml = marked.parse(opts.markdown, { async: false }) as string; + // 1.5. Image directive suffixes: `![a](x.png){width=50%}` → data-gstack-* + // attributes. Before the sanitizer (which keeps data- attrs) so the brace + // text never reaches smartypants or the final page. + const directedHtml = applyImageDirectives(rawHtml); + // 2. Sanitize - const cleanHtml = sanitizeUntrustedHtml(rawHtml); + const cleanHtml = sanitizeUntrustedHtml(directedHtml); // 3. Decode common entities so smartypants can match raw " and '. // marked HTML-encodes quotes in text ("hello" → "hello"); @@ -91,7 +105,9 @@ export function render(opts: RenderOptions): RenderResult { confidential: opts.confidential !== false, runningHeader: derivedTitle, pageSize: opts.pageSize, - margins: opts.margins, + // Compose per-side margins into the CSS shorthand so @page stays the + // single source of truth even under preferCSSPageSize. + margins: composeMargins(opts), pageNumbers: showPageNumbers, }; const css = printCss(cssOptions); @@ -106,14 +122,22 @@ export function render(opts: RenderOptions): RenderResult { }) : ""; + // TOC anchors must resolve: assign id="toc-N" to each H1-H3 in the same + // order buildTocBlock scans them, or every TOC link is a dead href (masked + // in PDFs by Chromium outline bookmarks, glaring in --to html). Headings + // that already carry an id keep it — the ids array records the ACTUAL id + // per heading so TOC entries always link to something real. + const anchored = opts.toc ? addHeadingIds(typographicHtml) : { html: typographicHtml, ids: [] }; + const anchoredHtml = anchored.html; + const tocBlock = opts.toc - ? buildTocBlock(typographicHtml) + ? buildTocBlock(anchoredHtml, anchored.ids) : ""; // Wrap body in .chapter sections at H1 boundaries if chapter breaks are on. const chapterHtml = opts.noChapterBreaks - ? `

${typographicHtml}
` - : wrapChaptersByH1(typographicHtml); + ? `
${anchoredHtml}
` + : wrapChaptersByH1(anchoredHtml); const watermarkBlock = opts.watermark ? `
${escapeHtml(opts.watermark)}
` @@ -256,13 +280,13 @@ function buildCoverBlock(opts: { * Page numbers are filled in by Paged.js (when --toc is passed and Paged.js * polyfill is injected). */ -function buildTocBlock(html: string): string { +function buildTocBlock(html: string, ids: string[] = []): string { const headings = extractHeadings(html); if (headings.length === 0) return ""; const items = headings.map((h, i) => { const level = h.level >= 2 ? "level-2" : "level-1"; - const id = `toc-${i}`; + const id = ids[i] ?? `toc-${i}`; return [ `
  • `, ` ${escapeHtml(h.text)}`, @@ -282,6 +306,28 @@ function buildTocBlock(html: string): string { ].join("\n"); } +/** + * Assign id="toc-N" to every H1-H3 in document order — the same order + * extractHeadings/buildTocBlock use, so anchors and entries line up by index. + * A heading that already carries an id keeps it, and the returned ids array + * records the actual id for that slot so the TOC links to the real anchor + * instead of a nonexistent toc-N. + */ +function addHeadingIds(html: string): { html: string; ids: string[] } { + const ids: string[] = []; + const out = html.replace(/<(h[1-3])([^>]*)>/gi, (full, tag: string, attrs: string) => { + const existing = attrs.match(/\bid\s*=\s*["']([^"']*)["']/i)?.[1]; + if (existing) { + ids.push(existing); + return full; + } + const id = `toc-${ids.length}`; + ids.push(id); + return `<${tag}${attrs} id="${id}">`; + }); + return { html: out, ids }; +} + function extractHeadings(html: string): Array<{ level: number; text: string }> { const re = /<(h[1-3])[^>]*>([\s\S]*?)<\/\1>/gi; const headings: Array<{ level: number; text: string }> = []; @@ -352,11 +398,28 @@ function decodeTextEntities(s: string): string { .replace(/&/g, "&"); } +/** Compose `margin: top right bottom left` from per-side overrides + base. */ +function composeMargins(opts: { + margins?: string; marginTop?: string; marginRight?: string; + marginBottom?: string; marginLeft?: string; +}): string | undefined { + const base = opts.margins ?? "1in"; + if (!opts.marginTop && !opts.marginRight && !opts.marginBottom && !opts.marginLeft) { + return opts.margins; + } + return [ + opts.marginTop ?? base, + opts.marginRight ?? base, + opts.marginBottom ?? base, + opts.marginLeft ?? base, + ].join(" "); +} + function stripTags(html: string): string { return html.replace(/<[^>]+>/g, ""); } -function escapeHtml(s: string): string { +export function escapeHtml(s: string): string { return s .replace(/&/g, "&") .replace(/.pdf) + output?: string; // output path (default: /tmp/.) + + // Output format (NOT --format, which is a --page-size alias): + // pdf — print-quality PDF via Chromium (default) + // html — single self-contained file, zero network references + // docx — content-fidelity Word document (diagrams embedded as PNG) + to?: OutputFormat; // Page layout margins?: string; // "1in" | "72pt" | "25mm" | "2.54cm" @@ -44,6 +52,10 @@ export interface GenerateOptions { // Network allowNetwork?: boolean; // default: false + // Strict mode (eng-review D6.1): missing/remote images hard-fail instead of + // warn + placeholder. For CI docs pipelines that need determinism. + strict?: boolean; // default: false + // Metadata title?: string; author?: string; diff --git a/make-pdf/test/coverage-gaps.test.ts b/make-pdf/test/coverage-gaps.test.ts new file mode 100644 index 000000000..26586e8e5 --- /dev/null +++ b/make-pdf/test/coverage-gaps.test.ts @@ -0,0 +1,220 @@ +/** + * Coverage-gap fills from the v1.58.0.0 ship audit — the branches the main + * suites couldn't reach without a live browse tab (mock-tab here), plus the + * pure-function stragglers (WebP probing, landscape geometry, bundle path + * resolution, screen CSS). + */ +import { describe, expect, test } from "bun:test"; +import * as fs from "node:fs"; +import * as os from "node:os"; +import * as path from "node:path"; + +import { + RenderCallError, + type RenderTab, + landscapeContentBox, + rasterizeDiagramFigures, + renderFenceSlots, + resolveBundlePath, + substituteSlots, +} from "../src/diagram-prepass"; +import { imageDims } from "../src/image-size"; +import { screenCss } from "../src/print-css"; + +/** Duck-typed RenderTab: scripted call results + a loadBundle counter. */ +function mockTab(script: (fn: string, ...args: Array) => string) { + const calls: string[] = []; + let reloads = 0; + const tab = { + call: (fn: string, ...args: Array) => { + calls.push(fn); + return script(fn, ...args); + }, + loadBundle: () => { reloads++; }, + close: () => {}, + } as unknown as RenderTab; + return { tab, calls, reloadCount: () => reloads }; +} + +const fence = (over: Partial<{ lang: string; source: string; ordinal: number }>) => ({ + lang: "mermaid", + source: "graph LR\n A --> B", + render: true as const, + token: `tok-${over.ordinal ?? 1}`, + ordinal: over.ordinal ?? 1, + title: undefined, + page: undefined, + ...over, +}); + +// ─── renderFenceSlots: reset contract + excalidraw branches ─────────── + +describe("renderFenceSlots (mock tab)", () => { + test("reset contract: a failure reloads the bundle and the NEXT fence still renders", () => { + const { tab, reloadCount } = mockTab((fn, ...args) => { + if (String(args[1] ?? "").includes("BROKEN")) throw new RenderCallError("Parse error on line 1"); + return ""; + }); + const warnings: string[] = []; + const slots = renderFenceSlots( + [ + fence({ ordinal: 1 }), + fence({ ordinal: 2, source: "BROKEN" }), + fence({ ordinal: 3 }), + ], + tab, + (m) => warnings.push(m), + ); + expect(slots.get("tok-1")).toContain(""); + expect(slots.get("tok-2")).toContain("diagram-error"); + expect(slots.get("tok-3")).toContain(""); // post-failure fence rendered + expect(reloadCount()).toBe(1); // exactly one reset reload + expect(warnings[0]).toContain("failed to render"); + }); + + test("excalidraw fence renders via __excalidrawToSvg", () => { + const { tab, calls } = mockTab(() => ""); + const slots = renderFenceSlots( + [fence({ lang: "excalidraw", source: '{"type":"excalidraw","elements":[]}' })], + tab, + () => {}, + ); + expect(calls).toEqual(["__excalidrawToSvg"]); + expect(slots.get("tok-1")).toContain(" { + const { tab, calls, reloadCount } = mockTab(() => ""); + const warnings: string[] = []; + const slots = renderFenceSlots( + [fence({ lang: "excalidraw", source: "{not json" })], + tab, + (m) => warnings.push(m), + ); + expect(calls).toEqual([]); // JSON.parse threw before any bundle call + expect(slots.get("tok-1")).toContain("diagram-error"); + expect(reloadCount()).toBe(1); + expect(warnings).toHaveLength(1); + }); +}); + +// ─── rasterizeDiagramFigures: svg-data-URI + error fallbacks ────────── + +describe("rasterizeDiagramFigures (mock tab)", () => { + const figure = ``; + + test("svg data-URI images rasterize to PNG", () => { + const svgUri = `data:image/svg+xml;base64,${Buffer.from("").toString("base64")}`; + const { tab } = mockTab(() => "data:image/png;base64,AAAA"); + const out = rasterizeDiagramFigures(`v`, tab, 6.5, () => {}); + expect(out).toContain('src="data:image/png;base64,AAAA"'); + }); + + test("figure rasterization failure surfaces the SOURCE as text (never silent loss)", () => { + // Returning the figure unchanged would make the diagram vanish in DOCX + // (the converter drops
    /) — the failure must be visible. + const { tab } = mockTab(() => { throw new RenderCallError("tainted"); }); + const warnings: string[] = []; + const srcFigure = figure.replace( + '
    B").toString("base64")}"`, + ); + const out = rasterizeDiagramFigures(srcFigure, tab, 6.5, (m) => warnings.push(m)); + expect(out).toContain("could not be rasterized"); + expect(out).toContain("A --> B"); // source visible (escaped), not dropped + expect(out).not.toContain(" { + const svgUri = `data:image/svg+xml;base64,${Buffer.from("").toString("base64")}`; + const { tab } = mockTab(() => { throw new RenderCallError("decode failed"); }); + const tagIn = `
    `; + const out = rasterizeDiagramFigures(tagIn, tab, 6.5, () => {}); + expect(out).toBe(tagIn); + }); +}); + +// ─── image-size: WebP variants ──────────────────────────────────────── + +describe("imageDims WebP", () => { + function riff(fmt: string, body: Buffer): Buffer { + const b = Buffer.alloc(12 + 4 + body.length); + b.write("RIFF", 0, "ascii"); + b.writeUInt32LE(4 + body.length + 4, 4); + b.write("WEBP", 8, "ascii"); + b.write(fmt, 12, "ascii"); + body.copy(b, 16); + return b; + } + + test("VP8 (lossy)", () => { + const body = Buffer.alloc(16); + body.writeUInt16LE(800 & 0x3fff, 10); // width at chunk offset 26 = body offset 10 + body.writeUInt16LE(600 & 0x3fff, 12); + expect(imageDims(riff("VP8 ", body))).toEqual({ width: 800, height: 600, mime: "image/webp" }); + }); + + test("VP8L (lossless)", () => { + const body = Buffer.alloc(10); + body[4] = 0x2f; // signature at chunk offset 20 = body offset 4 + const w = 1023, h = 511; + const bits = (w - 1) | ((h - 1) << 14); + body.writeUInt32LE(bits >>> 0, 5); + expect(imageDims(riff("VP8L", body))).toEqual({ width: 1023, height: 511, mime: "image/webp" }); + }); + + test("VP8X (extended)", () => { + const body = Buffer.alloc(14); + const w = 4000 - 1, h = 250 - 1; // 24-bit minus-one at offsets 24/27 = body 8/11 + body[8] = w & 0xff; body[9] = (w >> 8) & 0xff; body[10] = (w >> 16) & 0xff; + body[11] = h & 0xff; body[12] = (h >> 8) & 0xff; body[13] = (h >> 16) & 0xff; + expect(imageDims(riff("VP8X", body))).toEqual({ width: 4000, height: 250, mime: "image/webp" }); + }); + + test("unknown RIFF subtype → null", () => { + expect(imageDims(riff("XXXX", Buffer.alloc(14)))).toBeNull(); + }); +}); + +// ─── landscape geometry + slot fallback + bundle path + screen css ──── + +describe("pure-function stragglers", () => { + test("landscapeContentBox letter defaults: 9in × 6.5in", () => { + expect(landscapeContentBox({})).toEqual({ contentWIn: 9, contentHIn: 6.5 }); + }); + test("landscapeContentBox a4 + asymmetric margins", () => { + const box = landscapeContentBox({ pageSize: "a4", marginLeft: "0.5in", marginRight: "0.5in", marginTop: "25mm", marginBottom: "1in" }); + expect(box.contentWIn).toBeCloseTo(11.69 - 1, 2); + expect(box.contentHIn).toBeCloseTo(8.27 - 25 / 25.4 - 1, 2); + }); + + test("substituteSlots bare-token fallback (token not

    -wrapped)", () => { + const slots = new Map([["gstack-diagram-slot-x-1", "

    D
    "]]); + const out = substituteSlots("
  • gstack-diagram-slot-x-1
  • ", slots); + expect(out).toBe("
  • D
  • "); + }); + + test("resolveBundlePath honors the env override", () => { + const tmp = path.join(os.tmpdir(), `bundle-override-${process.pid}.html`); + fs.writeFileSync(tmp, ""); + try { + expect(resolveBundlePath({ GSTACK_DIAGRAM_BUNDLE: tmp } as NodeJS.ProcessEnv)).toBe(tmp); + } finally { + fs.unlinkSync(tmp); + } + }); + // NOTE: resolveBundlePath's not-found error shape is untestable from inside + // this checkout (the repo-relative candidate always exists), and a vacuous + // if-guarded assertion was worse than none. The env-override test above is + // the honest coverage; the error path is exercised manually via + // GSTACK_DIAGRAM_BUNDLE pointing at a missing file outside a repo. + + test("screenCss is media-scoped and readable-width", () => { + const css = screenCss(); + expect(css).toContain("@media screen"); + // 42em at 12pt ≈ 70-75 chars/line — the readable ceiling (design review). + expect(css).toContain("max-width: 42em"); + expect(css).toContain(".watermark { display: none; }"); + }); +}); diff --git a/make-pdf/test/diagram-prepass.test.ts b/make-pdf/test/diagram-prepass.test.ts new file mode 100644 index 000000000..eac3e645d --- /dev/null +++ b/make-pdf/test/diagram-prepass.test.ts @@ -0,0 +1,403 @@ +/** + * Unit tests for the diagram pre-pass: fence extraction, info-string parsing, + * slot substitution, diagnostic blocks, image inlining policy, and the + * byte-level image dimension prober. No browse daemon required — the tab + * factory returns null so downscale paths are exercised as no-ops. + */ +import { afterAll, describe, expect, test } from "bun:test"; +import * as fs from "node:fs"; +import * as os from "node:os"; +import * as path from "node:path"; +import zlib from "node:zlib"; + +import { + StrictModeError, + buildDiagnosticBlock, + buildDiagramFigure, + contentWidthInches, + dimToInches, + extractDiagramFences, + inlineLocalImages, + parseInfoString, + substituteSlots, + decodeFigureSource, +} from "../src/diagram-prepass"; +import { imageDims } from "../src/image-size"; + +// ─── fence extraction ───────────────────────────────────────────────── + +describe("extractDiagramFences", () => { + test("extracts a mermaid fence and replaces it with a token paragraph", () => { + const md = "# T\n\n```mermaid\ngraph LR\n A --> B\n```\n\ntail"; + const { markdown, fences } = extractDiagramFences(md); + expect(fences).toHaveLength(1); + expect(fences[0].lang).toBe("mermaid"); + expect(fences[0].source).toBe("graph LR\n A --> B"); + expect(markdown).toContain(fences[0].token); + expect(markdown).not.toContain("```mermaid"); + }); + + test("extracts excalidraw fences", () => { + const md = '```excalidraw\n{"type":"excalidraw","elements":[]}\n```'; + const { fences } = extractDiagramFences(md); + expect(fences).toHaveLength(1); + expect(fences[0].lang).toBe("excalidraw"); + }); + + test("render=false keeps the fence as code and strips the flag", () => { + const md = "```mermaid render=false\ngraph LR\n X --> Y\n```"; + const { markdown, fences } = extractDiagramFences(md); + expect(fences).toHaveLength(0); + expect(markdown).toContain("```mermaid\ngraph LR"); + expect(markdown).not.toContain("render=false"); + }); + + test("title is captured from the info string", () => { + const md = '```mermaid title="Auth flow"\ngraph LR\n A --> B\n```'; + const { fences } = extractDiagramFences(md); + expect(fences[0].title).toBe("Auth flow"); + }); + + test("non-diagram fences pass through untouched", () => { + const md = "```js\nconst a = 1;\n```"; + const { markdown, fences } = extractDiagramFences(md); + expect(fences).toHaveLength(0); + expect(markdown).toBe(md); + }); + + test("a mermaid example inside a plain fence is never extracted", () => { + const md = "````\n```mermaid\ngraph LR\n```\n````"; + const { markdown, fences } = extractDiagramFences(md); + expect(fences).toHaveLength(0); + expect(markdown).toBe(md); + }); + + test("tilde fences work", () => { + const md = "~~~mermaid\ngraph TD\n A --> B\n~~~"; + const { fences } = extractDiagramFences(md); + expect(fences).toHaveLength(1); + }); + + test("unclosed fence at EOF replays verbatim", () => { + const md = "```mermaid\ngraph LR\n A --> B"; + const { markdown, fences } = extractDiagramFences(md); + expect(fences).toHaveLength(0); + expect(markdown).toBe(md); + }); + + test("multiple fences get distinct ordinals and tokens", () => { + const md = "```mermaid\nA\n```\n\nmiddle\n\n```mermaid\nB\n```"; + const { fences } = extractDiagramFences(md); + expect(fences).toHaveLength(2); + expect(fences[0].ordinal).toBe(1); + expect(fences[1].ordinal).toBe(2); + expect(fences[0].token).not.toBe(fences[1].token); + }); +}); + +describe("parseInfoString", () => { + test("plain language", () => { + expect(parseInfoString("mermaid")).toEqual({ lang: "mermaid", render: true, title: undefined }); + }); + test("render=false", () => { + expect(parseInfoString("mermaid render=false").render).toBe(false); + }); + test("single-quoted title", () => { + expect(parseInfoString("mermaid title='Hi there'").title).toBe("Hi there"); + }); +}); + +// ─── slots ──────────────────────────────────────────────────────────── + +describe("substituteSlots", () => { + test("replaces the

    -wrapped token with slot HTML", () => { + const slots = new Map([["gstack-diagram-slot-ab-1", "

    X
    "]]); + const html = "

    T

    \n

    gstack-diagram-slot-ab-1

    \n

    tail

    "; + const out = substituteSlots(html, slots); + expect(out).toContain("
    X
    "); + expect(out).not.toContain("gstack-diagram-slot"); + expect(out).not.toContain("

    "); + }); +}); + +describe("diagnostic + figure blocks", () => { + const fence = { + lang: "mermaid", source: "graph LR\n A --> B", render: true, + token: "t", ordinal: 3, title: undefined, + }; + test("diagnostic block escapes error content and names the lang", () => { + const block = buildDiagnosticBlock(fence, 'Parse "quoted"'); + expect(block).toContain("diagram-error"); + expect(block).toContain("Diagram failed to render (mermaid)"); + expect(block).toContain("Parse <error>"); + expect(block).not.toContain(""); + }); + test("figure carries role=img and ordinal-based aria-label fallback", () => { + const fig = buildDiagramFigure(fence, ""); + expect(fig).toContain('role="img"'); + expect(fig).toContain('aria-label="diagram 3"'); + expect(fig).toContain(""); + }); + test("figure strips scripts from SVG (sanitizer second layer)", () => { + const fig = buildDiagramFigure(fence, ""); + expect(fig).not.toContain(" other than the page's + // own closers: head error-trap closer + module closer. + const closers = html.match(/<\/script>/g) ?? []; + expect(closers.length).toBe(2); + }); + + const nodeModules = path.join(ROOT, "node_modules"); + let builtWithSameBun = false; + try { + const info = require(BUILD_INFO); + builtWithSameBun = info.bunVersion === Bun.version; + } catch {} + const canDeepCheck = existsSync(nodeModules) && builtWithSameBun; + + test.skipIf(!canDeepCheck)( + "deep: fresh build reproduces committed dist", + async () => { + const before = await Bun.file(BUILD_INFO).json(); + const proc = Bun.spawnSync(["bun", "run", "scripts/build.ts"], { cwd: ROOT }); + expect(proc.exitCode).toBe(0); + const after = await Bun.file(BUILD_INFO).json(); + expect(after.sha256).toBe(before.sha256); + }, + 60000, + ); +}); diff --git a/test/helpers/touchfiles.ts b/test/helpers/touchfiles.ts index ca9957c0e..68bc2062e 100644 --- a/test/helpers/touchfiles.ts +++ b/test/helpers/touchfiles.ts @@ -291,6 +291,11 @@ export const E2E_TOUCHFILES: Record = { 'design-shotgun-session': ['design-shotgun/**', 'scripts/resolvers/design.ts'], 'design-shotgun-full': ['design-shotgun/**', 'design/src/**', 'browse/src/**'], + // /diagram (diagram-render bundle consumers). Triplet = deterministic + // functional (gate); authoring quality = LLM-judged benchmark (periodic). + 'diagram-triplet': ['diagram/**', 'lib/diagram-render/**', 'browse/src/write-commands.ts', 'browse/src/read-commands.ts'], + 'diagram-authoring-quality': ['diagram/**', 'lib/diagram-render/**', 'test/helpers/llm-judge.ts'], + // gstack-upgrade 'gstack-upgrade-happy-path': ['gstack-upgrade/**'], @@ -656,6 +661,10 @@ export const E2E_TIERS: Record = { 'design-shotgun-session': 'gate', 'design-shotgun-full': 'periodic', + // /diagram — triplet is deterministic functional, judge is a quality benchmark + 'diagram-triplet': 'gate', + 'diagram-authoring-quality': 'periodic', + // gstack-upgrade 'gstack-upgrade-happy-path': 'gate', diff --git a/test/skill-coverage-matrix.ts b/test/skill-coverage-matrix.ts index 101918bda..7359afbce 100644 --- a/test/skill-coverage-matrix.ts +++ b/test/skill-coverage-matrix.ts @@ -131,6 +131,11 @@ export const SKILL_COVERAGE: Record = { 'design-consultation': { gate: ['test/skill-coverage-floor.test.ts'], periodic: [] }, 'design-shotgun': { gate: ['test/skill-coverage-floor.test.ts'], periodic: [] }, 'design-html': { gate: ['test/skill-coverage-floor.test.ts'], periodic: [] }, + diagram: { + gate: ['test/skill-e2e-diagram.test.ts', 'test/skill-coverage-floor.test.ts'], + periodic: ['test/skill-e2e-diagram.test.ts'], + rationale: 'Triplet contract is gate-tier deterministic; authoring-quality judge is periodic (E2E_TIERS: diagram-triplet/diagram-authoring-quality).', + }, cso: { gate: ['test/skill-e2e-cso.test.ts', 'test/cso-preserved.test.ts', 'test/skill-coverage-floor.test.ts'], periodic: [], diff --git a/test/skill-e2e-bws.test.ts b/test/skill-e2e-bws.test.ts index 956174117..cf812e1fc 100644 --- a/test/skill-e2e-bws.test.ts +++ b/test/skill-e2e-bws.test.ts @@ -192,13 +192,21 @@ Report the exact output — either "READY: " or "NEEDS_SETUP".`, run('git', ['add', '.']); run('git', ['commit', '-m', 'initial']); - // Copy bin scripts + // Copy bin scripts + the lib module they import. gstack-learnings-log + // does `import ... from '$SCRIPT_DIR/../lib/jsonl-store.ts'` (v1.57.5.0 + // injection sanitization) — without lib/ alongside bin/, the script exits + // 1 before writing anything, failing this test for a fixture reason, not + // a model-behavior reason (root-caused during the v1.58.0.0 ship; fails + // identically on main). const binDir = path.join(opDir, 'bin'); fs.mkdirSync(binDir, { recursive: true }); for (const script of ['gstack-learnings-log', 'gstack-slug']) { fs.copyFileSync(path.join(ROOT, 'bin', script), path.join(binDir, script)); fs.chmodSync(path.join(binDir, script), 0o755); } + const libDir = path.join(opDir, 'lib'); + fs.mkdirSync(libDir, { recursive: true }); + fs.copyFileSync(path.join(ROOT, 'lib', 'jsonl-store.ts'), path.join(libDir, 'jsonl-store.ts')); // gstack-learnings-log will create the project dir automatically via gstack-slug diff --git a/test/skill-e2e-diagram.test.ts b/test/skill-e2e-diagram.test.ts new file mode 100644 index 000000000..43f3dddfc --- /dev/null +++ b/test/skill-e2e-diagram.test.ts @@ -0,0 +1,153 @@ +/** + * /diagram skill E2E (paid, claude -p). + * + * Two tests with deliberately different tiers (eng-review D5): + * + * diagram-triplet (gate) — deterministic functional contract: from an + * English ask, the agent following the skill emits a parseable triplet — + * .mmd source, .excalidraw scene with elements, SVG markup, PNG bytes. + * No quality judgment; either the artifacts exist and parse or they don't. + * + * diagram-authoring-quality (periodic) — LLM-judged benchmark of the + * authored mermaid itself (faithfulness to the ask, label quality, + * readable size). Non-deterministic by nature → never blocks merge. + * + * Per the extract-don't-copy fixture rule, the prompt embeds only the skill's + * working section (from "# /diagram" onward), not the full generated SKILL.md + * with its preamble. + */ +import { describe, expect } from 'bun:test'; +import * as fs from 'node:fs'; +import * as path from 'node:path'; +import * as os from 'node:os'; + +import { runSkillTest } from './helpers/session-runner'; +import { + ROOT, browseBin, runId, + describeIfSelected, testConcurrentIfSelected, + logCost, +} from './helpers/e2e-helpers'; +import { callJudge } from './helpers/llm-judge'; + +const BUNDLE = path.join(ROOT, 'lib', 'diagram-render', 'dist', 'diagram-render.html'); + +/** Extract the working section of the generated skill doc (post-preamble). */ +function skillExtract(): string { + const full = fs.readFileSync(path.join(ROOT, 'diagram', 'SKILL.md'), 'utf-8'); + const start = full.indexOf('# /diagram'); + if (start < 0) throw new Error('diagram/SKILL.md missing "# /diagram" section — regenerate skill docs'); + return full.slice(start); +} + +function setupDir(prefix: string): string { + const dir = fs.mkdtempSync(path.join(os.tmpdir(), prefix)); + fs.writeFileSync(path.join(dir, 'diagram-skill.md'), skillExtract()); + // Pre-stage the bundle so the test is hermetic (no global install needed in + // CI); the prompt tells the agent discovery is already done. + fs.copyFileSync(BUNDLE, path.join(dir, 'diagram-render.html')); + fs.mkdirSync(path.join(dir, 'out')); + return dir; +} + +function basePrompt(dir: string, ask: string): string { + return `You have the /diagram skill instructions at ./diagram-skill.md — read them and follow Steps 1-4. + +Environment notes (already set up — skip Step 2's bundle discovery): +- The browse binary is at ${browseBin} — use it wherever the skill says $B. +- The render bundle is ALREADY staged at ./diagram-render.html in this directory; load it with: ${browseBin} load-html ./diagram-render.html +- Write all four artifacts into ./out/ with the slug "flow" (out/flow.mmd, out/flow.excalidraw, out/flow.svg, out/flow.png). +- Do not open any other applications. Do not use the Read tool on the PNG (no inline display needed here). + +The diagram to create: ${ask}`; +} + +describeIfSelected('/diagram skill E2E', ['diagram-triplet', 'diagram-authoring-quality'], () => { + testConcurrentIfSelected('diagram-triplet', async () => { + const dir = setupDir('diagram-triplet-'); + try { + const result = await runSkillTest({ + prompt: basePrompt( + dir, + 'a flowchart (graph LR) of a 4-stage pipeline: markdown → prepass → Chromium → PDF.', + ), + workingDirectory: dir, + maxTurns: 25, + allowedTools: ['Bash', 'Read', 'Write'], + timeout: 240_000, + testName: 'diagram-triplet', + runId, + }); + logCost('diagram triplet', result); + expect(result.exitReason).toBe('success'); + + // The deterministic contract: all four artifacts exist and parse. + const mmd = fs.readFileSync(path.join(dir, 'out', 'flow.mmd'), 'utf-8'); + expect(mmd).toMatch(/graph\s+(LR|TD)/); + + const scene = JSON.parse(fs.readFileSync(path.join(dir, 'out', 'flow.excalidraw'), 'utf-8')); + expect(scene.type).toBe('excalidraw'); + expect(Array.isArray(scene.elements)).toBe(true); + expect(scene.elements.length).toBeGreaterThan(3); + + const svg = fs.readFileSync(path.join(dir, 'out', 'flow.svg'), 'utf-8'); + expect(svg).toMatch(/ { + const dir = setupDir('diagram-quality-'); + try { + const result = await runSkillTest({ + prompt: basePrompt( + dir, + 'how gstack renders diagrams in PDFs: markdown containing mermaid fences goes through a pre-pass that extracts the fences, renders them in a browse daemon tab using an offline bundle, substitutes the SVG back in, inlines local images, and prints via Chromium. Failures become visible diagnostic blocks.', + ), + workingDirectory: dir, + maxTurns: 25, + allowedTools: ['Bash', 'Read', 'Write'], + timeout: 240_000, + testName: 'diagram-authoring-quality', + runId, + }); + logCost('diagram authoring quality', result); + expect(result.exitReason).toBe('success'); + + const mmd = fs.readFileSync(path.join(dir, 'out', 'flow.mmd'), 'utf-8'); + const svg = fs.readFileSync(path.join(dir, 'out', 'flow.svg'), 'utf-8'); + expect(svg).toMatch(/( + `You are judging the quality of an agent-authored mermaid diagram. + +THE ASK: a diagram of gstack's PDF diagram-rendering flow — mermaid fences are +extracted by a pre-pass, rendered in a browse tab via an offline bundle, +substituted back as SVG, images inlined, printed by Chromium, with render +failures becoming visible diagnostic blocks. + +THE AUTHORED MERMAID: +\`\`\`mermaid +${mmd} +\`\`\` + +Score 1-10 on: faithfulness to the ask (are the named stages present and +correctly ordered?), label quality (short node labels, detail on edges), +and readable size (5-15 nodes, not a wall). A diagram that misses the +failure/diagnostic path entirely caps at 5 — that path is an explicitly +named requirement, so omitting it must fail the run. + +Respond with JSON: {"score": N, "reasoning": "..."}`, + ); + // eslint-disable-next-line no-console + console.log(`[diagram-quality] score=${verdict.score} — ${verdict.reasoning}`); + expect(verdict.score).toBeGreaterThanOrEqual(6); + } finally { + try { fs.rmSync(dir, { recursive: true, force: true }); } catch { /* ignore */ } + } + }, 300_000); +}); From c7ae63201ab193a7dc7fb7e0d81238645111ffac Mon Sep 17 00:00:00 2001 From: Garry Tan Date: Sun, 14 Jun 2026 11:40:57 -0700 Subject: [PATCH 4/4] v1.58.1.0 feat: hermetic local E2E + Conductor prose AskUserQuestion (#2004) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit * feat: add shared call-time isConductor() helper Single source of truth for Conductor host detection in TS consumers (CONDUCTOR_WORKSPACE_PATH / CONDUCTOR_PORT). Reads the passed env at call time, not a module-load snapshot, so unit tests can pin the env inline without Bun --preload (esm-hoist-breaks-env-pin-bootstrap). Co-Authored-By: Claude Fable 5 * test: harden question-preference-hook harness against ambient Conductor env runHook copied all of process.env into the hook subprocess, so running the suite inside Conductor (CONDUCTOR_WORKSPACE_PATH/PORT set) would leak those markers. Strip them so the existing cases deterministically characterize NON-Conductor behavior before the Conductor branch lands. Baseline: 15 pass. Co-Authored-By: Claude Fable 5 * feat: PreToolUse hook denies AskUserQuestion in Conductor, redirects to prose Conductor disables native AskUserQuestion and routes through a flaky MCP variant that returns '[Tool result missing due to internal error]'. The hook now denies any AUQ call in a Conductor session and instructs the model to render a prose decision brief instead (transport avoidance, not preference enforcement) — firing for one-way doors too, with a typed-confirmation requirement for destructive paths. Precedence: never-ask auto-decide still wins (user already settled those); Conductor prose is the fallback for everything else; non-Conductor behavior is byte-for-byte unchanged. Restructured the per-question loop to compute eligibility without early-returning so the Conductor branch can run as the fallback while preserving memoryContext on every exit. Co-Authored-By: Claude Fable 5 * feat: Conductor renders AskUserQuestion decisions as prose by default In Conductor, native AskUserQuestion is disabled and the MCP variant is flaky, so skills now render every decision as a plain-text prose brief the user answers by typing a letter — proactively, not as a failure reaction. - Preamble emits CONDUCTOR_SESSION, gated on != headless so eval/CI inside Conductor still BLOCKs instead of rendering prose to nobody. - AskUserQuestion Format gains a Conductor-default-prose rule (auto-decide preferences still apply first; prose decisions log via gstack-question-log since PostToolUse never fires), a one-way/destructive typed-confirmation rule, and a typed-reply continuation protocol for split chains. - Regenerated all SKILL.md + ship golden fixtures; bumped affected carve skeleton caps to absorb the always-loaded additions. Co-Authored-By: Claude Fable 5 * feat: deploy the Conductor AskUserQuestion hook (setup + upgrade migration) The PreToolUse hook only delivers its Conductor-prose guarantee if it's installed, but setup skips hook registration in non-interactive (conductor/CI) setups. Two fixes so layer 3 actually deploys: - setup: treat a Conductor workspace as an implicit opt-in for the PreToolUse hook on the silent fall-through (never overriding an explicit opt-out). - migration v1.58.0.0: re-register the hook for existing Conductor installs on /gstack-upgrade, idempotent and respecting plan_tune_hooks=no. Co-Authored-By: Claude Fable 5 * test: E2E for Conductor prose + fix auto-decide-preserved GSTACK_HOME bug - New skill-e2e-conductor-prose (periodic): Conductor env + plan-eng-review surfaces a prose decision brief, not a silent skip. Header documents this is end-to-end behavior coverage; the deterministic Conductor guard is the question-preference-hook unit test (the PTY harness can't register the MCP variant — Codex #10). - Fix the pre-existing bug in auto-decide-preserved: it seeded the never-ask preference under GSTACK_HOME=tmpHome but never passed GSTACK_HOME into the PTY run, so the spawned claude read the real ~/.gstack and the preference was inert (Codex #9). Now passes GSTACK_HOME + CONDUCTOR_WORKSPACE_PATH to prove auto-decide still wins over the Conductor prose redirect. - Register both in touchfiles (periodic tier). Co-Authored-By: Claude Fable 5 * v1.58.0.0 feat: Conductor renders AskUserQuestion decisions as prose Co-Authored-By: Claude Fable 5 * test: strip ambient Conductor env in memory-cache-injection hook harness Same dev-in-Conductor leak fixed for question-preference-hook: this suite's runHook copies process.env, so running it inside Conductor flipped the defer-path memoryContext assertions into the [conductor] prose deny. Strip CONDUCTOR_* so the cases characterize non-Conductor behavior. (CI is headless, so this only bit local Conductor runs.) Co-Authored-By: Claude Fable 5 * feat: gstack-detach — run agent eval/bench jobs in their own session Long agent-run jobs (30-60 min evals, benchmarks) die when the harness sends SIGTERM to a background task's process group on turn boundaries / monitor stops / interruptions (observed: 'script test:gate terminated by signal SIGTERM'). gstack-detach runs the command in a fresh session (python3 os.setsid, or setsid on Linux, nohup fallback) so a group SIGTERM can't reach it, and wraps it in caffeinate -i on macOS so idle-sleep can't kill it either. Returns immediately; caller polls the logfile. Secrets stay in env, never argv. The guard test pins the contract: the command runs in a different process group than the caller and outlives the launching shell. Co-Authored-By: Claude Fable 5 * feat: eval:bg* scripts — detached eval runs for agents Agent-facing convenience scripts that launch the eval suites through gstack-detach so a harness SIGTERM can't kill a long run. eval:bg (diff-based), eval:bg:all, eval:bg:gate, eval:bg:periodic — each returns immediately and streams to /tmp/gstack-evals.log for polling. The plain test:evals / test:e2e scripts stay foreground for humans. Co-Authored-By: Claude Fable 5 * docs: CLAUDE.md — agents must run long evals via gstack-detach Codifies the detached-execution default: agent-launched eval/benchmark runs go through bin/gstack-detach (or the eval:bg* scripts) so a harness SIGTERM or macOS idle-sleep can't kill a 30-60 min run, then poll the log with a death-aware watcher. Humans keep foreground scripts. Co-Authored-By: Claude Fable 5 * feat: harden gstack-detach against all four eval-infra killers The basic bash detach fixed SIGTERM but a real run on a shared dev box hit three more killers: cross-worktree API saturation (15-way concurrency x a sibling worktree mass-timed-out the suite), a silent hang (periodic bun died with no exit marker), and shared-/tmp log contamination (a concurrent worktree's agent output bled into the log). Rewrite as a portable python3 tool that bakes in all four fixes: - fork + setsid: SIGTERM-proof (own session, survives harness polite-quit) - caffeinate -i on macOS: no idle-sleep death - --lock NAME (fcntl, machine-wide): concurrent worktrees SERIALIZE instead of saturating the shared model API - run-scoped default log (~/.gstack-dev/eval-runs/ ###' sentinel on every terminal path: no silent hang, finished-vs-died always detectable Guard test pins all four: detached pgid differs + outlives launcher, run-scoped log path, watchdog EXIT=timeout, and lock serialization (second run WAITS). Co-Authored-By: Claude Fable 5 * feat: eval:bg* use run-scoped logs + machine lock + watchdog Drop the shared /tmp/gstack-evals.log path (the cross-worktree collision that contaminated a live run) for gstack-detach's run-scoped default, and add the machine-wide gstack-evals lock (concurrent worktrees serialize, no API saturation) plus per-tier watchdog timeouts (60/90/120 min). Each eval:bg* prints its run-scoped log path to poll. Co-Authored-By: Claude Fable 5 * docs: wire detached-eval guidance into /ship + correct CLAUDE.md flags - /ship eval step (sections/tests.md): long eval suites launch via gstack-detach (own session, machine lock, EXIT sentinel) so a turn boundary can't kill a 30+ min run mid-ship — the exact failure observed during this branch's ship. - CLAUDE.md: correct the now-stale /tmp reference; document the --lock (serialize worktrees, no API saturation), --timeout watchdog, run-scoped log, and the guaranteed EXIT sentinel the poller breaks on. Co-Authored-By: Claude Fable 5 * refactor: extract pure promotedEnv() from conductor-env-shim Single source of truth for GSTACK_* key promotion semantics. The ambient promoteConductorEnv() becomes a wrapper; behavior-preserving. Needed by the hermetic env builder which must not mutate process.env. Co-Authored-By: Claude Fable 5 * feat: hermetic child-env builder for E2E runners Allowlist scrub (basics/network/named-auth kept; CONDUCTOR_*, CLAUDE_*, GSTACK_*, MCP_*, GBRAIN_*, operator credentials dropped), per-runner extraAllow, overrides merge last, EVALS_HERMETIC=0 byte-identical escape hatch read at call time (ESM-hoist safe). Sync memoized singleton temp dirs (/.claude keeps the extractPlanFilePath contract), seeded .claude.json for non-interactive first run, pid-aware GC of crashed runs. 19 free unit tests. Co-Authored-By: Claude Fable 5 * feat: session-runner spawns hermetic children + isolation canaries claude -p children now get the allowlist-scrubbed env and a gated --strict-mcp-config (EVALS_HERMETIC=0 restores operator env AND args). Two gate-tier canaries make the clean room falsifiable: hermetic-canary asserts env redirect + scrub + zero MCP servers + nonzero API-key cost from the Bash tool_result (never model prose); hermetic-sentinel plants a poisoned operator config (user CLAUDE.md + MCP server) and proves the child cannot see it. Empirically verified on claude 2.1.175: print mode needs no seed config (the seed serves the PTY path); the child CLI sets CLAUDECODE for its own tools, so that scrub is pinned in unit tests, not E2E. hermetic-env.ts joins GLOBAL_TOUCHFILES. Co-Authored-By: Claude Fable 5 * feat: PTY runner spawns hermetic claude sessions launchClaudePty children get the allowlist-scrubbed env, a gated --strict-mcp-config, and the session exposes hermeticConfigDir for forensics (hermetic plan files live under /plans/ and still match extractPlanFilePath via the /.claude dir-name contract). Seeded trust state covers repo-cwd sessions; the 15s trust-watcher stays as fallback. Verified foreground via the plan-mode-no-op gate test. Co-Authored-By: Claude Fable 5 * feat: codex/gemini runners spawn hermetic children Same allowlist scrub as the claude runners, with each provider's auth surface re-admitted via extraAllow (codex: OPENAI_API_KEY/CODEX_* plus its tempHome .codex copy; gemini: GEMINI_*/GOOGLE_* with real HOME for ~/.gemini auth). The gemini spawn previously inherited the full operator env with no env property at all. Co-Authored-By: Claude Fable 5 * feat: agent-sdk-runner spawns hermetic children via complete Options.env The historical 'env: breaks SDK auth' failure was partial-env replacement: Options.env replaces the child's entire environment, so objects lacking ANTHROPIC_API_KEY killed auth. Passing the complete hermetic env (key + PATH + redirected CLAUDE_CONFIG_DIR/GSTACK_HOME) works — validated live via query() with a Bash tool call (success, real cost, Conductor vars scrubbed). Per-test opts.env merges last; ambient key mutation still works because the builder reads process.env at call time. Co-Authored-By: Claude Fable 5 * test: static tripwire pins hermetic wiring in all five runners Free-tier invariants: every runner builds child env via hermeticChildEnv, no raw ...process.env spread at any spawn site, --strict-mcp-config gated on isHermeticEnabled in both claude runners, and no test callsite passes the operator env into a runner's override parameter (scoped to runner calls — unit tests spawning gstack bin scripts directly are exempt). Mirrors the terminal-agent-pid-identity / server-embedder-terminal-port tripwire idiom. Co-Authored-By: Claude Fable 5 * test: refresh codex/factory ship goldens with detached-eval block a38089aa added the gstack-detach guidance to the ship template and updated the claude golden; the codex and factory goldens missed the same 16-line block. Regenerated via bun run gen:skill-docs. Co-Authored-By: Claude Fable 5 * docs: hermetic local E2E is the default; retire stale SDK env warning CLAUDE.md now documents the hermetic clean room (allowlist scrub, fresh seeded CLAUDE_CONFIG_DIR, temp GSTACK_HOME, --strict-mcp-config), EVALS_HERMETIC=0 as the debug escape hatch, and replaces the 'never pass env: to runAgentSdkTest' rule with the verified mechanism (partial-env replacement was the failure; complete env is safe). Co-Authored-By: Claude Fable 5 * fix: operational-learning fixture copies lib/jsonl-store.ts with the bin gstack-learnings-log imports $SCRIPT_DIR/../lib/jsonl-store.ts (hasInjection, v1.57.5.0) — copying only the bin scripts into the temp fixture broke the script with exit 1 since then. Latent because diff-based selection rarely runs this test; surfaced when hermetic-env.ts joined GLOBAL_TOUCHFILES and selected everything. Reproduced outside the hermetic env to confirm blame. Co-Authored-By: Claude Fable 5 * fix: ios-qa daemon scenarios use unique pidfiles under --concurrent All scenarios shared join(workDir, 'daemon.pid') through a module-scope workDir binding that beforeEach reassigns mid-flight under bun --concurrent. First daemon claims; siblings get already_running against the test process's own always-alive pid and fail in milliseconds — the failure mode seen at 15-way gate concurrency. Per-claim unique pidfiles keep the single-instance semantics under test. Co-Authored-By: Claude Fable 5 * fix: workflow judge re-appends body-carved sections after the marker slice runWorkflowJudge appended sections/*.md before slicing startMarker..endMarker. That handles skills that moved their MARKERS into sections (plan-eng, plan-design) but not document-release, which keeps its markers in the skeleton and carved the workflow BODY (Steps 2-9 -> sections/release-body.md) AFTER the endMarker — so the slice dropped it and the judge scored completeness 2 ('Steps 2-9 are in an external file'). Now any carved section the marker window excluded is re-appended, so the judge sees the full workflow the agent executes. document-release: completeness 2->5, clarity 3->4. ship/plan-ceo/plan-eng/plan-design judges unchanged (their section content is already inside the slice, so the head-dedup skips re-append). Pre-existing since the v1.57.0.0 carve (#1907); surfaced now because hermetic-env.ts is a global touchfile that selects every llm-judge test. Co-Authored-By: Claude Fable 5 * harden: hermetic temp-dir GC grace window + half-seed cleanup Codex adversarial review (ship) flagged two temp-dir lifecycle edges: - GC deleted any dead-pid dir; PID reuse could delete a freshly-created dir whose original pid exited and was recycled to a live process. Now requires BOTH a dead pid AND mtime older than a 1h floor. - A seed-write failure after mkdir left an unseeded dir named with our live pid that this process's GC skips, leaking until exit. Now the partial dir is torn down before the (still loud) rethrow. Two findings left as-is by design: HOME stays allowlisted (CLAUDE_CONFIG_DIR wins for claude; codex/gemini need ~/.codex|~/.gemini auth; FS sandbox is TODOS.md:454 scope; the hermetic-sentinel canary proves config isolation), and PTY extraArgs --mcp-config is a deliberate caller opt-in like env overrides. Co-Authored-By: Claude Opus 4.8 (1M context) * docs: document hermetic-by-default E2E + eval:bg detached runs in CONTRIBUTING The Testing & evals section now tells contributors that local E2E runners spawn children through a sealed clean room (allowlist-scrubbed env, seeded CLAUDE_CONFIG_DIR, temp GSTACK_HOME, --strict-mcp-config) so local signal matches CI, with EVALS_HERMETIC=0 as the escape hatch. The eval-tools list gains the eval:bg* detached-run scripts (gstack-detach: SIGTERM-proof, caffeinate-wrapped, machine-locked, run-scoped logs, EXIT= sentinel). Co-Authored-By: Claude Opus 4.8 (1M context) * chore: sync package.json to 1.58.1.0 The merge took main's package.json (1.58.0.0); gstack-version-bump repair fixed the working tree but the change was left uncommitted. Without this the committed tree disagrees with VERSION and CI's version-match test fails. Co-Authored-By: Claude Opus 4.8 (1M context) * docs: regenerate diagram SKILL.md with Conductor prose preamble The diagram skill (new from main) was missing the Conductor-session prose AskUserQuestion blocks that gen-skill-docs propagates to every SKILL.md. Pure generated output; reproduced by bun run gen:skill-docs. Co-Authored-By: Claude Opus 4.8 (1M context) --------- Co-authored-by: Claude Fable 5 --- CHANGELOG.md | 102 +++++++ CLAUDE.md | 53 +++- CONTRIBUTING.md | 31 ++ SKILL.md | 7 + VERSION | 2 +- autoplan/SKILL.md | 19 +- benchmark-models/SKILL.md | 7 + benchmark/SKILL.md | 7 + bin/gstack-detach | 167 +++++++++++ browse/SKILL.md | 7 + canary/SKILL.md | 19 +- codex/SKILL.md | 19 +- context-restore/SKILL.md | 19 +- context-save/SKILL.md | 19 +- cso/SKILL.md | 19 +- design-consultation/SKILL.md | 19 +- design-html/SKILL.md | 19 +- design-review/SKILL.md | 19 +- design-shotgun/SKILL.md | 19 +- devex-review/SKILL.md | 19 +- diagram/SKILL.md | 19 +- document-generate/SKILL.md | 19 +- document-release/SKILL.md | 19 +- gstack-upgrade/migrations/v1.58.0.0.sh | 63 ++++ health/SKILL.md | 19 +- .../claude/hooks/question-preference-hook.ts | 87 +++--- investigate/SKILL.md | 19 +- ios-clean/SKILL.md | 19 +- ios-design-review/SKILL.md | 19 +- ios-fix/SKILL.md | 19 +- ios-qa/SKILL.md | 19 +- ios-sync/SKILL.md | 19 +- land-and-deploy/SKILL.md | 19 +- landing-report/SKILL.md | 19 +- learn/SKILL.md | 19 +- lib/conductor-env-shim.ts | 26 +- lib/is-conductor.ts | 19 ++ make-pdf/SKILL.md | 7 + office-hours/SKILL.md | 19 +- open-gstack-browser/SKILL.md | 19 +- package.json | 6 +- pair-agent/SKILL.md | 19 +- plan-ceo-review/SKILL.md | 19 +- plan-design-review/SKILL.md | 19 +- plan-devex-review/SKILL.md | 19 +- plan-eng-review/SKILL.md | 19 +- plan-tune/SKILL.md | 19 +- qa-only/SKILL.md | 19 +- qa/SKILL.md | 19 +- retro/SKILL.md | 19 +- review/SKILL.md | 19 +- scrape/SKILL.md | 19 +- .../preamble/generate-ask-user-format.ts | 12 +- .../preamble/generate-preamble-bash.ts | 7 + setup | 20 +- setup-browser-cookies/SKILL.md | 7 + setup-deploy/SKILL.md | 19 +- setup-gbrain/SKILL.md | 19 +- ship/SKILL.md | 19 +- ship/sections/tests.md | 16 + ship/sections/tests.md.tmpl | 16 + skillify/SKILL.md | 19 +- spec/SKILL.md | 38 ++- sync-gbrain/SKILL.md | 19 +- test/agent-sdk-runner.test.ts | 8 +- test/fixtures/golden/claude-ship-SKILL.md | 19 +- test/fixtures/golden/codex-ship-SKILL.md | 35 ++- test/fixtures/golden/factory-ship-SKILL.md | 35 ++- test/gstack-detach.test.ts | 96 ++++++ test/helpers/agent-sdk-runner.ts | 20 +- test/helpers/carve-guards.ts | 20 +- test/helpers/claude-pty-runner.ts | 20 +- test/helpers/codex-session-runner.ts | 14 +- test/helpers/gemini-session-runner.ts | 8 +- test/helpers/hermetic-env.test.ts | 269 +++++++++++++++++ test/helpers/hermetic-env.ts | 276 ++++++++++++++++++ test/helpers/session-runner.ts | 10 +- test/helpers/touchfiles.ts | 19 +- test/hermetic-wiring.test.ts | 113 +++++++ test/is-conductor.test.ts | 50 ++++ test/memory-cache-injection.test.ts | 5 + test/preamble-compose.test.ts | 10 + test/question-preference-hook.test.ts | 112 ++++++- test/resolver-ask-user-format.test.ts | 21 +- test/skill-e2e-auto-decide-preserved.test.ts | 8 + test/skill-e2e-conductor-prose.test.ts | 69 +++++ test/skill-e2e-hermetic-canary.test.ts | 190 ++++++++++++ test/skill-e2e-ios.test.ts | 20 +- test/skill-llm-eval.test.ts | 16 +- 89 files changed, 2747 insertions(+), 221 deletions(-) create mode 100755 bin/gstack-detach create mode 100755 gstack-upgrade/migrations/v1.58.0.0.sh create mode 100644 lib/is-conductor.ts create mode 100644 test/gstack-detach.test.ts create mode 100644 test/helpers/hermetic-env.test.ts create mode 100644 test/helpers/hermetic-env.ts create mode 100644 test/hermetic-wiring.test.ts create mode 100644 test/is-conductor.test.ts create mode 100644 test/skill-e2e-conductor-prose.test.ts create mode 100644 test/skill-e2e-hermetic-canary.test.ts diff --git a/CHANGELOG.md b/CHANGELOG.md index 2dd4b64a8..034ab7bcf 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -1,5 +1,107 @@ # Changelog +## [1.58.1.0] - 2026-06-14 + +## **Local evals stop lying. Spawned `claude` test children run in a sealed clean room,** +## **and in Conductor every decision is a plain-text brief you answer with a letter.** + +Two things shipped here. First, the local E2E harness is now hermetic by default: +every spawned agent (claude -p, the real-PTY plan-mode runner, the Agent SDK +runner, plus the codex and gemini runners) gets an allowlist-scrubbed environment, +a fresh seeded `CLAUDE_CONFIG_DIR`, a temp `GSTACK_HOME`, and `--strict-mcp-config`. +Before this, a dev machine leaked the operator's `~/.claude` config, MCP servers +(gbrain, Conductor), skills, `~/.gstack` decision logs, and `CONDUCTOR_*`/`CLAUDECODE` +env into every child, so local eval results disagreed with CI for reasons that had +nothing to do with the code under test. Now local signal matches CI. Set +`EVALS_HERMETIC=0` to debug against real operator state. + +Second, in a Conductor session gstack no longer fights Conductor's flaky +AskUserQuestion tool. It detects the session and renders every decision as a prose +brief, a labeled question with a recommendation, per-option completeness scores, and +"reply with a letter," enforced by a PreToolUse hook that denies the tool and +redirects to prose. Destructive confirmations demand an explicit typed answer. + +Agents that launch long eval runs get `gstack-detach`: a SIGTERM-proof, idle-sleep-proof +wrapper (fresh session + `caffeinate`) with a machine-wide lock so concurrent +worktrees serialize instead of saturating the model API, run-scoped logs, and a +guaranteed `EXIT=` sentinel so a poller never mistakes silence for success. + +### The numbers that matter + +Measured against the gate eval suite on a contaminated dev box (gbrain MCP up, live +Conductor session, sibling worktrees). Reproduce: `bun test` (free unit + wiring +tripwire) and `EVALS=1 EVALS_TIER=gate bun test test/skill-e2e-hermetic-canary.test.ts`. + +| Metric | Before | After | Δ | +|--------|--------|-------|---| +| Spawned-child env | full operator `process.env` | allowlist-scrubbed | sealed | +| Runners hermeticized | 0 of 5 | 5 of 5 | +5 | +| Operator MCP servers visible to child | all (gbrain, Conductor) | 0 (`--strict-mcp-config`) | isolated | +| Config isolation proof | none | poisoned-operator sentinel canary | falsifiable | +| Long eval runs surviving a turn-boundary SIGTERM | no | yes (`gstack-detach`) | survives | + +The clean room is falsifiable, not asserted: a `hermetic-sentinel` gate canary +plants a poisoned operator config (a user `CLAUDE.md` + an MCP server) and fails if +the child can see any of it, and a free static tripwire fails CI if any runner +reverts to a raw `process.env` spread. + +### What this means for contributors + +Run evals locally and trust the result. You no longer have to push to CI to find +out whether a failure was real or just your machine bleeding context into the agent. +Three latent bugs the old harness hid surfaced the moment the suite ran clean and +are fixed: a coverage-judge that scored carved skills against half a document, an +ios-qa daemon test that collided on a shared pidfile under concurrency, and an +operational-learning fixture missing a lib it imports. Start a run with +`bun run eval:bg:gate`; flip `EVALS_HERMETIC=0` only when you deliberately want your +real `~/.claude` in the loop. + +### Itemized changes + +#### Added +- **Hermetic E2E environment** (`test/helpers/hermetic-env.ts`): allowlist env + builder (process basics, network/proxy vars, named `ANTHROPIC_*` auth, per-runner + `extraAllow`), pure `promotedEnv()` shared with `lib/conductor-env-shim.ts`, a + sync-memoized singleton temp dir (`/.claude` keeps the plan-file path + contract), a seeded `.claude.json` for non-interactive first run, and pid-aware GC + of crashed runs. Default-on; `EVALS_HERMETIC=0` restores the legacy env AND drops + `--strict-mcp-config`. +- **Two gate-tier isolation canaries** (`test/skill-e2e-hermetic-canary.test.ts`): + `hermetic-canary` asserts env redirect + scrub + zero MCP servers + nonzero + API-key cost from the Bash tool_result (not model prose); `hermetic-sentinel` + proves the child cannot see a planted poisoned operator config. +- **Static wiring tripwire** (`test/hermetic-wiring.test.ts`): free-tier invariants + that fail CI if any of the five runners drops `hermeticChildEnv()`, the gated + `--strict-mcp-config`, or leaks `process.env` through a callsite override. +- **`gstack-detach`** + `eval:bg` / `eval:bg:all` / `eval:bg:gate` / `eval:bg:periodic` + scripts: detached, SIGTERM-proof, `caffeinate`-wrapped eval runs with a machine-wide + lock, per-run logs under `~/.gstack-dev/eval-runs/`, a watchdog, and an `EXIT=` + sentinel. +- **Conductor prose AskUserQuestion**: when a Conductor session is detected, every + decision renders as a prose brief (labeled question, recommendation, per-option + completeness, reply-with-a-letter), enforced by a PreToolUse hook that denies the + tool and redirects. Auto-decide preferences still apply first; destructive + confirmations require an explicit typed answer. Installed for Conductor even in + non-interactive setup, with an upgrade migration for existing installs. + +#### Changed +- All five E2E runners (`session-runner`, `claude-pty-runner`, `agent-sdk-runner`, + `codex-session-runner`, `gemini-session-runner`) spawn children through + `hermeticChildEnv()`. The Agent SDK runner now receives a COMPLETE hermetic env + via `Options.env` (the old "never pass env: to the SDK" rule was partial-env + replacement; a complete env is safe). +- `hermetic-env.ts` is a global touchfile, so any change to it selects every E2E + + judge test. +- CLAUDE.md documents hermetic-by-default local evals and retires the stale SDK env + warning. + +#### Fixed +- The workflow LLM-judge now re-appends body-carved `sections/*.md` after the marker + slice, so carved skills (document-release) are judged on the full workflow the + agent executes instead of a half-document. +- ios-qa daemon scenarios use unique pidfiles, fixing `already_running` collisions + under `bun test --concurrent`. + ## [1.58.0.0] - 2026-06-12 ## **Your documents grow diagrams. Mermaid and excalidraw fences render as real pictures,** diff --git a/CLAUDE.md b/CLAUDE.md index 03384ae79..984844902 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -31,11 +31,26 @@ use Codex's own auth from `~/.codex/` config — no `OPENAI_API_KEY` env var nee `lib/conductor-env-shim.ts`) promotes `GSTACK_ANTHROPIC_API_KEY` / `GSTACK_OPENAI_API_KEY` to their canonical names inside gstack's TS binaries. Tests run through gstack entrypoints inherit this promotion automatically. -Don't echo the key value to stdout, logs, or shell history. When passing to a -test's Agent SDK, do NOT pass `env: {...}` to `runAgentSdkTest` — the SDK's -auth pipeline doesn't pick up the key the same way when env is supplied as an -object (confirmed failure mode). Mutate `process.env.ANTHROPIC_API_KEY` -ambiently before the call and restore in `finally`. +Don't echo the key value to stdout, logs, or shell history. The historical +"never pass `env:` to `runAgentSdkTest`" rule is retired: the failure was +partial-env replacement (the SDK's `Options.env` REPLACES the child's entire +environment, so an object without the key broke auth). The runner now always +passes a COMPLETE hermetic env with per-test `env:` merged last, so per-test +overrides are safe; ambient `process.env.ANTHROPIC_API_KEY` mutation also +still works (the env builder reads process.env at call time). + +**Hermetic local E2E (default).** Every E2E runner (claude -p, PTY, Agent +SDK, codex, gemini) spawns children through `test/helpers/hermetic-env.ts`: +allowlist-scrubbed env (operator `CONDUCTOR_*`, `CLAUDE_*`, `GSTACK_*`, +`MCP_*`, `GBRAIN_*`, and credentials like `GH_TOKEN` never reach children), +a fresh seeded `CLAUDE_CONFIG_DIR` (no operator `~/.claude` CLAUDE.md / +MCP servers / skills), a temp `GSTACK_HOME`, and `--strict-mcp-config`. +Local eval signal matches CI. Debug against real operator state with +`EVALS_HERMETIC=0` (restores the legacy env AND drops the strict-MCP flag). +Per-test `env:` overrides merge last, so deliberate contamination +(`CONDUCTOR_WORKSPACE_PATH`, per-test `GSTACK_HOME`) keeps working. Wiring +is pinned by `test/hermetic-wiring.test.ts` (static tripwire) and two +gate-tier canaries in `test/skill-e2e-hermetic-canary.test.ts`. E2E tests stream progress in real-time (tool-by-tool via `--output-format stream-json --verbose`). Results are persisted to `~/.gstack-dev/evals/` with auto-comparison @@ -828,6 +843,34 @@ them. Report progress at each check (which tests passed, which are running, any failures so far). The user wants to see the run complete, not a promise that you'll check later. +## Running evals as an agent: always detach (SIGTERM-proof) + +When **you (an agent/harness)** launch a long eval/benchmark run, run it through +`bin/gstack-detach` — NEVER as a plain backgrounded Bash task. A plain background +task lives in the harness's process group, so a SIGTERM ("polite quit") on a turn +boundary, a stopped Monitor, or an interruption kills the run mid-flight (observed: +`script "test:gate" was terminated by signal SIGTERM` ~40 min into a run). On macOS +the run can also die to idle-sleep. `gstack-detach` fixes both: a fresh session +(escapes the group SIGTERM) wrapped in `caffeinate -i` (blocks idle-sleep). + +- Use the `eval:bg*` scripts (`eval:bg`, `eval:bg:all`, `eval:bg:gate`, + `eval:bg:periodic`) — they wrap the eval command in `gstack-detach` with the + machine-wide `gstack-evals` lock (concurrent worktrees serialize instead of + saturating the shared model API), a per-tier watchdog, and a **run-scoped** log + under `~/.gstack-dev/eval-runs/` (no shared-`/tmp` collision). Each prints its + log path. Or call `gstack-detach [--lock NAME] [--timeout SECS] [--label LBL] -- + ` directly for any long agent job. Export `ANTHROPIC_API_KEY` first (never + pass keys in argv). +- Then **poll the printed logfile** with a death-aware watcher: break on the + guaranteed `### gstack-detach EXIT= ###` sentinel (success AND failure are + both marked, so silence is never mistaken for success). The detached run survives + even if your watcher gets reaped, so re-checking the log always works. +- Why the lock: a shared dev box with several Conductor worktrees will rate-limit + the model API if two eval suites run at once (15-way concurrency each), which + mass-times-out E2E tests. The lock makes the second run WAIT, not collide. +- Humans running `bun run test:evals` foreground in their own terminal don't need + this — Ctrl-C is intended there. Detachment is for agent-launched runs only. + ## E2E test fixtures: extract, don't copy **NEVER copy a full SKILL.md file into an E2E test fixture.** SKILL.md files are diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md index 5a56ef5d3..b75d4a898 100644 --- a/CONTRIBUTING.md +++ b/CONTRIBUTING.md @@ -176,6 +176,18 @@ EVALS=1 bun test test/skill-e2e-*.test.ts - Saves full NDJSON transcripts and failure JSON for debugging - Tests live in `test/skill-e2e-*.test.ts` (split by category), runner logic in `test/helpers/session-runner.ts` +**Hermetic by default.** Every E2E runner (claude -p, the real-PTY plan-mode +runner, the Agent SDK runner, plus the codex and gemini runners) spawns its child +through `test/helpers/hermetic-env.ts`: an allowlist-scrubbed environment, a fresh +seeded `CLAUDE_CONFIG_DIR`, a temp `GSTACK_HOME`, and `--strict-mcp-config`. Your +operator `~/.claude` config, MCP servers (gbrain, Conductor), skills, `~/.gstack` +decision logs, and `CONDUCTOR_*` env never leak into the child, so local eval +signal matches CI instead of disagreeing for reasons unrelated to the code under +test. Set `EVALS_HERMETIC=0` to debug against your real operator state (this also +drops `--strict-mcp-config`). The wiring is pinned by `test/hermetic-wiring.test.ts` +(a free static tripwire) and two gate-tier isolation canaries in +`test/skill-e2e-hermetic-canary.test.ts`. + ### E2E observability When E2E tests run, they produce machine-readable artifacts in `~/.gstack-dev/`: @@ -198,6 +210,25 @@ bun run eval:compare # compare two runs — shows per-test deltas + Take bun run eval:summary # aggregate stats + per-test efficiency averages across runs ``` +**Detached runs for agents and long suites.** When an agent (or you, for a run +you don't want to babysit) launches a long eval, use the `eval:bg*` scripts. They +wrap the eval command in `bin/gstack-detach`: a fresh session that escapes a +turn-boundary SIGTERM, a `caffeinate` wrapper that blocks idle-sleep, a machine-wide +`gstack-evals` lock so concurrent worktrees serialize instead of saturating the +model API, a run-scoped log under `~/.gstack-dev/eval-runs/`, a per-tier watchdog, +and a guaranteed `### gstack-detach EXIT= ###` sentinel so a poller never +mistakes silence for success. + +```bash +bun run eval:bg # detached test:evals (diff-based) +bun run eval:bg:all # detached test:evals:all +bun run eval:bg:gate # detached gate-tier suite +bun run eval:bg:periodic # detached periodic-tier suite +``` + +Each prints its log path. Humans running `bun run test:evals` foreground in their +own terminal don't need this — Ctrl-C is intended there. + **Eval comparison commentary:** `eval:compare` generates natural-language Takeaway sections interpreting what changed between runs — flagging regressions, noting improvements, calling out efficiency gains (fewer turns, faster, cheaper), and producing an overall summary. This is driven by `generateCommentary()` in `eval-store.ts`. Artifacts are never cleaned up — they accumulate in `~/.gstack-dev/` for post-mortem debugging and trend analysis. diff --git a/SKILL.md b/SKILL.md index 8711ae7f3..90774950e 100644 --- a/SKILL.md +++ b/SKILL.md @@ -48,6 +48,13 @@ echo "REPO_MODE: $REPO_MODE" _SESSION_KIND=$(~/.claude/skills/gstack/bin/gstack-session-kind 2>/dev/null || echo "interactive") case "$_SESSION_KIND" in spawned|headless|interactive) ;; *) _SESSION_KIND="interactive" ;; esac echo "SESSION_KIND: $_SESSION_KIND" +# Conductor host: AskUserQuestion is unreliable here (native disabled, MCP +# variant flaky), so skills render decisions as prose instead of calling the +# tool. Gated on !headless so an eval/CI run INSIDE Conductor (GSTACK_HEADLESS) +# still BLOCKs rather than rendering prose to nobody. +if [ "$_SESSION_KIND" != "headless" ] && { [ -n "${CONDUCTOR_WORKSPACE_PATH:-}" ] || [ -n "${CONDUCTOR_PORT:-}" ]; }; then + echo "CONDUCTOR_SESSION: true" +fi _LAKE_SEEN=$([ -f ~/.gstack/.completeness-intro-seen ] && echo "yes" || echo "no") echo "LAKE_INTRO: $_LAKE_SEEN" _TEL=$(~/.claude/skills/gstack/bin/gstack-config get telemetry 2>/dev/null || true) diff --git a/VERSION b/VERSION index 3a62339b5..eb4d8b4b5 100644 --- a/VERSION +++ b/VERSION @@ -1 +1 @@ -1.58.0.0 +1.58.1.0 \ No newline at end of file diff --git a/autoplan/SKILL.md b/autoplan/SKILL.md index de7174b03..49db38ff9 100644 --- a/autoplan/SKILL.md +++ b/autoplan/SKILL.md @@ -57,6 +57,13 @@ echo "REPO_MODE: $REPO_MODE" _SESSION_KIND=$(~/.claude/skills/gstack/bin/gstack-session-kind 2>/dev/null || echo "interactive") case "$_SESSION_KIND" in spawned|headless|interactive) ;; *) _SESSION_KIND="interactive" ;; esac echo "SESSION_KIND: $_SESSION_KIND" +# Conductor host: AskUserQuestion is unreliable here (native disabled, MCP +# variant flaky), so skills render decisions as prose instead of calling the +# tool. Gated on !headless so an eval/CI run INSIDE Conductor (GSTACK_HEADLESS) +# still BLOCKs rather than rendering prose to nobody. +if [ "$_SESSION_KIND" != "headless" ] && { [ -n "${CONDUCTOR_WORKSPACE_PATH:-}" ] || [ -n "${CONDUCTOR_PORT:-}" ]; }; then + echo "CONDUCTOR_SESSION: true" +fi _LAKE_SEEN=$([ -f ~/.gstack/.completeness-intro-seen ] && echo "yes" || echo "no") echo "LAKE_INTRO: $_LAKE_SEEN" _TEL=$(~/.claude/skills/gstack/bin/gstack-config get telemetry 2>/dev/null || true) @@ -306,7 +313,9 @@ AI orchestrator (e.g., OpenClaw). In spawned sessions: "AskUserQuestion" can resolve to two tools at runtime: the **host MCP variant** (e.g. `mcp__conductor__AskUserQuestion` — appears in your tool list when the host registers it) or the **native** Claude Code tool. -**Rule:** if any `mcp__*__AskUserQuestion` variant is in your tool list, prefer it. Hosts may disable native AUQ via `--disallowedTools AskUserQuestion` (Conductor does, by default) and route through their MCP variant; calling native there silently fails. Same questions/options shape; same decision-brief format applies. +**Conductor rule (read before the MCP rule):** if `CONDUCTOR_SESSION: true` was echoed by the preamble, do NOT call AskUserQuestion at all — neither native nor any `mcp__*__AskUserQuestion` variant. Render EVERY decision brief as the **prose form** below and STOP. This is proactive, not a reaction to a failure: Conductor disables native AUQ and its MCP variant is flaky (it returns `[Tool result missing due to internal error]`), so prose is the reliable path. **Auto-decide preferences still apply first:** if a `[plan-tune auto-decide]