Files
gstack/scripts/discover-skills.ts
T
Garry Tan 46c1fae7f1 v1.54.0.0 feat: carve /ship into skeleton + on-demand sections (-59% always-loaded) (#1806)
* feat(test): transcript-section-logger + ship-action fingerprint (T10)

Pure-analysis module over a SkillTestResult/NDJSON transcript:
- extractSectionReads(): which sections/*.md a run opened (post-carve check)
- extractShipActions(): observable action fingerprint (merge/test/bump/
  changelog/commit/push/pr) that works on the MONOLITH too, so a baseline
  captured before the carve can detect a sectioned-ship regression
- baseline read/write + compareShipActions() for baseline-first dogf(T10)

Baseline-first answers the Codex outside-voice critique that a logger in the
same PR as the carve is post-failure telemetry without a pre-carve reference.

11 unit tests, all green. Paid monolith baseline capture runs separately.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* feat(pipeline): section discovery + generation machinery (T9)

- discover-skills.ts: discoverSectionTemplates() scans <skill>/sections/*.md.tmpl
- gen-skill-docs.ts: extract resolvePlaceholders + applyHostRewrites + buildContext
  as shared helpers (processTemplate and the new processSectionTemplate both call
  them, so a sanitization/rewrite fix can't miss sections) [C1]
- processSectionTemplate: body-fragment generation (no frontmatter/catalog/voice),
  parent-skill TemplateContext (skillName pinned to parent, not 'sections', so
  appliesTo gating + tier behave identically), per-host output routing
- --host all now fails the build on ANY host failure, not just claude, so a stale
  external-host output can't slip the freshness gate [Codex outside-voice #9]

Inert until a skill is carved (no sections/ dirs exist yet). Refactor is
output-neutral: gen:skill-docs --dry-run --host all reports 0 STALE.

5 discovery unit tests + 389 gen-skill-docs tests green.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* feat(setup): install sections/ for cherry-pick targets (claude + kiro) (T9)

Two install targets cherry-pick SKILL.md and would leave a carved skill's
sections/ behind, 404ing a runtime 'Read sections/<name>.md':
- link_claude_skill_dirs: link the sections/ subdir via _link_or_copy (windows
  gets a fresh copy on every ./setup)
- kiro per-skill loop: sed-rewrite + copy each sections/* so paths resolve under
  ~/.kiro, not ~/.codex/~/.claude

codex/factory/opencode link the whole generated dir, so sections ride free.
Addresses Codex outside-voice #4/#6 (runtime pathing landmine). Inert until a
skill is carved. Static-tripwire test + windows-fallback invariant green.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* feat(ship): gstack-version-bump CLI — tested idempotency classify + write (T9)

Hybrid CLI extraction (CM1): the deterministic core of ship Step 12 becomes a
tested CLI instead of bash prose the agent re-derives each run.
- classify: FRESH/ALREADY_BUMPED/DRIFT_STALE_PKG/DRIFT_UNEXPECTED from VERSION
  vs origin/<base>:VERSION vs package.json.version (pure reader)
- write: validated dual-write to VERSION + package.json (FRESH bump)
- repair: DRIFT_STALE_PKG sync, no re-bump
Bump-LEVEL choice + queue collision stay agent judgment; slot pick stays
bin/gstack-next-version. This removes the re-bump-a-shipped-branch footgun from
skippable prose into code that can't be skipped or misread.

15 tests (exhaustive state matrix + write/repair fs + real-git classify).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* test(parity): sectioned-skill parity capability — guards the carve (T9)

Carved skills (skeleton + sections/*.md) need parity checks that see relocated
content, or moving a phrase into a section reads as 'lost':
- readSkillForParity(): union skeleton + all sections/*.md
- checkSkillParity sectioned mode: content checks against the union; minBytes/
  maxSizeRatio against union bytes (total behavior preserved); maxSkeletonBytes
  asserts the always-loaded skeleton actually shrank. Lowering minBytes to fit a
  small skeleton would otherwise make the size floor toothless [Codex #12].

Built + tested BEFORE the carve so ship's invariant can flip to sectioned in the
same commit it lands. Monolith path byte-identical (verified: pre-existing
investigate 1.053 ratio drift fails the same with this change stashed).

7 sectioned-parity tests + existing parity tests green.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* refactor(ship): carve into skeleton + on-demand sections (Claude) (T9)

ship/SKILL.md drops 167KB → 68.7KB (~59% of the always-loaded skill) by moving
8 prose-heavy steps into ship/sections/*.md, read on demand:
tests, test-coverage, plan-completion, review-army, greptile, adversarial,
changelog, pr-body. Step 12's version logic now calls the tested
gstack-version-bump CLI instead of inline bash.

Claude-first (S2): {{SECTION:id}} emits a STOP-Read pointer on Claude (skeleton +
generated section files) and INLINES the content on every other host, so external
hosts keep the full monolith — verified factory at 162KB with no sections dir.
{{SECTION_INDEX:ship}} renders the situation→section table from the PASSIVE
manifest (CM2 / v2_PLAN.md:663); required-reads live only in test fixtures.
Multi-pass resolve expands inlined sections' own resolvers.

Parity: ship invariant flipped to sectioned (union content checks + maxSkeletonBytes
asserts the shrink). Carve-fallout fixed across gen-skill-docs/skill-validation/
golden/plan-completion/#1539/size-budget tests via skeleton+sections union reads.
Free suite green except the pre-existing investigate parity drift.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* test(ship): manifest-consistency + context-parity + requiredReads helper (T9)

Free deterministic guards for the carve:
- required-reads.ts + unit test: assertRequiredReads(run, requiredFiles) — the
  mechanical layer-5 check that the agent Read the sections its situation needs
  (required set comes from the fixture, not the passive manifest)
- section-manifest-consistency: 3-tier orphan classification (generated orphan +
  hand-edited generated file → FAIL; manifest orphan → WARN per v2_PLAN.md) and
  pins the PASSIVE-manifest contract (no applies_when/required_for)
- template-context-parity: generated sections have zero unresolved placeholders
  and gated resolvers (ADVERSARIAL_STEP/CONFIDENCE_CALIBRATION/CHANGELOG_WORKFLOW)
  rendered — proving sections resolve with the parent skillName, not 'sections'

16 tests, all green.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* test(ship): section-loading E2E + idempotency CLI detection (T9)

- skill-e2e-ship-section-loading.test.ts (new, periodic): runs real /ship in plan
  mode against a fresh version-changing fixture and asserts the agent Read the
  required sections (review-army + changelog). Runs against the INSTALLED skill
  (~/.claude/skills/gstack/ship), not repo paths, so install-layout 404s surface
  [Codex outside-voice #5]. Layer-5 mechanical guard against silent section-skip.
- skill-e2e-ship-idempotency.test.ts: detection updated for the carve — Step 12
  now runs gstack-version-bump classify (JSON "state":"ALREADY_BUMPED") instead
  of the inline bash echo (STATE: ALREADY_BUMPED). Accept both; add a
  gstack-version-bump-write re-bump regression signal.
- touchfiles: register ship-section-loading (periodic) + extend idempotency deps
  with bin/gstack-version-bump + scripts/resolvers/sections.ts.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* test(ship): union-read redaction wiring test for the carve (T9)

main's PR-body redaction-at-sink lives in sections/pr-body.md.tmpl after the
carve, not the skeleton template. Read skeleton + section templates union so the
redaction-wiring assertions follow the relocated content. 9/9 green.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* v1.54.0.0 feat: carve /ship into skeleton + on-demand sections (-59% always-loaded)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-30 12:09:10 -07:00

68 lines
2.5 KiB
TypeScript

/**
* Shared discovery for SKILL.md and .tmpl files.
* Scans root + one level of subdirs, skipping node_modules/.git/dist.
*/
import * as fs from 'fs';
import * as path from 'path';
const SKIP = new Set(['node_modules', '.git', 'dist']);
function subdirs(root: string): string[] {
return fs.readdirSync(root, { withFileTypes: true })
.filter(d => d.isDirectory() && !d.name.startsWith('.') && !SKIP.has(d.name))
.map(d => d.name);
}
export function discoverTemplates(root: string): Array<{ tmpl: string; output: string }> {
const dirs = ['', ...subdirs(root)];
const results: Array<{ tmpl: string; output: string }> = [];
for (const dir of dirs) {
const rel = dir ? `${dir}/SKILL.md.tmpl` : 'SKILL.md.tmpl';
if (fs.existsSync(path.join(root, rel))) {
results.push({ tmpl: rel, output: rel.replace(/\.tmpl$/, '') });
}
}
return results;
}
/**
* Discover on-demand section templates: `<skill>/sections/*.md.tmpl`.
*
* Returns the relative tmpl path, its generated output path (`.tmpl` stripped),
* and the owning skill directory so the generator can build a TemplateContext
* with the PARENT skill's name (not "sections") — see processSectionTemplate.
*
* Scans one level of subdirs (same depth as discoverTemplates), looking only
* inside a `sections/` child. Skills without a sections/ dir contribute nothing,
* so this is a no-op for every skill that hasn't been carved.
*/
export function discoverSectionTemplates(
root: string,
): Array<{ tmpl: string; output: string; skillDir: string }> {
const results: Array<{ tmpl: string; output: string; skillDir: string }> = [];
for (const dir of subdirs(root)) {
const sectionsDir = path.join(root, dir, 'sections');
if (!fs.existsSync(sectionsDir) || !fs.statSync(sectionsDir).isDirectory()) continue;
for (const entry of fs.readdirSync(sectionsDir, { withFileTypes: true })) {
if (!entry.isFile() || !entry.name.endsWith('.md.tmpl')) continue;
const rel = `${dir}/sections/${entry.name}`;
results.push({ tmpl: rel, output: rel.replace(/\.tmpl$/, ''), skillDir: dir });
}
}
// Deterministic order so CI freshness checks don't flap on FS iteration order.
return results.sort((a, b) => a.tmpl.localeCompare(b.tmpl));
}
export function discoverSkillFiles(root: string): string[] {
const dirs = ['', ...subdirs(root)];
const results: string[] = [];
for (const dir of dirs) {
const rel = dir ? `${dir}/SKILL.md` : 'SKILL.md';
if (fs.existsSync(path.join(root, rel))) {
results.push(rel);
}
}
return results;
}