Files
gstack/test/skill-ceo-section-ordering.test.ts
T
Garry Tan ab66193e2e refactor(plan-ceo-review): carve review body into on-demand section
Carve the largest skill (138,838 B) into a skeleton + one on-demand
section, the documented next Phase B target after /ship (v2_PLAN.md:216).

- sections/review-sections.md(.tmpl): the 11-section deep review, codex/
  outside-voice rules, how-to-ask, Required Outputs, registries, Completion
  Summary, Review Log, REVIEW_DASHBOARD, PLAN_FILE_REVIEW_REPORT, Next Steps,
  docs/designs promotion, Formatting Rules, and the Mode Quick Reference.
- sections/manifest.json: passive registry (CM2), one entry.
- SKILL.md.tmpl: {{SECTION_INDEX}} after the system audit, a single
  {{SECTION:review-sections}} STOP-Read after Step 0 mode selection, and a
  Section self-check. All of Step 0 (the scope/mode conversation) stays in
  the always-loaded skeleton; only EXIT_PLAN_MODE_GATE follows the section.

Measured: always-loaded skeleton 138,838 -> 80,731 B (-42%, ~14.4K tokens
off every invocation). Union (skeleton + section) 139,110 B, behavior held.

Boundary honors Codex P1: nothing review-governing (formatting rules, mode
reference, how-to-ask, required outputs) sits in the skeleton below the
STOP. Housekeeping resolvers ride in the section, matching the ship
precedent (adversarial.md carries LEARNINGS_LOG + GBRAIN_SAVE_RESULTS).

Tests (atomic with the carve — skill-docs.yml gates gen:skill-docs
freshness on every push, so source + regen + tests must land together):
- parity-harness: plan-ceo flipped to sectioned, maxSkeletonBytes 90_000
  (measured 80,731 + headroom); content/minBytes run against the union.
- skill-size-budget: plan-ceo-review added to SECTIONS_EXTRACTED.
- section-manifest-consistency: generalized to discover every carved skill,
  vars computed per-skill-case (Codex P2).
- skill-ceo-section-ordering (new, gate): per-PR static guard — STOP after
  Step 0, review body absent from skeleton, report writer in the section,
  nothing review-governing below the STOP.
- skill-e2e-plan-ceo-review-section-loading (new, periodic): refreshes the
  installed skill first (Codex P1), drives full Step 0, asserts the section
  is Read before the report.
- gen-skill-docs + skill-validation: read the skeleton+sections union for
  carved skills so relocated prose still counts.
- touchfiles: plan-ceo-section-loading registered (periodic).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-31 08:54:56 -07:00

83 lines
4.0 KiB
TypeScript

/**
* plan-ceo-review carve — static ordering guard (GATE tier, free, deterministic).
*
* This is the per-PR mechanical backstop for the v2-plan Phase B carve of
* plan-ceo-review (Codex outside-voice P2). The periodic real-PTY E2E
* (skill-e2e-plan-ceo-review-section-loading.test.ts) is the behavioral proof,
* but it runs weekly and costs money. This file runs on every `bun test` and
* fails CI the moment the carve's structural invariants break:
*
* 1. The skeleton points at the section with a STOP-Read directive, and that
* directive sits AFTER Step 0 (scope + mode) — so the conversational Step 0
* stays in the always-loaded skeleton, never stranded in the on-demand file.
* 2. The heavy review body (Sections 1-11) is NOT in the skeleton — it moved to
* the section. A regression that inlines it back would re-bloat the skeleton.
* 3. The review report writer ("GSTACK REVIEW REPORT") lives in the section, and
* the blocking EXIT PLAN MODE GATE that verifies it lives in the skeleton
* AFTER the STOP — so the gate fires once the section work returns.
* 4. Nothing review-governing sits in the skeleton below the STOP (Codex P1):
* no "Section N", no "## Mode Quick Reference", no "## Formatting Rules".
*/
import { describe, test, expect } from 'bun:test';
import * as fs from 'fs';
import * as path from 'path';
const ROOT = path.resolve(import.meta.dir, '..');
const SKELETON = path.join(ROOT, 'plan-ceo-review', 'SKILL.md');
const SECTION = path.join(ROOT, 'plan-ceo-review', 'sections', 'review-sections.md');
describe('plan-ceo-review carve — static ordering', () => {
const skeleton = fs.readFileSync(SKELETON, 'utf-8');
const section = fs.readFileSync(SECTION, 'utf-8');
// Index into the skeleton, -1 if absent.
const at = (needle: string): number => skeleton.indexOf(needle);
const STEP0 = '## Step 0: Nuclear Scope Challenge + Mode Selection';
const STOP = 'sections/review-sections.md'; // appears in the index row + STOP directive
const GATE = 'GSTACK REVIEW REPORT';
test('skeleton emits a STOP-Read directive pointing at the section', () => {
expect(skeleton).toContain('> **STOP.**');
expect(skeleton).toContain('plan-ceo-review/sections/review-sections.md');
expect(skeleton).toContain('## Section index — Read each section when its situation applies');
});
test('Step 0 (scope + mode) stays in the skeleton, BEFORE the STOP', () => {
const step0 = at(STEP0);
const stop = skeleton.indexOf('> **STOP.**');
expect(step0).toBeGreaterThan(-1);
expect(stop).toBeGreaterThan(step0); // STOP fires only after Step 0
});
test('the heavy review body (Sections 1-11) is NOT in the skeleton', () => {
expect(skeleton).not.toContain('### Section 1: Architecture Review');
expect(skeleton).not.toContain('### Section 11:');
// ...it lives in the section instead.
expect(section).toContain('### Section 1: Architecture Review');
expect(section).toContain('### Section 11:');
});
test('nothing review-governing sits in the skeleton below the STOP (Codex P1)', () => {
// Mode Quick Reference + Formatting Rules govern review-time behavior and must
// travel with the section, not be stranded below the STOP in the skeleton.
expect(skeleton).not.toContain('## Mode Quick Reference');
expect(skeleton).not.toContain('## Formatting Rules');
expect(section).toContain('## Mode Quick Reference');
});
test('review report writer lives in the section; the EXIT PLAN MODE GATE stays in the skeleton AFTER the STOP', () => {
// The report itself is produced inside the section work...
expect(section).toContain(GATE);
// ...and the blocking gate that verifies it is the last thing the skeleton runs.
const stop = skeleton.indexOf('> **STOP.**');
const gate = skeleton.lastIndexOf(GATE);
expect(gate).toBeGreaterThan(stop);
});
test('the section is generated, not hand-edited', () => {
expect(section.slice(0, 120)).toContain('AUTO-GENERATED');
});
});