test: Codex generation tests + CI + docs for multi-agent support

Tests (28 new): - Codex output path routing, frontmatter validation (name+description only) - No .claude/skills/ path leaks in Codex output (regression guard) - /codex skill exclusion, hook→prose conversion, multiline YAML - --host agents alias, dynamic template discovery - Codex skill validation + $B command validation - find-browse priority chain verification - Replace static ALL_SKILLS list with dynamic filesystem scan CI: - Add Codex freshness check to skill-docs workflow Docs: - AGENTS.md: Codex-facing project instructions - README: multi-agent installation section - CONTRIBUTING: dual-host development workflow - CHANGELOG: v0.9.0 multi-agent support entry Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-02 03:35:09 +02:00 · 2026-03-19 01:04:21 -07:00
parent 10e6d39f27
commit 6fc696dfb8
9 changed files with 447 additions and 29 deletions
@@ -7,5 +7,9 @@ jobs:
      - uses: actions/checkout@v4
      - uses: oven-sh/setup-bun@v2
      - run: bun install
-      - run: bun run gen:skill-docs
+      - name: Check Claude host freshness
+        run: bun run gen:skill-docs
      - run: git diff --exit-code || (echo "Generated SKILL.md files are stale. Run: bun run gen:skill-docs" && exit 1)
+      - name: Check Codex host freshness
+        run: bun run gen:skill-docs --host codex
+      - run: git diff --exit-code -- .agents/ || (echo "Generated Codex SKILL.md files are stale. Run: bun run gen:skill-docs --host codex" && exit 1)
@@ -0,0 +1,49 @@
+# gstack — AI Engineering Workflow
+
+gstack is a collection of SKILL.md files that give AI agents structured roles for
+software development. Each skill is a specialist: CEO reviewer, eng manager,
+designer, QA lead, release engineer, debugger, and more.
+
+## Available skills
+
+Skills live in `.agents/skills/`. Invoke them by name (e.g., `/office-hours`).
+
+| Skill | What it does |
+|-------|-------------|
+| `/office-hours` | Start here. Reframes your product idea before you write code. |
+| `/plan-ceo-review` | CEO-level review: find the 10-star product in the request. |
+| `/plan-eng-review` | Lock architecture, data flow, edge cases, and tests. |
+| `/plan-design-review` | Rate each design dimension 0-10, explain what a 10 looks like. |
+| `/design-consultation` | Build a complete design system from scratch. |
+| `/review` | Pre-landing PR review. Finds bugs that pass CI but break in prod. |
+| `/debug` | Systematic root-cause debugging. No fixes without investigation. |
+| `/design-review` | Design audit + fix loop with atomic commits. |
+| `/qa` | Open a real browser, find bugs, fix them, re-verify. |
+| `/qa-only` | Same as /qa but report only — no code changes. |
+| `/ship` | Run tests, review, push, open PR. One command. |
+| `/document-release` | Update all docs to match what you just shipped. |
+| `/retro` | Weekly retro with per-person breakdowns and shipping streaks. |
+| `/browse` | Headless browser — real Chromium, real clicks, ~100ms/command. |
+| `/setup-browser-cookies` | Import cookies from your real browser for authenticated testing. |
+| `/careful` | Warn before destructive commands (rm -rf, DROP TABLE, force-push). |
+| `/freeze` | Lock edits to one directory. Hard block, not just a warning. |
+| `/guard` | Activate both careful + freeze at once. |
+| `/unfreeze` | Remove directory edit restrictions. |
+| `/gstack-upgrade` | Update gstack to the latest version. |
+
+## Build commands
+
+```bash
+bun install              # install dependencies
+bun test                 # run tests (free, <5s)
+bun run build            # generate docs + compile binaries
+bun run gen:skill-docs   # regenerate SKILL.md files from templates
+bun run skill:check      # health dashboard for all skills
+```
+
+## Key conventions
+
+- SKILL.md files are **generated** from `.tmpl` templates. Edit the template, not the output.
+- Run `bun run gen:skill-docs --host codex` to regenerate Codex-specific output.
+- The browse binary provides headless browser access. Use `$B <command>` in skills.
+- Safety skills (careful, freeze, guard) use inline advisory prose — always confirm before destructive operations.
@@ -1,10 +1,19 @@
 # Changelog

+## [0.9.0] - 2026-03-19 — Works on Codex, Gemini CLI, and Cursor
+
+**gstack now works on any AI agent that supports the open SKILL.md standard.** Install once, use from Claude Code, OpenAI Codex CLI, Google Gemini CLI, or Cursor. All 21 skills are available in `.agents/skills/` -- just run `./setup --host codex` or `./setup --host auto` and your agent discovers them automatically.
+
+- **One install, four agents.** Claude Code reads from `.claude/skills/`, everything else reads from `.agents/skills/`. Same skills, same prompts, adapted for each host. Hook-based safety skills (careful, freeze, guard) get inline safety advisory prose instead of hooks -- they work everywhere.
+- **Auto-detection.** `./setup --host auto` detects which agents you have installed and sets up both. Already have Claude Code? It still works exactly the same.
+- **Codex-adapted output.** Frontmatter is stripped to just name + description (Codex doesn't need allowed-tools or hooks). Paths are rewritten from `~/.claude/` to `~/.codex/`. The `/codex` skill itself is excluded from Codex output -- it's a Claude wrapper around `codex exec`, which would be self-referential.
+- **CI checks both hosts.** The freshness check now validates Claude and Codex output independently. Stale Codex docs break the build just like stale Claude docs.
+
 ## [0.8.1] - 2026-03-19

 ### Fixed

- **`/qa` no longer refuses to use the browser on backend-only changes.** Previously, if your branch only changed prompt templates, config files, or service logic, `/qa` would analyze the diff, conclude "no UI to test," and suggest running evals instead. Now it always opens the browser — falling back to a Quick mode smoke test (homepage + top 5 navigation targets) when no specific pages are identified from the diff.
+- **`/qa` no longer refuses to use the browser on backend-only changes.** Previously, if your branch only changed prompt templates, config files, or service logic, `/qa` would analyze the diff, conclude "no UI to test," and suggest running evals instead. Now it always opens the browser -- falling back to a Quick mode smoke test (homepage + top 5 navigation targets) when no specific pages are identified from the diff.

 ## [0.8.0] - 2026-03-19 — Multi-AI Second Opinion

@@ -213,10 +213,11 @@ SKILL.md files are **generated** from `.tmpl` templates. Don't edit the `.md` di
 # 1. Edit the template
 vim SKILL.md.tmpl              # or browse/SKILL.md.tmpl

-# 2. Regenerate
+# 2. Regenerate for both hosts
 bun run gen:skill-docs
+bun run gen:skill-docs --host codex

-# 3. Check health
+# 3. Check health (reports both Claude and Codex)
 bun run skill:check

 # Or use watch mode — auto-regenerates on save
@@ -227,6 +228,60 @@ For template authoring best practices (natural language over bash-isms, dynamic

 To add a browse command, add it to `browse/src/commands.ts`. To add a snapshot flag, add it to `SNAPSHOT_FLAGS` in `browse/src/snapshot.ts`. Then rebuild.

+## Dual-host development (Claude + Codex)
+
+gstack generates SKILL.md files for two hosts: **Claude** (`.claude/skills/`) and **Codex** (`.agents/skills/`). Every template change needs to be generated for both.
+
+### Generating for both hosts
+
+```bash
+# Generate Claude output (default)
+bun run gen:skill-docs
+
+# Generate Codex output
+bun run gen:skill-docs --host codex
+# --host agents is an alias for --host codex
+
+# Or use build, which does both + compiles binaries
+bun run build
+```
+
+### What changes between hosts
+
+| Aspect | Claude | Codex |
+|--------|--------|-------|
+| Output directory | `{skill}/SKILL.md` | `.agents/skills/gstack-{skill}/SKILL.md` |
+| Frontmatter | Full (name, description, allowed-tools, hooks, version) | Minimal (name + description only) |
+| Paths | `~/.claude/skills/gstack` | `~/.codex/skills/gstack` |
+| Hook skills | `hooks:` frontmatter (enforced by Claude) | Inline safety advisory prose (advisory only) |
+| `/codex` skill | Included (Claude wraps codex exec) | Excluded (self-referential) |
+
+### Testing Codex output
+
+```bash
+# Run all static tests (includes Codex validation)
+bun test
+
+# Check freshness for both hosts
+bun run gen:skill-docs --dry-run
+bun run gen:skill-docs --host codex --dry-run
+
+# Health dashboard covers both hosts
+bun run skill:check
+```
+
+### Dev setup for .agents/
+
+When you run `bin/dev-setup`, it creates symlinks in both `.claude/skills/` and `.agents/skills/` (if applicable), so Codex-compatible agents can discover your dev skills too.
+
+### Adding a new skill
+
+When you add a new skill template, both hosts get it automatically:
+1. Create `{skill}/SKILL.md.tmpl`
+2. Run `bun run gen:skill-docs` (Claude output) and `bun run gen:skill-docs --host codex` (Codex output)
+3. The dynamic template discovery picks it up — no static list to update
+4. Commit both `{skill}/SKILL.md` and `.agents/skills/gstack-{skill}/SKILL.md`
+
 ## Conductor workspaces

 If you're using [Conductor](https://conductor.build) to run multiple Claude Code sessions in parallel, `conductor.json` wires up workspace lifecycle automatically:
@@ -56,6 +56,24 @@ Open Claude Code and paste this. Claude does the rest.

 Real files get committed to your repo (not a submodule), so `git clone` just works. Everything lives inside `.claude/`. Nothing touches your PATH or runs in the background.

+### Codex, Gemini CLI, or Cursor
+
+gstack works on any agent that supports the [SKILL.md standard](https://github.com/anthropics/claude-code). Skills live in `.agents/skills/` and are discovered automatically.
+
+```bash
+git clone https://github.com/garrytan/gstack.git ~/.codex/skills/gstack
+cd ~/.codex/skills/gstack && ./setup --host codex
+```
+
+Or let setup auto-detect which agents you have installed:
+
+```bash
+git clone https://github.com/garrytan/gstack.git ~/gstack
+cd ~/gstack && ./setup --host auto
+```
+
+This installs to `~/.claude/skills/gstack` and/or `~/.codex/skills/gstack` depending on what's available. All 21 skills work across all supported agents. Hook-based safety skills (careful, freeze, guard) use inline safety advisory prose on non-Claude hosts.
+
 ## See it work

 ```
@@ -21,4 +21,30 @@ describe('locateBinary', () => {
      expect(existsSync(result)).toBe(true);
    }
  });
+
+  test('priority chain checks .codex, .agents, .claude markers', () => {
+    // Verify the source code implements the correct priority order.
+    // We read the function source to confirm the markers array order.
+    const src = require('fs').readFileSync(require('path').join(__dirname, '../src/find-browse.ts'), 'utf-8');
+    // The markers array should list .codex first, then .agents, then .claude
+    const markersMatch = src.match(/const markers = \[([^\]]+)\]/);
+    expect(markersMatch).not.toBeNull();
+    const markers = markersMatch![1];
+    const codexIdx = markers.indexOf('.codex');
+    const agentsIdx = markers.indexOf('.agents');
+    const claudeIdx = markers.indexOf('.claude');
+    // All three must be present
+    expect(codexIdx).toBeGreaterThanOrEqual(0);
+    expect(agentsIdx).toBeGreaterThanOrEqual(0);
+    expect(claudeIdx).toBeGreaterThanOrEqual(0);
+    // .codex before .agents before .claude
+    expect(codexIdx).toBeLessThan(agentsIdx);
+    expect(agentsIdx).toBeLessThan(claudeIdx);
+  });
+
+  test('function signature accepts no arguments', () => {
+    // locateBinary should be callable with no arguments
+    expect(typeof locateBinary).toBe('function');
+    expect(locateBinary.length).toBe(0);
+  });
 });
@@ -96,21 +96,67 @@ for (const file of SKILL_FILES) {
  }
 }

+// ─── Codex Skills ───────────────────────────────────────────
+
+const AGENTS_DIR = path.join(ROOT, '.agents', 'skills');
+if (fs.existsSync(AGENTS_DIR)) {
+  console.log('\n  Codex Skills (.agents/skills/):');
+  const codexDirs = fs.readdirSync(AGENTS_DIR).sort();
+  let codexCount = 0;
+  let codexMissing = 0;
+  for (const dir of codexDirs) {
+    const skillMd = path.join(AGENTS_DIR, dir, 'SKILL.md');
+    if (fs.existsSync(skillMd)) {
+      codexCount++;
+      const content = fs.readFileSync(skillMd, 'utf-8');
+      // Quick validation: must have frontmatter with name + description only
+      const hasClaude = content.includes('.claude/skills');
+      if (hasClaude) {
+        hasErrors = true;
+        console.log(`  \u274c ${dir.padEnd(30)} — contains .claude/skills reference`);
+      } else {
+        console.log(`  \u2705 ${dir.padEnd(30)} — OK`);
+      }
+    } else {
+      codexMissing++;
+      hasErrors = true;
+      console.log(`  \u274c ${dir.padEnd(30)} — SKILL.md missing`);
+    }
+  }
+  console.log(`  Total: ${codexCount} skills, ${codexMissing} missing`);
+} else {
+  console.log('\n  Codex Skills: .agents/skills/ not found (run: bun run gen:skill-docs --host codex)');
+}
+
 // ─── Freshness ──────────────────────────────────────────────

-console.log('\n  Freshness:');
+console.log('\n  Freshness (Claude):');
 try {
  execSync('bun run scripts/gen-skill-docs.ts --dry-run', { cwd: ROOT, stdio: 'pipe' });
-  console.log('  \u2705 All generated files are fresh');
+  console.log('  \u2705 All Claude generated files are fresh');
 } catch (err: any) {
  hasErrors = true;
  const output = err.stdout?.toString() || '';
-  console.log('  \u274c Generated files are stale:');
+  console.log('  \u274c Claude generated files are stale:');
  for (const line of output.split('\n').filter((l: string) => l.startsWith('STALE'))) {
    console.log(`      ${line}`);
  }
  console.log('      Run: bun run gen:skill-docs');
 }

+console.log('\n  Freshness (Codex):');
+try {
+  execSync('bun run scripts/gen-skill-docs.ts --host codex --dry-run', { cwd: ROOT, stdio: 'pipe' });
+  console.log('  \u2705 All Codex generated files are fresh');
+} catch (err: any) {
+  hasErrors = true;
+  const output = err.stdout?.toString() || '';
+  console.log('  \u274c Codex generated files are stale:');
+  for (const line of output.split('\n').filter((l: string) => l.startsWith('STALE'))) {
+    console.log(`      ${line}`);
+  }
+  console.log('      Run: bun run gen:skill-docs --host codex');
+}
+
 console.log('');
 process.exit(hasErrors ? 1 : 0);
@@ -56,28 +56,23 @@ describe('gen-skill-docs', () => {
    }
  });

-  // All skills that must have templates — single source of truth
-  const ALL_SKILLS = [
-    { dir: '.', name: 'root gstack' },
-    { dir: 'browse', name: 'browse' },
-    { dir: 'qa', name: 'qa' },
-    { dir: 'qa-only', name: 'qa-only' },
-    { dir: 'review', name: 'review' },
-    { dir: 'ship', name: 'ship' },
-    { dir: 'plan-ceo-review', name: 'plan-ceo-review' },
-    { dir: 'plan-eng-review', name: 'plan-eng-review' },
-    { dir: 'retro', name: 'retro' },
-    { dir: 'setup-browser-cookies', name: 'setup-browser-cookies' },
-    { dir: 'gstack-upgrade', name: 'gstack-upgrade' },
-    { dir: 'plan-design-review', name: 'plan-design-review' },
-    { dir: 'design-review', name: 'design-review' },
-    { dir: 'design-consultation', name: 'design-consultation' },
-    { dir: 'document-release', name: 'document-release' },
-    { dir: 'careful', name: 'careful' },
-    { dir: 'freeze', name: 'freeze' },
-    { dir: 'guard', name: 'guard' },
-    { dir: 'unfreeze', name: 'unfreeze' },
-  ];
+  // Dynamic template discovery — matches the generator's findTemplates() behavior.
+  // New skills automatically get test coverage without updating a static list.
+  const ALL_SKILLS = (() => {
+    const skills: Array<{ dir: string; name: string }> = [];
+    // Root template
+    if (fs.existsSync(path.join(ROOT, 'SKILL.md.tmpl'))) {
+      skills.push({ dir: '.', name: 'root gstack' });
+    }
+    // Subdirectory templates
+    for (const entry of fs.readdirSync(ROOT, { withFileTypes: true })) {
+      if (!entry.isDirectory() || entry.name.startsWith('.') || entry.name === 'node_modules') continue;
+      if (fs.existsSync(path.join(ROOT, entry.name, 'SKILL.md.tmpl'))) {
+        skills.push({ dir: entry.name, name: entry.name });
+      }
+    }
+    return skills;
+  })();

  test('every skill has a SKILL.md.tmpl template', () => {
    for (const skill of ALL_SKILLS) {
@@ -375,3 +370,152 @@ describe('REVIEW_DASHBOARD resolver', () => {
    expect(content).toContain('skip_eng_review');
  });
 });
+
+// ─── Codex Generation Tests ─────────────────────────────────
+
+describe('Codex generation (--host codex)', () => {
+  const AGENTS_DIR = path.join(ROOT, '.agents', 'skills');
+
+  // Dynamic discovery of expected Codex skills: all templates except /codex
+  const CODEX_SKILLS = (() => {
+    const skills: Array<{ dir: string; codexName: string }> = [];
+    if (fs.existsSync(path.join(ROOT, 'SKILL.md.tmpl'))) {
+      skills.push({ dir: '.', codexName: 'gstack' });
+    }
+    for (const entry of fs.readdirSync(ROOT, { withFileTypes: true })) {
+      if (!entry.isDirectory() || entry.name.startsWith('.') || entry.name === 'node_modules') continue;
+      if (entry.name === 'codex') continue; // /codex is excluded from Codex output
+      if (!fs.existsSync(path.join(ROOT, entry.name, 'SKILL.md.tmpl'))) continue;
+      const codexName = entry.name.startsWith('gstack-') ? entry.name : `gstack-${entry.name}`;
+      skills.push({ dir: entry.name, codexName });
+    }
+    return skills;
+  })();
+
+  test('--host codex generates correct output paths', () => {
+    for (const skill of CODEX_SKILLS) {
+      const skillMd = path.join(AGENTS_DIR, skill.codexName, 'SKILL.md');
+      expect(fs.existsSync(skillMd)).toBe(true);
+    }
+  });
+
+  test('codexSkillName mapping: root is gstack, others are gstack-{dir}', () => {
+    // Root → gstack
+    expect(fs.existsSync(path.join(AGENTS_DIR, 'gstack', 'SKILL.md'))).toBe(true);
+    // Subdirectories → gstack-{dir}
+    expect(fs.existsSync(path.join(AGENTS_DIR, 'gstack-review', 'SKILL.md'))).toBe(true);
+    expect(fs.existsSync(path.join(AGENTS_DIR, 'gstack-ship', 'SKILL.md'))).toBe(true);
+    // gstack-upgrade doesn't double-prefix
+    expect(fs.existsSync(path.join(AGENTS_DIR, 'gstack-upgrade', 'SKILL.md'))).toBe(true);
+    // No double-prefix: gstack-gstack-upgrade must NOT exist
+    expect(fs.existsSync(path.join(AGENTS_DIR, 'gstack-gstack-upgrade', 'SKILL.md'))).toBe(false);
+  });
+
+  test('Codex frontmatter has ONLY name + description', () => {
+    for (const skill of CODEX_SKILLS) {
+      const content = fs.readFileSync(path.join(AGENTS_DIR, skill.codexName, 'SKILL.md'), 'utf-8');
+      expect(content.startsWith('---\n')).toBe(true);
+      const fmEnd = content.indexOf('\n---', 4);
+      expect(fmEnd).toBeGreaterThan(0);
+      const frontmatter = content.slice(4, fmEnd);
+      // Must have name and description
+      expect(frontmatter).toContain('name:');
+      expect(frontmatter).toContain('description:');
+      // Must NOT have allowed-tools, version, or hooks
+      expect(frontmatter).not.toContain('allowed-tools:');
+      expect(frontmatter).not.toContain('version:');
+      expect(frontmatter).not.toContain('hooks:');
+    }
+  });
+
+  test('no .claude/skills/ in Codex output', () => {
+    for (const skill of CODEX_SKILLS) {
+      const content = fs.readFileSync(path.join(AGENTS_DIR, skill.codexName, 'SKILL.md'), 'utf-8');
+      expect(content).not.toContain('.claude/skills');
+    }
+  });
+
+  test('no ~/.claude/ paths in Codex output', () => {
+    for (const skill of CODEX_SKILLS) {
+      const content = fs.readFileSync(path.join(AGENTS_DIR, skill.codexName, 'SKILL.md'), 'utf-8');
+      expect(content).not.toContain('~/.claude/');
+    }
+  });
+
+  test('/codex skill excluded from Codex output', () => {
+    expect(fs.existsSync(path.join(AGENTS_DIR, 'gstack-codex', 'SKILL.md'))).toBe(false);
+    expect(fs.existsSync(path.join(AGENTS_DIR, 'gstack-codex'))).toBe(false);
+  });
+
+  test('--host codex --dry-run freshness', () => {
+    const result = Bun.spawnSync(['bun', 'run', 'scripts/gen-skill-docs.ts', '--host', 'codex', '--dry-run'], {
+      cwd: ROOT,
+      stdout: 'pipe',
+      stderr: 'pipe',
+    });
+    expect(result.exitCode).toBe(0);
+    const output = result.stdout.toString();
+    // Every Codex skill should be FRESH
+    for (const skill of CODEX_SKILLS) {
+      expect(output).toContain(`FRESH: .agents/skills/${skill.codexName}/SKILL.md`);
+    }
+    expect(output).not.toContain('STALE');
+  });
+
+  test('--host agents alias produces same output as --host codex', () => {
+    const codexResult = Bun.spawnSync(['bun', 'run', 'scripts/gen-skill-docs.ts', '--host', 'codex', '--dry-run'], {
+      cwd: ROOT,
+      stdout: 'pipe',
+      stderr: 'pipe',
+    });
+    const agentsResult = Bun.spawnSync(['bun', 'run', 'scripts/gen-skill-docs.ts', '--host', 'agents', '--dry-run'], {
+      cwd: ROOT,
+      stdout: 'pipe',
+      stderr: 'pipe',
+    });
+    expect(codexResult.exitCode).toBe(0);
+    expect(agentsResult.exitCode).toBe(0);
+    // Both should produce the same output (same FRESH lines)
+    expect(codexResult.stdout.toString()).toBe(agentsResult.stdout.toString());
+  });
+
+  test('multiline descriptions preserved in Codex output', () => {
+    // office-hours has a multiline description — verify it survives the frontmatter transform
+    const content = fs.readFileSync(path.join(AGENTS_DIR, 'gstack-office-hours', 'SKILL.md'), 'utf-8');
+    const fmEnd = content.indexOf('\n---', 4);
+    const frontmatter = content.slice(4, fmEnd);
+    // Description should span multiple lines (block scalar)
+    const descLines = frontmatter.split('\n').filter(l => l.startsWith('  '));
+    expect(descLines.length).toBeGreaterThan(1);
+    // Verify key phrases survived
+    expect(frontmatter).toContain('YC Office Hours');
+  });
+
+  test('hook skills have safety prose and no hooks: in frontmatter', () => {
+    const HOOK_SKILLS = ['gstack-careful', 'gstack-freeze', 'gstack-guard'];
+    for (const skillName of HOOK_SKILLS) {
+      const content = fs.readFileSync(path.join(AGENTS_DIR, skillName, 'SKILL.md'), 'utf-8');
+      // Must have safety advisory prose
+      expect(content).toContain('Safety Advisory');
+      // Must NOT have hooks: in frontmatter
+      const fmEnd = content.indexOf('\n---', 4);
+      const frontmatter = content.slice(4, fmEnd);
+      expect(frontmatter).not.toContain('hooks:');
+    }
+  });
+
+  test('all Codex SKILL.md files have auto-generated header', () => {
+    for (const skill of CODEX_SKILLS) {
+      const content = fs.readFileSync(path.join(AGENTS_DIR, skill.codexName, 'SKILL.md'), 'utf-8');
+      expect(content).toContain('AUTO-GENERATED from SKILL.md.tmpl');
+      expect(content).toContain('Regenerate: bun run gen:skill-docs');
+    }
+  });
+
+  test('Codex preamble uses codex paths', () => {
+    // Check a skill that has a preamble (review is a good candidate)
+    const content = fs.readFileSync(path.join(AGENTS_DIR, 'gstack-review', 'SKILL.md'), 'utf-8');
+    expect(content).toContain('~/.codex/skills/gstack');
+    expect(content).toContain('.agents/skills/gstack');
+  });
+});
@@ -1257,3 +1257,70 @@ describe('Skill trigger phrases', () => {
    });
  }
 });
+
+// ─── Codex Skill Validation ──────────────────────────────────
+
+describe('Codex skill validation', () => {
+  const AGENTS_DIR = path.join(ROOT, '.agents', 'skills');
+
+  // Discover all Claude skills with templates (except /codex which is Claude-only)
+  const CLAUDE_SKILLS_WITH_TEMPLATES = (() => {
+    const skills: string[] = [];
+    for (const entry of fs.readdirSync(ROOT, { withFileTypes: true })) {
+      if (!entry.isDirectory() || entry.name.startsWith('.') || entry.name === 'node_modules') continue;
+      if (entry.name === 'codex') continue; // Claude-only skill
+      if (fs.existsSync(path.join(ROOT, entry.name, 'SKILL.md.tmpl'))) {
+        skills.push(entry.name);
+      }
+    }
+    return skills;
+  })();
+
+  test('all skills (except /codex) have both Claude and Codex variants', () => {
+    for (const skillDir of CLAUDE_SKILLS_WITH_TEMPLATES) {
+      // Claude variant
+      const claudeMd = path.join(ROOT, skillDir, 'SKILL.md');
+      expect(fs.existsSync(claudeMd)).toBe(true);
+
+      // Codex variant
+      const codexName = skillDir.startsWith('gstack-') ? skillDir : `gstack-${skillDir}`;
+      const codexMd = path.join(AGENTS_DIR, codexName, 'SKILL.md');
+      expect(fs.existsSync(codexMd)).toBe(true);
+    }
+    // Root template has both too
+    expect(fs.existsSync(path.join(ROOT, 'SKILL.md'))).toBe(true);
+    expect(fs.existsSync(path.join(AGENTS_DIR, 'gstack', 'SKILL.md'))).toBe(true);
+  });
+
+  test('/codex skill is Claude-only — no Codex variant', () => {
+    // Claude variant should exist
+    expect(fs.existsSync(path.join(ROOT, 'codex', 'SKILL.md'))).toBe(true);
+    // Codex variant must NOT exist
+    expect(fs.existsSync(path.join(AGENTS_DIR, 'gstack-codex', 'SKILL.md'))).toBe(false);
+  });
+
+  test('Codex skill names follow gstack-{name} convention', () => {
+    const codexDirs = fs.readdirSync(AGENTS_DIR);
+    for (const dir of codexDirs) {
+      // Every directory should start with gstack
+      expect(dir.startsWith('gstack')).toBe(true);
+      // Root is just 'gstack', others are 'gstack-{name}'
+      if (dir !== 'gstack') {
+        expect(dir.startsWith('gstack-')).toBe(true);
+      }
+    }
+  });
+
+  test('$B commands in Codex SKILL.md files are valid browse commands', () => {
+    const codexDirs = fs.readdirSync(AGENTS_DIR);
+    for (const dir of codexDirs) {
+      const skillMd = path.join(AGENTS_DIR, dir, 'SKILL.md');
+      if (!fs.existsSync(skillMd)) continue;
+      const content = fs.readFileSync(skillMd, 'utf-8');
+      // Only validate if the skill contains $B commands
+      if (!content.includes('$B ')) continue;
+      const result = validateSkill(skillMd);
+      expect(result.invalid).toHaveLength(0);
+    }
+  });
+});