feat: voice-friendly skill triggers for speech-to-text input

Add voice-triggers YAML field to 10 SKILL.md.tmpl files with natural-language aliases (e.g. "see-so" for /cso, "tech review" for /plan-eng-review). gen-skill-docs preprocesses voice triggers before transformFrontmatter, folding them into the description and stripping the field from output. Includes unit tests, README voice input section, and CONTRIBUTING.md update.
2026-07-07 00:27:56 +02:00 · 2026-03-31 22:15:11 -07:00
parent db35b8e5bf
commit d10953b388
24 changed files with 175 additions and 4 deletions
@@ -257,7 +257,7 @@ bun run build
 | Aspect | Claude | Codex |
 |--------|--------|-------|
 | Output directory | `{skill}/SKILL.md` | `.agents/skills/gstack-{skill}/SKILL.md` (generated at setup, gitignored) |
-| Frontmatter | Full (name, description, allowed-tools, hooks, version) | Minimal (name + description only) |
+| Frontmatter | Full (name, description, voice-triggers, allowed-tools, hooks, version) | Minimal (name + description only) |
 | Paths | `~/.claude/skills/gstack` | `$GSTACK_ROOT` (`.agents/skills/gstack` in a repo, otherwise `~/.codex/skills/gstack`) |
 | Hook skills | `hooks:` frontmatter (enforced by Claude) | Inline safety advisory prose (advisory only) |
 | `/codex` skill | Included (Claude wraps codex exec) | Excluded (self-referential) |
@@ -103,6 +103,12 @@ cd ~/gstack && ./setup --host factory

 Skills install to `~/.factory/skills/gstack-*/`. Restart `droid` to rescan skills, then type `/qa` to get started.

+### Voice input (AquaVoice, Whisper, etc.)
+
+gstack skills have voice-friendly trigger phrases. Say what you want naturally —
+"run a security check", "test the website", "do an engineering review" — and the
+right skill activates. You don't need to remember slash command names or acronyms.
+
 ## See it work

 ```
@@ -11,6 +11,7 @@ description: |
  automatically", or "make the decisions for me".
  Proactively suggest when the user has a plan file and wants to run the full review
  gauntlet without answering 15-30 intermediate questions. (gstack)
+  Voice triggers (speech-to-text aliases): "auto plan", "automatic review".
 benefits-from: [office-hours]
 allowed-tools:
  - Bash
@@ -11,6 +11,9 @@ description: |
  automatically", or "make the decisions for me".
  Proactively suggest when the user has a plan file and wants to run the full review
  gauntlet without answering 15-30 intermediate questions. (gstack)
+voice-triggers:
+  - "auto plan"
+  - "automatic review"
 benefits-from: [office-hours]
 allowed-tools:
  - Bash
@@ -8,6 +8,7 @@ description: |
  Compares before/after on every PR. Tracks performance trends over time.
  Use when: "performance", "benchmark", "page speed", "lighthouse", "web vitals",
  "bundle size", "load time". (gstack)
+  Voice triggers (speech-to-text aliases): "speed test", "check performance".
 allowed-tools:
  - Bash
  - Read
@@ -8,6 +8,9 @@ description: |
  Compares before/after on every PR. Tracks performance trends over time.
  Use when: "performance", "benchmark", "page speed", "lighthouse", "web vitals",
  "bundle size", "load time". (gstack)
+voice-triggers:
+  - "speed test"
+  - "check performance"
 allowed-tools:
  - Bash
  - Read
@@ -8,6 +8,7 @@ description: |
  your code. Consult: ask codex anything with session continuity for follow-ups.
  The "200 IQ autistic developer" second opinion. Use when asked to "codex review",
  "codex challenge", "ask codex", "second opinion", or "consult codex". (gstack)
+  Voice triggers (speech-to-text aliases): "code x", "code ex", "get another opinion".
 allowed-tools:
  - Bash
  - Read
@@ -8,6 +8,10 @@ description: |
  your code. Consult: ask codex anything with session continuity for follow-ups.
  The "200 IQ autistic developer" second opinion. Use when asked to "codex review",
  "codex challenge", "ask codex", "second opinion", or "consult codex". (gstack)
+voice-triggers:
+  - "code x"
+  - "code ex"
+  - "get another opinion"
 allowed-tools:
  - Bash
  - Read
@@ -7,6 +7,7 @@ description: |
  action in real time. The extension shows a live activity feed in the Side Panel.
  Use when asked to "connect chrome", "open chrome", "real browser", "launch chrome",
  "side panel", or "control my browser".
+  Voice triggers (speech-to-text aliases): "show me the browser".
 allowed-tools:
  - Bash
  - Read
@@ -7,6 +7,8 @@ description: |
  action in real time. The extension shows a live activity feed in the Side Panel.
  Use when asked to "connect chrome", "open chrome", "real browser", "launch chrome",
  "side panel", or "control my browser".
+voice-triggers:
+  - "show me the browser"
 allowed-tools:
  - Bash
  - Read
@@ -9,6 +9,7 @@ description: |
  Two modes: daily (zero-noise, 8/10 confidence gate) and comprehensive (monthly deep
  scan, 2/10 bar). Trend tracking across audit runs.
  Use when: "security audit", "threat model", "pentest review", "OWASP", "CSO review". (gstack)
+  Voice triggers (speech-to-text aliases): "see-so", "see so", "security review", "security check", "vulnerability scan", "run security".
 allowed-tools:
  - Bash
  - Read
@@ -9,6 +9,13 @@ description: |
  Two modes: daily (zero-noise, 8/10 confidence gate) and comprehensive (monthly deep
  scan, 2/10 bar). Trend tracking across audit runs.
  Use when: "security audit", "threat model", "pentest review", "OWASP", "CSO review". (gstack)
+voice-triggers:
+  - "see-so"
+  - "see so"
+  - "security review"
+  - "security check"
+  - "vulnerability scan"
+  - "run security"
 allowed-tools:
  - Bash
  - Read
@@ -10,6 +10,7 @@ description: |
  Use when: "finalize this design", "turn this mockup into HTML", "implement
  this design", or after /design-shotgun approves a direction.
  Proactively suggest when user has approved a design in /design-shotgun. (gstack)
+  Voice triggers (speech-to-text aliases): "build the design", "code the mockup", "make it real".
 allowed-tools:
  - Bash
  - Read
@@ -10,6 +10,10 @@ description: |
  Use when: "finalize this design", "turn this mockup into HTML", "implement
  this design", or after /design-shotgun approves a direction.
  Proactively suggest when user has approved a design in /design-shotgun. (gstack)
+voice-triggers:
+  - "build the design"
+  - "code the mockup"
+  - "make it real"
 allowed-tools:
  - Bash
  - Read
@@ -5,6 +5,7 @@ description: |
  Upgrade gstack to the latest version. Detects global vs vendored install,
  runs the upgrade, and shows what's new. Use when asked to "upgrade gstack",
  "update gstack", or "get latest version".
+  Voice triggers (speech-to-text aliases): "upgrade the tools", "update the tools", "gee stack upgrade", "g stack upgrade".
 allowed-tools:
  - Bash
  - Read
@@ -5,6 +5,11 @@ description: |
  Upgrade gstack to the latest version. Detects global vs vendored install,
  runs the upgrade, and shows what's new. Use when asked to "upgrade gstack",
  "update gstack", or "get latest version".
+voice-triggers:
+  - "upgrade the tools"
+  - "update the tools"
+  - "gee stack upgrade"
+  - "g stack upgrade"
 allowed-tools:
  - Bash
  - Read
@@ -9,6 +9,7 @@ description: |
  "review the architecture", "engineering review", or "lock in the plan".
  Proactively suggest when the user has a plan or design doc and is about to
  start coding — to catch architecture issues before implementation. (gstack)
+  Voice triggers (speech-to-text aliases): "tech review", "technical review", "plan engineering review".
 benefits-from: [office-hours]
 allowed-tools:
  - Read
@@ -9,6 +9,10 @@ description: |
  "review the architecture", "engineering review", or "lock in the plan".
  Proactively suggest when the user has a plan or design doc and is about to
  start coding — to catch architecture issues before implementation. (gstack)
+voice-triggers:
+  - "tech review"
+  - "technical review"
+  - "plan engineering review"
 benefits-from: [office-hours]
 allowed-tools:
  - Read
@@ -8,6 +8,7 @@ description: |
  fixes anything. Use when asked to "just report bugs", "qa report only", or
  "test but don't fix". For the full test-fix-verify loop, use /qa instead.
  Proactively suggest when the user wants a bug report without any code changes. (gstack)
+  Voice triggers (speech-to-text aliases): "bug report", "just check for bugs".
 allowed-tools:
  - Bash
  - Read
@@ -8,6 +8,9 @@ description: |
  fixes anything. Use when asked to "just report bugs", "qa report only", or
  "test but don't fix". For the full test-fix-verify loop, use /qa instead.
  Proactively suggest when the user wants a bug report without any code changes. (gstack)
+voice-triggers:
+  - "bug report"
+  - "just check for bugs"
 allowed-tools:
  - Bash
  - Read
@@ -11,6 +11,7 @@ description: |
  or asks "does this work?". Three tiers: Quick (critical/high only),
  Standard (+ medium), Exhaustive (+ cosmetic). Produces before/after health scores,
  fix evidence, and a ship-readiness summary. For report-only mode, use /qa-only. (gstack)
+  Voice triggers (speech-to-text aliases): "quality check", "test the app", "run QA".
 allowed-tools:
  - Bash
  - Read
@@ -11,6 +11,10 @@ description: |
  or asks "does this work?". Three tiers: Quick (critical/high only),
  Standard (+ medium), Exhaustive (+ cosmetic). Produces before/after health scores,
  fix evidence, and a ship-readiness summary. For report-only mode, use /qa-only. (gstack)
+voice-triggers:
+  - "quality check"
+  - "test the app"
+  - "run QA"
 allowed-tools:
  - Bash
  - Read
@@ -132,6 +132,63 @@ function extractNameAndDescription(content: string): { name: string; description
  return { name, description };
 }

+// ─── Voice Trigger Processing ────────────────────────────────
+
+/**
+ * Extract voice-triggers YAML list from frontmatter.
+ * Returns an array of trigger strings, or [] if no voice-triggers field.
+ */
+function extractVoiceTriggers(content: string): string[] {
+  const fmStart = content.indexOf('---\n');
+  if (fmStart !== 0) return [];
+  const fmEnd = content.indexOf('\n---', fmStart + 4);
+  if (fmEnd === -1) return [];
+  const frontmatter = content.slice(fmStart + 4, fmEnd);
+
+  const triggers: string[] = [];
+  let inVoice = false;
+  for (const line of frontmatter.split('\n')) {
+    if (/^voice-triggers:/.test(line)) { inVoice = true; continue; }
+    if (inVoice) {
+      const m = line.match(/^\s+-\s+"(.+)"$/);
+      if (m) triggers.push(m[1]);
+      else if (!/^\s/.test(line)) break;
+    }
+  }
+  return triggers;
+}
+
+/**
+ * Preprocess voice triggers: fold voice-triggers YAML field into description,
+ * then strip the field from frontmatter. Must run BEFORE transformFrontmatter
+ * and extractNameAndDescription so all hosts see the updated description.
+ */
+function processVoiceTriggers(content: string): string {
+  const triggers = extractVoiceTriggers(content);
+  if (triggers.length === 0) return content;
+
+  // Strip voice-triggers block from frontmatter
+  content = content.replace(/^voice-triggers:\n(?:\s+-\s+"[^"]*"\n?)*/m, '');
+
+  // Get current description (after stripping voice-triggers, so it's clean)
+  const { description } = extractNameAndDescription(content);
+  if (!description) return content;
+
+  // Build new description with voice triggers appended
+  const voiceLine = `Voice triggers (speech-to-text aliases): ${triggers.map(t => `"${t}"`).join(', ')}.`;
+  const newDescription = description + '\n' + voiceLine;
+
+  // Replace old indented description with new in frontmatter
+  const oldIndented = description.split('\n').map(l => `  ${l}`).join('\n');
+  const newIndented = newDescription.split('\n').map(l => `  ${l}`).join('\n');
+  content = content.replace(oldIndented, newIndented);
+
+  return content;
+}
+
+// Export for testing
+export { extractVoiceTriggers, processVoiceTriggers };
+
 const OPENAI_SHORT_DESCRIPTION_LIMIT = 120;

 function condenseOpenAIShortDescription(description: string): string {
@@ -163,8 +220,10 @@ policy:
 */
 function transformFrontmatter(content: string, host: Host): string {
  if (host === 'claude') {
-    // Strip sensitive: field from Claude output (only Factory uses it)
-    return content.replace(/^sensitive:\s*true\n/m, '');
+    // Strip fields not used by Claude: sensitive (Factory-only), voice-triggers (folded into description by preprocessing)
+    content = content.replace(/^sensitive:\s*true\n/m, '');
+    content = content.replace(/^voice-triggers:\n(?:\s+-\s+"[^"]*"\n?)*/m, '');
+    return content;
  }

  const fmStart = content.indexOf('---\n');
@@ -364,13 +423,22 @@ function processTemplate(tmplPath: string, host: Host = 'claude'): { outputPath:
    throw new Error(`Unresolved placeholders in ${relTmplPath}: ${remaining.join(', ')}`);
  }

+  // Preprocess voice triggers: fold into description, strip field from frontmatter.
+  // Must run BEFORE transformFrontmatter so all hosts see the updated description,
+  // and BEFORE extractedDescription is used by external host metadata.
+  content = processVoiceTriggers(content);
+
+  // Re-extract description AFTER voice trigger preprocessing so Codex openai.yaml
+  // metadata gets the updated description with voice triggers included.
+  const postProcessDescription = extractNameAndDescription(content).description;
+
  // For Claude: strip sensitive: field (only Factory uses it)
  // For external hosts: route output, transform frontmatter, rewrite paths
  let symlinkLoop = false;
  if (host === 'claude') {
    content = transformFrontmatter(content, host);
  } else {
-    const result = processExternalHost(content, tmplContent, host, skillDir, extractedDescription, ctx, extractedName || undefined);
+    const result = processExternalHost(content, tmplContent, host, skillDir, postProcessDescription, ctx, extractedName || undefined);
    content = result.content;
    outputPath = result.outputPath;
    symlinkLoop = result.symlinkLoop;
@@ -2542,3 +2542,51 @@ describe('gen-skill-docs prefix warning (#620/#578)', () => {
    }
  });
 });
+
+describe('voice-triggers processing', () => {
+  const { extractVoiceTriggers, processVoiceTriggers } = require('../scripts/gen-skill-docs') as {
+    extractVoiceTriggers: (content: string) => string[];
+    processVoiceTriggers: (content: string) => string;
+  };
+
+  test('extractVoiceTriggers parses valid YAML list', () => {
+    const content = `---\nname: cso\ndescription: |\n  Security audit.\nvoice-triggers:\n  - "see-so"\n  - "security review"\n---\nBody`;
+    const triggers = extractVoiceTriggers(content);
+    expect(triggers).toEqual(['see-so', 'security review']);
+  });
+
+  test('extractVoiceTriggers returns [] when no field present', () => {
+    const content = `---\nname: qa\ndescription: |\n  QA testing.\n---\nBody`;
+    expect(extractVoiceTriggers(content)).toEqual([]);
+  });
+
+  test('processVoiceTriggers appends voice triggers to description', () => {
+    const content = `---\nname: cso\ndescription: |\n  Security audit. (gstack)\nvoice-triggers:\n  - "see-so"\n  - "security review"\n---\nBody`;
+    const result = processVoiceTriggers(content);
+    expect(result).toContain('Voice triggers (speech-to-text aliases): "see-so", "security review".');
+  });
+
+  test('processVoiceTriggers strips voice-triggers field from output', () => {
+    const content = `---\nname: cso\ndescription: |\n  Security audit. (gstack)\nvoice-triggers:\n  - "see-so"\n---\nBody`;
+    const result = processVoiceTriggers(content);
+    expect(result).not.toContain('voice-triggers:');
+  });
+
+  test('processVoiceTriggers returns content unchanged when no voice-triggers', () => {
+    const content = `---\nname: qa\ndescription: |\n  QA testing.\n---\nBody`;
+    expect(processVoiceTriggers(content)).toBe(content);
+  });
+
+  test('generated CSO SKILL.md contains voice triggers in description', () => {
+    const content = fs.readFileSync(path.join(ROOT, 'cso', 'SKILL.md'), 'utf-8');
+    expect(content).toContain('"see-so"');
+    expect(content).toContain('Voice triggers (speech-to-text aliases):');
+  });
+
+  test('generated CSO SKILL.md does NOT contain raw voice-triggers field', () => {
+    const content = fs.readFileSync(path.join(ROOT, 'cso', 'SKILL.md'), 'utf-8');
+    const fmEnd = content.indexOf('\n---', 4);
+    const frontmatter = content.slice(0, fmEnd);
+    expect(frontmatter).not.toContain('voice-triggers:');
+  });
+});