Merge origin/main (v1.52.1.0) into spec-pii-redaction-guard

Resolve bin/gstack-config (keep both redact_* and brain_* config keys). Regenerate all SKILL.md from merged templates + resolvers (redact-doc resolver now coexists with main's brain-aware-planning resolvers). Refresh ship goldens. Move the redaction taxonomy reference in /cso and /spec to a pointer at lib/redact-patterns.ts (single source of truth) so neither skill inlines the full catalog — keeps both under the size budget after the merge. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-18 15:50:11 +02:00 · 2026-05-29 18:08:38 -07:00
parent 02f848dde4 7f7e9d8652
commit 887210bfd1
145 changed files with 14101 additions and 412 deletions
@@ -6,76 +6,265 @@
 *
 * These resolvers are suppressed on hosts that don't support brain features
 * (via suppressedResolvers in each host config). For those hosts,
- * {{GBRAIN_CONTEXT_LOAD}} and {{GBRAIN_SAVE_RESULTS}} resolve to empty string.
+ * {{GBRAIN_CONTEXT_LOAD}}, {{GBRAIN_SAVE_RESULTS}}, {{BRAIN_PREFLIGHT}},
+ * {{BRAIN_CACHE_REFRESH}}, and {{BRAIN_WRITE_BACK}} all resolve to empty string.
 *
 * Compatible with GBrain >= v0.10.0 (search CLI, doctor --fast --json, entity enrichment).
+ *
+ * Brain-aware planning (T4 / v1.48 plan): adds three new resolvers powered by
+ * the bin/gstack-brain-cache CLI and scripts/brain-cache-spec.ts. The new
+ * resolvers fire only for the 5 planning skills registered in
+ * SKILL_DIGEST_SUBSETS (office-hours, plan-ceo-review, plan-eng-review,
+ * plan-design-review, plan-devex-review).
 */
 import type { TemplateContext } from './types';
+import {
+  SKILL_DIGEST_SUBSETS,
+  SKILL_CALIBRATION_WEIGHTS,
+  BRAIN_CACHE_ENTITIES,
+  getSkillSubset,
+  getInvalidationTargets,
+} from '../brain-cache-spec';
+
+// Per-skill slug + title + tag metadata for SAVE_RESULTS. The full save
+// template (heredoc body, entity-stub instructions, throttle handling,
+// backlinks) lives in docs/gbrain-write-surfaces.md §Save Template and is
+// read on-demand by the agent. Compressing the inline prose keeps the
+// token footprint at ~150 tokens per skill (down from ~500), so users with
+// gbrain installed pay a small overhead and users without it (whose hosts
+// have GBRAIN_SAVE_RESULTS suppressed at gen-time) pay nothing.
+interface SkillSaveMeta {
+  slugPrefix: string;
+  title: string;
+  tag: string;
+}
+
+const skillSaveMap: Record<string, SkillSaveMeta> = {
+  'office-hours':         { slugPrefix: 'office-hours',    title: 'Office Hours',    tag: 'design-doc' },
+  'investigate':          { slugPrefix: 'investigations',  title: 'Investigation',   tag: 'investigation' },
+  'plan-ceo-review':      { slugPrefix: 'ceo-plans',       title: 'CEO Plan',        tag: 'ceo-plan' },
+  'plan-eng-review':      { slugPrefix: 'eng-reviews',     title: 'Eng Review',      tag: 'eng-review' },
+  'plan-design-review':   { slugPrefix: 'design-reviews',  title: 'Design Review',   tag: 'design-review' },
+  'plan-devex-review':    { slugPrefix: 'devex-reviews',   title: 'Devex Review',    tag: 'devex-review' },
+  'retro':                { slugPrefix: 'retros',          title: 'Retro',           tag: 'retro' },
+  'ship':                 { slugPrefix: 'releases',        title: 'Release',         tag: 'release' },
+  'cso':                  { slugPrefix: 'security-audits', title: 'Security Audit',  tag: 'security-audit' },
+  'design-consultation':  { slugPrefix: 'design-systems',  title: 'Design System',   tag: 'design-system' },
+};

 export function generateGBrainContextLoad(ctx: TemplateContext): string {
  let base = `## Brain Context Load

-Before starting this skill, search your brain for relevant context:
+**Skip this entire section if \`gbrain\` is not on PATH.**

-1. Extract 2-4 keywords from the user's request (nouns, error names, file paths, technical terms).
-   Search GBrain: \`gbrain search "keyword1 keyword2"\`
-   Example: for "the login page is broken after deploy", search \`gbrain search "login broken deploy"\`
-   Search returns lines like: \`[slug] Title (score: 0.85) - first line of content...\`
-2. If few results, broaden to the single most specific keyword and search again.
-3. For each result page, read it: \`gbrain get_page "<page_slug>"\`
-   Read the top 3 pages for context.
-4. Use this brain context to inform your analysis.
+Extract 2-4 keywords from the user's request. Search the brain:
+\`gbrain search "<keywords>"\`. Read the top 3 results with
+\`gbrain get_page "<slug>"\`. Use that context to inform your analysis.

-If GBrain is not available or returns no results, proceed without brain context.
-Any non-zero exit code from gbrain commands should be treated as a transient failure.`;
+If \`gbrain search\` returns no results or any non-zero exit, proceed
+without brain context. Full search/read protocol + examples:
+see \`docs/gbrain-write-surfaces.md\` §Context Load.`;

  if (ctx.skillName === 'investigate') {
-    base += `\n\nIf the user's request is about tracking, extracting, or researching structured data (e.g., "track this data", "extract from emails", "build a tracker"), route to GBrain's data-research skill instead: \`gbrain call data-research\`. This skill has a 7-phase pipeline optimized for structured data extraction.`;
+    base += `\n\nFor structured-data extraction requests ("track this", "extract from emails", "build a tracker"), route to GBrain's data-research skill instead: \`gbrain call data-research\`.`;
  }

  return base;
 }

 export function generateGBrainSaveResults(ctx: TemplateContext): string {
-  // gbrain v0.18+ renamed `put_page` → `put <slug>` and moved --title/--tags
-  // into YAML frontmatter inside --content. These templates render into
-  // SKILL.md files as user-facing instructions; using the old subcommand
-  // ships broken copy-paste to every gstack user.
-  const skillSaveMap: Record<string, string> = {
-    'office-hours': 'Save the design document as a brain page:\n```bash\ngbrain put "office-hours/<project-slug>" --content "$(cat <<\'EOF\'\n---\ntitle: "Office Hours: <project name>"\ntags: [design-doc, <project-slug>]\n---\n<design doc content in markdown>\nEOF\n)"\n```',
-    'investigate': 'Save the root cause analysis as a brain page:\n```bash\ngbrain put "investigations/<issue-slug>" --content "$(cat <<\'EOF\'\n---\ntitle: "Investigation: <issue summary>"\ntags: [investigation, <affected-files>]\n---\n<investigation findings in markdown>\nEOF\n)"\n```',
-    'plan-ceo-review': 'Save the CEO plan as a brain page:\n```bash\ngbrain put "ceo-plans/<feature-slug>" --content "$(cat <<\'EOF\'\n---\ntitle: "CEO Plan: <feature name>"\ntags: [ceo-plan, <feature-slug>]\n---\n<scope decisions and vision in markdown>\nEOF\n)"\n```',
-    'retro': 'Save the retrospective as a brain page:\n```bash\ngbrain put "retros/<date>" --content "$(cat <<\'EOF\'\n---\ntitle: "Retro: <date range>"\ntags: [retro, <date>]\n---\n<retro output in markdown>\nEOF\n)"\n```',
-    'plan-eng-review': 'Save the architecture decisions as a brain page:\n```bash\ngbrain put "eng-reviews/<feature-slug>" --content "$(cat <<\'EOF\'\n---\ntitle: "Eng Review: <feature name>"\ntags: [eng-review, <feature-slug>]\n---\n<review findings and decisions in markdown>\nEOF\n)"\n```',
-    'ship': 'Save the release notes as a brain page:\n```bash\ngbrain put "releases/<version>" --content "$(cat <<\'EOF\'\n---\ntitle: "Release: <version>"\ntags: [release, <version>]\n---\n<changelog entry and deploy details in markdown>\nEOF\n)"\n```',
-    'cso': 'Save the security audit as a brain page:\n```bash\ngbrain put "security-audits/<date>" --content "$(cat <<\'EOF\'\n---\ntitle: "Security Audit: <date>"\ntags: [security-audit, <date>]\n---\n<findings and remediation status in markdown>\nEOF\n)"\n```',
-    'design-consultation': 'Save the design system as a brain page:\n```bash\ngbrain put "design-systems/<project-slug>" --content "$(cat <<\'EOF\'\n---\ntitle: "Design System: <project name>"\ntags: [design-system, <project-slug>]\n---\n<design decisions in markdown>\nEOF\n)"\n```',
-  };
+  // gbrain v0.18+ uses `gbrain put <slug>` (NOT the deprecated `put_page`
+  // MCP op). Compressed in v1.50.0.0: the inline heredoc + entity-stub +
+  // throttle + backlink prose moved to docs/gbrain-write-surfaces.md
+  // §Save Template, which the agent reads on demand when it actually
+  // saves. The compact pointer keeps non-gbrain users' token overhead
+  // near zero when their host's static suppression is overridden by
+  // detection.
+  const meta = skillSaveMap[ctx.skillName];

-  const saveInstruction = skillSaveMap[ctx.skillName] || 'Save the skill output as a brain page if the results are worth preserving:\n```bash\ngbrain put "<slug>" --content "$(cat <<\'EOF\'\n---\ntitle: "<descriptive title>"\ntags: [<relevant>, <tags>]\n---\n<content in markdown>\nEOF\n)"\n```';
+  if (!meta) {
+    return `## Save Results to Brain
+
+**Skip this entire section if \`gbrain\` is not on PATH.**
+
+If the skill output is worth preserving, save it via
+\`gbrain put "<slug>" --content "<frontmatter + markdown>"\`. Full template
+(heredoc body, frontmatter shape, entity-stub instructions, throttle
+handling): see \`docs/gbrain-write-surfaces.md\` §Save Template.`;
+  }

  return `## Save Results to Brain

-After completing this skill, persist the results to your brain for future reference:
+**Skip this entire section if \`gbrain\` is not on PATH.**

-${saveInstruction}
+After completing this skill, save the output:

-After saving the page, extract and enrich mentioned entities: for each actual person name or company/organization name found in the output, \`gbrain search "<entity name>"\` to check if a page exists. If not, create a stub page:
 \`\`\`bash
-gbrain put "entities/<entity-slug>" --content "$(cat <<'EOF'
+gbrain put "${meta.slugPrefix}/<feature-slug>" --content "$(cat <<'EOF'
 ---
-title: "<Person or Company Name>"
-tags: [entity, person]
+title: "${meta.title}: <feature name>"
+tags: [${meta.tag}, <feature-slug>]
 ---
-Stub page. Mentioned in <skill name> output.
+<skill output in markdown>
 EOF
 )"
 \`\`\`
-Only extract actual person names and company/organization names. Skip product names, section headings, technical terms, and file paths.

-Throttle errors appear as: exit code 1 with stderr containing "throttle", "rate limit", "capacity", or "busy". If GBrain returns a throttle or rate-limit error on any save operation, defer the save and move on. The brain is busy — the content is not lost, just not persisted this run. Any other non-zero exit code should also be treated as a transient failure.
-
-Add backlinks to related brain pages if they exist. If GBrain is not available, skip this step.
-
-After brain operations complete, note in your completion output: how many pages were found in the initial search, how many entities were enriched, and whether any operations were throttled. This helps the user see brain utilization over time.`;
+Then extract person/org entities and create stub pages for each one.
+Throttle errors (exit 1 with "throttle"/"rate limit"/"busy") and any
+other non-zero exit are transient — don't retry inline. Full entity-stub
+template, throttle handling, and backlink protocol:
+see \`docs/gbrain-write-surfaces.md\` §Save Template.`;
+}
+
+// ────────────────────────────────────────────────────────────────────
+// Brain-aware planning resolvers (T4 / v1.48 plan)
+// ────────────────────────────────────────────────────────────────────
+
+/**
+ * Returns true when this skill is registered for brain preflight. Skills not
+ * in SKILL_DIGEST_SUBSETS get an empty BRAIN_PREFLIGHT block (no behavior).
+ */
+function isPreflightSkill(skillName: string): boolean {
+  return Object.prototype.hasOwnProperty.call(SKILL_DIGEST_SUBSETS, skillName);
+}
+
+/**
+ * Renders the per-skill BRAIN_PREFLIGHT block. The rendered output is a single
+ * bash script that:
+ *   1. Reads each digest file from gstack-brain-cache get (one call per digest)
+ *   2. Falls back to "(brain context unavailable)" on missing
+ *   3. Concatenates outputs into a single ## Brain Context block injected
+ *      into the skill's prompt context
+ *   4. Tells the agent: "use this context to skip already-known questions"
+ *
+ * The cache CLI handles cold-refresh + lock dedup + stale-but-usable
+ * fallback internally. From the resolver's perspective the call is one
+ * shell command per digest.
+ */
+export function generateBrainPreflight(ctx: TemplateContext): string {
+  if (!isPreflightSkill(ctx.skillName)) return '';
+  const subset = getSkillSubset(ctx.skillName);
+  const binDir = ctx.paths.binDir;
+  // Build the bash that loads each digest. Per-skill subset is small (2-5 entries).
+  const loadLines = subset.map((entityName) => {
+    const entity = BRAIN_CACHE_ENTITIES[entityName];
+    if (!entity) return '';
+    const projectFlag = entity.scope === 'per-project' ? '--project "$SLUG"' : '';
+    return `  printf '\\n### %s\\n\\n' "${entityName}"\n  ${binDir}/gstack-brain-cache get ${entityName} ${projectFlag} 2>/dev/null || printf '_(no ${entityName} digest available yet)_\\n'`;
+  }).join('\n');
+
+  return `## Brain Context (preflight)
+
+Before asking any clarifying questions, load the brain's structured context
+for this project. The cache layer handles staleness, refresh, and stale-but-
+usable fallback automatically. Skip questions whose answers are already
+present in the loaded context; ground recommendations in what the brain
+already knows about the user, the product, the goals, and recent decisions.
+
+\`\`\`bash
+eval "$(${binDir}/gstack-slug 2>/dev/null)" 2>/dev/null || true
+{
+  printf '## Brain Context\\n\\n'
+${loadLines}
+} > /tmp/.gstack-brain-context-$$.md 2>/dev/null
+[ -s /tmp/.gstack-brain-context-$$.md ] && cat /tmp/.gstack-brain-context-$$.md
+rm -f /tmp/.gstack-brain-context-$$.md 2>/dev/null || true
+\`\`\`
+
+**How to use this context:**
+- If \`product\` digest names the value prop, target user, or stage — don't re-ask.
+- If \`goals\` digest lists active goals — frame recommendations against them.
+- If \`recent-decisions\` digest names a prior scope/architecture choice — flag if this plan contradicts.
+- If \`user-profile\` digest carries calibration pattern statements ("tends to over-engineer security") — surface them when relevant.
+- If a digest is \`(no X digest available yet)\`, treat that section as cold; ask the user.
+
+**Privacy:** Salience digest is filtered by allowlist (D9 default: \`projects/\`,
+\`gstack/\`, \`concepts/\` only). Personal/family/therapy content never leaks here.
+`;
+}
+
+/**
+ * Renders the at-skill-end background refresh hook. Fires after the skill's
+ * own work completes (telemetry has already logged); kicks any digest whose
+ * age exceeds half its TTL but hasn't yet expired, so the NEXT invocation
+ * gets a fresh cache without paying the cold-miss tax.
+ *
+ * Subordinate to {{TELEMETRY}} — runs after. Doesn't block the user.
+ */
+export function generateBrainCacheRefresh(ctx: TemplateContext): string {
+  if (!isPreflightSkill(ctx.skillName)) return '';
+  const binDir = ctx.paths.binDir;
+  return `## Brain Cache Background Refresh
+
+After the skill's work completes (and telemetry has logged), kick a
+background refresh of any cache digest that's getting close to its TTL.
+This is non-blocking — the user doesn't wait. Next invocation benefits
+from the warm cache.
+
+\`\`\`bash
+eval "$(${binDir}/gstack-slug 2>/dev/null)" 2>/dev/null || true
+(${binDir}/gstack-brain-cache refresh --project "$SLUG" 2>/dev/null &) || true
+\`\`\`
+`;
+}
+
+/**
+ * Renders the calibration write-back block. ONLY emits when the skill makes
+ * typed decisions worth a kind=bet take AND the brain trust policy is
+ * personal. Phase 2 / E5 cross-skill calibration.
+ *
+ * Gated behind BRAIN_CALIBRATION_WRITEBACK feature flag in the resolver
+ * output — the flag stays false until upstream gbrain ships takes_add MCP
+ * op (T8). When the flag flips, the existing skill templates pick up the
+ * write-back behavior without any template changes.
+ */
+export function generateBrainWriteBack(ctx: TemplateContext): string {
+  if (!isPreflightSkill(ctx.skillName)) return '';
+  const weight = SKILL_CALIBRATION_WEIGHTS[ctx.skillName];
+  if (weight == null) return '';
+  // List the cache digests this skill's writes should invalidate. Multiple
+  // skills write to multiple entities; the invalidation map captures this.
+  const invalidatesEntities = getInvalidationTargets(`/${ctx.skillName}`);
+  const invalidateBash = invalidatesEntities
+    .map((e) => `  ${ctx.paths.binDir}/gstack-brain-cache invalidate ${e} --project "$SLUG" 2>/dev/null || true`)
+    .join('\n');
+
+  return `## Brain Calibration Write-Back (Phase 2 / gated)
+
+When the skill makes a typed prediction worth tracking (scope decision,
+TTHW target, architectural bet, wedge commitment), it MAY write a
+\`kind=bet\` take to the brain so a calibration profile builds over time.
+
+**Gated on two things:**
+1. Brain trust policy for the active endpoint is \`personal\` (check via
+   \`${ctx.paths.binDir}/gstack-config get brain_trust_policy@<endpoint-hash>\`).
+   Shared brains skip write-back to avoid polluting team calibration.
+2. Feature flag \`BRAIN_CALIBRATION_WRITEBACK\` is set (today: false; flips
+   to true when upstream gbrain v0.42+ ships \`takes_add\` MCP op).
+
+When both gates pass, the write-back path uses \`mcp__gbrain__takes_add\`
+to record a take with weight ${weight} (per SKILL_CALIBRATION_WEIGHTS).
+If the MCP op is unavailable, fall back to \`mcp__gbrain__put_page\` with
+a gstack:takes fence block (documented but uglier path).
+
+Mandatory take frontmatter shape:
+\`\`\`yaml
+kind: bet
+holder: <user identity from whoami>
+claim: <one-line prediction the skill is making>
+weight: ${weight}
+since_date: <today's date>
+expected_resolution: <date in 1-3 months depending on skill>
+source_skill: ${ctx.skillName}
+\`\`\`
+
+After write, invalidate the affected digests so the next preflight reflects
+the new state:
+
+\`\`\`bash
+eval "$(${ctx.paths.binDir}/gstack-slug 2>/dev/null)" 2>/dev/null || true
+${invalidateBash || '  # (no per-skill invalidation targets configured)'}
+\`\`\`
+`;
 }
@@ -30,7 +30,7 @@ import { generateInvokeSkill } from './composition';
 import { generateReviewArmy } from './review-army';
 import { generateDxFramework } from './dx';
 import { generateModelOverlay } from './model-overlay';
-import { generateGBrainContextLoad, generateGBrainSaveResults } from './gbrain';
+import { generateGBrainContextLoad, generateGBrainSaveResults, generateBrainPreflight, generateBrainCacheRefresh, generateBrainWriteBack } from './gbrain';
 import { generateQuestionPreferenceCheck, generateQuestionLog, generateInlineTuneFeedback } from './question-tuning';
 import { generateMakePdfSetup } from './make-pdf';
 import { generateTasksSectionEmit, generateTasksSectionAggregate } from './tasks-section';
@@ -89,6 +89,9 @@ export const RESOLVERS: Record<string, ResolverValue> = {
  BIN_DIR: (ctx) => ctx.paths.binDir,
  GBRAIN_CONTEXT_LOAD: generateGBrainContextLoad,
  GBRAIN_SAVE_RESULTS: generateGBrainSaveResults,
+  BRAIN_PREFLIGHT: generateBrainPreflight,
+  BRAIN_CACHE_REFRESH: generateBrainCacheRefresh,
+  BRAIN_WRITE_BACK: generateBrainWriteBack,
  QUESTION_PREFERENCE_CHECK: generateQuestionPreferenceCheck,
  QUESTION_LOG: generateQuestionLog,
  INLINE_TUNE_FEEDBACK: generateInlineTuneFeedback,
@@ -25,7 +25,11 @@ export function generateQuestionTuning(ctx: TemplateContext): string {

 Before each AskUserQuestion, choose \`question_id\` from \`scripts/question-registry.ts\` or \`{skill}-{slug}\`, then run \`${bin}/gstack-question-preference --check "<id>"\`. \`AUTO_DECIDE\` means choose the recommended option and say "Auto-decided [summary] → [option] (your preference). Change with /plan-tune." \`ASK_NORMALLY\` means ask.

-After answer, log best-effort:
+**Embed the question_id as a marker in the question text** so hooks can identify it deterministically (plan-tune cathedral T14 / D18 progressive markers). Append \`<gstack-qid:{question_id}>\` somewhere in the rendered question (the leading line or trailing line is fine; the marker doesn't render visibly to the user when wrapped in HTML-style angle brackets, but the hook strips it). Without the marker the PreToolUse enforcement hook treats the AUQ as observed-only and never auto-decides — so always include it when the question matches a registered \`question_id\`.
+
+**Embed the option recommendation via the \`(recommended)\` label suffix** on exactly one option per AUQ. The PreToolUse hook parses \`(recommended)\` first, falls back to "Recommendation: X" prose, and refuses to auto-decide if ambiguous. Two \`(recommended)\` labels = refuse.
+
+After answer, log best-effort (PostToolUse hook also captures deterministically when installed; dedup on (source, tool_use_id) handles double-writes):
 \`\`\`bash
 ${bin}/gstack-question-log '{"skill":"${ctx.skillName}","question_id":"<id>","question_summary":"<short>","category":"<approval|clarification|routing|cherry-pick|feedback-loop>","door_type":"<one-way|two-way>","options_count":N,"user_choice":"<key>","recommended":"<key>","session_id":"'"$_SESSION_ID"'"}' 2>/dev/null || true
 \`\`\`