Files
gstack/lib/gbrain-sources.ts
T
Garry Tan 45cc95d5f4 v1.57.5.0 feat: cross-session decision memory + gbrain dream-stage call graph (#1910)
* feat(gbrain-sync): add cycleCompleted() cycle-state probe

Reads `gbrain doctor` cycle_freshness to classify whether a source has
completed a full cycle (completed/never/unknown). A fail naming this source
-> never; a fail naming only other sources -> completed; an absent or
unparseable check -> unknown, so an unrelated doctor failure never masks a
real state. Gates the automatic call-graph build on --full.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* feat(gbrain-sync): --dream call-graph stage with lock-free gate + honest outcome guard

Adds a source-scoped `gbrain dream --source <id>` stage that builds this
worktree's call graph (code-callers/code-callees). Runs lock-free after the
sync lock releases so it never blocks sibling worktrees; a .dream-in-progress
marker dedupes concurrent dreams. --full auto-runs it only when the cycle was
never built; explicit --dream always forces; --no-dream opts out.

The stage parses the cycle's own output and reports the truth, not a flat
"built": a WARN when the schema pack can't extract code symbols, when the
embed phase failed for a missing key, or when 0 edges resolved; OK with the
resolved-edge count otherwise. gbrain exits 0 even when it skips on a held
cycle lock (e.g. autopilot), so that case reports SKIP, not success.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* chore: ignore gbrain .sources/ local staging dir

gbrain writes per-source staging and capability-check artifacts under
.sources/ in the repo root. It's machine-local runtime state, not source.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* docs(gbrain): honest call-graph guidance in /sync-gbrain + pin works on gbrain>=0.41.38

sync-gbrain frames the --dream offer honestly: building a call graph requires a
code-aware schema pack, and the dream stage reports a WARN when it can't. The
verdict's Call graph row mirrors the dream stage's real outcome instead of
assuming a completed cycle means edges exist. The ## GBrain Search Guidance
block written into CLAUDE.md drops the old code-callers --source caveat:
gbrain >=0.41.38.0 honors the .gbrain-source pin for code-callers/code-callees.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* feat(jsonl-store): shared audited JSONL plumbing (injection-reject + atomic append + tolerant read)

Single source of truth extracted for D2A: gstack-learnings-* and the upcoming
gstack-decision-* bins share one injection-pattern list, one atomic single-line
appender, and one tolerant reader. No more drift between stores.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* refactor(learnings-log): use shared hasInjection from lib/jsonl-store (D2A)

Replace the inline injection-pattern copy with the shared list. One audited
write-path rejection across learnings + the upcoming decision store. Behavior
unchanged (35/35 learnings tests green); learnings-search keeps its inline copy
because a structural test pins its bash/bun shape.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* feat(decision): event-sourced decision-memory model (lib/gstack-decision)

decide/supersede/redact events on lib/jsonl-store; active set is computed (no
mutable status), dangling refs tolerated. Free-text is injection-checked and
redact-scanned on write (HIGH secret -> reject). Scope filter (repo/branch/issue)
for relevant resurfacing. File-only + reliable; gbrain not required.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* feat(decision): bounded active snapshot + compaction (redact expunges, supersede archives)

writeSnapshot/readSnapshot/rebuildSnapshot give an O(active) bounded read for the
session-start hot path (D1A). compact() rewrites the log to active, archives
superseded decisions for history, and EXPUNGES redacted ones (dropped, never
archived) so an accidentally-captured secret leaves the store for good.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* feat(decision): gstack-decision-log + gstack-decision-search bins (non-interactive)

Two bins mirroring gstack-learnings-* (D3A). log writes decide/--supersede/--redact/
--compact events + refreshes the bounded snapshot + enqueues for cross-machine sync;
search reads the O(active) snapshot, scope-filtered to current branch, newest-first,
--all to include superseded, --json for machines. Empty store returns silently
(no snapshot write on an empty read).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* feat(memory): surface active decisions at session start + capture nudge (Context Recovery)

Context Recovery now shows recent scope-relevant active decisions (bounded read of
decisions.active.json via gstack-decision-search) and instructs the agent to treat
them as settled calls and to log durable decisions/reversals. Closes the Phase-1
capture->curate->resurface loop, reliable + file-only. Regen across all hosts folded
in (squash-with-regen); parity 10/10, freshness green.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* test: refresh ship golden baselines for the memory-loop preamble change

Context Recovery now emits the cross-session-decisions block, so ship's preamble
(all hosts) changed. Golden baselines are hand-maintained copies (gen does not
write them); refresh them from the fresh gen so golden-file regression passes.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* docs(memory): document the cross-session decision-memory loop in CLAUDE.md

Adds a '## Cross-session decision memory' section: how to resurface
(gstack-decision-search) and capture (gstack-decision-log) durable decisions,
the supersede/redact/compact verbs, and a crisp durable-vs-trivial definition
so the store stays signal. Reliable file-only path; gbrain not required.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* feat(memory): emit durable decisions from ship/ceo/eng/spec at structured points

Wires the four skills that finalize real decisions to capture them in the
cross-session decision store, from their STRUCTURED outputs (never free-text
scraping):
- ship: the version bump (level + why) at write time
- plan-ceo-review: accepted scope + verdict (branch-scoped)
- plan-eng-review: the architecture verdict + key call (branch-scoped)
- spec: the filed issue's core approach (issue-scoped)

All emits are non-interactive, schema-correct (content in decision/rationale,
source=skill, confidence 1-10), and best-effort (|| true) so a decision-log
failure never blocks the workflow. Includes regen across hosts + refreshed ship
golden baselines.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* feat(memory): optional gbrain --semantic recall for decision search

Adds gstack-decision-search --semantic (with --query): appends a 'Related from
memory' block from gbrain semantic search, scoped to the curated-memory source.
Pure enhancement, reliability-first: a new lib/gstack-decision-semantic.ts is the
ONLY decision module that touches gbrain and is imported lazily only on --semantic,
so the reliable file path never loads gbrain code. Every path degrades to the
reliable file results when gbrain is off, unconfigured, empty, or errors (never
throws, 10s timeout).

Built against the verified gbrain 0.42.x surface (text output [score] slug --
snippet, NOT JSON; curated-memory source resolved by worktree path, not a
gstack-brain-<user> id). Deterministic-contract tests only: parser units,
degrade-to-null when gbrain absent, and a fake-gbrain shim proving scope+search
end-to-end. find-contradictions deferred (no verifiable CLI surface yet + curated
memory not indexed).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* feat(gbrain-sync): self-heal stale autopilot lock (dead-pid)

detectAutopilot treated a lock FILE as proof of life, so a crashed gbrain daemon
left a stale lock that wedged every sync forever (observed: a dead pid refused
--full indefinitely). Now read the holder pid (bare or JSON body) and check
liveness via signal-0: ESRCH=dead → ignore the stale signal and keep checking;
EPERM=alive (other user) → active. A stale lock never masks a live autopilot
process. Pure decision function — does not delete the file; the caller may clean it.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* docs(review): drop stray trailing code fence in TODOS-format

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* fix(test): align section-loading E2E testNames with their TOUCHFILES keys

Pre-existing on main (v1.56.x): the two section-loading E2E tests used
human-label testNames ('/ship section-loading') that don't match their slug
keys ('ship-section-loading') in E2E_TOUCHFILES/E2E_TIERS. Every other E2E test
uses the slug as its testName, and the TOUCHFILES completeness gate requires
testName to be a registered key — so the gate was red. Align both testNames to
their slug keys (also fixes tier lookup for these two periodic tests).

Verified failing on a clean origin/main checkout before the fix.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* fix: pre-landing review fixes (datamark, DRY, compact, coverage)

Addresses the pre-landing review findings (all INFORMATIONAL, no criticals):
- security: datamark resurfaced decision text at the render boundary
  (lib/gstack-decision.ts datamark() — neutralizes code fences, --- banners,
  <|role|>/</system> markers, control chars, newlines). Applied in
  gstack-decision-search human output so stored text can't masquerade as
  instructions in Context Recovery (codex hardening #3 / AC #7). --json stays raw.
- DRY: extract resolveSlug/gitBranch/flagValue to lib/bin-context.ts; both
  decision bins use it instead of duplicating the helpers.
- compact(): batch the archive append (one write, not N) and shrink the
  mid-compact crash window; simplify the opaque branch/issue ternary.
- coverage: learnings-log injection rejection (D2A wiring), search --recent/
  --scope + NaN-safe --recent, datamark-applied, unparseable lock body,
  compact-empty, corrupt-snapshot degrade.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* fix(security): close adversarial-review findings in decision memory

Adversarial review (Claude subagent) found a CRITICAL the specialist pass missed:
- F1 (CRITICAL): 'Human:'/'Assistant:' turn-prefixes bypassed BOTH the write-time
  denylist AND datamark(), landing verbatim in agent context inside the trusted
  ACTIVE DECISIONS fence. Add 'human:' (+ 'disregard previous', 'from now on') to
  the shared denylist, and have datamark() neutralize Human:/Assistant:/System:/User:
  turn-prefixes (ZWSP) at the render boundary.
- F2: datamark() only stripped ASCII C0; extend to Unicode line terminators
  (U+0085/2028/2029) and U+007F so 'strip newlines' actually holds.
- F3: validateDecide blocked only HIGH secrets; MEDIUM-tier PII (e.g. SSN) persisted
  silently and synced cross-machine. The store is non-interactive (no confirm path),
  so fail closed on MEDIUM too.
- F4: compact() was a lock-free read-modify-rewrite that could clobber a concurrent
  append (lost decision). Add an O_EXCL compact lock + a pre-rename size recheck that
  aborts untouched (skipped=true) if an append landed; caller re-runs.
- F7: filterByScope unknown/garbage scope fell through to 'return true' (leaked into
  every context); fail conservative (false).

F5 (pid reuse) and F6 (pgrep over-match) are intentionally left as-is: both fail SAFE
(over-refuse sync); making them precise would introduce a fail-DANGEROUS path
(allowing sync during a real autopilot). True disambiguation needs gbrain to stamp the
lock with a start-time, which gstack doesn't own. F8 (compact moves history to archive)
is by design.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* fix(security): close cross-model (Codex) adversarial findings

Codex adversarial review found a HIGH the Claude pass missed plus 3 mediums:
- C1 (HIGH): gstack-decision-search --all returned every decide and IGNORED redact
  events, so a redacted secret still resurfaced via --all until compact ran. --all
  now excludes redacted (redact = expunge from every read path), still showing
  superseded history.
- C-med: semantic (external gbrain) slug/snippet were printed raw — datamark them too
  so a gbrain hit can't spoof role markers / fences into agent context.
- C4: semanticRecall fell back to an UNSCOPED gbrain search when no curated-memory
  source resolved, pulling code/doc corpora mislabeled as 'related decisions'. Now
  returns null (degrade) when there's no worktree-backed memory source.
- C5: validateDecide scanned only decision/rationale/alternatives; branch and issue
  are stored + surfaced (raw via --json), so include them in the injection+secret scan.

C2 (snapshot staleness) / C3 (compact TOCTOU residual): accepted for a single-user
store — atomic appends never lose the event, rebuilds self-heal, and the compact
size-recheck leaves only a sub-ms window; full append-locking would break the
lock-free append design.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* chore: bump version and changelog (v1.57.5.0)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-08 06:20:58 -07:00

277 lines
9.9 KiB
TypeScript

/**
* gbrain-sources — TypeScript helper for idempotent gbrain federated source registration.
*
* Mirrors the bash logic in bin/gstack-gbrain-source-wireup:204-310 but in a form
* importable by other TS callers (currently bin/gstack-gbrain-sync.ts; future
* callers welcome). gbrain has no `sources update` — drift recovery is
* `sources remove` followed by `sources add`.
*
* Per /plan-eng-review D3 (DRY extraction).
*/
import { execFileSync, spawnSync } from "child_process";
import { withErrorContext } from "./gstack-memory-helpers";
import { execGbrainJson, NEEDS_SHELL_ON_WINDOWS } from "./gbrain-exec";
export interface SourceState {
/** "absent" — id not registered. "match" — id at expected path. "drift" — id at different path. */
status: "absent" | "match" | "drift";
/** Path gbrain has registered for this id. Only set when status !== "absent". */
registered_path?: string;
}
export interface EnsureResult {
/** True if registration state changed (added or re-registered). False on no-op. */
changed: boolean;
/** Final source state after the call. */
state: SourceState;
}
/**
* One row of `gbrain sources list --json`. `config.remote_url` distinguishes
* URL-managed sources (gbrain owns the clone, may auto-reclone) from
* path-managed ones (user owns the working tree) — load-bearing for the #1734
* destructive-op guards.
*/
export interface GbrainSourceRow {
id?: string;
local_path?: string;
page_count?: number;
config?: { remote_url?: string | null } | null;
}
/**
* Normalize `gbrain sources list --json` output to an array of source rows.
*
* gbrain has shipped two shapes: a wrapped `{ sources: [...] }` object (v0.20+)
* and, in older/other variants, a bare top-level array. #1576 was a crash when a
* reader assumed one shape; the parse is centralized here so every reader
* (probeSource, sourcePageCount, sourceLocalPath, the #1734 remote_url audit)
* agrees on the shape in ONE place. Returns [] for null/garbage rather than
* throwing — callers treat "no rows" as absent.
*/
export function parseSourcesList(raw: unknown): GbrainSourceRow[] {
if (Array.isArray(raw)) return raw as GbrainSourceRow[];
if (raw && typeof raw === "object" && Array.isArray((raw as { sources?: unknown }).sources)) {
return (raw as { sources: GbrainSourceRow[] }).sources;
}
return [];
}
export interface EnsureOptions {
/** Pass --federated to `gbrain sources add`. Default false. */
federated?: boolean;
/** When status=drift, force a remove+add to update the registered path. Default true. */
reregister_on_drift?: boolean;
/**
* Optional env override for the spawned `gbrain` calls. Production callers
* leave this unset (inherit process.env). Tests pass a custom env to point
* at a fake `gbrain` on PATH (Bun's execFileSync does not respect runtime
* mutations of process.env.PATH unless env is passed explicitly).
*/
env?: NodeJS.ProcessEnv;
}
/**
* Probe the registration state of a source by id.
*
* Errors:
* - "gbrain CLI not on PATH" (exit 127) — caller should treat as absent + skip stage.
* - "gbrain DB connection failed" — caller should treat as absent + skip stage.
* - JSON parse error — propagate via withErrorContext caller.
*/
export function probeSource(id: string, env?: NodeJS.ProcessEnv): SourceState {
let stdout: string;
try {
stdout = execFileSync("gbrain", ["sources", "list", "--json"], {
encoding: "utf-8",
timeout: 30_000,
stdio: ["ignore", "pipe", "pipe"],
env,
shell: NEEDS_SHELL_ON_WINDOWS, // #1731: gbrain is a .cmd shim on Windows
});
} catch (err) {
const e = err as NodeJS.ErrnoException & { stderr?: Buffer };
const stderr = e.stderr?.toString() || "";
if (e.code === "ENOENT" || stderr.includes("command not found")) {
throw new Error("gbrain CLI not on PATH");
}
if (stderr.includes("Cannot connect to database") || stderr.includes("config.json")) {
throw new Error("gbrain not configured (run /setup-gbrain)");
}
throw err;
}
let parsed: unknown;
try {
parsed = JSON.parse(stdout);
} catch (err) {
throw new Error(`gbrain sources list returned non-JSON output: ${(err as Error).message}`);
}
const sources = parseSourcesList(parsed);
const match = sources.find((s) => s.id === id);
if (!match) return { status: "absent" };
return {
status: "match",
registered_path: match.local_path,
};
}
/**
* Ensure source <id> is registered at <path>. Idempotent.
*
* Behavior:
* - status=absent → `gbrain sources add <id> --path <path> [--federated]`, returns changed=true.
* - status=match + same path → no-op, returns changed=false.
* - status=match + different path → `sources remove` + `sources add`, returns changed=true.
* (Skip when reregister_on_drift=false; returns changed=false.)
*
* Caller is responsible for catching errors. The function uses withErrorContext for
* forensic logging to ~/.gstack/.gbrain-errors.jsonl.
*/
export async function ensureSourceRegistered(
id: string,
path: string,
options: EnsureOptions = {}
): Promise<EnsureResult> {
const federated = options.federated ?? false;
const reregister_on_drift = options.reregister_on_drift ?? true;
const env = options.env;
return withErrorContext(`ensureSourceRegistered:${id}`, () => {
const probed = probeSource(id, env);
// Disambiguate match-but-different-path
let state: SourceState = probed;
if (probed.status === "match" && probed.registered_path !== path) {
state = { status: "drift", registered_path: probed.registered_path };
}
if (state.status === "match") {
return { changed: false, state };
}
if (state.status === "drift" && !reregister_on_drift) {
return { changed: false, state };
}
// For drift, remove first.
if (state.status === "drift") {
const rm = spawnSync("gbrain", ["sources", "remove", id, "--yes"], {
encoding: "utf-8",
timeout: 30_000,
env,
shell: NEEDS_SHELL_ON_WINDOWS, // #1731: gbrain is a .cmd shim on Windows
});
if (rm.status !== 0) {
throw new Error(`gbrain sources remove ${id} failed: ${rm.stderr || rm.stdout || `exit ${rm.status}`}`);
}
}
// Add.
const addArgs = ["sources", "add", id, "--path", path];
if (federated) addArgs.push("--federated");
const add = spawnSync("gbrain", addArgs, {
encoding: "utf-8",
timeout: 30_000,
env,
shell: NEEDS_SHELL_ON_WINDOWS, // #1731: gbrain is a .cmd shim on Windows
});
if (add.status !== 0) {
throw new Error(`gbrain sources add ${id} failed: ${add.stderr || add.stdout || `exit ${add.status}`}`);
}
return {
changed: true,
state: { status: "match", registered_path: path },
};
}, "gbrain-sources");
}
/**
* Get page_count for a registered source. Returns null if source is absent or if
* page_count is missing/invalid in the JSON. Used by the verdict block + preamble
* variant selection.
*/
export function sourcePageCount(id: string, env?: NodeJS.ProcessEnv): number | null {
let stdout: string;
try {
stdout = execFileSync("gbrain", ["sources", "list", "--json"], {
encoding: "utf-8",
timeout: 30_000,
stdio: ["ignore", "pipe", "pipe"],
env,
shell: NEEDS_SHELL_ON_WINDOWS, // #1731: gbrain is a .cmd shim on Windows
});
} catch {
return null;
}
try {
const match = parseSourcesList(JSON.parse(stdout)).find((s) => s.id === id);
if (!match) return null;
if (typeof match.page_count !== "number") return null;
return match.page_count;
} catch {
return null;
}
}
/**
* Whether a source's call graph has been built.
*
* "completed" — `gbrain dream` has run a full maintenance cycle, so the
* brain-global `resolve_symbol_edges` phase populated this
* source's call graph (`gbrain code-callers`/`code-callees`
* return edges).
* "never" — a cycle has provably NOT completed for this source.
* "unknown" — doctor is unavailable, unparseable, or reports a failure
* that doesn't name this source. Callers MUST treat unknown
* conservatively (the orchestrator skips auto-dream and WARNs
* rather than launch a ~35-min cycle on a flaky-doctor signal —
* see the `gbrain-doctor-overstrict` learning).
*/
export type CycleStatus = "completed" | "never" | "unknown";
interface DoctorCheck {
name?: string;
status?: string;
message?: string;
}
interface DoctorReport {
checks?: DoctorCheck[];
}
/**
* Read `gbrain doctor --json --fast` and decide whether <sourceId>'s call
* graph is built, by inspecting the `cycle_freshness` check.
*
* Decision table (cycle_freshness.status / message):
* - ok → "completed"
* - fail|warn AND message names <sourceId> → "never"
* - fail|warn AND message omits <sourceId> → "unknown" (a real failure
* about OTHER sources must not be silently read as completed for us)
* - check absent / doctor null / other status → "unknown"
*
* `sourceId` is matched as a LITERAL substring (not a regex) so an id with
* regex metacharacters can never misfire. Routes through `execGbrainJson` so
* DATABASE_URL is seeded from gbrain's config (consistent with every other
* gstack-side gbrain call). `env` is the caller's base env (tests inject a
* shim on PATH).
*/
export function cycleCompleted(sourceId: string, env?: NodeJS.ProcessEnv): CycleStatus {
const report = execGbrainJson<DoctorReport>(["doctor", "--json", "--fast"], { baseEnv: env });
if (!report || !Array.isArray(report.checks)) return "unknown";
const check = report.checks.find((c) => c.name === "cycle_freshness");
if (!check) return "unknown";
if (check.status === "ok") return "completed";
if (check.status === "fail" || check.status === "warn") {
const msg = check.message || "";
return msg.includes(sourceId) ? "never" : "unknown";
}
return "unknown";
}