Merge remote-tracking branch 'origin/main' into garrytan/monterrey-v3

This commit is contained in:
Garry Tan
2026-05-13 07:16:14 -07:00
25 changed files with 2144 additions and 231 deletions
+155
View File
@@ -1,5 +1,160 @@
# Changelog
## [1.33.2.0] - 2026-05-11
## **`./setup` no longer pollutes the global install when run from a Conductor worktree.**
## **Six-line bash guard catches the BSD `ln -snf` footgun that was leaking per-worktree symlinks into `~/.claude/skills/gstack/`.**
When you ran `./setup` from a Conductor worktree of the gstack repo itself (e.g. `~/conductor/workspaces/gstack/dublin-v1`), it would silently corrupt your global install. The "register this checkout as the active gstack" branch did `ln -snf "$SOURCE_GSTACK_DIR" "$HOME/.claude/skills/gstack"`. On macOS and BSD, when the destination is an existing real directory (your global git clone), `ln -snf` does NOT replace it. It creates a child symlink INSIDE: `~/.claude/skills/gstack/dublin-v1 → ~/conductor/workspaces/gstack/dublin-v1`. Claude Code reads every directory in `~/.claude/skills/` that contains a `SKILL.md`, so each leaked worktree showed up as its own top-level skill: `/dublin-v1`, `/wellington`, `/santiago-v1`, etc. The skill picker filled with noise.
The fix in `setup` checks whether `~/.claude/skills/gstack` is already a real (non-symlink) directory whose resolved `pwd -P` differs from `$SOURCE_GSTACK_DIR`. If so, refuse the `ln -snf`, print a four-line remediation hint, and exit the Claude registration branch cleanly. Binaries (`browse`, `design`, `make-pdf`, `find-browse`) still build locally for dev. The four other code paths through the same branch (fresh install, retarget existing symlink, self-rerun pointing to the same dir, `--local`) are unchanged.
### The numbers that matter
Source: `bun test test/setup-conductor-worktree.test.ts` — 8 tests covering every branch of the new guard plus a behavioral reproduction of the BSD `ln -snf` bug itself.
| Scenario | Before | After |
|---|---|---|
| `./setup` from worktree A with global install present | Leaks `~/.claude/skills/gstack/A → workspaces/gstack/A` | Skipped with remediation hint |
| `./setup` from N sibling worktrees over a week | N child symlinks accumulate inside global install | 0 leaks |
| Claude Code skill picker shows extra entries | Yes: `dublin-v1`, `wellington`, `santiago-v1`, etc. | No |
| Fresh install (no existing global) | Worked | Worked (unchanged path) |
| Re-running `./setup` from inside the global install | Worked | Worked (unchanged path) |
| Test coverage of the guard | 0 tests | 8 tests, all branches |
The behavioral test in `test/setup-conductor-worktree.test.ts` actually invokes `ln -snf SRC DST` against a real tmpdir to prove the macOS/BSD child-symlink behavior happens, then re-runs with the new guard to prove the leak doesn't. The bug is now documented in the test suite, not just the patch.
### What this means for builders
If you've been seeing extra top-level skills (`/dublin-v1`, `/wellington`, etc.) in Claude Code, that's the leak. Run `/gstack-upgrade` to pick up this fix, then manually remove the existing child symlinks: `cd ~/.claude/skills/gstack && find . -maxdepth 1 -type l -delete`. The guard prevents new leaks from `./setup` runs in any Conductor worktree of the gstack repo. If you actually want to register a worktree as the active gstack (rare, usually only when dogfooding a big in-progress change), remove the global install first: `rm -rf ~/.claude/skills/gstack && cd <your-worktree> && ./setup`.
### Itemized changes
#### Fixed
- **`setup`** — added Conductor worktree guard before `ln -snf "$SOURCE_GSTACK_DIR" "$CLAUDE_GSTACK_LINK"`. Checks `[ -d "$CLAUDE_GSTACK_LINK" ] && [ ! -L "$CLAUDE_GSTACK_LINK" ]` for a real directory, then `cd ... && pwd -P` to compare against the source. If they differ, sets `_SKIP_CLAUDE_REGISTER=1`, prints a remediation message naming both paths, and exits the Claude registration branch without touching the global install.
#### Added
- **`test/setup-conductor-worktree.test.ts`** — 8 tests (27 expect calls) covering: guard placement in `setup` before `ln -snf`, `pwd -P` resolution against `$SOURCE_GSTACK_DIR`, the skip-branch's remediation message, BSD `ln -snf` reproducer (proves the bug shape exists), guard skips when dest is real-dir-elsewhere, guard allows ln when dest doesn't exist, guard allows ln when dest is an existing symlink (upgrade-in-place), guard allows ln when dest already resolves to source (self-rerun).
#### For contributors
- The guard intentionally does NOT clean up pre-existing pollution inside `~/.claude/skills/gstack/`. Users must remove leaked symlinks manually (see "What this means for builders" above). Retroactive cleanup would require a separate migration script, filed for a future release if the manual remediation friction becomes noticeable.
## [1.33.1.0] - 2026-05-11
## **Long skills stop drifting away from their starting context.**
## **`/investigate`, `/qa`, and `/ship` now pull learnings keyed to what they're actually about, and refresh that pull mid-flow as work shifts to new sub-tasks.**
For the last 30+ versions, every gstack skill loaded learnings the same way: `gstack-learnings-search --limit 10` at the top, generic top-10 by confidence, no query, no refresh. Short skills were fine, they finish before the loaded learnings go stale. Long skills (`/investigate` walks 4 phases, `/qa` runs a multi-bug fix loop, `/ship` covers ~20 steps from test to bump to PR) drifted away from whatever was loaded at minute zero. By the time `/ship` reaches Step 12 (VERSION bump), the learnings it pulled at Step 1 are about whatever was the highest-confidence entry in your project, not about the headline feature you're shipping.
Two changes ship in this release: per-skill task-shaped queries at the top of the three long skills, and a mid-flow refresh checkpoint inside each one that re-pulls learnings keyed to the sub-task that's about to begin. Both rely on a fix to `bin/gstack-learnings-search` itself. The binary's `--query` flag previously used whole-string substring match against key/insight/files, so a query like `"debug investigation"` would only match a learning whose insight contained that exact contiguous phrase. The flag is now token-OR: split on whitespace, match if ANY token appears in any haystack field. This is what most users expect from a search flag.
### The numbers that matter
Source: this project's local `learnings.jsonl` (35 entries as of this release). Same query, same flag, before vs after the binary fix:
| Query | Before (substring) | After (token-OR) | Δ |
|-------|-------------------|------------------|---|
| `"debug investigation root cause"` | 0 entries matched | 5 entries matched | +5 |
| `"qa testing bug regression"` | 0 entries matched | 2 entries matched | +2 |
| `"release ship version changelog"` | 0 entries matched | 8 entries matched | +8 |
| `"skill resolver"` | 0 entries matched | 12 entries matched | +12 |
Recall on the static skill-shaped queries went from zero to relevant. Without this fix, the rest of the change would have been silent. The bash would run, the binary would exit 0 with no output, and the skill would render the same empty section it always rendered.
### What this means for builders
If you run `/investigate` on a bug, the top-of-skill learnings pull now surfaces prior investigation patterns instead of unrelated top-10 confidence entries. When you finish Phase 1 (naming a root-cause hypothesis), a mid-flow refresh fires and re-pulls learnings keyed to your hypothesis keyword, so prior fixes for the same problem-shape land in the agent's context right when they're relevant. Same pattern for `/qa` (refresh before the fix loop, keyed to the buggy component) and `/ship` (refresh before the VERSION/CHANGELOG step, keyed to the headline feature). The other 13 short-lived skills are unchanged: their existing top-10 generic pull is still right for their attention span.
### Itemized changes
#### Changed
- **`bin/gstack-learnings-search`** now uses token-OR `--query` semantics. Multi-word queries split on whitespace and match if ANY token appears as a substring in ANY of key/insight/files. Single-word queries behave exactly as before. No flag changes; same CLI surface. The old whole-string substring behavior was a silent footgun that returned nothing on real-world learnings stores. New test file `test/gstack-learnings-search.test.ts` covers the three branches (multi-token, single-token, no-query backwards compat).
- **`scripts/resolvers/learnings.ts`** `{{LEARNINGS_SEARCH}}` macro now accepts a `query=KEYWORD` argument. Empty value falls through to no-query (principle of least surprise: a stray `{{LEARNINGS_SEARCH:query=}}` placeholder gets today's behavior, not a build failure). Pattern reuses the parameterized-macro infrastructure from `composition.ts`. The 13 templates that don't pass a query stay byte-identical in their generated SKILL.md output. Shell-injection guard: the query value is whitelisted to `^[A-Za-z0-9 _-]+$` at gen-skill-docs time, so any `$()`, backticks, semicolons, or quotes in a future template throw a loud build error instead of emitting executable bash.
- **`investigate/SKILL.md.tmpl`** top-of-skill learnings pull keyed to `debug investigation root cause hypothesis bug fix`. New mid-flow refresh block between Phase 1 (hypothesis) and Phase 2 (analysis) instructs the agent to pick one alphanumeric-only keyword from the hypothesis and re-pull. Worked examples included (good: `auth-cookie`, `session-expiry`; bad: `auth.ts:47`, `<hypothesis-keyword>`).
- **`qa/SKILL.md.tmpl`** top-of-skill pull keyed to `qa testing bug regression flake fixture`. Mid-flow refresh inserted between Phase 7 (triage) and Phase 8 (fix loop), keyed to the buggy component name.
- **`ship/SKILL.md.tmpl`** top-of-skill pull keyed to `release ship version changelog merge pr`. Mid-flow refresh inserted just before Step 12 (VERSION bump), keyed to the headline feature on this branch.
- **`test/gen-skill-docs.test.ts`** 5 new resolver assertions: no-args has no `--query`, claude+query=foo bar appears in BOTH cross-project and project-scoped branches, codex host gets `--query` in the codex bash variant, empty value `query=` falls through to no-query, AND shell-injection payloads (`$(whoami)`, backticks, `;`, `&`, `"`, `\`, `$x`) throw a build error.
- **All generated `SKILL.md` files for the 3 long skills + 4 host outputs** regenerated. The other 13 skills' generated output is byte-identical (backwards-compat verified via diff).
#### For contributors
- Contributed by @Fergtic ([chronicle-write-up](https://github.com/Fergtic/chronicle-write-up)) who flagged the load-once + no-refresh pattern and the spend-per-success data point that motivated this work. The static-skill query expansion was also informed by a key fact-check from Codex outside-voice review: the binary's `--query` was single-substring match, not token-OR, which silently invalidated any multi-word query in the wild.
## [1.33.0.0] - 2026-05-11
## **`/sync-gbrain` memory stage no longer infinite-loops or silently throws away progress.**
## **Per-file gitleaks scanning is opt-in, signal handling actually kills the gbrain child, and state writes are atomic.**
`/sync-gbrain` memory ingest used to spawn `gitleaks detect` plus `gbrain put` once per file across 1,841+ transcripts and artifacts, then the orchestrator SIGTERM'd the whole pipeline at 35 minutes with no state flush. Every cold run started from zero and burned 35 minutes for nothing. v1.33 rewrites the memory stage around `gbrain import <dir>` (batch path that's been in gbrain since v0.20). The prepare phase walks sources, parses transcripts and artifacts, writes prepared markdown into a hierarchical staging directory mirroring slug structure, then invokes `gbrain import` once. Per-file failures get read back from `~/.gbrain/sync-failures.jsonl` via a byte-offset snapshot so the state file only records files that actually landed in PGLite. `--scan-secrets` is now an opt-in flag because `gstack-brain-sync` already runs a regex-based secret scanner at the actual cross-machine boundary (git push), making per-file ingest scans redundant defense-in-depth that cost ~470 seconds on every cold run.
The signal handler now propagates `SIGTERM` and `SIGINT` to the gbrain child and synchronously cleans up the staging directory before `process.exit`, fixing the orphan-process bug that left gbrain holding the PGLite write lock and burning CPU for hours after the orchestrator gave up. State file writes use `tmp+rename` for atomicity so a crash mid-write can't truncate the ingest state. The full-file `sha256` change detection (was capped at 1MB) catches tail edits to long partial transcripts that the old algorithm silently missed.
### The numbers that matter
Source: live run on `~/.gstack/projects/` corpus (5,135 transcripts + artifacts), `bin/gstack-memory-ingest.ts --bulk` on a fresh PGLite at gbrain v0.31.2.
| Metric | Before (v1.31.x) | After (v1.33) | Δ |
|---|---|---|---|
| Cold run completes | no, 35-min loop + null exit | yes | works |
| Prepare phase time (5,135 files) | ~10-12 min | <10 sec | ~60x |
| Per-file gitleaks scans | 1,841 mandatory | 0 by default, opt-in via `--scan-secrets` | gated |
| State file flushed on SIGTERM | no, loss-on-kill | yes, sync cleanup before exit | fixed |
| Orphan gbrain child after timeout | yes, observed 15hr CPU drain | no, signal forwarded | fixed |
| FILE_TOO_LARGE blocks all advancement | yes | no, failed paths excluded via D7 | fixed |
| Tests in `test/gstack-memory-ingest.test.ts` | 17 | 21 | +4 |
| Decision | What landed |
|---|---|
| D1 hierarchical staging | `writeStaged` does `mkdir -p` per slug segment |
| D2 cut over | `gbrainPutPage` deleted, no `--legacy-ingest` flag |
| D3 source-first secret scan | Scan opt-in via `--scan-secrets`, default off |
| D4 OK/ERR verdict | Per-file failures show in summary but only system errors mark ERR |
| D5 unified state schema | No separate skip-list file |
| D6 trust idempotency | gbrain's content_hash dedup makes reruns cheap |
| D7 sync-failures byte-offset | `readNewFailures` reads only appended bytes since pre-import snapshot |
| F6 atomic state writes | `tmp+rename` instead of direct overwrite |
| F9 full-file sha256 | Removes 1MB cap that silently swallowed tail edits |
Prepare phase dropped from ~10 minutes to <10 seconds because the dominant cost was `gitleaks detect` cold start (~256ms per file, 5,135 files = 22 minutes of subprocess startup). The cross-machine secret boundary is `git push`, and `gstack-brain-sync` already runs its own regex scanner there. Local PGLite ingest of files that already live on disk in plaintext doesn't change exposure. The opt-in flag survives for users who want per-file ingest scanning, but it's no longer the default tax on every cold run.
### What this means for builders
If you've been hitting the 35-minute hang on `/sync-gbrain`, it's gone. The architecture is correct on this side now. A separate `gbrain import` performance issue surfaced during testing where the gbrain CLI itself takes >10 minutes on 5,131-file staging dirs (10 seconds on 501 files), which is filed as a P2 TODO for gbrain proper. That's the next bottleneck to chase, but it lives in gbrain's import path, not in the gstack orchestrator. Run `/sync-gbrain` after upgrading. If you've been seeing the loop, this fixes it.
### Itemized changes
#### Added
- `bin/gstack-memory-ingest.ts:1093``preparePages` pure function: walk sources, mtime-skip via state, optional gitleaks scan (`--scan-secrets`), parse transcripts and artifacts, render frontmatter with `title`/`type`/`tags` injected.
- `bin/gstack-memory-ingest.ts:920``writeStaged` writes prepared markdown into a hierarchical staging directory mirroring slug structure. `mkdir -p` per slug segment. Slugs containing `/` (like `transcripts/claude-code/foo`) get the matching subdirectory tree so gbrain's path-authoritative `slugifyPath` round-trips exactly.
- `bin/gstack-memory-ingest.ts:961``parseImportJson` reads gbrain's `--json` last-line payload. Returns `null` (treated as `system_error` by caller) instead of zero-padded silently when the line doesn't parse.
- `bin/gstack-memory-ingest.ts:993``readNewFailures` snapshots `~/.gbrain/sync-failures.jsonl` byte offset before import, reads only appended bytes after, maps gbrain's staging-relative paths back to source paths via the `stagedPathToSource` map.
- `bin/gstack-memory-ingest.ts:1009``runGbrainImport` async wrapper around `child_process.spawn` so the signal forwarder has a child reference to kill on parent `SIGTERM`/`SIGINT`. Pre-2026-05-11 `spawnSync` made signal forwarding impossible and gbrain orphaned every time the orchestrator timed out.
- `bin/gstack-memory-ingest.ts:1218``installSignalForwarder` registers `SIGTERM`/`SIGINT` handlers that forward to the live child, synchronously clean up the active staging directory, then exit. Async `finally` blocks don't run after `process.exit` from inside a signal handler, so cleanup has to happen in the handler itself.
- `bin/gstack-memory-ingest.ts:194``--scan-secrets` CLI flag and `GSTACK_MEMORY_INGEST_SCAN_SECRETS=1` env var to opt back into per-file gitleaks scanning during the prepare phase. Off by default.
- `test/gstack-memory-ingest.test.ts:457` — 5 new tests covering hierarchical staging slug round-trip, frontmatter injection, D7 sync-failures exclusion, missing-`import`-subcommand error path, and `--scan-secrets` dirty-source skipping with a fake gitleaks shim.
- `docs/designs/SYNC_GBRAIN_BATCH_INGEST.md` — full design doc with D1-D8 decisions, source-verified gbrain behaviors, performance measurements, F9 hash migration notes.
#### Changed
- `bin/gstack-memory-ingest.ts:288``saveState` now uses `tmp+rename` for atomicity (F6) so a crash mid-write can't truncate the state file. Matches the orchestrator's existing pattern at `gstack-gbrain-sync.ts:508`.
- `bin/gstack-memory-ingest.ts:307``fileSha256` hashes the full file (F9). Pre-2026-05-11 it stopped at 1MB, so tail edits to long partial transcripts looked unchanged and never re-imported. One-time cliff on upgrade: files whose mtime hasn't moved keep their old 1MB-capped hash, files whose mtime moves get recomputed correctly. No data loss.
- `bin/gstack-memory-ingest.ts:798``gbrainAvailable` probes for the `import` subcommand in `--help` output (was: `put` subcommand). Without `import`, the memory stage exits non-zero with a `system_error` instead of silently degrading.
- `bin/gstack-gbrain-sync.ts:442` — memory-stage parser preferentially picks `[memory-ingest] ERR` lines over the latest `[memory-ingest]` line for the summary, strips the prefix, and surfaces `(killed by signal / timeout)` when the child exits with `status=null`.
#### Fixed
- Per-file gitleaks scan was running on every transcript and artifact during memory ingest as redundant defense-in-depth. The cross-machine secret boundary is `gstack-brain-sync` (git push), which already runs a Python regex scanner. Local PGLite ingest doesn't change exposure surface for content that already lives on disk in plaintext.
- Signal handlers now kill the gbrain child and clean up the staging directory before exit. Pre-fix, every orchestrator timeout left a gbrain process holding the PGLite write lock and burning CPU until the user noticed and `kill -9`'d it manually (observed: a 15-hour-CPU-time orphan from yesterday's run was still alive today).
- `parseImportJson` no longer silently returns `{imported: 0, errors: 0}` when gbrain's `--json` output doesn't parse. Returns `null`, caller surfaces as `system_error` so the orchestrator's verdict block shows ERR instead of misleading OK/0/0.
- `bin/gstack-memory-ingest.ts` `require("fs")` calls replaced with top-level ESM `import`s for runtime portability.
#### For contributors
- Plan file at `/Users/garrytan/.claude/plans/purrfect-tumbling-quiche.md` captures the full review chain: `/investigate``/plan-eng-review` (5 architecture decisions D1-D5) → `/codex review` outside-voice plan challenge (9 findings, 3 reshaped the architecture into D6-D8). Plan also records the post-Codex user perf review that flipped D3 to opt-in.
- `TODOS.md` filed P2: investigate `gbrain import` perf on large staging dirs (5,131 files takes >10 minutes when 501 takes 10 seconds — gbrain-side N+1 SQL or auto-link reconciliation suspected). P3: cache "no changes since last import" at the prepare-batch level for true no-op fast paths.
- `Plan completion audit` ran via subagent on this branch: 17/21 DONE, 1 CHANGED (D3 made opt-in), 2 deferred (F8 benchmark harness as separate work, 24-path unit coverage went integration-only).
## [1.32.0.0] - 2026-05-10
## **Seven contributor PRs land. Three are security or hardening.**
+37
View File
@@ -778,3 +778,40 @@ Key routing rules:
- Ship/deploy/PR → invoke /ship or /land-and-deploy
- Save progress → invoke /context-save
- Resume context → invoke /context-restore
## GBrain Search Guidance (configured by /sync-gbrain)
<!-- gstack-gbrain-search-guidance:start -->
GBrain is set up and synced on this machine. The agent should prefer gbrain
over Grep when the question is semantic or when you don't know the exact
identifier yet.
**This worktree is pinned to a worktree-scoped code source** via the
`.gbrain-source` file in the repo root (kubectl-style context). Any
`gbrain code-def`, `code-refs`, `code-callers`, `code-callees`, or `query`
call from anywhere under this worktree routes to that source by default —
no `--source` flag needed. Conductor sibling worktrees of the same repo
each have their own pin and their own indexed pages, so semantic results
match the actual code on disk in this worktree.
Two indexed corpora available via the `gbrain` CLI:
- This worktree's code (auto-pinned via `.gbrain-source`).
- `~/.gstack/` curated memory (registered as `gstack-brain-<user>` source via
the existing federation pipeline).
Prefer gbrain when:
- "Where is X handled?" / semantic intent, no exact string yet:
`gbrain search "<terms>"` or `gbrain query "<question>"`
- "Where is symbol Y defined?" / symbol-based code questions:
`gbrain code-def <symbol>` or `gbrain code-refs <symbol>`
- "What calls Y?" / "What does Y depend on?":
`gbrain code-callers <symbol>` / `gbrain code-callees <symbol>`
- "What did we decide last time?" / past plans, retros, learnings:
`gbrain search "<terms>" --source gstack-brain-<user>`
Grep is still right for known exact strings, regex, multiline patterns, and
file globs. Run `/sync-gbrain` after meaningful code changes; for ongoing
auto-sync across all worktrees, run `gbrain autopilot --install` once per
machine — gbrain's daemon handles incremental refresh on a schedule.
<!-- gstack-gbrain-search-guidance:end -->
+61
View File
@@ -1,5 +1,66 @@
# TODOS
## /sync-gbrain memory stage perf follow-up
### P2: Investigate `gbrain import` perf on large staging dirs
**What:** Cold-run time on a 5131-file staging dir is >10 min in `gbrain import`
alone (after gstack's prepare phase, which is now <10s after dropping per-file
gitleaks). On 501 files it took 10s. The scaling is worse than linear and the
bottleneck is inside gbrain, not the gstack orchestrator.
**Why:** With memory-ingest's prepare phase now fast, the remaining cold-run cost
is entirely on the gbrain side. Users with large corpora (5K+ files) currently pay
~15-30 min on first ingest. Likely culprits in `~/git/gbrain/src/core/import-file.ts`:
- N+1 SQL queries: `engine.getPage(slug)` for each file's content_hash check
(line 242 + 478) — should be batched into a single query
- Per-page auto-link reconciliation that fires even for unchanged content
- FTS / vector index updates without batching transactions
**Pros:** Lives in gbrain (cleaner separation). Fix in gbrain benefits other
gbrain callers too (`gbrain sync`, MCP `put_page` workflows). Likely 10-50x
speedup from batched queries alone.
**Cons:** Cross-repo change, requires gbrain test coverage for the new batched
path. Not on the gstack critical path; gstack's architecture is already correct.
**Context:** Verified on real corpus 2026-05-10. gstack-side prepare with
`--scan-secrets` off runs in <10s. The full gbrain import on the same staged
dir consumes 100% CPU for >10 min. Both observations from
`bin/gstack-memory-ingest.ts:ingestPass` reaching the `runGbrainImport` call
quickly, then the child process taking the bulk of the wall time.
**Depends on:** None — gstack's batch-ingest architecture (D1-D8 in
`docs/designs/SYNC_GBRAIN_BATCH_INGEST.md`) is already shipped and correct.
---
### P3: Cache "no changes since last import" at the prepare-batch level
**What:** Even with the prepare phase fast (<10s for 5135 files), walking and
mtime-stat'ing every file on a true no-op run adds a few seconds and creates
spurious staging dirs. Cache the most-recent-source-mtime per-source in the
state file; if no source dir has a newer mtime, skip the walk + stage + import
entirely.
**Why:** Most `/sync-gbrain` invocations have nothing new to ingest. The
fastest path is "do nothing, fast." `gbrain doctor` should still report state,
but the actual ingest pipeline can short-circuit when last_full_walk is recent
and no source-tree mtime has moved.
**Pros:** Trivial implementation (~20 lines in `ingestPass`). Makes the
incremental fast-path actually live up to "<30s" in the original plan.
**Cons:** Adds a cache invalidation surface. If a user edits a file but its
parent dir's mtime doesn't update (rare on macOS APFS), changes get missed.
Mitigation: only short-circuit when last_full_walk is recent (e.g. <1 min ago).
**Context:** Filed during 2026-05-10 perf testing after `--scan-secrets` was
made opt-in. Lower priority than the gbrain-side perf issue above.
---
## Browser-skills follow-on (Phases 2-4)
### P1: Browser-skills Phase 2 — `/scrape` and `/skillify` skill templates
+1 -1
View File
@@ -1 +1 @@
1.32.0.0
1.33.2.0
+19 -3
View File
@@ -442,14 +442,30 @@ function runMemoryIngest(args: CliArgs): StageResult {
timeout: 35 * 60 * 1000,
});
const summary = (result.stderr || "").split("\n").filter((l) => l.includes("[memory-ingest]")).slice(-1)[0] || "ingest pass complete";
// D6: parse [memory-ingest] lines from the child's stderr. ERR-prefixed
// lines indicate a system-level failure (gbrain crashed or CLI missing)
// and the child exits non-zero. Per-file failures are summarized in the
// last non-ERR [memory-ingest] line but do NOT make the verdict ERR.
const stderrLines = (result.stderr || "").split("\n");
const memLines = stderrLines.filter((l) => l.includes("[memory-ingest]"));
const errLine = memLines.find((l) => l.includes("[memory-ingest] ERR"));
const lastMemLine = memLines.slice(-1)[0];
const rawSummary = errLine || lastMemLine || "ingest pass complete";
// Strip the "[memory-ingest] " prefix and any leading "ERR: " for cleaner
// verdict output. The orchestrator's own formatStage will prefix with OK/ERR.
const summary = rawSummary
.replace(/^.*\[memory-ingest\]\s*/, "")
.replace(/^ERR:\s*/, "");
const ok = result.status === 0;
return {
name: "memory",
ran: true,
ok: result.status === 0,
ok,
duration_ms: Date.now() - t0,
summary: result.status === 0 ? summary : `memory ingest exited ${result.status}`,
summary: ok
? summary
: `${summary}${result.status === null ? " (killed by signal / timeout)" : ` (exit ${result.status})`}`,
};
}
+7 -7
View File
@@ -48,7 +48,8 @@ cat "${FILES[@]}" 2>/dev/null | GSTACK_SEARCH_TYPE="$TYPE" GSTACK_SEARCH_QUERY="
const lines = (await Bun.stdin.text()).trim().split('\n').filter(Boolean);
const now = Date.now();
const type = process.env.GSTACK_SEARCH_TYPE || '';
const query = (process.env.GSTACK_SEARCH_QUERY || '').toLowerCase();
const queryRaw = (process.env.GSTACK_SEARCH_QUERY || '').toLowerCase();
const queryTokens = queryRaw.split(/\s+/).filter(Boolean);
const limit = parseInt(process.env.GSTACK_SEARCH_LIMIT || '10', 10);
const slug = process.env.GSTACK_SEARCH_SLUG || '';
@@ -94,12 +95,11 @@ let results = Array.from(seen.values());
// Filter by type
if (type) results = results.filter(e => e.type === type);
// Filter by query
if (query) results = results.filter(e =>
(e.key || '').toLowerCase().includes(query) ||
(e.insight || '').toLowerCase().includes(query) ||
(e.files || []).some(f => f.toLowerCase().includes(query))
);
// Filter by query (token-OR: match if ANY whitespace-split token appears in ANY haystack)
if (queryTokens.length > 0) results = results.filter(e => {
const haystacks = [(e.key || '').toLowerCase(), (e.insight || '').toLowerCase(), ...(e.files || []).map(f => f.toLowerCase())];
return queryTokens.some(tok => haystacks.some(h => h.includes(tok)));
});
// Sort by effective confidence desc, then recency
results.sort((a, b) => {
+650 -98
View File
@@ -47,9 +47,14 @@ import {
statSync,
mkdirSync,
appendFileSync,
renameSync,
openSync,
readSync,
closeSync,
rmSync,
} from "fs";
import { join, basename, dirname } from "path";
import { execSync, execFileSync } from "child_process";
import { execSync, execFileSync, spawnSync, spawn, type ChildProcess } from "child_process";
import { homedir } from "os";
import { createHash } from "crypto";
@@ -73,6 +78,12 @@ interface CliArgs {
sources: Set<MemoryType>;
limit: number | null;
noWrite: boolean;
/**
* Opt-in per-file gitleaks scan during the prepare phase. Off by
* default — the cross-machine boundary (gstack-brain-sync, git push)
* has its own scanner. Setting this adds ~4-8 min to cold runs.
*/
scanSecrets: boolean;
}
type MemoryType =
@@ -137,6 +148,14 @@ interface BulkResult {
failed: number;
duration_ms: number;
partial_pages: number;
/**
* D6: when set, indicates a process-level failure (gbrain CLI missing
* or `gbrain import` crashed). Per-file errors (FILE_TOO_LARGE etc.)
* land in `failed` but do NOT set this flag — the orchestrator should
* still treat the run as OK with summary mentioning the failure count.
* Only when this is set does the verdict become ERR.
*/
system_error?: string;
}
// ── Constants ──────────────────────────────────────────────────────────────
@@ -176,6 +195,9 @@ Options:
--limit <N> Stop after N pages written (smoke testing).
--no-write Skip gbrain put_page calls (still updates state file).
Used by tests + dry runs without actual ingest.
--scan-secrets Opt-in per-file gitleaks scan during prepare. Off by
default; gstack-brain-sync already gates the git-push
boundary. Adds ~4-8 min to cold runs.
--help This text.
`);
}
@@ -190,6 +212,7 @@ function parseArgs(): CliArgs {
let limit: number | null = null;
let sources: Set<MemoryType> = new Set(ALL_TYPES);
let noWrite = process.env.GSTACK_MEMORY_INGEST_NO_WRITE === "1";
let scanSecrets = process.env.GSTACK_MEMORY_INGEST_SCAN_SECRETS === "1";
for (let i = 0; i < args.length; i++) {
const a = args[i];
@@ -202,6 +225,7 @@ function parseArgs(): CliArgs {
case "--include-unattributed": includeUnattributed = true; break;
case "--all-history": allHistory = true; break;
case "--no-write": noWrite = true; break;
case "--scan-secrets": scanSecrets = true; break;
case "--limit":
limit = parseInt(args[++i] || "0", 10);
if (!Number.isFinite(limit) || limit <= 0) {
@@ -229,7 +253,7 @@ function parseArgs(): CliArgs {
}
}
return { mode, quiet, benchmark, includeUnattributed, allHistory, sources, limit, noWrite };
return { mode, quiet, benchmark, includeUnattributed, allHistory, sources, limit, noWrite, scanSecrets };
}
// ── State file ─────────────────────────────────────────────────────────────
@@ -268,9 +292,14 @@ function loadState(): IngestState {
}
function saveState(state: IngestState): void {
// F6 (Codex finding 6): tmp+rename atomic write so a crash mid-write
// never leaves a truncated/corrupt state file. Matches the pattern
// in gstack-gbrain-sync.ts:saveSyncState.
try {
mkdirSync(dirname(STATE_PATH), { recursive: true });
writeFileSync(STATE_PATH, JSON.stringify(state, null, 2), "utf-8");
const tmp = `${STATE_PATH}.tmp.${process.pid}`;
writeFileSync(tmp, JSON.stringify(state, null, 2), "utf-8");
renameSync(tmp, STATE_PATH);
} catch (err) {
console.error(`[state] write failed: ${(err as Error).message}`);
}
@@ -278,12 +307,15 @@ function saveState(state: IngestState): void {
// ── File hash + change detection ───────────────────────────────────────────
function fileSha256(path: string, maxBytes = 1024 * 1024): string {
// Hash the first 1MB only; sufficient for change detection on big JSONL.
function fileSha256(path: string): string {
// F9 (Codex finding 9): full-file hash. The prior 1MB cap silently
// missed tail edits to long partial transcripts — exactly the
// recovery case this pipeline needs to handle correctly. Realistic
// max for an ingest source is ~50MB (long JSONL); fine to load in
// memory for hashing.
try {
const fd = readFileSync(path);
const slice = fd.length > maxBytes ? fd.subarray(0, maxBytes) : fd;
return createHash("sha256").update(slice).digest("hex");
const buf = readFileSync(path);
return createHash("sha256").update(buf).digest("hex");
} catch {
return "";
}
@@ -753,51 +785,66 @@ function buildArtifactPage(path: string, type: MemoryType): PageRecord {
};
}
// ── Writer (calls `gbrain put`) ────────────────────────────────────────────
// ── Writer (batch via `gbrain import <dir>`) ───────────────────────────────
//
// Architecture (post plan-eng-review + Codex outside-voice):
//
// walkAllSources(ctx)
// → for each path: mtime-skip / source-file gitleaks (D3) / parse / buildPage
// → renderPageBody injects title/type/tags into YAML frontmatter
// → writeStaged: mkdir -p slug subdirs (D1), write ${slug}.md
// → snapshot ~/.gbrain/sync-failures.jsonl byte-offset (D7)
// → spawnSync `gbrain import <stagingDir> --no-embed --json` (D6)
// → parseImportJson(stdout) → { imported, skipped, errors, ... } (D6 OK/ERR)
// → readNewFailures(preImportOffset, slugMap) → Set<sourcePath> (D7)
// → state.sessions[path] = { ... } for prepared files NOT in failed set
// → saveStateAtomic (F6 tmp+rename) + cleanupStagingDir
//
// We trust gbrain's content_hash idempotency (verified in
// ~/git/gbrain/src/core/import-file.ts:242-243, :478) — repeated imports
// of identical content are cheap. So we do NOT track per-file skip_reasons,
// do NOT keep a SIGTERM checkpoint, and do NOT advance a three-state verdict.
let _gbrainAvailability: boolean | null = null;
function gbrainAvailable(): boolean {
if (_gbrainAvailability !== null) return _gbrainAvailability;
try {
execSync("command -v gbrain", { stdio: "ignore" });
// gbrain v0.27 retired the legacy `put_page` flag-form for `put <slug>`
// (content via stdin, metadata as YAML frontmatter). Probe `--help` for
// the `put` subcommand so we surface a single clean error here rather
// than failing every page with "Unknown command: put_page". The regex
// anchors on the indented subcommand format gbrain's help actually uses
// (" put ..."), not any whitespace-bordered "put" word in prose.
// Probe `--help` for the `import` subcommand. gbrain v0.20.0+ ships
// `import <dir>` (batch markdown import via path-authoritative slugs).
// If absent, we surface a single clean error here rather than failing
// the whole stage with a confusing usage message from gbrain itself.
const help = execFileSync("gbrain", ["--help"], {
encoding: "utf-8",
timeout: 5000,
stdio: ["ignore", "pipe", "pipe"],
});
_gbrainAvailability = /^\s+put\s/m.test(help);
_gbrainAvailability = /^\s+import\s/m.test(help);
} catch {
_gbrainAvailability = false;
}
return _gbrainAvailability;
}
function gbrainPutPage(page: PageRecord): { ok: boolean; error?: string } {
if (!gbrainAvailable()) {
return { ok: false, error: "gbrain CLI not in PATH or missing `put` subcommand" };
}
// gbrain v0.27+ uses `put <slug>` (positional, content via stdin) instead
// of the legacy `put_page` flag form. Metadata rides as YAML frontmatter:
// - When the page body already starts with frontmatter (transcripts), inject
// title/type/tags into the existing block so gbrain's frontmatter parser
// picks them up.
// - When the page body has no frontmatter (raw artifacts: design-docs,
// learnings, builder-profile-entries), wrap with a fresh frontmatter
// carrying the same fields. Without this branch, artifact pages would
// land in gbrain with empty title/type/tags.
/**
* Build the markdown body with YAML frontmatter (title/type/tags) injected.
*
* Two cases:
* - Page body already starts with `---\n` (transcripts) — inject into the
* existing frontmatter block before its close fence so gbrain's frontmatter
* parser picks up the fields alongside any session-level metadata the
* transcript builder already wrote (session_id, cwd, git_remote, etc.).
* - No leading frontmatter (raw artifacts: design-docs, learnings, etc.) —
* wrap with a fresh frontmatter block carrying title/type/tags. Without
* this branch, artifact pages would land in gbrain with empty metadata.
*
* gbrain enforces slug = path-derived (slugifyPath in gbrain's sync.ts).
* We do NOT set `slug:` in frontmatter — the staging-dir filename is the
* source of truth and gbrain rejects mismatches.
*/
function renderPageBody(page: PageRecord): string {
let body = page.body;
if (body.startsWith("---\n")) {
// Locate the closing --- delimiter. buildTranscriptPage joins with "\n"
// and does not append a trailing newline, so the close fence looks like
// "...\n---" followed directly by body content (no "\n---\n" pattern).
// Match the close on "\n---" only — the inject lands BEFORE the close
// fence, inside the frontmatter block, regardless of what follows it.
const end = body.indexOf("\n---", 4);
if (end > 0) {
const inject = [
@@ -822,27 +869,155 @@ function gbrainPutPage(page: PageRecord): { ok: boolean; error?: string } {
// Strip NUL bytes — Postgres rejects 0x00 in UTF-8 text columns. Some Claude
// Code transcripts contain NUL inside user-pasted content or tool output, and
// surfacing those as `internal_error: invalid byte sequence` from the brain
// is unhelpful when we can sanitize at write time.
// is unhelpful when we can sanitize at write time. Originally landed in v1.32.0.0
// (PR #1411) on the per-file `gbrain put` path; moved here so all staged
// pages still get the same sanitization.
body = body.replace(/\x00/g, "");
try {
execFileSync("gbrain", ["put", page.slug], {
input: body,
encoding: "utf-8",
// Bumped from 30s: auto-link reconciliation on dense transcripts hits
// 30s once the brain has a few hundred existing pages.
timeout: 60000,
// Bumped from default 1MB: without this, gbrain's actual stderr gets
// truncated and callers see only "Command failed:" with no detail.
maxBuffer: 16 * 1024 * 1024,
stdio: ["pipe", "pipe", "pipe"],
});
return { ok: true };
} catch (err: any) {
const stderr = err?.stderr?.toString?.() ?? "";
const stdout = err?.stdout?.toString?.() ?? "";
const detail = stderr || stdout || (err instanceof Error ? err.message : String(err));
return { ok: false, error: detail.split("\n")[0].slice(0, 300) };
return body;
}
interface PreparedPage {
/** Page slug (path-shaped, e.g. "transcripts/claude-code/foo"). */
slug: string;
/** Original source file on disk (e.g. ~/.claude/projects/.../foo.jsonl). */
source_path: string;
/** Full markdown including frontmatter — ready to write. */
rendered_body: string;
/** Carry-through fields for state recording on success. */
page_slug: string;
partial: boolean;
}
interface StagingResult {
staging_dir: string;
written: number;
errors: Array<{ slug: string; error: string }>;
/** Map from staging-dir-relative path (e.g. "transcripts/foo.md") → source path. */
stagedPathToSource: Map<string, string>;
}
/**
* Write prepared pages to a staging dir, mirroring slug hierarchy.
*
* D1: gbrain's `slugifyPath` (sync.ts:260) derives the slug from the
* directory-aware relative path inside the import dir, so slugs containing
* slashes (e.g. "transcripts/claude-code/foo") must live in matching
* subdirectories of the staging dir. Otherwise the slug becomes flattened
* or rejected by gbrain's path-vs-frontmatter slug check (import-file.ts:429).
*
* Filename = `${slug}.md`. mkdir is recursive. Existing files overwrite.
* Errors per-file are collected; the whole batch is best-effort.
*/
function writeStaged(prepared: PreparedPage[], stagingDir: string): StagingResult {
mkdirSync(stagingDir, { recursive: true });
const stagedPathToSource = new Map<string, string>();
const errors: Array<{ slug: string; error: string }> = [];
let written = 0;
for (const p of prepared) {
const relPath = `${p.slug}.md`;
const absPath = join(stagingDir, relPath);
try {
mkdirSync(dirname(absPath), { recursive: true });
writeFileSync(absPath, p.rendered_body, "utf-8");
stagedPathToSource.set(relPath, p.source_path);
written++;
} catch (err) {
errors.push({ slug: p.slug, error: (err as Error).message });
}
}
return { staging_dir: stagingDir, written, errors, stagedPathToSource };
}
interface ImportJsonResult {
status?: string;
duration_s?: number;
imported?: number;
skipped?: number;
errors?: number;
chunks?: number;
total_files?: number;
}
/**
* Parse the `gbrain import --json` stdout payload (single JSON object on
* the last non-empty line per commands/import.ts:271-275).
*
* Returns parsed counts on success, or `null` to signal "unparseable" — the
* caller treats null as ERR (system_error) rather than silently passing
* through as zeros. Pre-2026-05-11 this returned zeros on parse failure,
* which silently masked gbrain crashes as "0 imported, 0 failed = OK".
*/
function parseImportJson(stdout: string): ImportJsonResult | null {
const lines = stdout.split("\n").map((s) => s.trim()).filter(Boolean);
for (let i = lines.length - 1; i >= 0; i--) {
const line = lines[i];
if (line.startsWith("{") && line.endsWith("}")) {
try {
const parsed = JSON.parse(line);
if (typeof parsed === "object" && parsed && "imported" in parsed) {
return parsed as ImportJsonResult;
}
} catch {
// try next line up
}
}
}
return null;
}
/**
* Read failures appended to ~/.gbrain/sync-failures.jsonl since the
* snapshotted byte offset, and map them back to source paths.
*
* D7: gbrain import writes per-file failures to sync-failures.jsonl
* (commands/import.ts:308-310) explicitly so "callers can gate state
* advances" (comment at :28). We snapshot the file size before import
* and read only the appended bytes after, so we never confuse new
* entries with prior-run leftovers.
*
* Each line is `{ path, error, code, commit, ts }`. The `path` is the
* staging-dir-relative filename gbrain saw (e.g. "transcripts/foo.md").
* stagedPathToSource maps that back to the original source file.
*/
function readNewFailures(
syncFailuresPath: string,
preImportOffset: number,
stagedPathToSource: Map<string, string>,
): Set<string> {
const failed = new Set<string>();
try {
if (!existsSync(syncFailuresPath)) return failed;
const stat = statSync(syncFailuresPath);
if (stat.size <= preImportOffset) return failed;
// Read appended bytes only. readSync with a positional offset works
// synchronously without slurping the whole file.
const fd = openSync(syncFailuresPath, "r");
try {
const buf = Buffer.alloc(stat.size - preImportOffset);
readSync(fd, buf, 0, buf.length, preImportOffset);
const text = buf.toString("utf-8");
for (const line of text.split("\n")) {
const trimmed = line.trim();
if (!trimmed) continue;
try {
const entry = JSON.parse(trimmed) as { path?: string };
if (entry.path) {
const source = stagedPathToSource.get(entry.path);
if (source) failed.add(source);
}
} catch {
// ignore malformed line
}
}
} finally {
closeSync(fd);
}
} catch {
// Best-effort. If we can't read failures, we conservatively assume
// none — caller will state-record all prepared files. Worst case:
// failed files get a retry-on-next-run shot anyway via content_hash.
}
return failed;
}
// ── Main ingest passes ─────────────────────────────────────────────────────
@@ -901,34 +1076,72 @@ async function probeMode(args: CliArgs): Promise<ProbeReport> {
};
}
async function ingestPass(args: CliArgs): Promise<BulkResult> {
const t0 = Date.now();
const state = loadState();
const ctx = makeWalkContext(args, state);
let written = 0;
/**
* Prepare phase: walk sources, apply incremental + optional-secret-scan filters,
* parse transcripts/artifacts into PageRecord, render bodies with
* frontmatter. Returns the PreparedPage[] to stage + counts of files
* filtered at each gate.
*
* Secret scanning policy (post 2026-05-10 perf review):
*
* The actual cross-machine exfiltration boundary is `gstack-brain-sync`,
* which runs a regex-based secret scanner on the staged diff before
* `git commit` (see bin/gstack-brain-sync:78-110: AWS keys, GitHub
* tokens, OpenAI keys, PEM blocks, JWTs, bearer-token-in-JSON). That's
* the right place — it gates content leaving the machine.
*
* memory-ingest, by contrast, moves data from one local file to a
* local PGLite database. Scanning every source file at ingest time
* doesn't change exposure (the secret already lives in plaintext
* where the user keeps their transcripts and artifacts) but costs
* ~470s on cold runs. We removed the per-file gitleaks gate as
* redundant defense-in-depth and made it opt-in via `--scan-secrets`
* for users who want belt-and-suspenders.
*/
function preparePages(
args: CliArgs,
ctx: WalkContext,
state: IngestState,
): {
prepared: PreparedPage[];
skippedSecret: number;
skippedDedup: number;
skippedUnattributed: number;
parseFailed: number;
partialPages: number;
} {
const prepared: PreparedPage[] = [];
let skippedSecret = 0;
let skippedDedup = 0;
let skippedUnattributed = 0;
let failed = 0;
let parseFailed = 0;
let partialPages = 0;
for (const { path, type } of walkAllSources(ctx)) {
if (args.limit !== null && written >= args.limit) break;
if (args.limit !== null && prepared.length >= args.limit) break;
if (args.mode === "incremental" && !fileChangedSinceState(path, state)) {
skippedDedup++;
continue;
}
// Secret scan first
const scan = secretScanFile(path);
if (scan.scanner === "gitleaks" && scan.findings.length > 0) {
skippedSecret++;
if (!args.quiet) {
console.error(`[secret-scan match] ${path} (${scan.findings.length} finding${scan.findings.length === 1 ? "" : "s"}); skipped`);
// Optional belt-and-suspenders: when --scan-secrets is set, scan the
// source file with gitleaks and skip dirty ones. Off by default
// because gstack-brain-sync already gates the cross-machine boundary
// and per-file gitleaks costs ~256ms/file (4-8 min on a real corpus).
if (args.scanSecrets) {
const scan = secretScanFile(path);
if (scan.scanner === "gitleaks" && scan.findings.length > 0) {
skippedSecret++;
if (!args.quiet) {
console.error(
`[secret-scan match] ${path} (${scan.findings.length} finding${
scan.findings.length === 1 ? "" : "s"
}); skipped`,
);
}
continue;
}
continue;
}
let page: PageRecord;
@@ -936,7 +1149,7 @@ async function ingestPass(args: CliArgs): Promise<BulkResult> {
if (type === "transcript") {
const session = parseTranscriptJsonl(path);
if (!session) {
failed++;
parseFailed++;
continue;
}
if (!args.includeUnattributed && !session.cwd) {
@@ -953,38 +1166,373 @@ async function ingestPass(args: CliArgs): Promise<BulkResult> {
page = buildArtifactPage(path, type);
}
} catch (err) {
failed++;
parseFailed++;
console.error(`[parse-error] ${path}: ${(err as Error).message}`);
continue;
}
const result = args.noWrite
? { ok: true }
: await withErrorContext(
`put_page:${page.slug}`,
async () => gbrainPutPage(page),
"gstack-memory-ingest"
);
if (!result.ok) {
failed++;
if (!args.quiet) {
console.error(`[put-error] ${page.slug}: ${result.error || "unknown"}`);
prepared.push({
slug: page.slug,
source_path: path,
rendered_body: renderPageBody(page),
page_slug: page.slug,
partial: page.partial ?? false,
});
}
return {
prepared,
skippedSecret,
skippedDedup,
skippedUnattributed,
parseFailed,
partialPages,
};
}
/**
* Make a per-run staging directory at ~/.gstack/.staging-ingest-<pid>-<ts>/
* The pid+ts namespace avoids collisions when two ingest passes run
* concurrently (the orchestrator's lock should prevent this, but
* defense-in-depth).
*/
function makeStagingDir(): string {
const dir = join(GSTACK_HOME, `.staging-ingest-${process.pid}-${Date.now()}`);
mkdirSync(dir, { recursive: true });
return dir;
}
/**
* Best-effort recursive cleanup. Failures swallowed — at worst we leak a
* staging dir to disk; the next run uses a new one and they age out via
* normal disk hygiene. We deliberately do NOT crash the pipeline on
* cleanup failure.
*/
function cleanupStagingDir(dir: string): void {
try {
rmSync(dir, { recursive: true, force: true });
} catch {
// best-effort
}
}
/**
* Track the currently-running gbrain import child + active staging dir so
* SIGTERM/SIGINT on the parent process can:
* 1. forward the signal to the child (otherwise gbrain orphans, holds the
* PGLite write lock, and burns CPU — observed during 2026-05-10 cold-run
* testing)
* 2. synchronously clean up the staging dir BEFORE process.exit (otherwise
* finally blocks in async callers don't run after process.exit from
* inside a signal handler, leaking the staging dir on every interrupt)
*/
let _activeImportChild: ChildProcess | null = null;
let _activeStagingDir: string | null = null;
let _signalHandlersInstalled = false;
function installSignalForwarder(): void {
if (_signalHandlersInstalled) return;
_signalHandlersInstalled = true;
const forward = (signal: NodeJS.Signals) => () => {
if (_activeImportChild && _activeImportChild.pid && !_activeImportChild.killed) {
try {
process.kill(_activeImportChild.pid, signal);
} catch {
// child may have already exited between the alive-check and the kill
}
}
// Synchronously clean up the active staging dir before exiting. The async
// `finally` blocks in ingestPass never run after process.exit fires from
// inside this handler, so cleanup has to happen here.
if (_activeStagingDir) {
cleanupStagingDir(_activeStagingDir);
_activeStagingDir = null;
}
// Re-raise to default action so the parent actually exits. Without this,
// a SIGTERM handler that doesn't exit holds the process alive.
process.exit(signal === "SIGINT" ? 130 : 143);
};
process.on("SIGTERM", forward("SIGTERM"));
process.on("SIGINT", forward("SIGINT"));
}
/**
* Run gbrain import as an async child so we can install signal handlers
* that kill the child on parent SIGTERM/SIGINT. Returns the same shape as
* spawnSync's result so the caller doesn't care which mode was used.
*/
function runGbrainImport(
stagingDir: string,
timeoutMs: number,
): Promise<{ status: number | null; stdout: string; stderr: string }> {
installSignalForwarder();
return new Promise((resolve) => {
const child = spawn(
"gbrain",
["import", stagingDir, "--no-embed", "--json"],
{ stdio: ["ignore", "pipe", "pipe"] },
);
_activeImportChild = child;
let stdout = "";
let stderr = "";
let timedOut = false;
const timer = setTimeout(() => {
timedOut = true;
try {
if (child.pid) process.kill(child.pid, "SIGTERM");
} catch {
// already gone
}
}, timeoutMs);
child.stdout?.on("data", (chunk) => {
stdout += chunk.toString("utf-8");
});
child.stderr?.on("data", (chunk) => {
stderr += chunk.toString("utf-8");
});
child.on("close", (status) => {
clearTimeout(timer);
_activeImportChild = null;
resolve({
status: timedOut ? null : status,
stdout,
stderr,
});
});
child.on("error", (err) => {
clearTimeout(timer);
_activeImportChild = null;
resolve({
status: null,
stdout,
stderr: stderr + `\n[spawn-error] ${(err as Error).message}`,
});
});
});
}
async function ingestPass(args: CliArgs): Promise<BulkResult> {
const t0 = Date.now();
const state = loadState();
const ctx = makeWalkContext(args, state);
// Phase 1: prepare (parse + secret-scan + filter + render frontmatter).
const prep = preparePages(args, ctx, state);
let written = 0;
let failed = 0;
if (args.noWrite) {
// --no-write: skip the gbrain import call but still record state for
// prepared pages (treat them as ingested for dedup purposes). Matches
// the prior contract from --help: "Skip gbrain put_page calls (still
// updates state file)".
const nowIso = new Date().toISOString();
for (const p of prep.prepared) {
try {
state.sessions[p.source_path] = {
mtime_ns: Math.floor(statSync(p.source_path).mtimeMs * 1e6),
sha256: fileSha256(p.source_path),
ingested_at: nowIso,
page_slug: p.page_slug,
partial: p.partial,
};
written++;
} catch {
// best-effort state record
}
}
state.last_full_walk = new Date().toISOString();
state.last_writer = "gstack-memory-ingest";
saveState(state);
return {
written,
skipped_secret: prep.skippedSecret,
skipped_dedup: prep.skippedDedup,
skipped_unattributed: prep.skippedUnattributed,
failed: prep.parseFailed,
duration_ms: Date.now() - t0,
partial_pages: prep.partialPages,
};
}
if (prep.prepared.length === 0) {
// Nothing to import — still touch state.last_full_walk and exit.
state.last_full_walk = new Date().toISOString();
state.last_writer = "gstack-memory-ingest";
saveState(state);
return {
written: 0,
skipped_secret: prep.skippedSecret,
skipped_dedup: prep.skippedDedup,
skipped_unattributed: prep.skippedUnattributed,
failed: prep.parseFailed,
duration_ms: Date.now() - t0,
partial_pages: prep.partialPages,
};
}
if (!gbrainAvailable()) {
const msg =
"gbrain CLI not in PATH or missing `import` subcommand. Run /setup-gbrain.";
console.error(`[memory-ingest] ERR: ${msg}`);
return {
written: 0,
skipped_secret: prep.skippedSecret,
skipped_dedup: prep.skippedDedup,
skipped_unattributed: prep.skippedUnattributed,
failed: prep.parseFailed + prep.prepared.length,
duration_ms: Date.now() - t0,
partial_pages: prep.partialPages,
system_error: msg,
};
}
// Phase 2: stage to a per-run dir + invoke gbrain import.
const stagingDir = makeStagingDir();
// Register staging dir with the signal forwarder so SIGTERM/SIGINT can
// synchronously clean it up before process.exit (the async finally block
// below does NOT run after a signal-handler exit).
_activeStagingDir = stagingDir;
try {
const staging = writeStaged(prep.prepared, stagingDir);
failed += staging.errors.length;
if (!args.quiet && staging.errors.length > 0) {
for (const e of staging.errors.slice(0, 5)) {
console.error(`[stage-error] ${e.slug}: ${e.error}`);
}
continue;
}
state.sessions[path] = {
mtime_ns: Math.floor(statSync(path).mtimeMs * 1e6),
sha256: page.content_sha256,
ingested_at: new Date().toISOString(),
page_slug: page.slug,
partial: page.partial,
};
written++;
if (!args.quiet) {
const tag = page.partial ? " [partial]" : "";
console.log(`[${written}] ${page.slug}${tag}`);
// D7: snapshot sync-failures.jsonl byte-offset before import so we
// can read only newly-appended failure entries afterwards.
const syncFailuresPath = join(homedir(), ".gbrain", "sync-failures.jsonl");
let preImportOffset = 0;
try {
if (existsSync(syncFailuresPath)) {
preImportOffset = statSync(syncFailuresPath).size;
}
} catch {
// best-effort; absent file → 0 offset, all future entries are "new"
}
if (!args.quiet) {
console.error(
`[memory-ingest] staged ${staging.written} pages → ${stagingDir}; running gbrain import...`,
);
}
// D6: single batch import. `--no-embed` matches the prior per-file
// behavior (we never enabled embedding); embeddings happen on-demand
// via gbrain's own pipelines. `--json` gives us structured counts.
//
// Async spawn (not spawnSync) so the signal forwarder installed in
// runGbrainImport propagates SIGTERM/SIGINT to the child. With sync
// spawn, parent termination orphans the gbrain process (observed
// during 2026-05-10 cold-run testing — gbrain kept running 15 min
// after the orchestrator timed out).
const importResult = await runGbrainImport(stagingDir, 30 * 60 * 1000);
const stdout = importResult.stdout || "";
const stderr = importResult.stderr || "";
const importJson = parseImportJson(stdout);
if (importResult.status !== 0) {
const tail = (stderr.trim().split("\n").pop() || "").slice(0, 300);
const msg = `gbrain import exited ${importResult.status}: ${tail}`;
console.error(`[memory-ingest] ERR: ${msg}`);
// We conservatively state-record nothing on a non-zero exit — per-run
// partial progress is invisible to us when the importer crashed.
// sync-failures.jsonl entries may still hold per-file detail.
failed += prep.prepared.length;
return {
written: 0,
skipped_secret: prep.skippedSecret,
skipped_dedup: prep.skippedDedup,
skipped_unattributed: prep.skippedUnattributed,
failed,
duration_ms: Date.now() - t0,
partial_pages: prep.partialPages,
system_error: msg,
};
}
if (!args.quiet) {
// Echo gbrain's own progress lines on stderr through so the user sees
// them when running interactively. Already on our stderr from the
// child via `stdio: pipe`, but we explicitly forward for clarity.
process.stderr.write(stderr);
}
if (importJson === null) {
// gbrain exited 0 but didn't emit a parseable --json line. Treat as
// ERR rather than silently passing zeros through — silent zeros let
// a future gbrain-output regression mask data loss.
const msg =
"gbrain import exited 0 but emitted no parseable --json payload. " +
"Refusing to advance state.";
console.error(`[memory-ingest] ERR: ${msg}`);
failed += prep.prepared.length;
return {
written: 0,
skipped_secret: prep.skippedSecret,
skipped_dedup: prep.skippedDedup,
skipped_unattributed: prep.skippedUnattributed,
failed,
duration_ms: Date.now() - t0,
partial_pages: prep.partialPages,
system_error: msg,
};
}
// D7: identify which staged files failed to import and exclude them
// from state recording. Source paths get a retry on the next run.
const failedSources = readNewFailures(
syncFailuresPath,
preImportOffset,
staging.stagedPathToSource,
);
failed += failedSources.size;
// Phase 3: state recording. Only files that landed in gbrain get
// their mtime+sha256 stamped. Failed source paths are deliberately
// left un-state'd so the next run re-prepares them and gbrain's
// content_hash dedup short-circuits the import.
const nowIso = new Date().toISOString();
for (const p of prep.prepared) {
if (failedSources.has(p.source_path)) continue;
try {
state.sessions[p.source_path] = {
mtime_ns: Math.floor(statSync(p.source_path).mtimeMs * 1e6),
sha256: fileSha256(p.source_path),
ingested_at: nowIso,
page_slug: p.page_slug,
partial: p.partial,
};
written++;
if (!args.quiet) {
const tag = p.partial ? " [partial]" : "";
console.log(`[${written}] ${p.page_slug}${tag}`);
}
} catch (err) {
// statSync can fail if the source file was removed mid-run; skip
// recording but don't fail the whole pass.
console.error(
`[state-record] ${p.source_path}: ${(err as Error).message}`,
);
}
}
if (!args.quiet) {
console.error(
`[memory-ingest] gbrain import: ${importJson.imported ?? 0} imported, ` +
`${importJson.skipped ?? 0} unchanged, ${importJson.errors ?? 0} failed` +
(failedSources.size > 0
? ` (see ~/.gbrain/sync-failures.jsonl for details)`
: ""),
);
}
} finally {
cleanupStagingDir(stagingDir);
_activeStagingDir = null;
}
state.last_full_walk = new Date().toISOString();
@@ -993,12 +1541,12 @@ async function ingestPass(args: CliArgs): Promise<BulkResult> {
return {
written,
skipped_secret: skippedSecret,
skipped_dedup: skippedDedup,
skipped_unattributed: skippedUnattributed,
failed,
skipped_secret: prep.skippedSecret,
skipped_dedup: prep.skippedDedup,
skipped_unattributed: prep.skippedUnattributed,
failed: failed + prep.parseFailed,
duration_ms: Date.now() - t0,
partial_pages: partialPages,
partial_pages: prep.partialPages,
};
}
@@ -1072,11 +1620,15 @@ async function main(): Promise<void> {
if (result.written > 0 || result.failed > 0) {
console.error(`[memory-ingest] ${result.written} written, ${result.failed} failed in ${dt}ms`);
}
// D6: system_error → process-level failure; orchestrator sees ERR.
// Per-file errors do NOT exit non-zero.
if (result.system_error) process.exit(1);
return;
}
const result = await ingestPass(args);
printBulkResult(result, args);
if (result.system_error) process.exit(1);
}
main().catch((err) => {
+332
View File
@@ -0,0 +1,332 @@
# /sync-gbrain batch ingest migration
**Status:** Implemented on garrytan/dublin-v1 (D1-D8 decisions land in this PR)
**Branch:** garrytan/dublin-v1
**Owner:** Garry Tan
**Triggered by:** /investigate run, 2026-05-09
**Estimated effort:** human ~3 days / CC+gstack ~2 hr
**Files touched:** 4 source + 1 test = 5 total (under estimate)
## Decisions (post-review)
This doc captures the original architecture. Final architecture lands per
the 8 review decisions captured in
`/Users/garrytan/.claude/plans/purrfect-tumbling-quiche.md`:
- **D1** hierarchical staging dir (mkdir -p per slug segment) — kept
- **D2** cut over + delete legacy in same PR (no `--legacy-ingest` flag) — kept
- **D3** scan source-file first, stage only clean — kept
- **D4** ~~three-state OK/DEGRADED/ERR verdict~~ COLLAPSED to OK/ERR per
Codex finding 7 (gbrain content_hash idempotency makes the third state
redundant)
- **D5** ~~skip_reason field in state schema~~ DROPPED per Codex finding 7
(re-runs are cheap; no need for permanent skip-tracking)
- **D6** trust gbrain's content_hash idempotency; drop bookkeeping
scaffolding (skip_reason, three-state, SIGTERM checkpoint)
- **D7** per-file failure detection via `~/.gbrain/sync-failures.jsonl`
(byte-offset snapshot + appended-only read)
- **D8** bundle 3 in-scope pre-existing fixes: F6 atomic saveState
(tmp+rename), F8 isolated-stage benchmark, F9 full-file sha256 hash
(no more 1MB cap)
## Verified from gbrain source
Three properties verified by reading `~/git/gbrain/src/`:
- **Idempotency** at `core/import-file.ts:242-243, :478` — content_hash
check, skip if unchanged, overwrite if changed.
- **Frontmatter parity** at `core/import-file.ts:228, 297, 410-422`
title/type/tags honored; auto-inference only when frontmatter absent.
- **Path-authoritative slug** at `core/sync.ts:260` (`slugifyPath`),
enforced at `core/import-file.ts:429`.
- **Per-file failures surface** at `commands/import.ts:308-310`,
comment at `:28`: "callers can gate state advances" — the
intentional API for what D7 uses.
## Performance: planned vs measured (post 2026-05-10 perf review)
| Metric | Plan target | Measured | Verdict |
|---|---|---|---|
| Prepare phase on 5135 files | — | <10s | FAST |
| `gbrain import` on 5135 files | — | >10 min | gbrain-side perf issue, filed |
| Loop / hang (original bug) | never | never | FIXED |
| Memory ingest exits null on SIGTERM | no | no — state writes succeed; child gbrain dies with parent | FIXED |
| FILE_TOO_LARGE blocks last_commit | no | no — failed paths excluded via D7 | FIXED |
**Initial perf miss + correction.** The first cold-run measurement
(~12 min) was dominated by 1841 sequential gitleaks subprocess spawns
at ~256ms each — a redundant security gate. The cross-machine
exfiltration boundary is `gstack-brain-sync` (bin/gstack-brain-sync:78-110,
regex-based secret scan on staged diff before `git commit`). Scanning
every source file before ingest into a LOCAL PGLite doesn't change
exposure — the secret already lives on disk in plaintext. We made
per-file gitleaks opt-in via `--scan-secrets`. Default is off. That
cut the prepare phase from ~12 min to under 10 seconds.
The remaining cold-run cost is `gbrain import` itself, which scales
worse than linear on large staging dirs (10s for 501 files; >10 min
for 5031). That's a gbrain-side perf issue, not gstack architecture.
Filed as a TODO; the fix likely lives in gbrain's content_hash check
loop or auto-link reconciliation phase.
## F9 hash migration (one-time cliff)
F9 switched `fileSha256` from a 1MB-capped hash to full-file. Existing state
entries from before this change carry the old 1MB-capped hash. For any file
whose mtime hasn't changed, `fileChangedSinceState` returns false at the
mtime check and the new hash is never computed — so unchanged files behave
identically. For any file whose mtime DOES change after upgrade, the
full-file hash is recomputed and (correctly) treated as changed, then
re-imported. The `gbrain doctor` probe report's `updated_count` may show
inflated numbers on the first run post-upgrade because every touched file
crosses the algorithm boundary. No data loss, but worth knowing.
## Follow-ups (filed as TODOs)
1. **gbrain import perf on large dirs** — investigate why 5031 files
take >10 min when 501 takes 10s. Likely culprits: N+1 SQL for
`getPage(slug)` content_hash check, per-page auto-link reconciliation,
FTS index updates without batching. Lives in gbrain, not gstack.
2. **Optional: source-file changed-detection cache** — even with the
prepare phase fast, walking 5031 files takes some time. Caching
the "no changes since last successful import" state at the
batch level (not per-file) would skip the prepare phase entirely
on a no-op incremental run.
## Problem
`/sync-gbrain` memory stage takes 35 minutes on a fresh PGLite and exits null,
losing all progress. Subsequent runs redo the same 35 minutes. Observed in
two consecutive runs (gbrain 0.30.0 broken-postgres run: 712s exit-null;
gbrain 0.31.2 PGLite run: 2100s exit-null with 501 pages actually persisted).
## Root cause (from /investigate)
Two compounding bugs in `bin/gstack-memory-ingest.ts`:
1. **Subprocess-per-file architecture.** The ingest loop at line 911 walks
1,841 files in `~/.gstack/projects/` and spawns two subprocesses per file:
- `gitleaks detect --no-git --source <path>` — 46ms cold start (`lib/gstack-memory-helpers.ts:157`)
- `gbrain put <slug>` — 329ms cold start (`bin/gstack-memory-ingest.ts:823`)
- Per-file floor: 375ms × 1841 = 690s (11.5 min) of pure subprocess startup
before any actual work happens.
2. **Kill-no-save timeout.** Orchestrator at `bin/gstack-gbrain-sync.ts:442`
enforces a 35-min timeout. When it fires, `spawnSync` returns
`result.status === null`, the child gets SIGTERM, and the in-memory
ingest state never flushes to `~/.gstack/.transcript-ingest-state.json`.
Next run starts from the same un-progressed state — explains the
redo-everything pattern.
## Numbers from the field
| Metric | Value | Source |
|---|---|---|
| Files in walkAllSources | 1,841 | `find ~/.gstack/projects -type f \( -name "*.md" -o -name "*.jsonl" \)` |
| `gbrain put` cold start | 329ms | `time (echo "test" \| gbrain put _bench)` |
| `gitleaks detect` cold start | 46ms | `time gitleaks detect --no-git --source <small-file>` |
| Theoretical floor (subprocess only) | 690s / 11.5 min | 375ms × 1841 |
| Observed run time | 2100s / 35 min | matches orchestrator timeout exactly |
| Pages actually persisted | 501 | gbrain sources list page_count |
| PGLite growth during run | 290 → 386 MB | `du -sh ~/.gbrain/brain.pglite` |
## Proposed architecture
Replace the per-file subprocess loop with a **prepare-then-batch** pipeline:
```
walkAllSources(ctx)
→ prepareStage (in-process, fast):
parse transcripts/artifacts
build PageRecord with custom YAML frontmatter
gitleaks scan (single subprocess on staging dir)
write prepared .md to staging dir
→ gbrain import <staging-dir> --no-embed (single subprocess)
→ flush state file with all successes
→ cleanup staging dir
```
### Why `gbrain import <dir>` is the right batch path
- Already shipped in gbrain CLI (verified: `gbrain --help` shows `import <dir> [--no-embed]`).
- Walks dir in-process inside gbrain's own runtime — no subprocess fan-out.
- Honors gbrain's batch-size and embedding-batch tuning.
- gbrain v0.31.2 import did 501 pages + 2906 chunks in 10 seconds during the
observed run; the slow part was OUR per-file `gbrain put` loop above it.
### What we keep that the current code does right
- **Custom YAML frontmatter injection** (title, type, tags) — preserved by
writing prepared .md files with frontmatter into the staging dir.
- **Secret scanning** — preserved, but moved to ONE `gitleaks detect --source <staging-dir>`
call after prepare, before import. Files with findings get redacted or
excluded; staging dir guarantees gitleaks sees only the prepared content,
not internal gbrain state.
- **Partial-transcript detection** — preserved in prepare stage; partial
files still get a `partial: true` field in frontmatter.
- **Unattributed-transcript filtering** — preserved in prepare stage.
- **Per-file mtime + sha256 state tracking** — preserved; the prepare stage
records what got staged, the import-success result records what landed.
- **Incremental mode** — `fileChangedSinceState` check stays at the top of
the prepare loop.
## Migration steps
### Step 1: extract `preparePages` from current ingest loop
Take everything in `ingestPass` (lines 899-988 of `bin/gstack-memory-ingest.ts`)
between the walk and the `gbrainPutPage` call. Move into a new function
`preparePages(args, ctx, state) → { staged: PreparedPage[], skipped, failed }`.
Output: list of `{ slug, body, source_path, mtime_ns, sha256, partial }`
where `body` is the full markdown including frontmatter.
### Step 2: add staging dir writer
Pure function: `writeStaged(prepared, stagingDir) → { written, errors }`.
Filename: `${slug}.md`. Idempotent overwrite.
Staging dir lifecycle:
- Created at `~/.gstack/.staging-ingest-${pid}-${ts}/`
- Cleaned in `finally` block, even on SIGTERM
- One staging dir per ingest pass — never reused across runs
### Step 3: single gitleaks pass
Replace per-file `secretScanFile(path)` calls with one call after prepare:
`gitleaks detect --no-git --source <staging-dir> --report-format json --report-path -`.
Parse JSON output, build `Map<slug, findings[]>`. Files with findings get
removed from staging dir before import (or sanitized in place per existing
redaction policy in `lib/gstack-memory-helpers.ts`).
### Step 4: replace `gbrainPutPage` loop with single import call
```typescript
const importResult = spawnSync("gbrain", ["import", stagingDir], {
stdio: ["ignore", "inherit", "inherit"],
timeout: 30 * 60 * 1000, // generous; whole batch
});
```
Parse stdout for the `Import complete` line and the `failed` count.
### Step 5: persist state on partial success
If gbrain import reports `imported=N, failed=M`, save state for the N
successful slugs (not all of them). Failures stay un-state'd so they retry
next run, but successes don't redo.
### Step 6: SIGTERM handler in `gstack-memory-ingest.ts`
Wrap `main()` in:
```typescript
let interrupted = false;
const flush = () => {
if (interrupted) return;
interrupted = true;
saveState(state); // best-effort flush of whatever's accumulated
cleanupStagingDir();
process.exit(143);
};
process.on("SIGTERM", flush);
process.on("SIGINT", flush);
```
This unblocks the kill-no-save bug independently — even if the batch import
runs over the orchestrator timeout, state from the prepare stage survives.
### Step 7: orchestrator update
In `bin/gstack-gbrain-sync.ts:444`:
- Change `result.status === 0` to `result.status === 0 || (parsedSummary.imported > 0 && parsedSummary.imported >= parsedSummary.skipped + parsedSummary.failed)`.
Treat partial success (most pages imported) as OK, not ERR.
- Surface `failed_count` and `partial_blockers` in the stage summary so the
user sees `Memory ... OK 487/501 imported (14 FILE_TOO_LARGE)` instead
of `ERR exited null`.
### Step 8: handle FILE_TOO_LARGE specifically
When gbrain reports FILE_TOO_LARGE, log to a new
`~/.gstack/.ingest-skip-list.json` so the next prepare stage skips that file
entirely. Avoids re-staging a file that will always fail. User can review
the skip list with a new `gstack-memory-ingest --skip-list` flag.
## Test plan
1. **Unit (free, runs in `bun test`):**
- `preparePages` against fixture corpus of 50 files: assert YAML correct,
partial detection works, unattributed filtered.
- `writeStaged` overwrite idempotency.
- SIGTERM handler flush behavior using a child-process test harness.
2. **Integration (free, runs in `bun test`):**
- End-to-end: prepare → gitleaks → gbrain import on a temp PGLite,
assert page_count matches imported count.
- Partial-success path: inject a deliberate FILE_TOO_LARGE; assert
successes still state'd, failure logged to skip list.
- State preservation across SIGTERM: spawn ingest, kill at midpoint,
restart, assert resumed state.
3. **Benchmark gate (periodic, paid):**
- Cold run on 1841-file fixture: assert under 8 min.
- Incremental run (no changes): assert under 60 sec.
- Test fixture: copy of `~/.gstack/projects/` snapshot for repeatable timing.
## Rollback strategy
- New `--legacy-ingest` flag on `gstack-memory-ingest` keeps the old
per-file path callable for one release cycle.
- If batch path regresses on a real corpus, set
`gstack-config set memory_ingest_path legacy` to revert without redeploy.
- Remove flag + legacy path one minor version after confirming batch is stable.
## Risks & open questions for plan-eng-review
1. **gbrain import idempotency on overlapping slugs.** If a previous run
wrote slug X to PGLite with old content, does `gbrain import` of
updated-X overwrite or duplicate? Need to test before relying on it.
2. **Frontmatter injection inside `gbrain import` parser.** Current code
knows how to inject title/type/tags into existing frontmatter blocks
(line 794-821). Does `gbrain import` honor those fields the same way
`gbrain put` does? Verify in unit test.
3. **Staging dir disk pressure.** 1841 files × avg ~50KB = ~92MB of
staging .md content. Acceptable on dev machines but worth knowing.
Alternative: stream prepared content to a tar piped to import (if gbrain
supports it) — likely not, ignore for V1.
4. **Cross-worktree concurrency.** `~/.gstack/.staging-ingest-${pid}-${ts}/`
is pid-namespaced so two concurrent /sync-gbrain runs don't collide.
But the orchestrator already holds a lock at `~/.gstack/.sync-gbrain.lock`
so this is belt-and-suspenders. Keep it.
5. **The "memory ingest exited null" message.** After this change, the
orchestrator might still see status=null on real OOM kills or SIGKILL.
Should the verdict block be more honest? E.g.,
`ERR memory: killed by signal SIGTERM at 35:00 (timeout)`.
6. **Should we deprecate `gbrain put` for memory entirely?** The legacy
path exists for V1.5's `put_file` migration plan. With batch import
working, do we still need single-page put as a fallback for ad-hoc
ingestion? Probably yes (for `~/.gstack/.transcript-ingest-state.json`
updates triggered outside the orchestrator), but worth confirming.
## What this isn't
- Not a gbrain CLI change. All work is in gstack.
- Not a CLAUDE.md voice/UX change.
- Not a new user-facing feature. CHANGELOG entry will read: "Memory ingest
is ~10× faster on cold runs and survives interruption."
## Acceptance criteria
- Cold `/sync-gbrain` on 1841 files completes in under 8 minutes.
- Incremental `/sync-gbrain` (no file changes) completes in under 60 seconds.
- SIGTERM mid-run flushes state; next run resumes without redoing
successfully-imported files.
- FILE_TOO_LARGE failures don't block sync.last_commit advancement.
- All existing test fixtures (transcripts, learnings, design-docs, ceo-plans)
ingest correctly with full frontmatter.
- No regression on partial-transcript or unattributed-transcript handling.
+16 -2
View File
@@ -823,9 +823,9 @@ Search for relevant learnings from previous sessions:
_CROSS_PROJ=$(~/.claude/skills/gstack/bin/gstack-config get cross_project_learnings 2>/dev/null || echo "unset")
echo "CROSS_PROJECT: $_CROSS_PROJ"
if [ "$_CROSS_PROJ" = "true" ]; then
~/.claude/skills/gstack/bin/gstack-learnings-search --limit 10 --cross-project 2>/dev/null || true
~/.claude/skills/gstack/bin/gstack-learnings-search --limit 10 --query "debug investigation root cause hypothesis bug fix" --cross-project 2>/dev/null || true
else
~/.claude/skills/gstack/bin/gstack-learnings-search --limit 10 2>/dev/null || true
~/.claude/skills/gstack/bin/gstack-learnings-search --limit 10 --query "debug investigation root cause hypothesis bug fix" 2>/dev/null || true
fi
```
@@ -855,6 +855,20 @@ smarter on their codebase over time.
Output: **"Root cause hypothesis: ..."** — a specific, testable claim about what is wrong and why.
### Refresh learnings for the hypothesis you just named
The top-of-skill learnings pull above is keyed to "debug investigation" broadly. Now that you have a specific hypothesis, re-pull learnings keyed to that hypothesis so prior fixes for the same problem-shape surface.
Pick ONE keyword from the hypothesis. The keyword should be a noun: the failing component name, the basename of the file you suspect (without extension), or the bug noun. The keyword MUST be alphanumeric or hyphen only — no quotes, slashes, dots, colons, or whitespace. If your candidate has any of those, simplify to just the alphanumeric stem.
Worked examples (investigate-specific): good keywords are `auth-cookie`, `session-expiry`, `redirect-loop`. Bad: `auth.ts:47`, `fix the auth bug`, `<hypothesis-keyword>`.
```bash
~/.claude/skills/gstack/bin/gstack-learnings-search --query "<your-keyword>" --limit 5 2>/dev/null || true
```
If any learnings come back, name which one applies to your investigation in one sentence. If none come back, continue without reference — the absence of a matching prior learning is itself useful information.
---
## Scope Lock
+15 -1
View File
@@ -93,10 +93,24 @@ Gather context before forming any hypothesis.
5. **Check investigation history:** Search prior learnings for investigations on the same files. Recurring bugs in the same area are an architectural smell. If prior investigations exist, note patterns and check if the root cause was structural.
{{LEARNINGS_SEARCH}}
{{LEARNINGS_SEARCH:query=debug investigation root cause hypothesis bug fix}}
Output: **"Root cause hypothesis: ..."** — a specific, testable claim about what is wrong and why.
### Refresh learnings for the hypothesis you just named
The top-of-skill learnings pull above is keyed to "debug investigation" broadly. Now that you have a specific hypothesis, re-pull learnings keyed to that hypothesis so prior fixes for the same problem-shape surface.
Pick ONE keyword from the hypothesis. The keyword should be a noun: the failing component name, the basename of the file you suspect (without extension), or the bug noun. The keyword MUST be alphanumeric or hyphen only — no quotes, slashes, dots, colons, or whitespace. If your candidate has any of those, simplify to just the alphanumeric stem.
Worked examples (investigate-specific): good keywords are `auth-cookie`, `session-expiry`, `redirect-loop`. Bad: `auth.ts:47`, `fix the auth bug`, `<hypothesis-keyword>`.
```bash
~/.claude/skills/gstack/bin/gstack-learnings-search --query "<your-keyword>" --limit 5 2>/dev/null || true
```
If any learnings come back, name which one applies to your investigation in one sentence. If none come back, continue without reference — the absence of a matching prior learning is itself useful information.
---
## Scope Lock
+1 -1
View File
@@ -1,6 +1,6 @@
{
"name": "gstack",
"version": "1.32.0.0",
"version": "1.33.2.0",
"description": "Garry's Stack — Claude Code skills + fast headless browser. One repo, one install, entire AI engineering workflow.",
"license": "MIT",
"type": "module",
+16 -2
View File
@@ -1068,9 +1068,9 @@ Search for relevant learnings from previous sessions:
_CROSS_PROJ=$(~/.claude/skills/gstack/bin/gstack-config get cross_project_learnings 2>/dev/null || echo "unset")
echo "CROSS_PROJECT: $_CROSS_PROJ"
if [ "$_CROSS_PROJ" = "true" ]; then
~/.claude/skills/gstack/bin/gstack-learnings-search --limit 10 --cross-project 2>/dev/null || true
~/.claude/skills/gstack/bin/gstack-learnings-search --limit 10 --query "qa testing bug regression flake fixture" --cross-project 2>/dev/null || true
else
~/.claude/skills/gstack/bin/gstack-learnings-search --limit 10 2>/dev/null || true
~/.claude/skills/gstack/bin/gstack-learnings-search --limit 10 --query "qa testing bug regression flake fixture" 2>/dev/null || true
fi
```
@@ -1426,6 +1426,20 @@ Sort all discovered issues by severity, then decide which to fix based on the se
Mark issues that cannot be fixed from source code (e.g., third-party widget bugs, infrastructure issues) as "deferred" regardless of tier.
### Refresh learnings for the component/page where the bug lives
The top-of-skill learnings pull was keyed to "qa testing" broadly. Before the fix loop, re-pull learnings keyed to the component or page where the bug you're about to fix lives so prior fixes for the same component-shape surface.
Pick ONE keyword that names the buggy component or page. The keyword should be a noun: the failing component name, the page route base, or the feature noun. The keyword MUST be alphanumeric or hyphen only — no quotes, slashes, dots, colons, or whitespace. If your candidate has any of those, simplify to just the alphanumeric stem.
Worked examples (qa-specific): good keywords are `checkout-button`, `signup-form`, `payment`. Bad: `tests are failing`, `<failing-test>`, `app/views/_checkout.html.erb`.
```bash
~/.claude/skills/gstack/bin/gstack-learnings-search --query "<your-keyword>" --limit 5 2>/dev/null || true
```
If any learnings come back, name which one applies to the fix you're about to make in one sentence. If none come back, continue without reference — the absence is itself useful information.
---
## Phase 8: Fix Loop
+15 -1
View File
@@ -100,7 +100,7 @@ mkdir -p .gstack/qa-reports/screenshots
---
{{LEARNINGS_SEARCH}}
{{LEARNINGS_SEARCH:query=qa testing bug regression flake fixture}}
## Test Plan Context
@@ -154,6 +154,20 @@ Sort all discovered issues by severity, then decide which to fix based on the se
Mark issues that cannot be fixed from source code (e.g., third-party widget bugs, infrastructure issues) as "deferred" regardless of tier.
### Refresh learnings for the component/page where the bug lives
The top-of-skill learnings pull was keyed to "qa testing" broadly. Before the fix loop, re-pull learnings keyed to the component or page where the bug you're about to fix lives so prior fixes for the same component-shape surface.
Pick ONE keyword that names the buggy component or page. The keyword should be a noun: the failing component name, the page route base, or the feature noun. The keyword MUST be alphanumeric or hyphen only — no quotes, slashes, dots, colons, or whitespace. If your candidate has any of those, simplify to just the alphanumeric stem.
Worked examples (qa-specific): good keywords are `checkout-button`, `signup-form`, `payment`. Bad: `tests are failing`, `<failing-test>`, `app/views/_checkout.html.erb`.
```bash
~/.claude/skills/gstack/bin/gstack-learnings-search --query "<your-keyword>" --limit 5 2>/dev/null || true
```
If any learnings come back, name which one applies to the fix you're about to make in one sentence. If none come back, continue without reference — the absence is itself useful information.
---
## Phase 8: Fix Loop
+24 -4
View File
@@ -13,7 +13,27 @@
*/
import type { TemplateContext } from './types';
export function generateLearningsSearch(ctx: TemplateContext): string {
// Whitelist for query= macro values. Allows alphanumeric, space, hyphen, underscore.
// Anything else (e.g. $, backticks, quotes, ;) is a shell-injection vector when the
// emitted bash interpolates the value into `--query "${queryArg}"`. Static template
// queries hand-written in gstack are safe, but the resolver API must defend against
// future contributors writing dangerous values.
const QUERY_SAFE_RE = /^[A-Za-z0-9 _-]+$/;
export function generateLearningsSearch(ctx: TemplateContext, args?: string[]): string {
// Parse query= arg. Empty value falls through to no-query (principle of least surprise:
// a stray {{LEARNINGS_SEARCH:query=}} placeholder gets today's behavior, not a build error).
const queryArg = (args || [])
.filter(a => a.startsWith('query='))
.map(a => a.slice(6))
.filter(Boolean)[0];
if (queryArg && !QUERY_SAFE_RE.test(queryArg)) {
throw new Error(
`{{LEARNINGS_SEARCH:query=...}} value must match ${QUERY_SAFE_RE} (alphanumeric, space, hyphen, underscore). Got: ${JSON.stringify(queryArg)}`
);
}
const queryFlag = queryArg ? ` --query "${queryArg}"` : '';
if (ctx.host === 'codex') {
// Codex: simpler version, no cross-project, uses $GSTACK_BIN
return `## Prior Learnings
@@ -21,7 +41,7 @@ export function generateLearningsSearch(ctx: TemplateContext): string {
Search for relevant learnings from previous sessions on this project:
\`\`\`bash
$GSTACK_BIN/gstack-learnings-search --limit 10 2>/dev/null || true
$GSTACK_BIN/gstack-learnings-search --limit 10${queryFlag} 2>/dev/null || true
\`\`\`
If learnings are found, incorporate them into your analysis. When a review finding
@@ -36,9 +56,9 @@ Search for relevant learnings from previous sessions:
_CROSS_PROJ=$(${ctx.paths.binDir}/gstack-config get cross_project_learnings 2>/dev/null || echo "unset")
echo "CROSS_PROJECT: $_CROSS_PROJ"
if [ "$_CROSS_PROJ" = "true" ]; then
${ctx.paths.binDir}/gstack-learnings-search --limit 10 --cross-project 2>/dev/null || true
${ctx.paths.binDir}/gstack-learnings-search --limit 10${queryFlag} --cross-project 2>/dev/null || true
else
${ctx.paths.binDir}/gstack-learnings-search --limit 10 2>/dev/null || true
${ctx.paths.binDir}/gstack-learnings-search --limit 10${queryFlag} 2>/dev/null || true
fi
\`\`\`
+58 -25
View File
@@ -806,35 +806,68 @@ if [ "$INSTALL_CLAUDE" -eq 1 ]; then
fi
log " browse: $BROWSE_BIN"
else
# Not inside a skills/ directory — symlink into ~/.claude/skills/ and retry
# Not inside a skills/ directory — would symlink the source into
# ~/.claude/skills/gstack/ and register from there.
CLAUDE_SKILLS_DIR="$HOME/.claude/skills"
CLAUDE_GSTACK_LINK="$CLAUDE_SKILLS_DIR/gstack"
mkdir -p "$CLAUDE_SKILLS_DIR"
ln -snf "$SOURCE_GSTACK_DIR" "$CLAUDE_GSTACK_LINK"
log " symlinked $CLAUDE_GSTACK_LINK -> $SOURCE_GSTACK_DIR"
INSTALL_SKILLS_DIR="$CLAUDE_SKILLS_DIR"
INSTALL_GSTACK_DIR="$CLAUDE_GSTACK_LINK"
# Clean up stale symlinks from the opposite prefix mode
if [ "$SKILL_PREFIX" -eq 1 ]; then
cleanup_old_claude_symlinks "$SOURCE_GSTACK_DIR" "$INSTALL_SKILLS_DIR"
# Conductor worktree guard: if ~/.claude/skills/gstack is already a real
# (non-symlink) directory pointing to a *different* install, refuse to plant
# a symlink there. On macOS/BSD, `ln -snf SRC DST` won't replace a real DST;
# it creates DST/$(basename SRC) → SRC inside it. The result is per-worktree
# symlinks leaking into the global install that Claude Code picks up as
# separate top-level skills (dublin-v1, lincoln-v2, ...). Typical trigger:
# running ./setup from a Conductor worktree of the gstack repo itself.
_SKIP_CLAUDE_REGISTER=0
if [ -d "$CLAUDE_GSTACK_LINK" ] && [ ! -L "$CLAUDE_GSTACK_LINK" ]; then
_EXISTING_REAL=$(cd "$CLAUDE_GSTACK_LINK" 2>/dev/null && pwd -P || echo "")
if [ -n "$_EXISTING_REAL" ] && [ "$_EXISTING_REAL" != "$SOURCE_GSTACK_DIR" ]; then
_SKIP_CLAUDE_REGISTER=1
fi
fi
if [ "$_SKIP_CLAUDE_REGISTER" -eq 1 ]; then
log ""
log " $CLAUDE_GSTACK_LINK already exists as a separate global install."
log " Skipping Claude skill registration to avoid polluting it with"
log " per-worktree symlinks. (Binaries still built locally for dev.)"
log ""
log " Global install: $CLAUDE_GSTACK_LINK"
log " This worktree: $SOURCE_GSTACK_DIR"
log ""
log " To register this worktree as the active gstack, remove the global"
log " install first: rm -rf $CLAUDE_GSTACK_LINK"
log ""
log "gstack built (claude registration skipped)."
log " browse: $BROWSE_BIN"
else
cleanup_prefixed_claude_symlinks "$SOURCE_GSTACK_DIR" "$INSTALL_SKILLS_DIR"
mkdir -p "$CLAUDE_SKILLS_DIR"
ln -snf "$SOURCE_GSTACK_DIR" "$CLAUDE_GSTACK_LINK"
log " symlinked $CLAUDE_GSTACK_LINK -> $SOURCE_GSTACK_DIR"
INSTALL_SKILLS_DIR="$CLAUDE_SKILLS_DIR"
INSTALL_GSTACK_DIR="$CLAUDE_GSTACK_LINK"
# Clean up stale symlinks from the opposite prefix mode
if [ "$SKILL_PREFIX" -eq 1 ]; then
cleanup_old_claude_symlinks "$SOURCE_GSTACK_DIR" "$INSTALL_SKILLS_DIR"
else
cleanup_prefixed_claude_symlinks "$SOURCE_GSTACK_DIR" "$INSTALL_SKILLS_DIR"
fi
"$SOURCE_GSTACK_DIR/bin/gstack-patch-names" "$SOURCE_GSTACK_DIR" "$SKILL_PREFIX"
link_claude_skill_dirs "$SOURCE_GSTACK_DIR" "$INSTALL_SKILLS_DIR"
GSTACK_RELINK="$SOURCE_GSTACK_DIR/bin/gstack-relink"
if [ -x "$GSTACK_RELINK" ]; then
GSTACK_SKILLS_DIR="$INSTALL_SKILLS_DIR" GSTACK_INSTALL_DIR="$SOURCE_GSTACK_DIR" "$GSTACK_RELINK" >/dev/null 2>&1 || true
fi
_OGB_LINK="$INSTALL_SKILLS_DIR/connect-chrome"
if [ "$SKILL_PREFIX" -eq 1 ]; then
_OGB_LINK="$INSTALL_SKILLS_DIR/gstack-connect-chrome"
fi
if [ -L "$_OGB_LINK" ] || [ ! -e "$_OGB_LINK" ]; then
ln -snf "gstack/open-gstack-browser" "$_OGB_LINK"
fi
log "gstack ready (claude)."
log " browse: $BROWSE_BIN"
fi
"$SOURCE_GSTACK_DIR/bin/gstack-patch-names" "$SOURCE_GSTACK_DIR" "$SKILL_PREFIX"
link_claude_skill_dirs "$SOURCE_GSTACK_DIR" "$INSTALL_SKILLS_DIR"
GSTACK_RELINK="$SOURCE_GSTACK_DIR/bin/gstack-relink"
if [ -x "$GSTACK_RELINK" ]; then
GSTACK_SKILLS_DIR="$INSTALL_SKILLS_DIR" GSTACK_INSTALL_DIR="$SOURCE_GSTACK_DIR" "$GSTACK_RELINK" >/dev/null 2>&1 || true
fi
_OGB_LINK="$INSTALL_SKILLS_DIR/connect-chrome"
if [ "$SKILL_PREFIX" -eq 1 ]; then
_OGB_LINK="$INSTALL_SKILLS_DIR/gstack-connect-chrome"
fi
if [ -L "$_OGB_LINK" ] || [ ! -e "$_OGB_LINK" ]; then
ln -snf "gstack/open-gstack-browser" "$_OGB_LINK"
fi
log "gstack ready (claude)."
log " browse: $BROWSE_BIN"
fi
fi
+23 -12
View File
@@ -37,9 +37,22 @@ happens after you say yes.
## What gets scanned for secrets
Every ingested page passes through **gitleaks** before write
(per D19 — replaces the regex scanner that previously ran only on
staged git diffs). Gitleaks is industry-standard, covers:
The cross-machine secret boundary is `gstack-brain-sync` (the git push
to your private artifacts repo), which runs its own scanner before any
content leaves this Mac. Local PGLite ingest doesn't change the exposure
surface for content that already lives on disk in plaintext.
Per-file **gitleaks** scanning during memory ingest is **opt-in** as of
v1.33.0.0 — off by default. To re-enable it (adds ~4-8 min to cold runs
on a large transcript corpus), use either:
```bash
gstack-memory-ingest --bulk --scan-secrets
# or
GSTACK_MEMORY_INGEST_SCAN_SECRETS=1 gstack-memory-ingest --bulk
```
When enabled, gitleaks covers:
- AWS / GCP / Azure access keys
- ANTHROPIC_API_KEY, OPENAI_API_KEY, GitHub tokens
@@ -50,13 +63,11 @@ A session with a positive finding is **skipped entirely** — not partially
redacted. The match line + rule ID are logged to stderr; you can see what
was skipped via `bun run bin/gstack-memory-ingest.ts --probe` (which
shows new vs. updated counts) or by reviewing the helper's output during
`/gbrain-sync --full`.
`/sync-gbrain --full`.
If gitleaks is not installed (run `brew install gitleaks` on macOS, or
`apt install gitleaks` on Linux), the helper warns once and disables
secret scanning. **In that mode, transcripts ingest unscanned. Don't run
ingest without gitleaks if you have any concern about secrets in your
sessions.**
`apt install gitleaks` on Linux) and you passed `--scan-secrets` anyway,
the helper warns once and disables secret scanning for that run.
## Where it goes
@@ -168,14 +179,14 @@ Common cases:
- Brain-sync git history shows every curated artifact push with the
user's git identity.
If you find a transcript page that contains a secret gitleaks missed,
the recovery path is:
If you find a transcript page that contains a secret (either because
per-file scanning was off, or gitleaks missed it), the recovery path is:
1. `gbrain delete_page <slug>` — removes from index immediately
2. Rotate the secret (rotate it anyway as a defensive measure)
3. If brain-sync is on: `git filter-repo --invert-paths --path <relative-path>`
on the brain remote for hard-delete from history
4. File a gitleaks issue with the pattern (or extend the gitleaks config
at `~/.gitleaks.toml`).
4. If the miss looks like a gitleaks rule gap, file a gitleaks issue
with the pattern (or extend the gitleaks config at `~/.gitleaks.toml`).
## Path 4: Remote MCP setup (v1.27.0.0+)
+16 -2
View File
@@ -1824,9 +1824,9 @@ Search for relevant learnings from previous sessions:
_CROSS_PROJ=$(~/.claude/skills/gstack/bin/gstack-config get cross_project_learnings 2>/dev/null || echo "unset")
echo "CROSS_PROJECT: $_CROSS_PROJ"
if [ "$_CROSS_PROJ" = "true" ]; then
~/.claude/skills/gstack/bin/gstack-learnings-search --limit 10 --cross-project 2>/dev/null || true
~/.claude/skills/gstack/bin/gstack-learnings-search --limit 10 --query "release ship version changelog merge pr" --cross-project 2>/dev/null || true
else
~/.claude/skills/gstack/bin/gstack-learnings-search --limit 10 2>/dev/null || true
~/.claude/skills/gstack/bin/gstack-learnings-search --limit 10 --query "release ship version changelog merge pr" 2>/dev/null || true
fi
```
@@ -2459,6 +2459,20 @@ already knows. A good test: would this insight save time in a future session? If
### Refresh learnings for the headline feature on this branch
The top-of-skill learnings pull was keyed to "release ship" broadly. Before the VERSION/CHANGELOG step, re-pull learnings keyed to THIS branch's headline feature so any prior version-bump or CHANGELOG pitfalls for similar features surface.
Pick ONE keyword that names the headline feature you're shipping. The keyword should be a noun: the primary skill or module name, the central feature noun, or the binary you changed. The keyword MUST be alphanumeric or hyphen only — no quotes, slashes, dots, colons, or whitespace. If your candidate has any of those, simplify to just the alphanumeric stem.
Worked examples (ship-specific): good keywords are `learnings-search`, `pacing`, `worktree-ship`. Bad: `the branch headline`, `v1.31.1.0`, `feat: token-or search`.
```bash
~/.claude/skills/gstack/bin/gstack-learnings-search --query "<your-keyword>" --limit 5 2>/dev/null || true
```
If any learnings come back, name which one applies to the version bump or CHANGELOG framing in one sentence. If none come back, continue without reference — the absence is itself useful information.
## Step 12: Version bump (auto-decide)
**Idempotency check:** Before bumping, classify the state by comparing `VERSION` against the base branch AND against `package.json`'s `version` field. Four states: FRESH (do bump), ALREADY_BUMPED (skip bump), DRIFT_STALE_PKG (sync pkg only, no re-bump), DRIFT_UNEXPECTED (stop and ask).
+15 -1
View File
@@ -283,7 +283,7 @@ If multiple suites need to run, run them sequentially (each needs a test lane).
{{PLAN_VERIFICATION_EXEC}}
{{LEARNINGS_SEARCH}}
{{LEARNINGS_SEARCH:query=release ship version changelog merge pr}}
{{SCOPE_DRIFT}}
@@ -401,6 +401,20 @@ For each comment in `comments`:
{{GBRAIN_SAVE_RESULTS}}
### Refresh learnings for the headline feature on this branch
The top-of-skill learnings pull was keyed to "release ship" broadly. Before the VERSION/CHANGELOG step, re-pull learnings keyed to THIS branch's headline feature so any prior version-bump or CHANGELOG pitfalls for similar features surface.
Pick ONE keyword that names the headline feature you're shipping. The keyword should be a noun: the primary skill or module name, the central feature noun, or the binary you changed. The keyword MUST be alphanumeric or hyphen only — no quotes, slashes, dots, colons, or whitespace. If your candidate has any of those, simplify to just the alphanumeric stem.
Worked examples (ship-specific): good keywords are `learnings-search`, `pacing`, `worktree-ship`. Bad: `the branch headline`, `v1.31.1.0`, `feat: token-or search`.
```bash
~/.claude/skills/gstack/bin/gstack-learnings-search --query "<your-keyword>" --limit 5 2>/dev/null || true
```
If any learnings come back, name which one applies to the version bump or CHANGELOG framing in one sentence. If none come back, continue without reference — the absence is itself useful information.
## Step 12: Version bump (auto-decide)
**Idempotency check:** Before bumping, classify the state by comparing `VERSION` against the base branch AND against `package.json`'s `version` field. Four states: FRESH (do bump), ALREADY_BUMPED (skip bump), DRIFT_STALE_PKG (sync pkg only, no re-bump), DRIFT_UNEXPECTED (stop and ask).
+16 -2
View File
@@ -1824,9 +1824,9 @@ Search for relevant learnings from previous sessions:
_CROSS_PROJ=$(~/.claude/skills/gstack/bin/gstack-config get cross_project_learnings 2>/dev/null || echo "unset")
echo "CROSS_PROJECT: $_CROSS_PROJ"
if [ "$_CROSS_PROJ" = "true" ]; then
~/.claude/skills/gstack/bin/gstack-learnings-search --limit 10 --cross-project 2>/dev/null || true
~/.claude/skills/gstack/bin/gstack-learnings-search --limit 10 --query "release ship version changelog merge pr" --cross-project 2>/dev/null || true
else
~/.claude/skills/gstack/bin/gstack-learnings-search --limit 10 2>/dev/null || true
~/.claude/skills/gstack/bin/gstack-learnings-search --limit 10 --query "release ship version changelog merge pr" 2>/dev/null || true
fi
```
@@ -2459,6 +2459,20 @@ already knows. A good test: would this insight save time in a future session? If
### Refresh learnings for the headline feature on this branch
The top-of-skill learnings pull was keyed to "release ship" broadly. Before the VERSION/CHANGELOG step, re-pull learnings keyed to THIS branch's headline feature so any prior version-bump or CHANGELOG pitfalls for similar features surface.
Pick ONE keyword that names the headline feature you're shipping. The keyword should be a noun: the primary skill or module name, the central feature noun, or the binary you changed. The keyword MUST be alphanumeric or hyphen only — no quotes, slashes, dots, colons, or whitespace. If your candidate has any of those, simplify to just the alphanumeric stem.
Worked examples (ship-specific): good keywords are `learnings-search`, `pacing`, `worktree-ship`. Bad: `the branch headline`, `v1.31.1.0`, `feat: token-or search`.
```bash
~/.claude/skills/gstack/bin/gstack-learnings-search --query "<your-keyword>" --limit 5 2>/dev/null || true
```
If any learnings come back, name which one applies to the version bump or CHANGELOG framing in one sentence. If none come back, continue without reference — the absence is itself useful information.
## Step 12: Version bump (auto-decide)
**Idempotency check:** Before bumping, classify the state by comparing `VERSION` against the base branch AND against `package.json`'s `version` field. Four states: FRESH (do bump), ALREADY_BUMPED (skip bump), DRIFT_STALE_PKG (sync pkg only, no re-bump), DRIFT_UNEXPECTED (stop and ask).
+15 -1
View File
@@ -1810,7 +1810,7 @@ Add a `## Verification Results` section to the PR body (Step 19):
Search for relevant learnings from previous sessions on this project:
```bash
$GSTACK_BIN/gstack-learnings-search --limit 10 2>/dev/null || true
$GSTACK_BIN/gstack-learnings-search --limit 10 --query "release ship version changelog merge pr" 2>/dev/null || true
```
If learnings are found, incorporate them into your analysis. When a review finding
@@ -2074,6 +2074,20 @@ already knows. A good test: would this insight save time in a future session? If
### Refresh learnings for the headline feature on this branch
The top-of-skill learnings pull was keyed to "release ship" broadly. Before the VERSION/CHANGELOG step, re-pull learnings keyed to THIS branch's headline feature so any prior version-bump or CHANGELOG pitfalls for similar features surface.
Pick ONE keyword that names the headline feature you're shipping. The keyword should be a noun: the primary skill or module name, the central feature noun, or the binary you changed. The keyword MUST be alphanumeric or hyphen only — no quotes, slashes, dots, colons, or whitespace. If your candidate has any of those, simplify to just the alphanumeric stem.
Worked examples (ship-specific): good keywords are `learnings-search`, `pacing`, `worktree-ship`. Bad: `the branch headline`, `v1.31.1.0`, `feat: token-or search`.
```bash
$GSTACK_ROOT/bin/gstack-learnings-search --query "<your-keyword>" --limit 5 2>/dev/null || true
```
If any learnings come back, name which one applies to the version bump or CHANGELOG framing in one sentence. If none come back, continue without reference — the absence is itself useful information.
## Step 12: Version bump (auto-decide)
**Idempotency check:** Before bumping, classify the state by comparing `VERSION` against the base branch AND against `package.json`'s `version` field. Four states: FRESH (do bump), ALREADY_BUMPED (skip bump), DRIFT_STALE_PKG (sync pkg only, no re-bump), DRIFT_UNEXPECTED (stop and ask).
+16 -2
View File
@@ -1815,9 +1815,9 @@ Search for relevant learnings from previous sessions:
_CROSS_PROJ=$($GSTACK_BIN/gstack-config get cross_project_learnings 2>/dev/null || echo "unset")
echo "CROSS_PROJECT: $_CROSS_PROJ"
if [ "$_CROSS_PROJ" = "true" ]; then
$GSTACK_BIN/gstack-learnings-search --limit 10 --cross-project 2>/dev/null || true
$GSTACK_BIN/gstack-learnings-search --limit 10 --query "release ship version changelog merge pr" --cross-project 2>/dev/null || true
else
$GSTACK_BIN/gstack-learnings-search --limit 10 2>/dev/null || true
$GSTACK_BIN/gstack-learnings-search --limit 10 --query "release ship version changelog merge pr" 2>/dev/null || true
fi
```
@@ -2450,6 +2450,20 @@ already knows. A good test: would this insight save time in a future session? If
### Refresh learnings for the headline feature on this branch
The top-of-skill learnings pull was keyed to "release ship" broadly. Before the VERSION/CHANGELOG step, re-pull learnings keyed to THIS branch's headline feature so any prior version-bump or CHANGELOG pitfalls for similar features surface.
Pick ONE keyword that names the headline feature you're shipping. The keyword should be a noun: the primary skill or module name, the central feature noun, or the binary you changed. The keyword MUST be alphanumeric or hyphen only — no quotes, slashes, dots, colons, or whitespace. If your candidate has any of those, simplify to just the alphanumeric stem.
Worked examples (ship-specific): good keywords are `learnings-search`, `pacing`, `worktree-ship`. Bad: `the branch headline`, `v1.31.1.0`, `feat: token-or search`.
```bash
$GSTACK_ROOT/bin/gstack-learnings-search --query "<your-keyword>" --limit 5 2>/dev/null || true
```
If any learnings come back, name which one applies to the version bump or CHANGELOG framing in one sentence. If none come back, continue without reference — the absence is itself useful information.
## Step 12: Version bump (auto-decide)
**Idempotency check:** Before bumping, classify the state by comparing `VERSION` against the base branch AND against `package.json`'s `version` field. Four states: FRESH (do bump), ALREADY_BUMPED (skip bump), DRIFT_STALE_PKG (sync pkg only, no re-bump), DRIFT_UNEXPECTED (stop and ask).
+48
View File
@@ -3047,3 +3047,51 @@ describe('GSTACK REVIEW REPORT delete-then-append flow', () => {
expect(src).not.toContain('If it was found mid-file, move it');
});
});
describe('LEARNINGS_SEARCH resolver: query parameter', () => {
// Lazy-load resolver and types after describe block to keep test file self-contained.
const { generateLearningsSearch } = require('../scripts/resolvers/learnings');
const { HOST_PATHS } = require('../scripts/resolvers/types');
const claudeCtx = {
skillName: 'test',
tmplPath: 'test/SKILL.md.tmpl',
host: 'claude',
paths: HOST_PATHS.claude,
};
const codexCtx = { ...claudeCtx, host: 'codex', paths: HOST_PATHS.codex };
test('no args → bash does not contain --query (backwards-compat)', () => {
const out = generateLearningsSearch(claudeCtx);
expect(out).not.toContain('--query');
});
test('claude host + query=foo bar → both cross-project and project-scoped branches contain --query', () => {
const out = generateLearningsSearch(claudeCtx, ['query=foo bar']);
// Both branches of the if/else must carry the flag.
const lines = out.split('\n').filter(l => l.includes('gstack-learnings-search'));
expect(lines.length).toBeGreaterThanOrEqual(2);
for (const line of lines) {
expect(line).toContain('--query "foo bar"');
}
});
test('codex host + query=foo bar → codex bash variant contains --query', () => {
const out = generateLearningsSearch(codexCtx, ['query=foo bar']);
expect(out).toContain('--query "foo bar"');
expect(out).toContain('$GSTACK_BIN/gstack-learnings-search');
});
test('empty value query= → bash does not contain --query (locked semantics: falls through)', () => {
const claudeOut = generateLearningsSearch(claudeCtx, ['query=']);
expect(claudeOut).not.toContain('--query');
const codexOut = generateLearningsSearch(codexCtx, ['query=']);
expect(codexOut).not.toContain('--query');
});
test('shell-injection chars in query= → throws at gen-time (defense in depth)', () => {
for (const bad of ['$(whoami)', '`cmd`', 'a;b', 'a&b', 'a"b', 'a\\b', 'foo$x']) {
expect(() => generateLearningsSearch(claudeCtx, [`query=${bad}`])).toThrow(/alphanumeric/);
}
});
});
+60
View File
@@ -0,0 +1,60 @@
import { describe, test, expect, beforeAll, afterAll } from 'bun:test';
import * as fs from 'fs';
import * as path from 'path';
import * as os from 'os';
import { execFileSync } from 'child_process';
const ROOT = path.resolve(import.meta.dir, '..');
const BIN = path.join(ROOT, 'bin', 'gstack-learnings-search');
const tmpHome = fs.mkdtempSync(path.join(os.tmpdir(), 'gstack-search-test-'));
const tmpCwd = fs.mkdtempSync(path.join(os.tmpdir(), 'gstack-search-cwd-'));
// gstack-slug derives slug from git remote (none here) → falls back to basename of cwd.
const slug = path.basename(tmpCwd).replace(/[^a-zA-Z0-9._-]/g, '');
const projDir = path.join(tmpHome, 'projects', slug);
function run(args: string[]): string {
return execFileSync(BIN, args, {
env: { ...process.env, GSTACK_HOME: tmpHome },
cwd: tmpCwd,
encoding: 'utf-8',
});
}
beforeAll(() => {
fs.mkdirSync(projDir, { recursive: true });
const entries = [
{ ts: '2026-05-01T00:00:00Z', skill: 'test', type: 'pattern', key: 'foo-pattern', insight: 'A foo-related insight', confidence: 8, source: 'observed', files: [] },
{ ts: '2026-05-02T00:00:00Z', skill: 'test', type: 'pitfall', key: 'bar-pitfall', insight: 'A bar-related insight', confidence: 8, source: 'observed', files: [] },
{ ts: '2026-05-03T00:00:00Z', skill: 'test', type: 'pattern', key: 'baz-pattern', insight: 'A baz-related insight', confidence: 8, source: 'observed', files: [] },
];
fs.writeFileSync(path.join(projDir, 'learnings.jsonl'), entries.map(e => JSON.stringify(e)).join('\n') + '\n');
});
afterAll(() => {
fs.rmSync(tmpHome, { recursive: true, force: true });
fs.rmSync(tmpCwd, { recursive: true, force: true });
});
describe('gstack-learnings-search token-OR query semantics', () => {
test('multi-token query returns entries matching ANY token', () => {
const out = run(['--query', 'foo bar']);
expect(out).toContain('foo-pattern');
expect(out).toContain('bar-pitfall');
expect(out).not.toContain('baz-pattern');
});
test('single-token query returns only entries matching that token', () => {
const out = run(['--query', 'foo']);
expect(out).toContain('foo-pattern');
expect(out).not.toContain('bar-pitfall');
expect(out).not.toContain('baz-pattern');
});
test('no --query flag returns all entries (backwards-compat)', () => {
const out = run(['--limit', '10']);
expect(out).toContain('foo-pattern');
expect(out).toContain('bar-pitfall');
expect(out).toContain('baz-pattern');
});
});
+328 -66
View File
@@ -312,54 +312,101 @@ describe("gstack-memory-ingest --limit", () => {
});
});
// ── Writer regression: gbrain v0.27+ uses `put`, not `put_page` ───────────
// ── Writer regression: batch-import via `gbrain import <dir>` ─────────────
/**
* Stand up a fake `gbrain` shim on PATH that:
* - advertises `put` in `--help` output (so gbrainAvailable() passes)
* - records `put <slug>` invocations + their stdin to a log
* - rejects `put_page` with a non-zero exit, mimicking real gbrain v0.27+
* - advertises `import` in `--help` output (gbrainAvailable() passes)
* - records `import <dir>` invocations, args, and a sample of staged files
* - emits a valid `--json` summary on stdout (status, imported, etc.)
* - optionally drops failures to a sync-failures.jsonl path (HOME/.gbrain/)
*
* If the writer ever regresses to the legacy flag-form, the bulk pass will
* report 0 writes and the assertion on `Wrote: 1` will fail loudly.
* Architecture being verified (post plan-eng-review + Codex outside-voice):
* - new code uses `gbrain import <stagingDir> --no-embed --json` ONE time,
* not `gbrain put <slug>` per file. The fixture would catch a regression
* to the legacy per-file loop because (a) `put` is no longer advertised,
* so gbrainAvailable() returns false; (b) we assert the recorded args
* include `import` and the dir argument.
*/
function installFakeGbrain(home: string): { binDir: string; logFile: string; stdinFile: string } {
function installFakeGbrain(
home: string,
opts: { failingPaths?: string[] } = {},
): { binDir: string; logFile: string; argsFile: string; stagingListFile: string } {
const binDir = join(home, "fake-bin");
mkdirSync(binDir, { recursive: true });
const logFile = join(home, "gbrain-calls.log");
const stdinFile = join(home, "gbrain-stdin.log");
const argsFile = join(home, "gbrain-args.log");
const stagingListFile = join(home, "gbrain-staging-list.log");
// Bash-side: when failingPaths is set, append matching JSONL entries to
// ~/.gbrain/sync-failures.jsonl so D7's readNewFailures can read them.
const failingList = (opts.failingPaths || []).join("|");
const script = `#!/usr/bin/env bash
set -euo pipefail
LOG="${logFile}"
STDIN_LOG="${stdinFile}"
ARGS_LOG="${argsFile}"
STAGING_LIST="${stagingListFile}"
FAILING_LIST="${failingList}"
case "\${1:-}" in
--help|-h)
cat <<EOF
Usage: gbrain <command> [options]
Commands:
put <slug> Write a page (content via stdin, YAML frontmatter for metadata)
import <dir> Import markdown directory (batch, content-addressed)
search <query> Keyword search across pages
ask <question> Hybrid semantic + keyword query
EOF
exit 0
;;
put)
if [ "\${2:-}" = "--help" ]; then
echo "Usage: gbrain put <slug>"
exit 0
fi
echo "put \${2:-}" >> "\$LOG"
import)
DIR="\${2:-}"
NO_EMBED=0
JSON=0
shift 2 || true
for arg in "\$@"; do
case "\$arg" in
--no-embed) NO_EMBED=1 ;;
--json) JSON=1 ;;
esac
done
echo "import \$DIR" >> "\$LOG"
{
echo "--- slug=\${2:-} ---"
cat
echo
} >> "\$STDIN_LOG"
echo "dir=\$DIR no_embed=\$NO_EMBED json=\$JSON"
} >> "\$ARGS_LOG"
# Capture file tree from staging dir for assertion-on-shape later.
if [ -d "\$DIR" ]; then
( cd "\$DIR" && find . -type f | sort ) > "\$STAGING_LIST" 2>/dev/null || true
fi
# If failingPaths configured, drop fake entries to sync-failures.jsonl
# (mtime byte-offset snapshot lets the ingest's readNewFailures pick them up).
if [ -n "\$FAILING_LIST" ]; then
mkdir -p "\${HOME}/.gbrain"
IFS='|' read -ra FAIL_PATHS <<< "\$FAILING_LIST"
for p in "\${FAIL_PATHS[@]}"; do
echo "{\\"path\\":\\"\$p\\",\\"error\\":\\"File too large\\",\\"code\\":\\"FILE_TOO_LARGE\\",\\"commit\\":\\"\\",\\"ts\\":\\"2026-05-09T22:00:00Z\\"}" >> "\${HOME}/.gbrain/sync-failures.jsonl"
done
fi
# Count files in staging dir for the imported count.
if [ -d "\$DIR" ]; then
TOTAL=\$(find "\$DIR" -name "*.md" -type f | wc -l | tr -d ' ')
else
TOTAL=0
fi
ERRORS=0
if [ -n "\$FAILING_LIST" ]; then
ERRORS=\$(echo "\$FAILING_LIST" | tr '|' '\\n' | wc -l | tr -d ' ')
fi
IMPORTED=\$((TOTAL - ERRORS))
if [ \$JSON -eq 1 ]; then
echo "{\\"status\\":\\"success\\",\\"duration_s\\":0.1,\\"imported\\":\$IMPORTED,\\"skipped\\":0,\\"errors\\":\$ERRORS,\\"chunks\\":\$IMPORTED,\\"total_files\\":\$TOTAL}"
fi
exit 0
;;
put_page|put-page)
echo "Unknown command: \$1" >&2
exit 2
put|put_page|put-page)
# If new ingest code ever regresses to per-file puts, fail loudly so the
# test signals a real architectural regression.
echo "Unexpected legacy command: \$1" >&2
exit 99
;;
*)
echo "Unknown command: \${1:-<empty>}" >&2
@@ -370,18 +417,18 @@ esac
const binPath = join(binDir, "gbrain");
writeFileSync(binPath, script, "utf-8");
chmodSync(binPath, 0o755);
return { binDir, logFile, stdinFile };
return { binDir, logFile, argsFile, stagingListFile };
}
describe("gstack-memory-ingest writer (gbrain v0.27+ `put` interface)", () => {
it("invokes `gbrain put <slug>` with stdin body, not legacy `put_page`", () => {
describe("gstack-memory-ingest writer (gbrain v0.20+ batch `import` interface)", () => {
it("invokes `gbrain import <dir> --no-embed --json` exactly once with hierarchical staging", () => {
const home = makeTestHome();
const gstackHome = join(home, ".gstack");
mkdirSync(gstackHome, { recursive: true });
const { binDir, logFile, stdinFile } = installFakeGbrain(home);
const { binDir, logFile, argsFile, stagingListFile } = installFakeGbrain(home);
// Single Claude Code session fixture. --include-unattributed lets it write
// even though there's no resolvable git remote in /tmp.
// Single Claude Code session fixture. --include-unattributed lets it
// write even though there's no resolvable git remote in /tmp.
const session =
`{"type":"user","message":{"role":"user","content":"hi"},"timestamp":"2026-05-01T00:00:00Z","cwd":"/tmp/foo"}\n` +
`{"type":"assistant","message":{"role":"assistant","content":"hello"},"timestamp":"2026-05-01T00:00:01Z"}\n`;
@@ -396,35 +443,55 @@ describe("gstack-memory-ingest writer (gbrain v0.27+ `put` interface)", () => {
expect(r.exitCode).toBe(0);
expect(existsSync(logFile)).toBe(true);
const calls = readFileSync(logFile, "utf-8");
expect(calls).toContain("put ");
expect(calls).not.toContain("put_page");
// Verify gbrain was called exactly ONCE with import, not per-file put.
const calls = readFileSync(logFile, "utf-8").trim().split("\n").filter(Boolean);
expect(calls.length).toBe(1);
expect(calls[0]).toMatch(/^import\s+\/.+\/\.staging-ingest-\d+-\d+$/);
// Body should ride stdin and carry frontmatter that gbrain can parse.
// The transcript builder prepends its own frontmatter (agent, session_id,
// etc.) but does NOT include title/type/tags — the writer injects those
// into the existing frontmatter so gbrain pages list/search/filter
// actually surface the page. Asserting all three guards against the
// exact regression that landed in v1.26.0.0 (writer ignored these fields
// entirely; pages landed empty-titled, un-typed, un-tagged).
const stdin = readFileSync(stdinFile, "utf-8");
expect(stdin).toContain("---");
expect(stdin).toMatch(/agent:\s+claude-code/);
expect(stdin).toMatch(/title:\s/);
expect(stdin).toMatch(/type:\s+transcript/);
expect(stdin).toMatch(/tags:/);
// Verify args: --no-embed and --json both present.
const argDump = readFileSync(argsFile, "utf-8");
expect(argDump).toMatch(/no_embed=1/);
expect(argDump).toMatch(/json=1/);
rmSync(home, { recursive: true, force: true });
// D1 regression: staged file lives in a slug-shaped subdirectory tree
// ("transcripts/claude-code/_unattributed/..."), not flat at the staging
// dir root. If writeStaged ever regresses to flat layout, this fails.
const stagedList = readFileSync(stagingListFile, "utf-8");
expect(stagedList).toMatch(/^\.\/transcripts\/claude-code\/.+\.md$/m);
});
// Postgres rejects 0x00 in UTF-8 text columns. Some Claude Code transcripts
// contain NUL inside user-pasted content or tool output. The writer strips
// them at submit time so the brain doesn't return `invalid byte sequence`.
it("strips NUL bytes from the body before piping to `gbrain put`", () => {
// Originally landed in v1.32.0.0 (PR #1411) on the per-file `gbrain put`
// path. Postgres rejects 0x00 in UTF-8 text columns. Some Claude Code
// transcripts contain NUL inside user-pasted content or tool output. The
// renderPageBody helper strips them so the staged .md never carries them
// into gbrain. Adapted for the batch architecture: we read the staged file
// contents instead of fake-gbrain stdin.
it("strips NUL bytes from the staged body before gbrain import", () => {
const home = makeTestHome();
const gstackHome = join(home, ".gstack");
mkdirSync(gstackHome, { recursive: true });
const { binDir, stdinFile } = installFakeGbrain(home);
// Shim that copies staging dir into stagingCopy so we can inspect the
// exact bytes that would have been fed to gbrain.
const binDir = join(home, "fake-bin");
mkdirSync(binDir, { recursive: true });
const stagingCopy = join(home, "staging-copy");
const script = `#!/usr/bin/env bash
case "\${1:-}" in
--help|-h) echo "Usage: gbrain <command>"; echo "Commands:"; echo " import <dir> Import"; exit 0 ;;
import)
DIR="\${2:-}"
cp -R "\$DIR" "${stagingCopy}" 2>/dev/null || true
if [[ " \$* " == *" --json "* ]]; then
echo '{"status":"success","duration_s":0.1,"imported":1,"skipped":0,"errors":0,"chunks":1,"total_files":1}'
fi
exit 0 ;;
*) echo "unknown"; exit 2 ;;
esac
`;
const binPath = join(binDir, "gbrain");
writeFileSync(binPath, script, "utf-8");
chmodSync(binPath, 0o755);
// Pasted content with embedded NUL bytes in a few shapes:
// - inline mid-token: abc\x00def
@@ -445,31 +512,166 @@ describe("gstack-memory-ingest writer (gbrain v0.27+ `put` interface)", () => {
});
expect(r.exitCode).toBe(0);
const stdin = readFileSync(stdinFile, "utf-8");
// The body that hit gbrain MUST NOT contain any 0x00 byte. Even one would
// make Postgres reject the insert with `invalid byte sequence`.
expect(stdin.includes("\x00")).toBe(false);
expect(existsSync(stagingCopy)).toBe(true);
const findMd = spawnSync("find", [stagingCopy, "-name", "*.md", "-type", "f"], {
encoding: "utf-8",
});
const mdPaths = (findMd.stdout || "").trim().split("\n").filter(Boolean);
expect(mdPaths.length).toBeGreaterThan(0);
const body = readFileSync(mdPaths[0], "utf-8");
// The body that gbrain will read MUST NOT contain any 0x00 byte.
expect(body.includes("\x00")).toBe(false);
// But the surrounding content should survive intact — we strip NUL only.
expect(stdin).toContain("abcdef");
expect(stdin).toContain("helloworld");
expect(stdin).toContain("leadingline");
expect(stdin).toContain("line-trailing");
expect(stdin).toContain("clean line");
expect(body).toContain("abcdef");
expect(body).toContain("helloworld");
expect(body).toContain("leadingline");
expect(body).toContain("line-trailing");
expect(body).toContain("clean line");
rmSync(home, { recursive: true, force: true });
});
it("fails fast when gbrain CLI is missing the `put` subcommand", () => {
it("injects title/type/tags into the staged page's YAML frontmatter", () => {
const home = makeTestHome();
const gstackHome = join(home, ".gstack");
mkdirSync(gstackHome, { recursive: true });
// Fake gbrain that ONLY advertises legacy `put_page` (no `put`).
// This shim sleeps long enough to let us read the staging dir mid-run.
// Easier path: intercept by copying the staging dir before gbrain exits.
const binDir = join(home, "fake-bin");
mkdirSync(binDir, { recursive: true });
const stagingCopy = join(home, "staging-copy");
const script = `#!/usr/bin/env bash
case "\${1:-}" in
--help|-h) echo "Usage: gbrain <command>"; echo "Commands:"; echo " import <dir> Import"; exit 0 ;;
import)
DIR="\${2:-}"
cp -R "\$DIR" "${stagingCopy}" 2>/dev/null || true
# Emit valid --json output
if [[ " \$* " == *" --json "* ]]; then
echo '{"status":"success","duration_s":0.1,"imported":1,"skipped":0,"errors":0,"chunks":1,"total_files":1}'
fi
exit 0 ;;
*) echo "unknown"; exit 2 ;;
esac
`;
const binPath = join(binDir, "gbrain");
writeFileSync(binPath, script, "utf-8");
chmodSync(binPath, 0o755);
const session =
`{"type":"user","message":{"role":"user","content":"hi"},"timestamp":"2026-05-01T00:00:00Z","cwd":"/tmp/foo"}\n` +
`{"type":"assistant","message":{"role":"assistant","content":"hello"},"timestamp":"2026-05-01T00:00:01Z"}\n`;
writeClaudeCodeSession(home, "tmp-foo", "abc123", session);
const r = runScript(["--bulk", "--include-unattributed", "--quiet"], {
HOME: home,
GSTACK_HOME: gstackHome,
PATH: `${binDir}:${process.env.PATH || ""}`,
});
expect(r.exitCode).toBe(0);
expect(existsSync(stagingCopy)).toBe(true);
// Find the staged .md file; assert frontmatter has title/type/tags.
// (The exact slug path varies with the staging dir generation, so we
// walk to find a .md and read its head.)
const findMd = spawnSync("find", [stagingCopy, "-name", "*.md", "-type", "f"], {
encoding: "utf-8",
});
const mdPaths = (findMd.stdout || "").trim().split("\n").filter(Boolean);
expect(mdPaths.length).toBeGreaterThan(0);
const body = readFileSync(mdPaths[0], "utf-8");
expect(body).toContain("---");
expect(body).toMatch(/title:\s/);
expect(body).toMatch(/type:\s+transcript/);
expect(body).toMatch(/tags:/);
rmSync(home, { recursive: true, force: true });
});
it("D7: files listed in ~/.gbrain/sync-failures.jsonl are NOT recorded in state", () => {
const home = makeTestHome();
const gstackHome = join(home, ".gstack");
mkdirSync(gstackHome, { recursive: true });
// Write TWO sessions so we can verify one lands and the other doesn't.
const sessionA =
`{"type":"user","message":{"role":"user","content":"a"},"timestamp":"2026-05-01T00:00:00Z","cwd":"/tmp/foo"}\n` +
`{"type":"assistant","message":{"role":"assistant","content":"a"},"timestamp":"2026-05-01T00:00:01Z"}\n`;
const sessionB =
`{"type":"user","message":{"role":"user","content":"b"},"timestamp":"2026-05-02T00:00:00Z","cwd":"/tmp/bar"}\n` +
`{"type":"assistant","message":{"role":"assistant","content":"b"},"timestamp":"2026-05-02T00:00:01Z"}\n`;
writeClaudeCodeSession(home, "tmp-foo", "aaaa", sessionA);
writeClaudeCodeSession(home, "tmp-bar", "bbbb", sessionB);
// Configure fake gbrain to "fail" the second session's staged path.
// The staging-dir-relative path is "transcripts/claude-code/...bbbb.md"
// (Codex sessions take a different prefix). We use a wildcard via the
// last segment matching the session id.
// The fake matches a literal path against the staging-list it captures,
// but since we can't know the exact path ahead of time, we let the
// ingest run once normally, inspect the staging list, then set HOME
// .gbrain/sync-failures.jsonl manually. Simpler: cause the SHA-id
// session-id segment to be in the failing list directly — gbrain's
// failure record uses the staging-relative path.
// Easiest: write a sync-failures.jsonl pre-existing that we OVERWRITE
// after the ingest starts. To keep this deterministic without timing,
// we run a passthrough fake that itself writes the failure entry.
const binDir = join(home, "fake-bin");
mkdirSync(binDir, { recursive: true });
const script = `#!/usr/bin/env bash
case "\${1:-}" in
--help|-h) echo "Usage: gbrain"; echo "Commands:"; echo " import <dir> Import"; exit 0 ;;
import)
DIR="\${2:-}"
# Pick the SECOND .md found in the staging dir and mark it failed in
# ~/.gbrain/sync-failures.jsonl using the dir-relative path. The first
# one lands cleanly.
mkdir -p "\${HOME}/.gbrain"
REL=\$(cd "\$DIR" && find . -name "*.md" -type f | sed 's|^\\./||' | sort | tail -1)
if [ -n "\$REL" ]; then
echo "{\\"path\\":\\"\$REL\\",\\"error\\":\\"File too large\\",\\"code\\":\\"FILE_TOO_LARGE\\",\\"commit\\":\\"\\",\\"ts\\":\\"2026-05-09T22:00:00Z\\"}" >> "\${HOME}/.gbrain/sync-failures.jsonl"
fi
if [[ " \$* " == *" --json "* ]]; then
echo '{"status":"success","duration_s":0.1,"imported":1,"skipped":0,"errors":1,"chunks":1,"total_files":2}'
fi
exit 0 ;;
*) echo "unknown"; exit 2 ;;
esac
`;
const binPath = join(binDir, "gbrain");
writeFileSync(binPath, script, "utf-8");
chmodSync(binPath, 0o755);
const r = runScript(["--bulk", "--include-unattributed", "--quiet"], {
HOME: home,
GSTACK_HOME: gstackHome,
PATH: `${binDir}:${process.env.PATH || ""}`,
});
expect(r.exitCode).toBe(0);
// State file should have exactly 1 session entry (the non-failed one).
const statePath = join(gstackHome, ".transcript-ingest-state.json");
expect(existsSync(statePath)).toBe(true);
const state = JSON.parse(readFileSync(statePath, "utf-8"));
const sessionPaths = Object.keys(state.sessions || {});
expect(sessionPaths.length).toBe(1);
rmSync(home, { recursive: true, force: true });
});
it("emits ERR with system_error and exits non-zero when gbrain CLI is missing the `import` subcommand", () => {
const home = makeTestHome();
const gstackHome = join(home, ".gstack");
mkdirSync(gstackHome, { recursive: true });
// Fake gbrain that advertises ONLY `put` (legacy) — no `import`.
const binDir = join(home, "legacy-bin");
mkdirSync(binDir, { recursive: true });
const script = `#!/usr/bin/env bash
case "\${1:-}" in
--help|-h) echo "Commands:"; echo " put_page Write a page (legacy)"; exit 0 ;;
--help|-h) echo "Commands:"; echo " put <slug> Write a page (legacy)"; exit 0 ;;
*) echo "Unknown command: \$1" >&2; exit 2 ;;
esac
`;
@@ -487,9 +689,69 @@ esac
PATH: `${binDir}:${process.env.PATH || ""}`,
});
// Bulk completes (the script is per-page tolerant), but every page
// surfaces the missing-`put` error rather than the old "Unknown command".
expect(r.stderr + r.stdout).toMatch(/missing `put` subcommand|gbrain CLI not in PATH/);
// D6: system_error sets non-zero exit; orchestrator marks ERR.
expect(r.exitCode).toBe(1);
expect(r.stderr).toMatch(/\[memory-ingest\] ERR:.*missing `import` subcommand|gbrain CLI not in PATH/);
rmSync(home, { recursive: true, force: true });
});
it("--scan-secrets opt-in: skips files with gitleaks findings, lets clean files through", () => {
const home = makeTestHome();
const gstackHome = join(home, ".gstack");
mkdirSync(gstackHome, { recursive: true });
const { binDir } = installFakeGbrain(home);
// Fake gitleaks: prints a "finding" for any file whose path contains
// "dirty", clean for everything else. The fake-gbrain shim doesn't
// interfere — gitleaks is invoked from preparePages before staging.
const fakeGitleaksDir = join(home, "fake-gitleaks-bin");
mkdirSync(fakeGitleaksDir, { recursive: true });
const fakeGitleaks = `#!/usr/bin/env bash
# gitleaks detect --no-git --source <path> --report-format json --report-path /dev/stdout --exit-code 0
# We just need to emit a JSON findings array on stdout. Find the --source arg.
SRC=""
while [ "$#" -gt 0 ]; do
case "$1" in
--source) SRC="$2"; shift 2 ;;
*) shift ;;
esac
done
if echo "$SRC" | grep -q dirty; then
echo '[{"RuleID":"fake-rule","Description":"fake finding","StartLine":1,"Match":"REDACTED","Secret":"AKIAFAKEFAKEFAKE12345"}]'
else
echo '[]'
fi
exit 0
`;
const gitleaksBin = join(fakeGitleaksDir, "gitleaks");
writeFileSync(gitleaksBin, fakeGitleaks, "utf-8");
chmodSync(gitleaksBin, 0o755);
// Two sessions: one "clean" (filename has no "dirty"), one "dirty"
// (filename contains "dirty" so the fake gitleaks reports a finding).
const sessionA =
`{"type":"user","message":{"role":"user","content":"clean"},"timestamp":"2026-05-01T00:00:00Z","cwd":"/tmp/foo"}\n`;
const sessionB =
`{"type":"user","message":{"role":"user","content":"dirty"},"timestamp":"2026-05-02T00:00:00Z","cwd":"/tmp/bar"}\n`;
writeClaudeCodeSession(home, "tmp-foo", "cleansess123", sessionA);
// Force the path to contain the "dirty" marker.
writeClaudeCodeSession(home, "tmp-dirty-bar", "dirtysess456", sessionB);
// Run with --scan-secrets enabled. Combine the fake gitleaks bin
// before fake-gbrain in PATH so both shims resolve.
const r = runScript(["--bulk", "--include-unattributed", "--scan-secrets"], {
HOME: home,
GSTACK_HOME: gstackHome,
PATH: `${fakeGitleaksDir}:${binDir}:${process.env.PATH || ""}`,
});
expect(r.exitCode).toBe(0);
// Bulk report shows skipped (secret-scan) >= 1
expect(r.stdout).toMatch(/skipped \(secret-scan\):\s+1/);
// Stderr from the secret-scan match path (printed when !quiet) includes the dirty path's basename.
// Match generously: any occurrence of "secret-scan match" line.
expect(r.stderr + r.stdout).toMatch(/secret-scan match/);
rmSync(home, { recursive: true, force: true });
});
+200
View File
@@ -0,0 +1,200 @@
import { describe, test, expect } from 'bun:test';
import { spawnSync } from 'child_process';
import * as path from 'path';
import * as fs from 'fs';
import * as os from 'os';
const ROOT = path.resolve(import.meta.dir, '..');
const SETUP_SCRIPT = path.join(ROOT, 'setup');
describe('setup: Conductor worktree guard', () => {
test('setup contains the real-dir guard before the ln -snf into ~/.claude/skills/', () => {
const content = fs.readFileSync(SETUP_SCRIPT, 'utf-8');
const guardIdx = content.indexOf('_SKIP_CLAUDE_REGISTER=0');
const lnIdx = content.indexOf('ln -snf "$SOURCE_GSTACK_DIR" "$CLAUDE_GSTACK_LINK"');
expect(guardIdx).toBeGreaterThan(-1);
expect(lnIdx).toBeGreaterThan(-1);
expect(guardIdx).toBeLessThan(lnIdx);
});
test('guard resolves the existing real dir with `pwd -P` and compares against source', () => {
const content = fs.readFileSync(SETUP_SCRIPT, 'utf-8');
expect(content).toContain('[ -d "$CLAUDE_GSTACK_LINK" ] && [ ! -L "$CLAUDE_GSTACK_LINK" ]');
expect(content).toContain('cd "$CLAUDE_GSTACK_LINK" 2>/dev/null && pwd -P');
expect(content).toContain('"$_EXISTING_REAL" != "$SOURCE_GSTACK_DIR"');
});
test('skip branch prints "registration skipped" + remediation hint', () => {
const content = fs.readFileSync(SETUP_SCRIPT, 'utf-8');
expect(content).toContain('Skipping Claude skill registration');
expect(content).toContain('claude registration skipped');
expect(content).toContain('rm -rf $CLAUDE_GSTACK_LINK');
});
// Reproduce the BSD/macOS `ln -snf` behavior that caused the bug, then
// confirm the guard avoids it. This is a behavioral test of the guard logic
// running in an isolated tmpdir — not the full setup script.
test('BSD ln -snf into an existing real dir creates a child symlink (bug reproduces)', () => {
const tmp = fs.mkdtempSync(path.join(os.tmpdir(), 'gstack-setup-guard-'));
try {
const source = path.join(tmp, 'source-worktree');
const dest = path.join(tmp, 'dest-real-dir');
fs.mkdirSync(source);
fs.mkdirSync(dest);
// The buggy invocation: target dest is an existing real dir.
const result = spawnSync('ln', ['-snf', source, dest], { encoding: 'utf-8' });
expect(result.status).toBe(0);
// Child symlink leaked inside dest.
const leaked = path.join(dest, path.basename(source));
expect(fs.existsSync(leaked)).toBe(true);
expect(fs.lstatSync(leaked).isSymbolicLink()).toBe(true);
expect(fs.readlinkSync(leaked)).toBe(source);
// dest itself stayed a real directory (not replaced).
expect(fs.lstatSync(dest).isSymbolicLink()).toBe(false);
expect(fs.lstatSync(dest).isDirectory()).toBe(true);
} finally {
fs.rmSync(tmp, { recursive: true, force: true });
}
});
test('guard logic refuses to ln when dest is a real dir pointing elsewhere', () => {
const tmp = fs.mkdtempSync(path.join(os.tmpdir(), 'gstack-setup-guard-'));
try {
const source = path.join(tmp, 'source-worktree');
const dest = path.join(tmp, 'dest-real-dir');
fs.mkdirSync(source);
fs.mkdirSync(dest);
// Inline the guard logic from setup. If it triggers, $_SKIP=1 is echoed
// and no ln is performed; otherwise ln runs and we'd see the leak.
const script = `
set -e
SOURCE_GSTACK_DIR='${source}'
CLAUDE_GSTACK_LINK='${dest}'
_SKIP_CLAUDE_REGISTER=0
if [ -d "$CLAUDE_GSTACK_LINK" ] && [ ! -L "$CLAUDE_GSTACK_LINK" ]; then
_EXISTING_REAL=$(cd "$CLAUDE_GSTACK_LINK" 2>/dev/null && pwd -P || echo "")
if [ -n "$_EXISTING_REAL" ] && [ "$_EXISTING_REAL" != "$SOURCE_GSTACK_DIR" ]; then
_SKIP_CLAUDE_REGISTER=1
fi
fi
if [ "$_SKIP_CLAUDE_REGISTER" -eq 1 ]; then
echo "SKIP"
else
ln -snf "$SOURCE_GSTACK_DIR" "$CLAUDE_GSTACK_LINK"
echo "LINKED"
fi
`;
const result = spawnSync('bash', ['-c', script], { encoding: 'utf-8' });
expect(result.status).toBe(0);
expect(result.stdout.trim()).toBe('SKIP');
// No child symlink leaked.
const leaked = path.join(dest, path.basename(source));
expect(fs.existsSync(leaked)).toBe(false);
} finally {
fs.rmSync(tmp, { recursive: true, force: true });
}
});
test('guard allows ln when dest does not exist (fresh install path)', () => {
const tmp = fs.mkdtempSync(path.join(os.tmpdir(), 'gstack-setup-guard-'));
try {
const source = path.join(tmp, 'source-worktree');
const dest = path.join(tmp, 'fresh-dest');
fs.mkdirSync(source);
const script = `
set -e
SOURCE_GSTACK_DIR='${source}'
CLAUDE_GSTACK_LINK='${dest}'
_SKIP_CLAUDE_REGISTER=0
if [ -d "$CLAUDE_GSTACK_LINK" ] && [ ! -L "$CLAUDE_GSTACK_LINK" ]; then
_EXISTING_REAL=$(cd "$CLAUDE_GSTACK_LINK" 2>/dev/null && pwd -P || echo "")
if [ -n "$_EXISTING_REAL" ] && [ "$_EXISTING_REAL" != "$SOURCE_GSTACK_DIR" ]; then
_SKIP_CLAUDE_REGISTER=1
fi
fi
if [ "$_SKIP_CLAUDE_REGISTER" -eq 1 ]; then
echo "SKIP"
else
ln -snf "$SOURCE_GSTACK_DIR" "$CLAUDE_GSTACK_LINK"
echo "LINKED"
fi
`;
const result = spawnSync('bash', ['-c', script], { encoding: 'utf-8' });
expect(result.status).toBe(0);
expect(result.stdout.trim()).toBe('LINKED');
expect(fs.lstatSync(dest).isSymbolicLink()).toBe(true);
expect(fs.readlinkSync(dest)).toBe(source);
} finally {
fs.rmSync(tmp, { recursive: true, force: true });
}
});
test('guard allows ln when dest is an existing symlink (upgrade-in-place path)', () => {
const tmp = fs.mkdtempSync(path.join(os.tmpdir(), 'gstack-setup-guard-'));
try {
const source = path.join(tmp, 'new-source');
const oldSource = path.join(tmp, 'old-source');
const dest = path.join(tmp, 'dest-symlink');
fs.mkdirSync(source);
fs.mkdirSync(oldSource);
fs.symlinkSync(oldSource, dest);
// Existing symlink: -L is true, so the guard does NOT trigger. ln -snf
// should atomically retarget the symlink to the new source.
const script = `
set -e
SOURCE_GSTACK_DIR='${source}'
CLAUDE_GSTACK_LINK='${dest}'
_SKIP_CLAUDE_REGISTER=0
if [ -d "$CLAUDE_GSTACK_LINK" ] && [ ! -L "$CLAUDE_GSTACK_LINK" ]; then
_EXISTING_REAL=$(cd "$CLAUDE_GSTACK_LINK" 2>/dev/null && pwd -P || echo "")
if [ -n "$_EXISTING_REAL" ] && [ "$_EXISTING_REAL" != "$SOURCE_GSTACK_DIR" ]; then
_SKIP_CLAUDE_REGISTER=1
fi
fi
if [ "$_SKIP_CLAUDE_REGISTER" -eq 1 ]; then
echo "SKIP"
else
ln -snf "$SOURCE_GSTACK_DIR" "$CLAUDE_GSTACK_LINK"
echo "LINKED"
fi
`;
const result = spawnSync('bash', ['-c', script], { encoding: 'utf-8' });
expect(result.status).toBe(0);
expect(result.stdout.trim()).toBe('LINKED');
expect(fs.readlinkSync(dest)).toBe(source);
} finally {
fs.rmSync(tmp, { recursive: true, force: true });
}
});
test('guard allows ln when dest is a real dir already pointing to source (self-rerun)', () => {
const tmp = fs.mkdtempSync(path.join(os.tmpdir(), 'gstack-setup-guard-'));
try {
const source = path.join(tmp, 'source-worktree');
fs.mkdirSync(source);
// Mirror setup's SOURCE_GSTACK_DIR resolution (`pwd -P`) so the comparison
// is fair on macOS where /tmp itself is a symlink to /private/tmp.
const resolvedSource = fs.realpathSync(source);
// Degenerate case: existing real dir IS the source.
const dest = source;
const script = `
set -e
SOURCE_GSTACK_DIR='${resolvedSource}'
CLAUDE_GSTACK_LINK='${dest}'
_SKIP_CLAUDE_REGISTER=0
if [ -d "$CLAUDE_GSTACK_LINK" ] && [ ! -L "$CLAUDE_GSTACK_LINK" ]; then
_EXISTING_REAL=$(cd "$CLAUDE_GSTACK_LINK" 2>/dev/null && pwd -P || echo "")
if [ -n "$_EXISTING_REAL" ] && [ "$_EXISTING_REAL" != "$SOURCE_GSTACK_DIR" ]; then
_SKIP_CLAUDE_REGISTER=1
fi
fi
echo "skip=$_SKIP_CLAUDE_REGISTER"
`;
const result = spawnSync('bash', ['-c', script], { encoding: 'utf-8' });
expect(result.status).toBe(0);
expect(result.stdout.trim()).toBe('skip=0');
} finally {
fs.rmSync(tmp, { recursive: true, force: true });
}
});
});