mirror of https://github.com/garrytan/gstack.git synced 2026-05-08 06:26:45 +02:00

Files

T

Garry Tan f44de365c5 v1.27.0.0 feat: /setup-gbrain Path 4 (remote MCP) + brain → artifacts rename (#1351 )

* feat: gstack-gbrain-mcp-verify helper for remote MCP probe

Probes a remote gbrain MCP endpoint with bearer auth. POSTs initialize,
classifies failures into NETWORK / AUTH / MALFORMED with one-line
remediation hints, and runs a tools/list capability probe to detect
sources_add MCP support (forward-compat for when gbrain ships URL ingest).

Token consumed from GBRAIN_MCP_TOKEN env, never argv. Required to set
both 'application/json' AND 'text/event-stream' in Accept; that gotcha
costs 10 minutes of debugging when missed (regression-tested).

Live-verified against wintermute (gbrain v0.27.1).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat: gstack-artifacts-init + gstack-artifacts-url helpers

artifacts-init replaces brain-init with provider choice (gh / glab /
manual), per-user gstack-artifacts-$USER repo, HTTPS-canonical storage in
~/.gstack-artifacts-remote.txt, and a "send this to your brain admin"
hookup printout. Always prints the command, never auto-executes — gbrain
v0.26.x has no admin-scope MCP probe (codex Finding #3).

artifacts-url centralizes HTTPS↔SSH/host/owner-repo conversion so callers
don't each string-mangle (codex Finding #10). The remote-conflict check in
artifacts-init compares at the canonical level so re-running with HTTPS
input doesn't trip on a stored SSH URL for the same logical repo.

The "URL form not supported" branch prints a two-line clone-then-path
form for gbrain v0.26.x; the supported branch is a one-liner with --url
ready for when gbrain ships URL ingest.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat: extend gstack-gbrain-detect with mcp_mode + artifacts_remote

Adds two new fields to detect's JSON output:

- gbrain_mcp_mode: local-stdio | remote-http | none
  Resolved via 3-tier fallback (codex Finding D3): claude mcp get --json
  → claude mcp list text-grep → ~/.claude.json jq read. If Anthropic moves
  the file format, the first two tiers absorb it.

- gstack_artifacts_remote: HTTPS URL from ~/.gstack-artifacts-remote.txt
  Falls back to ~/.gstack-brain-remote.txt during the v1.27.0.0 migration
  window so detect doesn't return empty between upgrade and migration.

Existing detect tests still pass (15/15). New 19 tests cover every fallback
tier independently, plus a schema regression for /sync-gbrain compat.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat: setup-gbrain Path 4 (remote MCP) + artifacts rename

Path 4 lets users paste an HTTPS MCP URL + bearer token and registers it
as an HTTP-transport MCP without needing a local gbrain CLI install. The
flow:

- Step 2 gains a fourth option (Remote gbrain MCP)
- Step 4 adds Path 4 sub-flow: collect URL, secret-read bearer, verify
  via gstack-gbrain-mcp-verify (NETWORK / AUTH / MALFORMED classifier)
- Step 5 (local doctor), Step 7.5 (transcript ingest), Step 5a's stdio
  branch all skip on Path 4
- Step 5a adds an HTTP+bearer registration form: claude mcp add
  --transport http --header "Authorization: Bearer ..."
- Step 7 renamed "session memory sync" → "artifacts sync" and now calls
  gstack-artifacts-init (which always prints the brain-admin hookup
  command — no auto-execute, codex Finding #3)
- Step 8 CLAUDE.md block branches: remote-http includes URL + server
  version (never the token); local-stdio keeps engine + config-file
- Step 9 smoke test on Path 4 prints the curl-equivalent for
  post-restart verification (MCP tools aren't visible mid-session)
- Step 10 verdict block has separate templates per mode

Idempotency: re-running with gbrain_mcp_mode=remote-http already in
detect output skips Step 2 entirely and goes to verification.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* refactor: rename gbrain_sync_mode → artifacts_sync_mode (v1.27.0.0 prep)

Hard rename, no dual-read alias (codex Finding D4). The on-disk migration
script (Phase C, separate commit) renames the config key in users'
~/.gstack/config.yaml and any CLAUDE.md blocks.

Touched call sites:
- bin/gstack-config defaults + validation + list/defaults output
- bin/gstack-gbrain-detect (gstack_brain_sync_mode field still emitted
  with the same name for downstream-tool compat; reads new key)
- bin/gstack-brain-sync, bin/gstack-brain-enqueue, bin/gstack-brain-uninstall
- bin/gstack-timeline-log (comment ref)
- scripts/resolvers/preamble/generate-brain-sync-block.ts: renames key,
  branches on gbrain_mcp_mode=remote-http to emit "ARTIFACTS_SYNC:
  remote-mode (managed by brain server <host>)" instead of the local
  mode/queue/last_push line (codex Finding #11)
- bin/gstack-brain-restore + bin/gstack-gbrain-source-wireup: read
  ~/.gstack-artifacts-remote.txt with ~/.gstack-brain-remote.txt fallback
  during the migration window
- bin/gstack-artifacts-init: tolerant of unrecognized URL forms (local
  paths, file://, self-hosted gitea) so test infrastructure and unusual
  remotes work without canonicalization
- test/brain-sync.test.ts: gstack-brain-init → gstack-artifacts-init
- test/skill-e2e-brain-privacy-gate.test.ts: artifacts_sync_mode keys
- test/gen-skill-docs.test.ts: budget 35K → 36.5K for the new MCP-mode
  probe in the preamble resolver
- health/SKILL.md.tmpl, sync-gbrain/SKILL.md.tmpl: comment + verdict line

Hard delete:
- bin/gstack-brain-init (replaced by bin/gstack-artifacts-init in v1.27.0.0)
- test/gstack-brain-init-gh-mock.test.ts (replaced by gstack-artifacts-init.test.ts)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore: regenerate SKILL.md files after artifacts-sync rename

Mechanical regen via \`bun run gen:skill-docs --host all\`. All */SKILL.md
files reflect the renamed config key (gbrain_sync_mode →
artifacts_sync_mode), the renamed remote-helper file
(~/.gstack-artifacts-remote.txt with brain fallback), the renamed init
script (gstack-artifacts-init), and the new ARTIFACTS_SYNC: remote-mode
status line that fires when a remote-http MCP is registered.

Golden fixtures (test/fixtures/golden/*-ship-SKILL.md) refreshed to match
the regenerated default-ship output.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat: v1.27.0.0 migration — gstack-brain → gstack-artifacts rename

Journaled, interruption-safe migration. Six steps, each writes to
~/.gstack/.migrations/v1.27.0.0.journal on success; re-entry resumes
from the next un-done step. On final success, journal is replaced by
~/.gstack/.migrations/v1.27.0.0.done.

Steps:
1. gh_repo_renamed       gh/glab repo rename gstack-brain-$USER →
                         gstack-artifacts-$USER (idempotent: detects
                         already-renamed and skips)
2. remote_txt_renamed    mv ~/.gstack-brain-remote.txt → artifacts file,
                         rewriting URL path to match the new repo name
3. config_key_renamed    sed -i in ~/.gstack/config.yaml flips
                         gbrain_sync_mode → artifacts_sync_mode
4. claude_md_block       sed flips "- Memory sync:" → "- Artifacts sync:"
                         in cwd CLAUDE.md and ~/.gstack/CLAUDE.md
5. sources_swapped       gbrain sources add NEW (verify) → remove OLD
                         (codex Finding #6: add-before-remove ordering,
                         no downtime window). On remote-MCP mode, prints
                         commands for the brain admin instead of executing.
6. done                  touchfile + delete journal

User opt-out: any "n" or "skip-for-now" answer at the initial prompt
writes a marker file that prevents re-prompting; user can re-invoke
via /setup-gbrain --rerun-migration.

11 unit tests cover: nothing-to-migrate, GitHub happy path, idempotent
re-run, journal-resume mid-flight, remote-MCP print-only path,
add-before-remove ordering verification, add-fail → old source stays
registered, CLAUDE.md field rewrite.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test: regression suite + E2E for v1.27.0.0 rename

Three new regression tests guard the rename's blast radius (per codex
Findings #1, #8, #9, #12):

- test/no-stale-gstack-brain-refs.test.ts: greps bin/, scripts/, *.tmpl,
  test/ for forbidden identifiers (gstack-brain-init, gbrain_sync_mode);
  fails CI if any non-allowlisted file references them.
- test/post-rename-doc-regen.test.ts: confirms gen-skill-docs output has
  no stale references in any */SKILL.md (the cross-product blind spot).
- test/setup-gbrain-path4-structure.test.ts: structural lint over the
  Path 4 prose contract — STOP gates after verify failure, never-write-
  token rules, mode-aware CLAUDE.md block, bearer always via env-var.

Two new gate-tier E2E tests (deterministic stub HTTP server, fixed inputs):

- test/skill-e2e-setup-gbrain-remote.test.ts: Path 4 happy path. Stubs
  an HTTP MCP server, drives the skill via Agent SDK with a stubbed
  bearer, asserts claude.json gets the http MCP entry, CLAUDE.md gets
  the remote-http block, the secret token NEVER leaks to CLAUDE.md.
- test/skill-e2e-setup-gbrain-bad-token.test.ts: stub server returns 401;
  asserts the AUTH classifier hint surfaces, no MCP registration occurs,
  CLAUDE.md is unchanged. Regression guard for the "verify failed → STOP"
  rule.

touchfiles.ts: setup-gbrain-remote and setup-gbrain-bad-token added at
gate-tier so CI catches Path 4 regressions on every PR.

Plus a few comment refs flipped: bin/gstack-jsonl-merge, bin/gstack-timeline-log
(legacy gstack-brain-init mentions in headers).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* release: v1.27.0.0 — /setup-gbrain Path 4 + brain → artifacts rename

Bumps VERSION 1.26.4.0 → 1.27.0.0 (MINOR per CLAUDE.md scale-aware bump
guidance: ~1500 line net change including a new path in /setup-gbrain,
two new bin helpers, a journaled migration, 59 new tests, and a config
key rename across the codebase).

CHANGELOG entry covers: Path 4 (Remote MCP) end-to-end, the brain →
artifacts rename, the journaled migration, the verify-helper error
classifier, the artifacts-init multi-host provider choice. Includes
the canonical Garry-voice headline + numbers table + audience close
per the release-summary format.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test: demote setup-gbrain Path 4 E2E to periodic-tier

The Agent SDK E2E tests for Path 4 (skill-e2e-setup-gbrain-remote and
skill-e2e-setup-gbrain-bad-token) are inherently non-deterministic —
the model interprets "follow Path 4 only" prompts flexibly and can
skip Step 8 (CLAUDE.md write) or shortcut past the verify helper, which
makes the gate-tier assertions flaky.

The deterministic gate coverage for Path 4 is in
test/setup-gbrain-path4-structure.test.ts: a fast structural lint that
catches AUQ-pacing regressions and prose contract drift in <200ms with
zero token spend. That test is the right tool for catching the failure
mode the gate-tier was meant to guard against.

The Agent SDK E2E tests stay available on-demand for periodic-tier runs
(EVALS=1 EVALS_TIER=periodic bun test test/skill-e2e-setup-gbrain-*.test.ts).
Also tightened the verify-error assertion to the literal field shape
("error_class": "AUTH") instead of a substring match that false-matches
the parent claude session's "needs-auth" MCP discovery markers.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore: sync package.json version to 1.27.0.0

VERSION was bumped to 1.27.0.0 in f6ec11eb but package.json was not
updated in the same commit. The gen-skill-docs.test.ts assertion
"package.json version matches VERSION file" caught the drift.

This is the DRIFT_STALE_PKG case the /ship Step 12 idempotency check
is designed for; the fix is the documented sync-only repair (no
re-bump, package.json synced to existing VERSION).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

2026-05-06 19:37:53 -07:00

11 KiB

Raw Blame History

gstack memory ingest — what it does, what stays local, what you can do with it

This is the user-facing reference for the V1 transcript + memory ingest feature in /setup-gbrain. If you ran /setup-gbrain and it asked "Ingest THIS repo's transcripts into gbrain?", this doc explains what happens after you say yes.

What gets ingested

Source	Type	Where	Sensitivity
Claude Code session JSONL	`transcript`	`~/.claude/projects/*/`	High — full conversations including tool I/O
Codex CLI session JSONL	`transcript`	`~/.codex/sessions/YYYY/MM/DD/`	High
Cursor session SQLite (V1.0.1)	`transcript`	`~/Library/Application Support/Cursor/`	Same — deferred V1.0.1
Eureka log	`eureka`	`~/.gstack/analytics/eureka.jsonl`	Medium — your insights, often non-secret
Project learnings	`learning`	`~/.gstack/projects/<slug>/learnings.jsonl`	Medium
Project timeline	`timeline`	`~/.gstack/projects/<slug>/timeline.jsonl`	Low
CEO plans	`ceo-plan`	`~/.gstack/projects/<slug>/ceo-plans/*.md`	Medium
Design docs	`design-doc`	`~/.gstack/projects/<slug>/-design-.md`	Medium
Retros	`retro`	`~/.gstack/projects/<slug>/retros/*.md`	Medium
Builder profile	`builder-profile-entry`	`~/.gstack/builder-profile.jsonl`	Low

What stays local

State files (~/.gstack/.gbrain-sync-state.json, ~/.gstack/.transcript-ingest-state.json, ~/.gstack/.gbrain-engine-cache.json, ~/.gstack/.gbrain-errors.jsonl) are local-only per ED1 (state file sync semantics decision). They are not synced via the brain remote.
Sessions with no resolvable git remote (running in /tmp/, scratch dirs, etc.) are skipped by default. Pass --include-unattributed to the ingest helper to opt them in.
Repos under a deny trust policy (set in /setup-gbrain Step 6) are skipped — neither code nor transcripts from those repos ingest.

What gets scanned for secrets

Every ingested page passes through gitleaks before write (per D19 — replaces the regex scanner that previously ran only on staged git diffs). Gitleaks is industry-standard, covers:

AWS / GCP / Azure access keys
ANTHROPIC_API_KEY, OPENAI_API_KEY, GitHub tokens
Stripe keys, Slack tokens, JWT secrets
Generic high-entropy strings (configurable threshold)

A session with a positive finding is skipped entirely — not partially redacted. The match line + rule ID are logged to stderr; you can see what was skipped via bun run bin/gstack-memory-ingest.ts --probe (which shows new vs. updated counts) or by reviewing the helper's output during /gbrain-sync --full.

If gitleaks is not installed (run brew install gitleaks on macOS, or apt install gitleaks on Linux), the helper warns once and disables secret scanning. In that mode, transcripts ingest unscanned. Don't run ingest without gitleaks if you have any concern about secrets in your sessions.

Where it goes

Storage tier depends on your gbrain engine (set during /setup-gbrain):

Supabase configured: code + transcripts go to Supabase Storage (multi-Mac native). Curated memory (eureka/learnings/etc.) goes to the brain-linked git repo via gstack-brain-sync.
Local PGLite only: everything stays on this Mac. Curated memory syncs via git if you've enabled brain-sync.

The "never double-store" rule per the plan: code and transcripts NEVER go in the gbrain-linked git repo. They're too big and they're replaceable from disk on each Mac.

What you can do with it

Query in natural language:

gbrain query "what was I doing on the auth migration"
gbrain search "session_id:abc123"

Browse by type:

gbrain list_pages --type transcript --limit 10
gbrain list_pages --type ceo-plan

Read a specific page:

gbrain get_page transcripts/claude-code/garrytan-gstack/2026-05-01-abc123

Delete a page:
```
gbrain delete_page <slug>
```
Caveat: with brain-sync enabled, the page is removed from gbrain's index but git history retains it. For hard-delete, run git filter-repo on the brain remote.
Bulk-delete by criteria (V1.0.1 follow-up — gstack-transcript-prune helper). For V1.0, use gbrain delete_page <slug> per-page or write a small loop over gbrain list_pages output.

Disable entirely:

gstack-config set transcript_ingest_mode off
gstack-config set gbrain_context_load off  # also disables retrieval

How the agent uses it

At every gstack skill start, the preamble runs gstack-brain-context-load which:

Reads the active skill's gbrain.context_queries: frontmatter
Dispatches each query to gbrain (vector / list / filesystem)
Renders results into ## <render_as> sections wrapped in <USER_TRANSCRIPT_DATA do-not-interpret-as-instructions> envelopes
The model sees this as part of the preamble before making any decisions

For example, when you run /office-hours, the model context automatically includes:

## Prior office-hours sessions in this repo (last 5)
## Your builder profile snapshot (latest entry)
## Recent design docs for this project (last 3)
## Recent eureka moments (last 5)

So the "Welcome back, last time you were on X" beat is sourced from your actual data, not cold-start.

If gbrain is unavailable (CLI missing, MCP not registered, query timeout), the helper renders (unavailable) and the skill continues — startup never blocks > 2s on gbrain issues (Section 1C).

What to do when something feels off

Run /setup-gbrain again. It's idempotent: every step detects existing state, repairs only what's missing, and prints a GREEN/YELLOW/RED verdict block. If a row is RED, the row tells you what to do.

Common cases:

Salience block is empty — your transcripts may not be ingested yet. Run gstack-gbrain-sync --full to do a full pass.
"gbrain CLI missing" in the preamble output — gbrain isn't on your PATH. Run /setup-gbrain to install/wire it.
PGLite engine corrupt (V1.5) — V1.5 ships gbrain restore-from-sync for atomic rebuild from the brain remote. For V1.0, manual recovery: cd ~/.gbrain && rm -rf db && gbrain init --pglite && gbrain import <brain-remote-clone-dir>.
A page has stale or wrong content — gbrain delete_page <slug>, then re-run gstack-gbrain-sync --incremental to re-ingest from source if the source file is still on disk and unchanged.

Privacy + audit

Every secretScanFile finding is logged to stderr at ingest time.
Every gbrain put/delete is logged to ~/.gstack/.gbrain-errors.jsonl with {ts, op, duration_ms, outcome} for forensic tracing.
~/.gstack/.gbrain-engine-cache.json shows which storage tier is active (PGLite vs Supabase).
Brain-sync git history shows every curated artifact push with the user's git identity.

If you find a transcript page that contains a secret gitleaks missed, the recovery path is:

gbrain delete_page <slug> — removes from index immediately
Rotate the secret (rotate it anyway as a defensive measure)
If brain-sync is on: git filter-repo --invert-paths --path <relative-path> on the brain remote for hard-delete from history
File a gitleaks issue with the pattern (or extend the gitleaks config at ~/.gitleaks.toml).

Path 4: Remote MCP setup (v1.27.0.0+)

If you don't run gbrain locally — you have a teammate or another machine running gbrain serve over HTTP, accessible via Tailscale, ngrok, or internal LAN — /setup-gbrain Path 4 is the one-paste flow.

You provide:

The MCP URL (e.g., https://wintermute.tail554574.ts.net:3131/mcp)
A bearer token (issued by the brain admin via gbrain access-token issue)

What /setup-gbrain does:

Verifies the URL + token via gstack-gbrain-mcp-verify. Three failure modes get classified with one-line remediation hints: NETWORK ("check Tailscale/DNS"), AUTH ("rotate token"), MALFORMED ("Accept-header gotcha — pass both application/json AND text/event-stream").

Registers the MCP at user scope:

claude mcp add --scope user --transport http gbrain "$URL" \
  --header "Authorization: Bearer $TOKEN"

Skips local install, local doctor, transcript ingest, and federated source registration. All four require a local gbrain CLI that Path 4 doesn't install.
Optionally provisions a gstack-artifacts-$USER private repo on GitHub or GitLab and prints the one-line gbrain sources add command for your brain admin to run on the brain host.

Token storage trade-off

The bearer token lives in ~/.claude.json (mode 0600), where Claude Code stores every MCP server's credentials. During claude mcp add --header "Authorization: Bearer $TOKEN", the token is briefly visible in process argv (~10ms) — visible to ps running concurrently. The window is small but it's not zero.

Mitigations we've considered:

Stdin or env-var input form for headers — would close the argv window. As of Claude Code v1.0.x, the CLI doesn't expose either. When it does, /setup-gbrain Path 4 will switch automatically.
Keychain storage — explicitly out of scope (the token's resting state in ~/.claude.json is the existing trust surface for every MCP credential; expanding to Keychain would touch every MCP server, not just gbrain).

Why Path 4 is "always print" for the brain-admin hookup

gstack-artifacts-init always prints the gbrain sources add command labeled "Send this to your brain admin" — even when the user IS the brain admin (consistent UX, no mode-detection fragility).

A previous design proposed probing whether the user's bearer has admin scope (via a benign MCP write call like add_tag) and auto-executing the source registration when scope was sufficient. The design review flagged that page-write doesn't actually prove source-management permission — those are different scopes in any sensible auth model. Until gbrain ships:

a mcp__gbrain__whoami capability tool that returns the bearer's scope set, AND
a mcp__gbrain__sources_add MCP tool with admin-scope gating

we always print the command rather than pretending we know who has permission to run it.

CLAUDE.md block in Path 4

Distinct from local-stdio mode. Token is never written to CLAUDE.md (many projects check CLAUDE.md into git). The block records the URL, the verified server version, the artifacts repo URL (if provisioned), and the per-repo trust policy.

## GBrain Configuration (configured by /setup-gbrain)
- Mode: remote-http
- MCP URL: https://wintermute.tail554574.ts.net:3131/mcp
- Server version: gbrain v0.27.1
- Setup date: 2026-05-06
- MCP registered: yes (user scope)
- Token: stored in ~/.claude.json (do not commit; never written to CLAUDE.md)
- Artifacts repo: github.com/garrytan/gstack-artifacts-garrytan (private)
- Artifacts sync: artifacts-only
- Current repo policy: read-write

Token rotation

Server-side. When verify hits AUTH (e.g., the brain admin rotated the token), the helper says: "rotate token on the brain host, re-run /setup-gbrain." On wintermute or wherever your gbrain server lives:

gbrain access-token rotate    # invalidates old, issues new

(See gstack/setup-gbrain/SKILL.md.tmpl for the full Path 4 flow plus the gbrain enhancement requests around scoped tokens that would let gstack auto-rotate in V2.)

11 KiB Raw Blame History