Files
gstack/bin/gstack-brain-sync
Garry Tan 2014557e7f v1.12.0.0 feat: /setup-gbrain — coding-agent onboarding for gbrain (#1183)
* feat(setup-gbrain): add gstack-gbrain-repo-policy bin helper

Per-remote trust-tier store for the forthcoming /setup-gbrain skill.
Tiers are the D3 triad (read-write / read-only / deny), keyed by a
normalized remote URL so ssh-shorthand and https variants collapse to
the same entry. The file carries _schema_version: 2 (D2-eng); legacy
`allow` values from pre-D3 experiments auto-migrate to `read-write`
on first read, idempotent, with a one-shot log line.

Pure bash + jq to match the existing gstack-brain-* family. Atomic
writes via tmpfile + rename. Policy file mode 0600. Corrupt files
quarantine to .corrupt-<ts> and start fresh.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test(setup-gbrain): unit tests for gstack-gbrain-repo-policy

24 tests covering normalize (ssh/https/shorthand/uppercase collapse to
one key), set/get round-trip, all three D3 tiers accepted, invalid
tiers rejected, file mode 0600, _schema_version field written on fresh
files, legacy allow migration (including idempotence and preservation
of non-allow entries), corrupt-JSON quarantine + fresh-file recovery,
list output sorting, and get-without-arg auto-detect against a git
repo with no origin.

All tests green against a per-test tmpdir GSTACK_HOME so nothing
leaks into the real ~/.gstack.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(setup-gbrain): add gstack-gbrain-detect state reporter

Pure-introspection JSON emitter for the /setup-gbrain skill's
start-up branching. Reports: gbrain presence + version on PATH,
~/.gbrain/config.json existence + engine, `gbrain doctor --json`
health (wrapped in timeout 5s to match the /health D6 pattern),
gstack-brain-sync mode via gstack-config, and ~/.gstack/.git
presence for the memory-sync feature.

Never modifies state. Always emits valid JSON even when every check
is false. Handles malformed ~/.gbrain/config.json without crashing
— gbrain_engine is null in that case, not an error.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(setup-gbrain): add gstack-gbrain-install with D5 detect-first + D19 PATH-shadow guard

Clones gbrain at a pinned commit (v0.18.2) and registers it via
`bun link`. Before any clone:

  D5 detect-first — probes ~/git/gbrain, ~/gbrain, and the install
  target for a valid pre-existing clone (package.json with name
  "gbrain" and bin.gbrain set). If one is found, `bun link` runs
  there instead of cloning a second copy. Prevents the day-one
  duplicate-install footgun on the skill author's own machine.

After install:

  D19 PATH-shadow guard — reads the install-dir's package.json
  version, compares to `gbrain --version` on PATH. On mismatch:
  exits 3, prints every gbrain binary on PATH via `type -a`, and
  gives a remediation menu. Setup skills refuse broken environments
  instead of warning and continuing.

Prereq checks (bun, git, https://github.com reachability) fail fast
with install hints. --dry-run and --validate-only flags let the
skill probe the plan without touching state; tests use them to
cover D5 and D19 without exercising real bun link.

Pin is a load-bearing version: setup-gbrain v1 verified against
gbrain v0.18.2. Updating requires re-running Pre-Impl Gate 1 to
verify gbrain's CLI + config shapes haven't drifted.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test(setup-gbrain): unit tests for gstack-gbrain-detect + install

15 tests covering: detect emits valid JSON when nothing configured,
reports gstack_brain_git on GSTACK_HOME/.git presence, reads
~/.gbrain/config.json engine, tolerates malformed config, detects
a mocked gbrain binary on PATH with version parsing.

For install: D5 detect-first uses ~/git/gbrain fixtures under a
sandboxed HOME, verifies fall-through to fresh clone when no valid
clone exists, rejects invalid package.json shapes. D19 PATH-shadow
validation uses a fake gbrain on a minimal SAFE_PATH to simulate
version mismatch, same-version-pass, v-prefix tolerance, missing
binary on PATH, and missing version field in package.json.

--validate-only mode in the install bin makes the D19 check unit-
testable without running real bun link (which touches ~/.bun/bin).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(setup-gbrain): add gstack-gbrain-lib.sh with read_secret_to_env (D3-eng)

Shared secret-read helper for PAT (D11) and pooler URL paste (D16).
One implementation of the hardest-to-get-right pattern: stty -echo +
SIGINT/TERM/EXIT trap that restores terminal mode, read into a named
env var, optional redacted preview.

Validates the target var name against [A-Z_][A-Z0-9_]* to prevent
bash name-injection via `read -r "$varname"`. When stdin is not a TTY
(CI, piped tests) the stty branches skip cleanly — piped input doesn't
echo anyway. Exports the var after read so subprocesses inherit it;
callers own the `unset` at handoff time.

Sourced, not executed — no +x bit.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(setup-gbrain): add gstack-gbrain-supabase-verify structural URL check

Zero-network validator for Supabase Session Pooler URLs before handing
them to `gbrain init`. Canonical shape verified per gbrain init.ts:266:
  postgresql://postgres.<ref>:<password>@aws-0-<region>.pooler.supabase.com:6543/postgres

Rejects direct-connection URLs (db.*.supabase.co:5432) with a distinct
exit code 3 and clear IPv6-failure remediation — that's the most common
paste mistake users make, so it earns its own UX path rather than a
generic "bad URL" error.

Never echoes the URL (contains a password) in error messages; tests
verify a distinct seed password never appears in stderr on any reject
path. Accepts URL from argv[1] or stdin ("-" or no arg).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test(setup-gbrain): unit tests for supabase-verify + lib.sh secret helper

22 tests. verify: accepts canonical pooler URL (argv + stdin modes),
rejects direct-connection URL with exit 3, rejects wrong scheme, wrong
port, empty password, missing userinfo, plain 'postgres' user (catches
direct-URL paste errors), wrong host, empty URL. Case-insensitive host
match. Explicit negative: error messages never echo the URL password.

lib.sh read_secret_to_env: reads piped stdin into the named env var,
exports to subprocesses, redacted-preview emits masked form on stderr
with the seed password absent, rejects invalid var names (lowercase,
leading digit, hyphens), rejects missing/unknown flags, secret value
never appears on stdout.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(setup-gbrain): add gstack-gbrain-supabase-provision Management API wrapper

Four subcommands: list-orgs, create, wait, pooler-url. Built against
the verified Supabase Management API shape (Pre-Impl Gate 1):

  - POST /v1/projects with {name, db_pass, organization_slug, region}
    — not the original plan's /v1/organizations/{ref}/projects
  - No `plan` field; subscription tier is org-level per the OpenAPI
    description ("Subscription Plan is now set on organization level
    and is ignored in this request")
  - GET /v1/projects/{ref}/config/database/pooler for pooler config
    — not /config/database

Secrets discipline: SUPABASE_ACCESS_TOKEN (PAT) and DB_PASS read from
env only, never from argv (D8 grep test enforces this). `set +x` at
the top as a defensive default so debug tracing never leaks secrets.
Management API hostname hardcoded to SUPABASE_API_BASE env override —
no user-controlled URL portion (SSRF guard).

HTTP error paths: 401/403 → exit 3 (auth), 402 → 4 (quota), 409 → 5
(conflict), 429 + 5xx → exponential-backoff retry up to 3 attempts,
then exit 8. Wait subcommand polls every 5s until ACTIVE_HEALTHY
with a configurable timeout; terminal states (INIT_FAILED, REMOVED,
etc.) exit 7 immediately with a clear message. Timeout emits the
--resume-provision hint so the skill can recover.

Pooler-url constructs the URL locally from db_user/host/port/name +
DB_PASS rather than trusting the API response's connection_string
field, which is templated with [PASSWORD] rather than the real value.
Handles both object and array response shapes, preferring session
pool_mode when Supabase returns multiple pooler configs.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test(setup-gbrain): unit tests for gstack-gbrain-supabase-provision via mock API

22 tests covering D21 HTTP error suite (401/403/402/409/429/5xx) and
happy paths for all four subcommands. Every test spins up a Bun.serve
mock server bound to SUPABASE_API_BASE so nothing hits the real API.

Uses Bun.spawn (async) rather than spawnSync because spawnSync blocks
the Bun event loop, which prevents Bun.serve mocks from responding —
calls would hit curl's own timeout instead of round-tripping.

Verifies: POST body contains organization_slug (not organization_id)
and no `plan` field, bearer-token auth header, retry-on-429 with
eventual success, exit-8 on persistent 5xx after max retries, wait
succeeds on ACTIVE_HEALTHY, exits 7 on INIT_FAILED, exits 6 with
--resume-provision hint on timeout, pooler-url builds URL locally
from db_user/host/port/name + DB_PASS (not response connection_string
template), handles array pooler responses.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(setup-gbrain): add SKILL.md.tmpl — user-facing skill prompt

Stitches together every slice built so far (repo-policy, detect,
install, lib.sh secret helper, supabase-verify, supabase-provision)
into a single interactive flow. Paths: Supabase existing-URL, Supabase
auto-provision (D7), Supabase manual, PGLite local, switch (PGLite ↔
Supabase via gbrain migrate wrapped in timeout 180s per D9).

Secrets discipline per D8/D10/D11: PAT + DB_PASS + pooler URL all
read via read_secret_to_env from lib.sh and handed to gbrain via
GBRAIN_DATABASE_URL env, never argv. PAT carries the full D11 scope
disclosure before collection and an explicit revocation reminder after
success. D12 SIGINT recovery prints the in-flight ref + resume command.

D18 MCP registration is scoped honestly to Claude Code — skips with
a manual-register hint when `claude` is not on PATH. D6 per-remote
trust-triad question (read-write/read-only/deny/skip-for-now) gates
repo import; the triad values compose with the D2-eng schema-version
policy file so future migrations stay deterministic.

Skill runs concurrent-run-locked via mkdir ~/.gstack/.setup-gbrain.lock.d
(atomic, same pattern as gstack-brain-sync). Telemetry (D4) payload
carries enumerated categorical values only — never URL, PAT, or any
postgresql:// substring.

--repo, --switch, --resume-provision, --cleanup-orphans shortcut modes
documented inline; the skill parses its own invocation args.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(health): integrate gbrain as D6 composite dimension

Adds a GBrain row to the /health dashboard rubric with weight 10%.
Three sub-signals rolled into one 0-10 score: doctor status (0.5),
sync queue depth (0.3), last-push age (0.2). Redistributes when
gbrain_sync_mode is off so the dimension stays fair.

Weights rebalance: typecheck 25→22, lint 20→18, test 30→28,
deadcode 15→13, shell 10→9, gbrain +10 — sums to 100.

gbrain doctor --json wrapped in timeout 5s so a hung gbrain never
stalls the /health dashboard. Dimension is omitted (not red) when
gbrain is not installed — running /health on a non-gbrain machine
shouldn't penalize that choice.

History-JSONL adds a `gbrain` field. Pre-D6 entries read as null for
trend comparison; new tracking starts from first post-D6 run.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(test): add secret-sink-harness for negative-space leak testing (D21 #5)

Runs a subprocess with a seeded secret, captures every channel the
subprocess could leak through, and asserts the seed never appears.
Built per the D1-eng tightened contract: per-run tmp $HOME, four seed
match rules (exact + URL-decoded + first-12-char prefix + base64),
fd-level stdout/stderr capture via Bun.spawn, post-mortem walk of
every file written under $HOME, separate buckets for telemetry JSONL.

Reusable: any future skill that handles secrets can import
runWithSecretSink and run positive/negative controls against its own
bins. The harness itself is ~180 lines of TS with no external deps
beyond Bun + node:fs.

Out of scope for v1 (documented as follow-ups): subprocess env dump
(portable /proc reading), the user's real shell history (bins don't
modify it).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test: secret-sink harness positive controls + real-bin negative controls

11 tests. Positive controls deliberately leak a seed in every covered
channel (stdout, stderr, a file under $HOME, the telemetry JSONL path,
base64-encoded, first-12-char prefix) and assert the harness catches
each one. Without these, a harness that silently under-reports would
look identical to a harness that works.

Negative controls run real setup-gbrain bins with distinctive seeds:
  - supabase-verify rejects a mysql:// URL and a direct-connection URL,
    password never appears in any captured channel
  - lib.sh read_secret_to_env reads piped stdin, emits only the length,
    seed value stays invisible
  - supabase-provision on an auth-failure path fails fast without
    leaking the PAT to any channel

Covers D21 #5 leak harness + uses it to validate D3-eng, D10, D11
discipline end-to-end on the already-shipped bins.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(setup-gbrain): add list-orphans + delete-project subcommands (D20)

Powers /setup-gbrain --cleanup-orphans. list-orphans filters the
authenticated user's Supabase projects by name prefix (default
"gbrain") and excludes the project the local ~/.gbrain/config.json
currently points at, so only unclaimed gbrain-shaped projects come
back. Active-ref detection parses the pooler URL's user portion
(postgres.<ref>:<pw>@...).

delete-project is a thin DELETE /v1/projects/{ref} wrapper with no
confirmation of its own — the skill's UI layer owns the per-project
confirm AskUserQuestion loop. Keeps responsibilities clean: the bin
manages HTTP; the skill manages user intent.

Both subcommands reuse the existing api_call retry+backoff and the
same PAT discipline (env only, never argv).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test(setup-gbrain): list-orphans active-ref filtering + delete-project 404

6 new tests bringing the supabase-provision suite to 28:

list-orphans:
  - Filters to gbrain-prefixed projects, excludes the active-ref derived
    from ~/.gbrain/config.json's pooler URL
  - Treats all gbrain-prefixed projects as orphans when no config exists
    (first run on a new machine)
  - Respects custom --name-prefix for users who named their brain
    something else

delete-project:
  - Happy path sends DELETE /v1/projects/<ref> and returns {deleted_ref}
  - 404 surfaces cleanly (exit 2, "404" in stderr)
  - Missing <ref> positional rejected with exit 2

Uses per-test tmpdir HOME with a stubbed ~/.gbrain/config.json so
active-ref extraction runs against deterministic fixtures.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore: regenerate setup-gbrain SKILL.md after main merge

* chore: bump version and changelog (v1.12.0.0)

Ships /setup-gbrain and its supporting infrastructure end-to-end:
per-remote trust policy, installer with PATH-shadow guard, shared
secret-read helper, structural URL verifier, Supabase Management
API wrapper, /health GBrain dimension, secret-sink test harness.

100 new tests across 5 suites, all green. Three pre-existing test
failures noted as P0 in TODOS.md.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs: add USING_GBRAIN_WITH_GSTACK.md + update README for /setup-gbrain

README changes:
- Rewrote the "Cross-machine memory with GBrain sync" section into
  "GBrain — persistent knowledge for your coding agent." Covers the
  three /setup-gbrain paths (Supabase existing URL, auto-provision,
  PGLite local), MCP registration, per-remote trust triad, and the
  (still-separate) memory sync feature.
- Added /setup-gbrain row to the skills table pointing at the full guide.
- Added /setup-gbrain to both skill-list install snippets.
- Added USING_GBRAIN_WITH_GSTACK.md to the Docs table.

New doc (USING_GBRAIN_WITH_GSTACK.md):
- All three setup paths with trust-surface caveats
- MCP registration details (and honest Claude-Code-v1 scoping)
- Per-remote trust triad semantics + how to change a policy
- Switching engines (PGLite ↔ Supabase) via --switch
- GStack memory sync + its relationship to the gbrain knowledge base
- /setup-gbrain --cleanup-orphans for orphan Supabase projects
- Full command + flag reference, every bin helper, every env var
- Security model: what's enforced in code, what's enforced by the leak
  harness, and the honest limits of v1
- Troubleshooting: PATH shadowing, direct-connection URL reject,
  auto-provision timeout, stale lock, policy file hand-edits,
  migrate hang
- Why-this-design section explaining the non-obvious choices

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(brain-sync): secret scanner now catches Bearer-prefixed auth tokens in JSON

The bearer-token-json regex value charset was [A-Za-z0-9_./+=-]{16,},
which does NOT permit spaces. Real HTTP auth headers embed the scheme
name with a literal space — "Bearer <token>" — so the value portion
actually starts with "Bearer " and the existing regex couldn't match.
Result: any JSON blob containing "authorization":"Bearer ..." would
slip past the scanner and sync to the user's private brain repo with
the bearer token inline.

Added optional (Bearer |Basic |Token )? prefix in front of the value
charset. Now matches the common auth-scheme forms without broadening
the matcher to tolerate arbitrary whitespace (which would false-positive
on lots of benign JSON).

Verified against 5 positive cases (bearer-in-json, clean bearer, apikey
no-prefix, token with Bearer, password no-prefix) + 3 negative cases
(too-short tokens, non-secret field names like username, random JSON).

This closes the P0 security regression first noticed during v1.12.0.0
/ship. brain-sync.test.ts now passes all 7 secret-scan fixtures.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test: mock-gh integration tests for gstack-brain-init auto-create path

8 tests covering the gh-repo-create happy path that had zero coverage
before. Existing brain-sync.test.ts always passes --remote <bare-url>
to bypass gh entirely, so the interactive default ("press Enter, we'll
run gh repo create for you") was shipping on trust.

Test strategy: write a bash stub for gh that records every call into
a file, then run gstack-brain-init with that stub on PATH. Assertions
verify: gh auth status is checked, gh repo create fires with the
computed gstack-brain-<user> default name + --private + --source
flags, fall-through to gh repo view when create reports already-exists,
user-provided URL bypasses gh entirely, gh-not-on-path and gh-not-authed
branches both prompt for URL, --remote flag short-circuits all gh
calls, conflicting-remote re-runs exit 1 with a clear message.

No real GitHub, no live auth. Gate tier — runs on every commit.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test(e2e): privacy-gate AskUserQuestion fires from preamble (periodic tier)

Two periodic-tier E2E tests exercising the preamble's privacy gate
end-to-end via the Agent SDK + canUseTool. Previously uncovered:

- Positive: stages a fake gbrain on PATH + gbrain_sync_mode_prompted=false
  in config, runs a real skill, intercepts tool-use. Asserts the
  preamble fires a 3-option AskUserQuestion matching the canonical
  prose ("publish session memory" / "artifact" / "decline") and does
  NOT fire a second time in the same run (idempotency within session).

- Negative: same staging but prompted=true. Asserts the gate stays
  silent even with gbrain detected on the host.

Registered in test/helpers/touchfiles.ts as `brain-privacy-gate`
(periodic) with dependency tracking on generate-brain-sync-block.ts,
the three gstack-brain-* bins, gstack-config, and the Agent SDK runner.
Diff-based selection re-runs the E2E when any of those change.

Cost: ~$0.30-$0.50 per run. Only fires under EVALS=1 EVALS_TIER=periodic;
gate tier stays free.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs: update TODOS for bearer-json fix + new brain-sync test coverage

Moves the bearer-json secret-scan regression from the P0 "pre-existing
failures" block into the Completed section with full context on the
fix, the mock-gh tests, the E2E privacy-gate tests, and the touchfile
registration. Remaining P0s are the GSTACK_HOME config-isolation bug
and the stale Opus 4.7 overlay pacing assertion, both unrelated.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(test): E2E privacy gate — ambient env + skill-file prompt

Two fixes to get the E2E actually running end-to-end (first attempt
failed at the SDK auth step, second at the assertion step):

1. Don't pass an explicit `env:` object to runAgentSdkTest. The SDK's
   auth pipeline misses ANTHROPIC_API_KEY when env is supplied as an
   object (verified against the plan-mode-no-op test, which passes no
   env and auths cleanly). Mutate process.env before the call instead,
   and restore the originals in finally so other tests don't inherit
   the ambient mutation.

2. The "Run /learn with no arguments" user prompt was too narrow — the
   model reduced it to a direct action and skipped the preamble
   privacy-gate directives entirely, so zero AskUserQuestions fired.
   Mirror the plan-mode-no-op pattern: point the model at the skill
   file on disk and ask it to follow every preamble directive. Bumped
   maxTurns from 6 to 10 to give the preamble room to execute.

Verified both tests pass under `EVALS=1 EVALS_TIER=periodic bun test
test/skill-e2e-brain-privacy-gate.test.ts` against a real ANTHROPIC_API_KEY.
Cost per run: ~$0.30-$0.50 per test.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs(CLAUDE.md): source ANTHROPIC/OPENAI keys from ~/.zshrc for paid evals

Conductor workspaces don't inherit the interactive shell env, so
both API keys are absent from the default process env even though
they're set in ~/.zshrc. Documents the source-from-zshrc pattern
(grep + eval, never echo the value) plus the Agent SDK gotcha: do
NOT pass env as an object to runAgentSdkTest — mutate process.env
ambiently and restore in finally.

Discovered this during the brain-privacy-gate E2E. First run failed
at SDK auth with 401; second failed because explicit env handoff
bypassed the SDK's own auth routing. Fix pattern now codified so
the next paid-eval session in a Conductor workspace doesn't hit the
same two dead ends.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 01:38:21 -07:00

453 lines
15 KiB
Bash
Executable File

#!/usr/bin/env bash
# gstack-brain-sync — drain queue, commit allowlisted paths, push to remote.
#
# Usage:
# gstack-brain-sync --once drain queue, commit, push (default)
# gstack-brain-sync --status print sync health as JSON
# gstack-brain-sync --skip-file <p> add <p> to ~/.gstack/.brain-skip.txt
# gstack-brain-sync --drop-queue --yes clear queue without committing
# gstack-brain-sync --discover-new scan allowlist dirs, enqueue changed files
#
# Invoked by the preamble at skill START and END boundaries. No persistent
# daemon. Typical run <1s when queue empty; ~200-800ms with network push.
#
# Singleton enforcement: flock on ~/.gstack/.brain-sync.lock. Concurrent
# invocations queue and serialize.
#
# Env:
# GSTACK_HOME — override ~/.gstack (aligns with writers).
set -uo pipefail
GSTACK_HOME="${GSTACK_HOME:-$HOME/.gstack}"
QUEUE="$GSTACK_HOME/.brain-queue.jsonl"
ALLOWLIST="$GSTACK_HOME/.brain-allowlist"
PRIVACY_MAP="$GSTACK_HOME/.brain-privacy-map.json"
SKIP_FILE="$GSTACK_HOME/.brain-skip.txt"
STATUS_FILE="$GSTACK_HOME/.brain-sync-status.json"
LAST_PUSH_FILE="$GSTACK_HOME/.brain-last-push"
LOCK_FILE="$GSTACK_HOME/.brain-sync.lock"
DISCOVER_CURSOR="$GSTACK_HOME/.brain-discover-cursor"
SCRIPT_DIR="$(cd "$(dirname "$0")" && pwd)"
CONFIG_BIN="$SCRIPT_DIR/gstack-config"
# Remote-specific hint for auth errors (branch on origin URL).
remote_auth_hint() {
local url
url=$(git -C "$GSTACK_HOME" remote get-url origin 2>/dev/null || echo "")
case "$url" in
*github.com*|*@github.*) echo "run: gh auth status (and gh auth refresh if needed)" ;;
*gitlab*) echo "run: glab auth status" ;;
*) echo "check 'git remote -v' and your credentials" ;;
esac
}
write_status() {
# args: status_code message [extra_json_blob]
local code="$1"
local msg="$2"
local extra="${3:-{\}}"
local ts
ts=$(date -u +%Y-%m-%dT%H:%M:%SZ 2>/dev/null || echo "")
python3 - "$STATUS_FILE" "$code" "$msg" "$ts" "$extra" <<'PYEOF' 2>/dev/null || true
import json, sys
path, code, msg, ts, extra = sys.argv[1:6]
try:
extra_obj = json.loads(extra) if extra else {}
except Exception:
extra_obj = {}
data = {"status": code, "message": msg, "ts": ts, **extra_obj}
with open(path, "w") as f:
json.dump(data, f)
f.write("\n")
PYEOF
}
# Read config; return 0 if sync active, 1 otherwise.
sync_active() {
if [ ! -d "$GSTACK_HOME/.git" ]; then
return 1
fi
local mode
mode=$("$CONFIG_BIN" get gbrain_sync_mode 2>/dev/null || echo off)
[ "$mode" = "off" ] && return 1
return 0
}
# Secret regex families — stdin scan. Exits 0 clean, 1 if hit.
# Echoes the matching pattern family name on hit. Uses python3 -c (not
# heredoc) so sys.stdin stays available for the diff content.
secret_scan_stdin() {
python3 -c "
import sys, re
patterns = [
('aws-access-key', re.compile(r'AKIA[0-9A-Z]{16}')),
('github-token', re.compile(r'\\b(gh[pousr]_[A-Za-z0-9]{20,}|github_pat_[A-Za-z0-9_]{20,})')),
('openai-key', re.compile(r'\\bsk-[A-Za-z0-9_-]{20,}')),
('pem-block', re.compile(r'-----BEGIN [A-Z ]{3,}-----')),
('jwt', re.compile(r'\\beyJ[A-Za-z0-9_-]{10,}\\.[A-Za-z0-9_-]{10,}\\.[A-Za-z0-9_-]{10,}\\b')),
('bearer-token-json',
# JSON-embedded auth headers. The optional Bearer/Basic/Token prefix
# matters: real auth values include a literal space after the scheme
# name, but the value charset below does not include spaces, so
# without the optional prefix every Bearer token in a JSON blob slips
# past the scanner.
re.compile(r'\"(authorization|api[_-]?key|apikey|token|secret|password)\"\\s*:\\s*\"(Bearer |Basic |Token )?[A-Za-z0-9_./+=-]{16,}\"',
re.IGNORECASE)),
]
text = sys.stdin.read()
for name, rx in patterns:
m = rx.search(text)
if m:
snippet = m.group(0)
if len(snippet) > 30:
snippet = snippet[:30] + '...'
print(name + ':' + snippet)
sys.exit(1)
sys.exit(0)
"
}
# Compute matched allowlisted, privacy-filtered path set from queue.
# Output: newline-delimited relative paths that should be staged.
compute_paths_to_stage() {
local mode="$1"
python3 - "$GSTACK_HOME" "$QUEUE" "$ALLOWLIST" "$PRIVACY_MAP" "$SKIP_FILE" "$mode" <<'PYEOF'
import sys, json, os, fnmatch, glob
gstack_home, queue, allowlist_path, privacy_path, skip_path, mode = sys.argv[1:7]
def load_lines(path):
try:
with open(path) as f:
return [l.strip() for l in f if l.strip() and not l.lstrip().startswith("#")]
except FileNotFoundError:
return []
def load_privacy_map(path):
try:
with open(path) as f:
data = json.load(f)
# Expected: [{"pattern": "glob", "class": "artifact" | "behavioral"}]
return data if isinstance(data, list) else []
except (FileNotFoundError, json.JSONDecodeError):
return []
allowlist_globs = load_lines(allowlist_path)
privacy_map = load_privacy_map(privacy_path)
skip_lines = set(load_lines(skip_path))
# Read queue; collect unique file paths.
queue_paths = set()
try:
with open(queue) as f:
for line in f:
line = line.strip()
if not line:
continue
try:
obj = json.loads(line)
p = obj.get("file")
if isinstance(p, str):
queue_paths.add(p)
except json.JSONDecodeError:
continue
except FileNotFoundError:
pass
def path_matches_any(path, globs):
for pattern in globs:
if fnmatch.fnmatchcase(path, pattern):
return True
return False
def privacy_class(path, mapping):
for entry in mapping:
pat = entry.get("pattern")
if pat and fnmatch.fnmatchcase(path, pat):
return entry.get("class", "artifact")
# Default class when no pattern matches: artifact (safe default).
return "artifact"
# mode filter: 'off' → nothing; 'artifacts-only' → only artifact class;
# 'full' → both classes.
def mode_allows(cls, mode):
if mode == "off":
return False
if mode == "artifacts-only":
return cls == "artifact"
return True # full
final = []
for p in sorted(queue_paths):
if p in skip_lines:
continue
# Must be under GSTACK_HOME root. Reject absolute + reject ../ escape.
if p.startswith("/") or ".." in p.split("/"):
continue
# Must match at least one allowlist glob.
if not path_matches_any(p, allowlist_globs):
continue
# Must survive privacy mode filter.
cls = privacy_class(p, privacy_map)
if not mode_allows(cls, mode):
continue
# Must exist on disk — can't stage what isn't there.
if not os.path.exists(os.path.join(gstack_home, p)):
continue
final.append(p)
for p in final:
print(p)
PYEOF
}
subcmd_once() {
if ! sync_active; then
# Silent no-op when feature not initialized / disabled.
exit 0
fi
# Singleton lock via atomic mkdir. `flock(1)` isn't on macOS by default;
# `mkdir` is atomic on every POSIX filesystem. If another --once is already
# running, skip (don't wait) — the next skill boundary will catch up.
local lock_dir="${LOCK_FILE}.d"
if ! mkdir "$lock_dir" 2>/dev/null; then
# Is the lock stale? Check the pidfile inside. If process is dead, clear it.
if [ -f "$lock_dir/pid" ]; then
local lock_pid
lock_pid=$(cat "$lock_dir/pid" 2>/dev/null || echo "")
if [ -n "$lock_pid" ] && ! kill -0 "$lock_pid" 2>/dev/null; then
# Stale lock — clear and retry once.
rm -rf "$lock_dir" 2>/dev/null || true
if ! mkdir "$lock_dir" 2>/dev/null; then
exit 0
fi
else
# Lock is held by a live process.
exit 0
fi
else
# Lock dir without pidfile — treat as held; don't touch.
exit 0
fi
fi
echo "$$" > "$lock_dir/pid" 2>/dev/null || true
local mode
mode=$("$CONFIG_BIN" get gbrain_sync_mode 2>/dev/null || echo off)
local paths_file
paths_file=$(mktemp /tmp/brain-sync-paths.XXXXXX) || { rm -rf "$lock_dir" 2>/dev/null; write_status "error" "mktemp failed"; exit 1; }
# Single trap covers both: lock cleanup AND tempfile cleanup.
trap 'rm -f "$paths_file" 2>/dev/null; rm -rf "$lock_dir" 2>/dev/null || true' EXIT INT TERM
compute_paths_to_stage "$mode" > "$paths_file"
if [ ! -s "$paths_file" ]; then
# Nothing to stage. Clear any stale queue entries and exit.
: > "$QUEUE"
write_status "idle" "no allowlisted changes in queue"
exit 0
fi
# Stage with git add -f (forces past .gitignore=*) explicit paths only.
while IFS= read -r p; do
[ -z "$p" ] && continue
git -C "$GSTACK_HOME" add -f -- "$p" 2>/dev/null || true
done < "$paths_file"
# Secret-scan staged diff.
local scan_out
scan_out=$(git -C "$GSTACK_HOME" diff --cached 2>/dev/null | secret_scan_stdin || true)
if [ -n "$scan_out" ]; then
# Hit — unstage, preserve queue, write loud status.
git -C "$GSTACK_HOME" reset HEAD -- . >/dev/null 2>&1 || true
local hint
hint="secret pattern detected ($scan_out). Remediation: review the staged file, then run: gstack-brain-sync --skip-file <path> OR edit the content."
write_status "blocked" "$hint"
echo "BRAIN_SYNC: blocked: $scan_out" >&2
exit 0
fi
# Commit with template message.
local n ts
n=$(wc -l < "$paths_file" | tr -d ' ')
ts=$(date -u +%Y-%m-%dT%H:%M:%SZ)
local msg="sync: $n file(s) | $ts"
git -C "$GSTACK_HOME" -c user.email="gstack@localhost" -c user.name="gstack-brain-sync" \
commit -q -m "$msg" 2>/dev/null || {
# Nothing to commit (e.g. all files already committed).
: > "$QUEUE"
write_status "idle" "queue drained but no new changes to commit"
exit 0
}
# Push. On reject, fetch + merge (merge driver handles JSONL) + retry once.
local push_err
push_err=$(git -C "$GSTACK_HOME" push origin HEAD 2>&1 >/dev/null) || {
# Check if this is an auth error first — no point retrying.
if echo "$push_err" | grep -qiE "auth|permission|403|401|forbidden"; then
local hint
hint=$(remote_auth_hint)
write_status "push_failed" "push failed: auth error. fix: $hint"
echo "BRAIN_SYNC: push failed: auth. fix: $hint" >&2
# Queue cleared because the commit exists locally; next push will send it.
: > "$QUEUE"
exit 0
fi
# Try a fetch-and-merge + retry.
if git -C "$GSTACK_HOME" fetch origin 2>/dev/null; then
local branch
branch=$(git -C "$GSTACK_HOME" rev-parse --abbrev-ref HEAD 2>/dev/null || echo main)
if git -C "$GSTACK_HOME" merge --no-edit "origin/$branch" >/dev/null 2>&1; then
if git -C "$GSTACK_HOME" push origin HEAD 2>/dev/null; then
: > "$QUEUE"
date -u +%Y-%m-%dT%H:%M:%SZ > "$LAST_PUSH_FILE"
write_status "ok" "pushed $n file(s) after rebase"
exit 0
fi
fi
fi
write_status "push_failed" "push failed: $(printf '%s' "$push_err" | head -1)"
: > "$QUEUE"
exit 0
}
# Success: clear queue, update last-push.
: > "$QUEUE"
date -u +%Y-%m-%dT%H:%M:%SZ > "$LAST_PUSH_FILE"
write_status "ok" "pushed $n file(s)"
exit 0
}
subcmd_status() {
if [ -f "$STATUS_FILE" ]; then
cat "$STATUS_FILE"
else
echo '{"status":"unknown","message":"no status file yet"}'
fi
# Supplemental info (not in status file).
local queue_depth=0
[ -f "$QUEUE" ] && queue_depth=$(wc -l < "$QUEUE" | tr -d ' ')
local last_push="never"
[ -f "$LAST_PUSH_FILE" ] && last_push=$(cat "$LAST_PUSH_FILE" 2>/dev/null || echo never)
local mode
mode=$("$CONFIG_BIN" get gbrain_sync_mode 2>/dev/null || echo off)
printf '{"queue_depth":%s,"last_push":"%s","mode":"%s"}\n' "$queue_depth" "$last_push" "$mode"
}
subcmd_skip_file() {
local path="${1:-}"
if [ -z "$path" ]; then
echo "Usage: gstack-brain-sync --skip-file <path>" >&2
exit 1
fi
mkdir -p "$GSTACK_HOME"
# Avoid duplicate entries.
if [ -f "$SKIP_FILE" ] && grep -Fxq "$path" "$SKIP_FILE"; then
echo "already in skip list: $path"
exit 0
fi
echo "$path" >> "$SKIP_FILE"
echo "added to skip list: $path"
echo "(future writers will not enqueue this path; existing queue entries ignored on next --once)"
}
subcmd_drop_queue() {
local force="${1:-}"
if [ "$force" != "--yes" ]; then
echo "Refusing: --drop-queue discards pending syncs. Pass --yes to confirm." >&2
exit 1
fi
if [ ! -f "$QUEUE" ]; then
echo "queue already empty"
exit 0
fi
local n
n=$(wc -l < "$QUEUE" | tr -d ' ')
: > "$QUEUE"
echo "dropped $n queue entries"
}
subcmd_discover_new() {
if ! sync_active; then
exit 0
fi
# Walk allowlist globs; enqueue any file where mtime+size differs from cursor.
python3 - "$GSTACK_HOME" "$ALLOWLIST" "$DISCOVER_CURSOR" "$SCRIPT_DIR/gstack-brain-enqueue" <<'PYEOF' 2>/dev/null || true
import sys, os, json, glob, fnmatch, subprocess, hashlib
gstack_home, allowlist_path, cursor_path, enqueue_bin = sys.argv[1:5]
def load_lines(path):
try:
with open(path) as f:
return [l.strip() for l in f if l.strip() and not l.lstrip().startswith("#")]
except FileNotFoundError:
return []
def load_cursor(path):
try:
with open(path) as f:
return json.load(f)
except (FileNotFoundError, json.JSONDecodeError):
return {}
def save_cursor(path, data):
try:
with open(path, "w") as f:
json.dump(data, f)
except OSError:
pass
allowlist = load_lines(allowlist_path)
cursor = load_cursor(cursor_path)
new_cursor = dict(cursor)
# Walk all files under gstack_home, match against allowlist.
for root, dirs, files in os.walk(gstack_home):
# Skip .git and .brain-* state files.
if ".git" in root.split(os.sep):
continue
for name in files:
full = os.path.join(root, name)
rel = os.path.relpath(full, gstack_home)
if rel.startswith(".brain-"):
continue
matched = any(fnmatch.fnmatchcase(rel, pat) for pat in allowlist)
if not matched:
continue
try:
st = os.stat(full)
key = f"{int(st.st_mtime)}:{st.st_size}"
except OSError:
continue
prev = cursor.get(rel)
if prev != key:
# Enqueue via the shim (respects sync mode + skip list).
subprocess.run([enqueue_bin, rel], check=False)
new_cursor[rel] = key
save_cursor(cursor_path, new_cursor)
PYEOF
}
# -------- dispatch --------
case "${1:-}" in
--once|"") subcmd_once ;;
--status) subcmd_status ;;
--skip-file) shift; subcmd_skip_file "${1:-}" ;;
--drop-queue) shift; subcmd_drop_queue "${1:-}" ;;
--discover-new) subcmd_discover_new ;;
--help|-h)
sed -n '2,18p' "$0" | sed 's/^# \{0,1\}//'
;;
*)
echo "Unknown subcommand: $1" >&2
echo "Run: gstack-brain-sync --help" >&2
exit 1
;;
esac