Commit Graph

3 Commits

Author SHA1 Message Date
Garry Tan 9dbaf906cf feat(v1.9.0.0): gbrain-sync — cross-machine gstack memory (#1151)
* feat(gbrain-sync): queue primitives + writer shims

Adds bin/gstack-brain-enqueue (atomic append to sync queue) and
bin/gstack-jsonl-merge (git merge driver, ts-sort with SHA-256 fallback).
Wires one backgrounded enqueue call into learnings-log, timeline-log,
review-log, and developer-profile --migrate. question-log and
question-preferences stay local per Codex v2 decision.

gstack-config gains gbrain_sync_mode (off/artifacts-only/full) and
gbrain_sync_mode_prompted keys, plus GSTACK_HOME env alignment so
tests don't leak into real ~/.gstack/config.yaml.

* feat(gbrain-sync): --once drain + secret scan + push

bin/gstack-brain-sync is the core sync binary. Subcommands: --once
(drain queue, allowlist-filter, privacy-class-filter, secret-scan
staged diff, commit with template, push with fetch+merge retry),
--status, --skip-file <path>, --drop-queue --yes, --discover-new
(cursor-based detection of artifact writes that skip the shim).

Secret regex families: AWS keys, GitHub tokens (ghp_/gho_/ghu_/ghs_/
ghr_/github_pat_), OpenAI sk-, PEM blocks, JWTs, bearer-token-in-JSON.
On hit: unstage, preserve queue, print remediation hint (--skip-file
or edit), exit clean. No daemon — invoked by preamble at skill
boundaries.

* feat(gbrain-sync): init, restore, uninstall, consumer registry

bin/gstack-brain-init: idempotent first-run. git init ~/.gstack/,
.gitignore=*, canonical .brain-allowlist + .brain-privacy-map.json,
pre-commit secret-scan hook (defense-in-depth), merge driver registration
via git config, gh repo create --private OR arbitrary --remote <url>,
initial push, ~/.gstack-brain-remote.txt for new-machine discovery,
GBrain consumer registration via HTTP POST.

bin/gstack-brain-restore: safe new-machine bootstrap. Refuses clobber
of existing allowlisted files, clones to staging, rsync-copies tracked
files, re-registers merge drivers (required — not cloned from remote),
rehydrates consumers.json, prompts for per-consumer tokens.

bin/gstack-brain-uninstall: clean off-ramp. Removes .git + .brain-*
files + consumers.json + config keys. Preserves user data (learnings,
plans, retros, profile). Optional --delete-remote for GitHub repos.

bin/gstack-brain-consumer + bin/gstack-brain-reader (symlink alias):
registry management. Internal 'consumer' term; user-facing 'reader'
per DX review decision.

* feat(gbrain-sync): preamble block — privacy gate + boundary sync

scripts/resolvers/preamble/generate-brain-sync-block.ts emits bash that
runs at every skill invocation:
- Detects ~/.gstack-brain-remote.txt on machines without local .git
  and surfaces a restore-available hint (does NOT auto-run restore).
- Runs gstack-brain-sync --once at skill start to drain any pending
  writes (and at skill end via prose instruction).
- Once-per-day auto-pull (cached via .brain-last-pull) for append-only
  JSONL files.
- Emits BRAIN_SYNC: status line every skill run.

Also emits prose for the host LLM to fire the one-time privacy
stop-gate (full / artifacts-only / off) when gbrain is detected and
gbrain_sync_mode_prompted is false. Wired into preamble.ts composition.

* test(gbrain-sync): 27-test consolidated suite

test/brain-sync.test.ts covers:
- Config: validation, defaults, GSTACK_HOME env isolation
- Enqueue: no-op gates, skip list, concurrent atomicity, JSON escape
- JSONL merge driver: 3-way + ts-sort + SHA-256 fallback
- Init + sync: canonical file creation, merge driver registration,
  push-reject + fetch+merge retry path
- Init refuses different remote (idempotency)
- Cross-machine restore round-trip (machine A write → machine B sees)
- Secret scan across all 6 regex families (AWS, GH, OpenAI, PEM, JWT,
  bearer-JSON). --skip-file unblock remediation
- Uninstall removes sync config, preserves user data
- --discover-new idempotence via mtime+size cursor

Behaviors verified via integration smokes during implementation. Known
follow-up: bun-test 5s default timeout needs 30s wrapper for
spawnSync-heavy tests.

* docs(gbrain-sync): user guide + error lookup + README section

docs/gbrain-sync.md: setup walkthrough, privacy modes, cross-machine
workflow, secret protection, two-machine conflict handling, uninstall,
troubleshooting reference.

docs/gbrain-sync-errors.md: problem/cause/fix index for every
user-visible error. Patterned on Rust's error docs + Stripe's API
error reference.

README.md: 'Cross-machine memory with GBrain sync' section near the
top (discovery moment), plus docs-table entry.

* chore: bump version and changelog (v1.7.0.0)

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* chore: regenerate SKILL.md files for gbrain-sync preamble block

Re-runs bun run gen:skill-docs after adding generateBrainSyncBlock
to scripts/resolvers/preamble.ts in a2aa8a07. CI check-freshness
caught the drift. All 36 SKILL.md files regenerated with the new
skill-start bash block + privacy-gate prose + skill-end sync
instructions baked in.

* fix(test): session-awareness reads AskUserQuestion Format from a Tier 2+ SKILL.md

The test was reading ROOT/SKILL.md (browse skill, Tier 1) which never
contained '## AskUserQuestion Format' — that section is only emitted
for Tier 2+ skills by scripts/resolvers/preamble.ts. As a result the
agent was prompted with an empty format guide and only emitted
'RECOMMENDATION' intermittently, making the test flaky.

Pre-existing on main (same ROOT/SKILL.md shape there) — surfaced now
because the agent run didn't hit the RECOMMENDATION/recommend/option a
fallback strings in this particular attempt.

Fix: read from office-hours/SKILL.md (Tier 3, always has the section)
with a fallback that scans for the first top-level skill dir whose
SKILL.md contains the header. Future template moves won't break this
test again.

* chore: bump to v1.9.0.0 for gbrain-sync landing

Changes just the VERSION + package.json + CHANGELOG header (1.7.0.0 → 1.9.0.0
and date 2026-04-22 → 2026-04-23). No code changes. User call: land gbrain-sync
as a bigger-signal release above main's 1.6.4.0, skipping 1.8.0.0.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-23 17:54:54 -07:00
Garry Tan 7e96fe299b fix: security wave 3 — 12 fixes, 7 contributors (v0.16.4.0) (#988)
* fix(security): validateOutputPath symlink bypass — check file-level symlinks

validateOutputPath() previously only resolved symlinks on the parent directory.
A symlink at /tmp/evil.png → /etc/crontab passed the parent check (parent is
/tmp, which is safe) but the write followed the symlink outside safe dirs.

Add lstatSync() check: if the target file exists and is a symlink, resolve
through it and verify the real target is within SAFE_DIRECTORIES. ENOENT
(file doesn't exist yet) falls through to the existing parent-dir check.

Closes #921

Co-Authored-By: Yunsu <Hybirdss@users.noreply.github.com>

* fix(security): shell injection in bin/ scripts — use env vars instead of interpolation

gstack-settings-hook interpolated $SETTINGS_FILE directly into bun -e
double-quoted blocks. A path containing quotes or backticks breaks the JS
string context, enabling arbitrary code execution.

Replace direct interpolation with environment variables (process.env).
Same fix applied to gstack-team-init which had the same pattern.

Systematic audit confirmed only these two scripts were vulnerable — all
other bin/ scripts already use stdin piping or env vars.

Closes #858

Co-Authored-By: Gus <garagon@users.noreply.github.com>

* fix(security): cookie-import path validation bypass + hardcoded /tmp

Two fixes:
1. cookie-import relative path bypass (#707): path.isAbsolute() gated the
   entire validation, so relative paths like "sensitive-file.json" bypassed
   the safe-directory check entirely. Now always resolves to absolute path
   with realpathSync for symlink resolution, matching validateOutputPath().

2. Hardcoded /tmp in cookie-import-browser (#708): openDbFromCopy used
   /tmp directly instead of os.tmpdir(), breaking Windows support.

Also adds explicit imports for SAFE_DIRECTORIES and isPathWithin in
write-commands.ts (previously resolved implicitly through bundler).

Closes #852

Co-Authored-By: Toby Morning <urbantech@users.noreply.github.com>

* fix(security): redact form fields with sensitive names, not just type=password

Form redaction only applied to type="password" fields. Hidden and text
fields named csrf_token, api_key, session_id, etc. were exposed unredacted
in LLM context, leaking secrets.

Extend redaction to check field name and id against sensitive patterns:
token, secret, key, password, credential, auth, jwt, session, csrf, sid,
api_key. Uses the same pattern style as SENSITIVE_COOKIE_NAME.

Closes #860

Co-Authored-By: Gus <garagon@users.noreply.github.com>

* fix(security): restrict session file permissions to owner-only

Design session files written to /tmp with default umask (0644) were
world-readable on shared systems. Sessions contain design prompts and
feedback history.

Set mode 0o600 (owner read/write only) on both create and update paths.

Closes #859

Co-Authored-By: Gus <garagon@users.noreply.github.com>

* fix(security): enforce frozen lockfile during setup

bun install without --frozen-lockfile resolves ^semver ranges from npm on
every run. If an attacker publishes a compromised compatible version of any
dependency, the next ./setup pulls it silently.

Add --frozen-lockfile with fallback to plain install (for fresh clones
where bun.lock may not exist yet). Matches the pattern already used in
the .agents/ generation block (line 237).

Closes #614

Co-Authored-By: Alberto Martinez <halbert04@users.noreply.github.com>

* fix: remove duplicate recursive chmod on /tmp in Dockerfile.ci

chmod -R 1777 /tmp recursively sets sticky bit on files (no defined
behavior), not just the directory. Deduplicate to single chmod 1777 /tmp.

Closes #747

Co-Authored-By: Maksim Soltan <Gonzih@users.noreply.github.com>

* fix(security): learnings input validation + cross-project trust gate

Three fixes to the learnings system:

1. Input validation in gstack-learnings-log: type must be from allowed list,
   key must be alphanumeric, confidence must be 1-10 integer, source must
   be from allowed list. Prevents injection via malformed fields.

2. Prompt injection defense: insight field checked against 10 instruction-like
   patterns (ignore previous, system:, override, etc.). Rejected with clear
   error message.

3. Cross-project trust gate in gstack-learnings-search: AI-generated learnings
   from other projects are filtered out. Only user-stated learnings cross
   project boundaries. Prevents silent prompt injection across codebases.

Also adds trusted field (true for user-stated source, false for AI-generated)
to enable the trust gate at read time.

Closes #841

Co-Authored-By: Ziad Al Sharif <Ziadstr@users.noreply.github.com>

* feat(security): track cookie-imported domains and scope cookie imports

Foundation for origin-pinned JS execution (#616). Tracks which domains
cookies were imported from so the JS/eval commands can verify execution
stays within imported origins.

Changes:
- BrowserManager: new cookieImportedDomains Set with track/get/has methods
- cookie-import: tracks imported cookie domains after addCookies
- cookie-import-browser: tracks domains on --domain direct import
- cookie-import-browser --all: new explicit opt-in for all-domain import
  (previously implicit behavior, now requires deliberate flag)

Closes #615

Co-Authored-By: Alberto Martinez <halbert04@users.noreply.github.com>

* feat(security): pin JS/eval execution to cookie-imported origins

When cookies have been imported for specific domains, block JS execution
on pages whose origin doesn't match. Prevents the attack chain:
1. Agent imports cookies for github.com
2. Prompt injection navigates to attacker.com
3. Agent runs js document.cookie → exfiltrates github cookies

assertJsOriginAllowed() checks the current page hostname against imported
cookie domains with subdomain matching (.github.com allows api.github.com).
When no cookies are imported, all origins allowed (nothing to protect).
about:blank and data: URIs are allowed (no cookies at risk).

Depends on #615 (cookie domain tracking).

Closes #616

Co-Authored-By: Alberto Martinez <halbert04@users.noreply.github.com>

* feat(security): add persistent command audit log

Append-only JSONL audit trail for all browse server commands. Unlike
in-memory ring buffers, the audit log persists across restarts and is
never truncated. Each entry records: timestamp, command, args (truncated
to 200 chars), page origin, duration, status, error (truncated to 300
chars), hasCookies flag, connection mode.

All writes are best-effort — audit failures never block command execution.
Log stored at ~/.gstack/.browse/browse-audit.jsonl.

Closes #617

Co-Authored-By: Alberto Martinez <halbert04@users.noreply.github.com>

* fix(security): block hex-encoded IPv4-mapped IPv6 metadata bypass

URL constructor normalizes ::ffff:169.254.169.254 to ::ffff:a9fe:a9fe
(hex form), which was not in the blocklist. Similarly, ::169.254.169.254
normalizes to ::a9fe:a9fe.

Add both hex-encoded forms to BLOCKED_METADATA_HOSTS so they're caught
by the direct hostname check in validateNavigationUrl.

Closes #739

Co-Authored-By: Osman Mehmood <mehmoodosman@users.noreply.github.com>

* chore: bump version and changelog (v0.16.4.0)

Security wave 3: 12 fixes, 7 contributors.
Cookie origin pinning, command audit log, domain tracking.
Symlink bypass, path validation, shell injection, form redaction,
learnings injection, IPv6 SSRF, session permissions, frozen lockfile.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Yunsu <Hybirdss@users.noreply.github.com>
Co-authored-by: Gus <garagon@users.noreply.github.com>
Co-authored-by: Toby Morning <urbantech@users.noreply.github.com>
Co-authored-by: Alberto Martinez <halbert04@users.noreply.github.com>
Co-authored-by: Maksim Soltan <Gonzih@users.noreply.github.com>
Co-authored-by: Ziad Al Sharif <Ziadstr@users.noreply.github.com>
Co-authored-by: Osman Mehmood <mehmoodosman@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 07:49:37 -10:00
Garry Tan ae0a9ad195 feat: GStack Learns — per-project self-learning infrastructure (v0.13.4.0) (#622)
* feat: learnings + confidence resolvers — cross-skill memory infrastructure

Three new resolvers for the self-learning system:
- LEARNINGS_SEARCH: tells skills to load prior learnings before analysis
- LEARNINGS_LOG: tells skills to capture discoveries after completing work
- CONFIDENCE_CALIBRATION: adds 1-10 confidence scoring to all review findings

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: learnings bin scripts — append-only JSONL read/write

gstack-learnings-log: validates JSON, auto-injects timestamp, appends to
~/.gstack/projects/$SLUG/learnings.jsonl. Append-only (no mutation).

gstack-learnings-search: reads/filters/dedupes learnings with confidence
decay (observed/inferred lose 1pt/30d), cross-project discovery, and
"latest winner" resolution per key+type.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: learnings count in preamble output

Every skill now prints "LEARNINGS: N entries loaded" during preamble,
making the compounding loop visible to the user.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: integrate learnings + confidence into 9 skill templates

Add {{LEARNINGS_SEARCH}}, {{LEARNINGS_LOG}}, and {{CONFIDENCE_CALIBRATION}}
placeholders to review, ship, plan-eng-review, plan-ceo-review, office-hours,
investigate, retro, and cso templates. Regenerated all SKILL.md files.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: /learn skill — manage project learnings

New skill for reviewing, searching, pruning, and exporting what gstack
has learned across sessions. Commands: /learn, /learn search, /learn prune,
/learn export, /learn stats, /learn add.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: self-learning roadmap — 5-release design doc

Covers: R1 GStack Learns (v0.14), R2 Review Army (v0.15), R3 Smart Ceremony
(v0.16), R4 /autoship (v0.17), R5 Studio (v0.18). Inspired by Compound
Engineering, adapted to GStack's architecture.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* test: learnings bin script unit tests — 13 tests, free

Tests gstack-learnings-log (valid/invalid JSON, timestamp injection,
append-only) and gstack-learnings-search (dedup, type/query/limit filters,
confidence decay, user-stated no-decay, malformed JSONL skip).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* chore: bump version and changelog (v0.13.4.0)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* test: learnings resolver + bin script edge case tests — 21 new tests, free

Adds gen-skill-docs coverage for LEARNINGS_SEARCH, LEARNINGS_LOG, and
CONFIDENCE_CALIBRATION resolvers. Adds bin script edge cases: timestamp
preservation, special characters, files array, sort order, type grouping,
combined filtering, missing fields, confidence floor at 0.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: sync package.json version with VERSION file (0.13.4.0)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* chore: gitignore .factory/ — generated output, not source

Same pattern as .claude/skills/ and .agents/. These SKILL.md files are
generated from .tmpl templates by gen:skill-docs --host factory.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* test: /learn E2E — seed 3 learnings, verify agent surfaces them

Seeds N+1 query pattern, stale cache pitfall, and rubocop preference
into learnings.jsonl, then runs /learn and checks that at least 2/3
appear in the agent's output. Gate tier, ~$0.25/run.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-29 17:02:01 -06:00