mirror of
https://github.com/garrytan/gstack.git
synced 2026-05-02 03:35:09 +02:00
dbd7aee5b6b5bb41b17ae7747568e2fda8a91d77
3 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
562a67503a |
feat: Session Intelligence Layer — /checkpoint + /health + context recovery (v0.15.0.0) (#733)
* feat: session timeline binaries (gstack-timeline-log + gstack-timeline-read) New binaries for the Session Intelligence Layer. gstack-timeline-log appends JSONL events to ~/.gstack/projects/$SLUG/timeline.jsonl. gstack-timeline-read reads, filters, and formats timeline data for /retro consumption. Timeline is local-only project intelligence, never sent anywhere. Always-on regardless of telemetry setting. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: preamble context recovery + timeline events + predictive suggestions Layers 1-3 of the Session Intelligence Layer: - Timeline start/complete events injected into every skill via preamble - Context recovery (tier 2+): lists recent CEO plans, checkpoints, reviews - Cross-session injection: LAST_SESSION and LATEST_CHECKPOINT for branch - Predictive skill suggestion from recent timeline patterns - Welcome back message synthesis - Routing rules for /checkpoint and /health Timeline writes are NOT gated by telemetry (local project intelligence). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: /checkpoint + /health skills (Layers 4-5) /checkpoint: save/resume/list working state snapshots. Supports cross-branch listing for Conductor workspace handoff. Session duration tracking. /health: code quality scorekeeper. Wraps project tools (tsc, biome, knip, shellcheck, tests), computes composite 0-10 score, tracks trends over time. Auto-detects tools or reads from CLAUDE.md ## Health Stack. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * chore: regenerate SKILL.md files + add timeline tests 9 timeline tests (all passing) mirroring learnings.test.ts pattern. All 34 SKILL.md files regenerated with new preamble (context recovery, timeline events, routing rules for /checkpoint and /health). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * chore: bump version and changelog (v0.15.0.0) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: update self-learning roadmap post-Session Intelligence R1-R3 marked shipped with actual versions. R4 becomes Adaptive Ceremony (trust as separate policy engine, scope-aware, gradual degradation). R5 becomes /autoship (resumable state machine, not linear chain). R6-R7 unbundled from old R5. Added State Systems reference, Risk Register (Codex-reviewed), and validation metrics for R4. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * test: E2E tests for Session Intelligence (timeline, recovery, checkpoint) 3 gate-tier E2E tests: - timeline-event-flow: binary data flow round-trip (no LLM) - context-recovery-artifacts: seeded artifacts appear in preamble - checkpoint-save-resume: checkpoint file created with YAML frontmatter Also fixes package.json version sync (0.14.6.0 → 0.15.0.0). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> |
||
|
|
a4a181ca92 |
feat: Review Army — parallel specialist reviewers for /review (v0.14.3.0) (#692)
* feat: extend gstack-diff-scope with SCOPE_MIGRATIONS, SCOPE_API, SCOPE_AUTH
Three new scope signals for Review Army specialist activation:
- SCOPE_MIGRATIONS: db/migrate/, prisma/migrations/, alembic/, *.sql
- SCOPE_API: *controller*, *route*, *endpoint*, *.graphql, openapi.*
- SCOPE_AUTH: *auth*, *session*, *jwt*, *oauth*, *permission*, *role*
* feat: add 7 specialist checklist files for Review Army
- testing.md (always-on): coverage gaps, flaky patterns, security enforcement
- maintainability.md (always-on): dead code, DRY, stale comments
- security.md (conditional): OWASP deep analysis, auth bypass, injection
- performance.md (conditional): N+1 queries, bundle impact, complexity
- data-migration.md (conditional): reversibility, lock duration, backfill
- api-contract.md (conditional): breaking changes, versioning, error format
- red-team.md (conditional): adversarial analysis, cross-cutting concerns
All use standard header with JSON output schema and NO FINDINGS fallback.
* feat: Review Army resolver — parallel specialist dispatch + merge
New resolver in review-army.ts generates template prose for:
- Stack detection and specialist selection
- Parallel Agent tool dispatch with learning-informed prompts
- JSON finding collection, fingerprint dedup, consensus highlighting
- PR quality score computation
- Red Team conditional dispatch
Registered as REVIEW_ARMY in resolvers/index.ts.
* refactor: restructure /review template for Review Army
- Replace Steps 4-4.75 with CRITICAL pass + {{REVIEW_ARMY}}
- Remove {{DESIGN_REVIEW_LITE}} and {{TEST_COVERAGE_AUDIT_REVIEW}}
(subsumed into Design and Testing specialists respectively)
- Extract specialist-covered categories from checklist.md
- Keep CRITICAL + uncovered INFORMATIONAL in main agent pass
* test: Review Army — 14 diff-scope tests + 7 E2E tests
- test/diff-scope.test.ts: 14 tests for all 9 scope signals
- test/skill-e2e-review-army.test.ts: 7 E2E tests
Gate: migration safety, N+1 detection, delivery audit,
quality score, JSON findings
Periodic: red team, consensus
- Updated gen-skill-docs tests for new review structure
- Added touchfile entries and tier classifications
* docs: update SELF_LEARNING_V0.md with Release 2 status + Release 2.5
Mark Release 2 (Review Army) as in-progress. Add Release 2.5 for
deferred expansions (E1 adaptive gating, E3 test stubs, E5 cross-review
dedup, E7 specialist tracking).
* chore: bump version and changelog (v0.14.3.0)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
|
||
|
|
ae0a9ad195 |
feat: GStack Learns — per-project self-learning infrastructure (v0.13.4.0) (#622)
* feat: learnings + confidence resolvers — cross-skill memory infrastructure Three new resolvers for the self-learning system: - LEARNINGS_SEARCH: tells skills to load prior learnings before analysis - LEARNINGS_LOG: tells skills to capture discoveries after completing work - CONFIDENCE_CALIBRATION: adds 1-10 confidence scoring to all review findings Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: learnings bin scripts — append-only JSONL read/write gstack-learnings-log: validates JSON, auto-injects timestamp, appends to ~/.gstack/projects/$SLUG/learnings.jsonl. Append-only (no mutation). gstack-learnings-search: reads/filters/dedupes learnings with confidence decay (observed/inferred lose 1pt/30d), cross-project discovery, and "latest winner" resolution per key+type. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: learnings count in preamble output Every skill now prints "LEARNINGS: N entries loaded" during preamble, making the compounding loop visible to the user. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: integrate learnings + confidence into 9 skill templates Add {{LEARNINGS_SEARCH}}, {{LEARNINGS_LOG}}, and {{CONFIDENCE_CALIBRATION}} placeholders to review, ship, plan-eng-review, plan-ceo-review, office-hours, investigate, retro, and cso templates. Regenerated all SKILL.md files. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: /learn skill — manage project learnings New skill for reviewing, searching, pruning, and exporting what gstack has learned across sessions. Commands: /learn, /learn search, /learn prune, /learn export, /learn stats, /learn add. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: self-learning roadmap — 5-release design doc Covers: R1 GStack Learns (v0.14), R2 Review Army (v0.15), R3 Smart Ceremony (v0.16), R4 /autoship (v0.17), R5 Studio (v0.18). Inspired by Compound Engineering, adapted to GStack's architecture. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * test: learnings bin script unit tests — 13 tests, free Tests gstack-learnings-log (valid/invalid JSON, timestamp injection, append-only) and gstack-learnings-search (dedup, type/query/limit filters, confidence decay, user-stated no-decay, malformed JSONL skip). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * chore: bump version and changelog (v0.13.4.0) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * test: learnings resolver + bin script edge case tests — 21 new tests, free Adds gen-skill-docs coverage for LEARNINGS_SEARCH, LEARNINGS_LOG, and CONFIDENCE_CALIBRATION resolvers. Adds bin script edge cases: timestamp preservation, special characters, files array, sort order, type grouping, combined filtering, missing fields, confidence floor at 0. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: sync package.json version with VERSION file (0.13.4.0) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * chore: gitignore .factory/ — generated output, not source Same pattern as .claude/skills/ and .agents/. These SKILL.md files are generated from .tmpl templates by gen:skill-docs --host factory. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * test: /learn E2E — seed 3 learnings, verify agent surfaces them Seeds N+1 query pattern, stale cache pitfall, and rubocop preference into learnings.jsonl, then runs /learn and checks that at least 2/3 appear in the agent's output. Gate tier, ~$0.25/run. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> |