- Add cross-reference comments between dashboard-queries.ts computeLeaderboard()
and dashboard/ui.ts renderLeaderboard() so maintainers know to update both
- Add security note in setup-team-dashboard about service-role-key visibility
in pg_cron job table
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Export getValidToken from sync.ts (was private)
- cli-team.ts now uses sync.ts version (supports auto-refresh, was missing)
- Remove unused isTokenExpired/getAuthTokens imports from cli-team
- Remove "Joined" column from formatMembersTable (team_members has no created_at)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
From CEO plan review:
- Edge functions: early guard on missing env vars instead of non-null assert crash
- cli-team: wire isTokenExpired check (was imported but unused)
- Migration 007: CHECK constraint on team slug (a-z0-9 hyphens, 2-50 chars)
- Dashboard: streak badges on leaderboard, repo slug in who's-online,
contextual empty states that teach, 60s refresh (was 30s)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
New `gstack team` CLI with create, members, set subcommands.
Migration adds team_settings (admin-only), alert_cooldowns (edge-fn
dedup), and create_team() SECURITY DEFINER RPC for atomic team +
first member creation. 9 tests.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
New `gstack eval leaderboard` subcommand pulls team data and renders
weekly stats per contributor. Refactored formatTeamSummary to use
computeVelocity from dashboard-queries (DRY). 4 new tests.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
6 functions: detectRegressions, computeVelocity, computeCostTrend,
computeLeaderboard, computeQATrend, computeEvalTrend. All pure,
no I/O, with division-by-zero guards. 28 tests.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- cli-sync.ts: push-transcript command, show sessions with formatSessionTable(),
upgrade cmdSetup() to interactively create .gstack-sync.json if missing
- bin/gstack-sync: add push-transcript case and help text
- test/lib-llm-summarize.test.ts: 10 tests with mocked fetch (429 retry,
5xx backoff, malformed response, no API key, cache)
- test/lib-transcript-sync.test.ts: 22 tests for parsing, grouping,
session file extraction, marker management, slug resolution
- test/lib-sync-show.test.ts: 4 tests for formatSessionTable
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add session intelligence pipeline for team transcript sync:
- lib/transcript-sync.ts: parse history.jsonl, enrich with Claude session
file data (tools_used, full turn count), sync marker management,
10-concurrent push with 5-concurrent Haiku summarization
- lib/llm-summarize.ts: raw fetch() to Anthropic Messages API (no SDK dep),
retry-after on 429, exponential backoff on 5xx, SHA-based eval-cache
- lib/sync.ts: pushTranscript() and pullTranscripts() following existing patterns
- 006_transcript_sync.sql: unique index on (team_id, session_id) for
idempotent upsert, RLS changed from admin-only to team-wide read
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- 005_sync_heartbeats.sql migration for connectivity testing
- eval:trend --team flag pulls team eval data (graceful fallback)
- docs/TEAM_SYNC_SETUP.md step-by-step setup guide
- Design doc status updated to Phase 2 complete
- 10 new tests for sync show formatting functions
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Extract pushWithSync() helper to eliminate boilerplate across 6 push
functions. Add pushHeartbeat() for connectivity testing. Add push-greptile
to CLI. New commands: gstack-sync test (validates full push/pull flow
via sync_heartbeats table), gstack-sync show (terminal team data
dashboard with summary/evals/ships/retros views). Guard main block
with import.meta.main.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
computeTrends() classifies tests as stable-pass/stable-fail/flaky/
improving/degrading based on pass rate, flip count, and recent streak.
gstack eval trend shows sparkline table with --limit, --tier, --test
filters. Guard CLI main block with import.meta.main to prevent
execution on import.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Extract per-model token usage from resultLine.modelUsage (including
cache tokens and exact API cost), flow CostEntry[] through EvalCollector,
aggregate in finalize(). Extend CostEntry with cache_read_input_tokens,
cache_creation_input_tokens, cost_usd. computeCosts() prefers exact
cost_usd over MODEL_PRICING when available (~4x more accurate with
prompt caching).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- lib/cli-eval.ts: routes to list/compare/summary/push/cost/cache/watch
subcommands. Ports logic from 4 separate scripts into unified entry.
Adds ANSI color for TTY (respects NO_COLOR), --limit flag for list.
- bin/gstack-eval: bash wrapper matching bin/gstack-sync pattern
- package.json: eval:* scripts now point to lib/cli-eval.ts
- supabase/migrations/004_eval_costs.sql: per-model cost tracking + RLS
- docs/eval-result-format.md: public format spec for any language
- test/lib-eval-cli.test.ts: integration tests (spawn CLI subprocess)
including 3 push failure modes (file-not-found, invalid schema,
sync unavailable)
215 tests passing across 13 files.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Cache at ~/.gstack/eval-cache/{suite}/{sha}.json. Compute cache keys
from source file contents + test input via Bun.CryptoHasher SHA256.
Supports read/write/stats/clear/verify operations. EVAL_CACHE=0
skips reads for force-rerun. 16 tests including corrupt JSON handling.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
DRY up eval I/O duplicated across scripts/eval-list.ts,
eval-compare.ts, and eval-summary.ts. Adds EVAL_DIR constant,
formatTimestamp(), listEvalFiles(), loadEvalResults() with
--limit support. 13 new tests.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- lib/sync-config.ts: reads .gstack-sync.json + ~/.gstack/auth.json
- lib/auth.ts: device auth flow (browser OAuth, local HTTP callback)
- lib/sync.ts: Supabase push/pull via raw fetch(), offline queue, cache
- lib/cli-sync.ts: CLI handler for gstack-sync commands
- bin/gstack-sync: bash wrapper (setup, status, push-*, pull, drain)
- .gstack-sync.json.example: template for team setup
Zero new dependencies — uses raw fetch() against PostgREST API.
All sync is non-fatal with 5s timeout and offline queue fallback.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
DRY up atomicWriteSync, readJSON, getGitInfo, getVersion, getRemoteSlug,
and sanitizeForFilename from eval-store.ts, session-runner.ts, and
eval-watch.ts into a shared module.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>