mirror of
https://github.com/garrytan/gstack.git
synced 2026-05-07 05:56:41 +02:00
Merge remote-tracking branch 'origin/main' into garrytan/chrome-extension-ctrl
# Conflicts: # browse/src/browser-manager.ts # browse/src/cli.ts
This commit is contained in:
@@ -179,14 +179,17 @@ Sidebar agent writes structured messages to `.context/sidebar-inbox/`. Workspace
|
||||
**Priority:** P4
|
||||
**Depends on:** Chrome extension proving value via sideloading
|
||||
|
||||
### Linux/Windows cookie decryption
|
||||
### Linux cookie decryption — PARTIALLY SHIPPED
|
||||
|
||||
**What:** GNOME Keyring / kwallet / DPAPI support for non-macOS cookie import.
|
||||
~~**What:** GNOME Keyring / kwallet / DPAPI support for non-macOS cookie import.~~
|
||||
|
||||
**Why:** Cross-platform cookie import. Currently macOS-only (Keychain).
|
||||
Linux cookie import shipped in v0.11.11.0 (Wave 3). Supports Chrome, Chromium, Brave, Edge on Linux with GNOME Keyring (libsecret) and "peanuts" fallback. Windows DPAPI support remains deferred.
|
||||
|
||||
**Effort:** L
|
||||
**Remaining:** Windows cookie decryption (DPAPI). Needs complete rewrite — PR #64 was 1346 lines and stale.
|
||||
|
||||
**Effort:** L (Windows only)
|
||||
**Priority:** P4
|
||||
**Completed (Linux):** v0.11.11.0 (2026-03-23)
|
||||
|
||||
## Ship
|
||||
|
||||
@@ -363,17 +366,18 @@ Sidebar agent writes structured messages to `.context/sidebar-inbox/`. Workspace
|
||||
**Depends on:** Video recording
|
||||
|
||||
|
||||
### GitHub Actions eval upload
|
||||
|
||||
**What:** Run eval suite in CI, upload result JSON as artifact, post summary comment on PR.
|
||||
### Extend worktree isolation to Claude E2E tests
|
||||
|
||||
**Why:** CI integration catches quality regressions before merge and provides persistent eval records per PR.
|
||||
**What:** Add `useWorktree?: boolean` option to `runSkillTest()` so any Claude E2E test can opt into worktree mode for full repo context instead of tmpdir fixtures.
|
||||
|
||||
**Context:** Requires `ANTHROPIC_API_KEY` in CI secrets. Cost is ~$4/run. Eval persistence system (v0.3.6) writes JSON to `~/.gstack-dev/evals/` — CI would upload as GitHub Actions artifacts and use `eval:compare` to post delta comment.
|
||||
**Why:** Some Claude E2E tests (CSO audit, review-sql-injection) create minimal fake repos but would produce more realistic results with full repo context. The infrastructure exists (`describeWithWorktree()` in e2e-helpers.ts) — this extends it to the session-runner level.
|
||||
|
||||
**Effort:** M
|
||||
**Priority:** P2
|
||||
**Depends on:** Eval persistence (shipped in v0.3.6)
|
||||
**Context:** WorktreeManager shipped in v0.11.12.0. Currently only Gemini/Codex tests use worktrees. Claude tests use planted-bug fixture repos which are correct for their purpose, but new tests that want real repo context can use `describeWithWorktree()` today. This TODO is about making it even easier via a flag on `runSkillTest()`.
|
||||
|
||||
**Effort:** M (human: ~2 days / CC: ~20 min)
|
||||
**Priority:** P3
|
||||
**Depends on:** Worktree isolation (shipped v0.11.12.0)
|
||||
|
||||
### E2E model pinning — SHIPPED
|
||||
|
||||
@@ -514,6 +518,20 @@ Shipped in v0.8.3. Step 8.5 added to `/ship` — after creating the PR, `/ship`
|
||||
**Depends on:** gstack-diff-scope (shipped)
|
||||
|
||||
|
||||
## Codex
|
||||
|
||||
### Codex→Claude reverse buddy check skill
|
||||
|
||||
**What:** A Codex-native skill (`.agents/skills/gstack-claude/SKILL.md`) that runs `claude -p` to get an independent second opinion from Claude — the reverse of what `/codex` does today from Claude Code.
|
||||
|
||||
**Why:** Codex users deserve the same cross-model challenge that Claude users get via `/codex`. Currently the flow is one-way (Claude→Codex). Codex users have no way to get a Claude second opinion.
|
||||
|
||||
**Context:** The `/codex` skill template (`codex/SKILL.md.tmpl`) shows the pattern — it wraps `codex exec` with JSONL parsing, timeout handling, and structured output. The reverse skill would wrap `claude -p` with similar infrastructure. Would be generated into `.agents/skills/gstack-claude/` by `gen-skill-docs --host codex`.
|
||||
|
||||
**Effort:** M (human: ~2 weeks / CC: ~30 min)
|
||||
**Priority:** P1
|
||||
**Depends on:** None
|
||||
|
||||
## Completeness
|
||||
|
||||
### Completeness metrics dashboard
|
||||
@@ -564,6 +582,14 @@ Shipped in v0.6.5. TemplateContext in gen-skill-docs.ts bakes skill name into pr
|
||||
|
||||
## Completed
|
||||
|
||||
### CI eval pipeline (v0.9.9.0)
|
||||
- GitHub Actions eval upload on Ubicloud runners ($0.006/run)
|
||||
- Within-file test concurrency (test() → testConcurrentIfSelected())
|
||||
- Eval artifact upload + PR comment with pass/fail + cost
|
||||
- Baseline comparison via artifact download from main
|
||||
- EVALS_CONCURRENCY=40 for ~6min wall clock (was ~18min)
|
||||
**Completed:** v0.9.9.0
|
||||
|
||||
### Deploy pipeline (v0.9.8.0)
|
||||
- /land-and-deploy — merge PR, wait for CI/deploy, canary verification
|
||||
- /canary — post-deploy monitoring loop with anomaly detection
|
||||
|
||||
Reference in New Issue
Block a user