mirror of
https://github.com/garrytan/gstack.git
synced 2026-06-17 15:20:11 +02:00
chore: bump version and changelog (v1.40.0.0)
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
@@ -75,6 +75,16 @@ Invoke them by name (e.g., `/office-hours`).
|
||||
| `/setup-browser-cookies` | Import cookies from your real browser for authenticated testing. |
|
||||
| `/pair-agent` | Pair a remote AI agent (OpenClaw, Codex, etc.) with your browser. |
|
||||
|
||||
### iOS device-farm (v1.40.0.0+)
|
||||
|
||||
| Skill | What it does |
|
||||
|-------|-------------|
|
||||
| `/ios-qa` | Live-device iOS QA via USB CoreDevice tunnel + embedded StateServer. Optionally exposes the device over Tailscale so remote agents can drive it. |
|
||||
| `/ios-fix` | Autonomous iOS bug fixer with regression snapshot capture. |
|
||||
| `/ios-design-review` | Designer's-eye QA on a real iPhone — 10-dimension Apple HIG rubric. |
|
||||
| `/ios-clean` | Convenience: strip DebugBridge + #if DEBUG wiring before a Release build. |
|
||||
| `/ios-sync` | Regenerate the iOS debug bridge against the latest upstream templates. |
|
||||
|
||||
### Safety + scoping
|
||||
|
||||
| Skill | What it does |
|
||||
|
||||
@@ -1,5 +1,73 @@
|
||||
# Changelog
|
||||
|
||||
## [1.40.0.0] - 2026-05-17
|
||||
|
||||
## **iOS QA on a real iPhone — no XCTest, no WebDriverAgent, no simulators.**
|
||||
## **Bring your own Mac mini + Tailscale and you have a DIY device farm any agent can drive.**
|
||||
|
||||
Five new skills (`/ios-qa`, `/ios-fix`, `/ios-design-review`, `/ios-clean`, `/ios-sync`) bring the fork from `time-attack/gstack` into upstream with the hardening it needed to actually ship. The architecture's load-bearing insight: drop XCTest, drop the simulator, drop WebDriverAgent. Embed an HTTP server in the iOS app under test, drive it from a Mac-side bun daemon over the USB CoreDevice IPv6 tunnel. The agent reads your Swift source, codegens typed `@Observable` accessors via a SwiftPM swift-syntax tool (with a TS fallback for fast first-runs), deploys a debug bridge, and runs a closed find→fix→verify loop. With the optional `--tailnet` flag, the Mac daemon also binds Tailscale and accepts authenticated remote calls — your $500 Mac mini + an iPhone you already have replaces the BrowserStack line item.
|
||||
|
||||
### The numbers that matter
|
||||
|
||||
Source: 67 daemon unit/integration tests + 20 codegen tests + 8 high-level E2E tests, all running against a real bun process and a stubbed iOS StateServer.
|
||||
|
||||
| Surface | Fork as-is | Shipped |
|
||||
|---|---|---|
|
||||
| StateServer bind | `0.0.0.0:9999`, zero auth | `::1` + `127.0.0.1` only; bearer-token gate; boot token rotates within ~5s of daemon spawn so anything scraping `os_log` past then sees a dead credential |
|
||||
| Release-build safety | none (any `#if DEBUG` mistake ships the bridge) | structural `Package.swift` `.when(configuration: .debug)` + CI `swift build -c release` invariant test that fails if the `DebugBridge` symbol appears |
|
||||
| Codegen failure modes covered | regex breaks on computed properties, generics, multi-line types | swift-syntax AST (production), strict TS regex fallback for tests; 3 dedicated fixtures pin the known failure shapes |
|
||||
| Multi-agent device contention | none | per-device session lock with sliding timeout on mutations only; concurrent `/session/acquire` race test |
|
||||
| Remote control | not in scope | Tailscale identity-gated `/auth/mint`; capability tiers (observe/interact/mutate/restore); 1h default session TTL (24h cap); audit log of every authenticated mutating request; hashed-identity attempts log |
|
||||
| Hardcoded paths | 3 `/Users/sinmat/.gstack/...` paths | none — all paths use `$HOME` / `os.homedir()` |
|
||||
| Test coverage | none | 95 tests covering session-lock concurrency, snapshot/restore atomicity with schema-hash gate, identity canonicalization (user / tag / node-key), capability tier enforcement, rate limits, body-size limits, boot-token leak proofs, tailnet fail-closed probe, CoreDevice tunnel reconnect plumbing, cache-key composite (Swift version + tool git rev + source content + platform triple) |
|
||||
|
||||
### What this means for iOS developers
|
||||
|
||||
You can ship a SwiftUI app, add the `DebugBridge` SPM dep, run `/ios-qa`, and watch an agent drive your phone — taps, swipes, state writes, the whole loop. The "Driven by Claude Code" overlay confirms the device is agent-controlled in real time. Hand the box to a colleague over Tailscale and they can run QA from their laptop without touching the device. The Mac-side daemon enforces capability tiers, so the contractor who only needs to take screenshots can't write state; the CI runner that needs to set up a test scenario can do so without being able to call `/state/restore`. The audit log gives you per-request forensics. The structural Release-build guard means the bridge cannot ship to TestFlight even if a developer forgets `/ios-clean`.
|
||||
|
||||
### Itemized changes
|
||||
|
||||
#### Added
|
||||
|
||||
- **`/ios-qa`** (770-line SKILL.md.tmpl) — live-device QA flow with warm-start session cache, on-demand daemon spawn, Tailscale opt-in, demo + recording modes, full failure-mode + recovery matrix.
|
||||
- **`/ios-fix`** — autonomous bug fixer that captures a reproducing `/state/snapshot` BEFORE editing source, then rebuilds + redeploys + verifies. Snapshot becomes a regression test fixture.
|
||||
- **`/ios-design-review`** — 10-dimension Apple HIG audit on a real device. 0-10 scores per dimension with "what would make it a 10" framing, mirroring `/plan-design-review`'s rubric for browser.
|
||||
- **`/ios-clean`** — convenience wrapper that strips `DebugBridge` SPM + `#if DEBUG` wiring. Explicitly NOT the safety-critical path — the structural Release-build guard in `Package.swift` is.
|
||||
- **`/ios-sync`** — regenerates accessors against latest upstream gstack templates. Run after upgrading gstack or adding new `@Observable` classes.
|
||||
- `ios-qa/templates/StateServer.swift.template` — dual-stack loopback bind (`::1` + `127.0.0.1`), boot token rotation, per-device session lock with mutation-only sliding window, snapshot/restore with schema envelope (`_schema_version` + `_app_build_id` + `_accessor_hash`), validate-then-apply atomicity via a single canonical-state-struct assignment, 1MB body cap.
|
||||
- `ios-qa/templates/DebugOverlay.swift.template` — animated brand-colored border, agent attribution chip (`X-Agent-Identity` header, display-only, never trusted for auth), optional recording-mode watermark for screencasts.
|
||||
- `ios-qa/templates/Package.swift.template` — DebugBridge target gated `.when(configuration: .debug)`. SwiftPM refuses to link in Release config.
|
||||
- `ios-qa/daemon/` — Mac-side bun/TS daemon. Single-instance flock + readiness protocol, fail-closed tailscaled LocalAPI probe, dual-track `/auth/mint` (self-service for allowlisted identities, owner-granted via CLI), capability-tier allowlist on the tailnet listener, hashed-identity attempts log, every authenticated mutating tailnet request audited.
|
||||
- `ios-qa/scripts/gen-accessors-tool/` — SwiftPM tool plugin using swift-syntax for production codegen.
|
||||
- `ios-qa/scripts/gen-accessors.ts` — TS fallback for fast first-runs and CI. Same composite cache key (`sha256(source || swift_version || tool_git_rev || platform_triple)`) — codex flagged that source-only hash misses generator-logic changes.
|
||||
- `ios-qa/docs/tailscale-acl-example.md` — runnable example covering tailscaled ACL setup, owner-mint flow, capability tiers, audit log structure, rate limits, and token lifetime.
|
||||
- `test/skill-e2e-ios.test.ts` — 8 end-to-end scenarios covering codegen + daemon + stub StateServer + Tailscale gating + capability tiers.
|
||||
- 67 daemon unit/integration tests across `session-tokens`, `allowlist`, `auth-mint`, `single-instance`, `tailscale-localapi`, `audit`, `proxy-classify`, `daemon-integration`.
|
||||
- 20 codegen tests in `ios-qa/scripts/gen-accessors.test.ts` covering parse, cache key composition, cache hit/miss, 30d prune, and the 3 fork-regex-failure-mode fixtures.
|
||||
|
||||
#### Changed
|
||||
|
||||
- `test/helpers/touchfiles.ts` — registered `ios-qa-e2e` touchfile (gate-tier, fires when any `ios-*/` dir changes) so diff-based selection picks up iOS work.
|
||||
- `AGENTS.md`, `docs/skills.md` — added "iOS device-farm" sections covering the five new skills.
|
||||
|
||||
#### Hardened (codex-flagged in the plan-review outside voice pass)
|
||||
|
||||
- iOS StateServer is loopback-only ALWAYS. Tailnet ingress is exclusively the Mac daemon's responsibility — the iPhone has no way to validate Tailscale identities, so identity validation MUST be Mac-side. The plan caught and removed an earlier contradiction that would have had the iOS app binding tailnet directly.
|
||||
- Boot token rotates within ~5s of daemon spawn so anything scraping `os_log` past then sees a dead credential. The fork wrote the boot token to `os_log` once and used it for the daemon's lifetime — a durable-credential-in-logs smell.
|
||||
- `/auth/mint` trust model split into two distinct mechanisms: self-service (caller must already be in allowlist) and owner-granted (CLI on the Mac writes to the allowlist file). Self-service NEVER auto-allowlists. The fork ambiguously mixed both paths.
|
||||
- Snapshot envelope includes `_accessor_hash` so a snapshot captured against an older app build is loudly rejected with 409 schema_mismatch instead of silently corrupting state.
|
||||
- `GET /state/snapshot` returns ONLY fields marked `@Snapshotable`. Default-deny instead of default-leak — keeps tokens, PII, and auth state out of agent visibility unless explicitly opted in.
|
||||
- Tailnet listener fails closed if tailscaled LocalAPI is unreachable. Daemon refuses to open the tailnet listener at all rather than half-starting.
|
||||
- `X-Agent-Identity` header is display-only. Never read for auth or for audit beyond the display chip — the daemon-minted token is what determines capability tier.
|
||||
|
||||
#### For contributors
|
||||
|
||||
- New SwiftPM tool dependency: `swift-syntax`. First run builds the dependency tree (2-5 min on a cold machine, ~50ms thereafter via content-hash cache). Document the "first-time setup" UX in `/ios-qa` so users know what's happening.
|
||||
- The TS fallback in `ios-qa/scripts/gen-accessors.ts` is what tests + CI exercise. Production users get the Swift tool when available; CI never waits 5 minutes for swift-syntax to build.
|
||||
- All daemon HTTP egress goes through `JSON.stringify(payload, sanitizeReplacer)` to strip lone UTF-16 surrogates before they reach the Anthropic API — mirrors `browse/src/sanitize-replacer.ts`. Tunnel-denial logging mirrors `browse/src/tunnel-denial-log.ts`. No new auth/logging primitives.
|
||||
|
||||
Contributed by @sinacodedit (forked from time-attack/gstack).
|
||||
|
||||
## [1.39.2.0] - 2026-05-15
|
||||
|
||||
## **Conductor workspaces wire `GSTACK_*` keys straight into gbrain embeddings and paid evals.**
|
||||
|
||||
@@ -229,6 +229,8 @@ Each skill feeds into the next. `/office-hours` writes a design doc that `/plan-
|
||||
| `/setup-gbrain` | **GBrain Onboarding** — from zero to running gbrain in under 5 minutes. PGLite local, Supabase existing URL, or auto-provision a new Supabase project via Management API. MCP registration for Claude Code + per-repo trust triad (read-write/read-only/deny). [Full guide](USING_GBRAIN_WITH_GSTACK.md). |
|
||||
| `/sync-gbrain` | **Keep Brain Current** — re-index this repo's code into gbrain via `gbrain sources add` + `gbrain sync --strategy code`, refresh the `## GBrain Search Guidance` block in CLAUDE.md, and auto-remove guidance when the capability check fails. `--incremental` (default), `--full`, `--dry-run`. Idempotent; safe to re-run. |
|
||||
| `/gstack-upgrade` | **Self-Updater** — upgrade gstack to latest. Detects global vs vendored install, syncs both, shows what changed. |
|
||||
| `/ios-qa` | **iOS Live-Device QA (v1.40+)** — drive a real iPhone over USB CoreDevice via an embedded `StateServer` in the app. Read Swift source, codegen typed `@Observable` accessors, run the agent loop. Optional `--tailnet` flag turns your Mac mini into a DIY device farm reachable by OpenClaw or any HTTP-capable agent on your Tailscale tailnet. Capability-tier allowlist (observe/interact/mutate/restore), per-device session lock, audit log. |
|
||||
| `/ios-fix`, `/ios-design-review`, `/ios-clean`, `/ios-sync` | iOS bug-fix loop, designer's-eye HIG audit, debug-bridge cleanup, and accessor resync. See `docs/skills.md`. |
|
||||
|
||||
### New binaries (v0.19)
|
||||
|
||||
|
||||
@@ -54,6 +54,11 @@ Detailed guides for every gstack skill — philosophy, workflow, and examples.
|
||||
| [`/setup-deploy`](#setup-deploy) | **Deploy Configurator** | One-time setup for `/land-and-deploy`. Detects your platform, production URL, and deploy commands. |
|
||||
| [`/gstack-upgrade`](#gstack-upgrade) | **Self-Updater** | Upgrade gstack to the latest version. Detects global vs vendored install, syncs both, shows what changed. |
|
||||
| [`/make-pdf`](#make-pdf) | **PDF Generator** | Turn any markdown file into a publication-quality PDF. Proper margins, page numbers, cover pages, clickable TOC. |
|
||||
| [`/ios-qa`](#ios-qa) | **iOS QA Lead** | Live-device iOS QA via USB CoreDevice tunnel + embedded StateServer. Reads Swift source, codegens accessors, drives the real iPhone. Optionally exposes the device over Tailscale for remote agents. |
|
||||
| [`/ios-fix`](#ios-fix) | **iOS Autonomous Fixer** | Closes the find→fix→verify loop on a real iPhone. Captures a reproducing snapshot, fixes the source, rebuilds, redeploys, verifies. |
|
||||
| [`/ios-design-review`](#ios-design-review) | **iOS Designer's Eye** | 10-dimension Apple HIG audit on a real iPhone. Rates each screen, says what would make it a 10. |
|
||||
| [`/ios-clean`](#ios-clean) | **iOS Bridge Cleanup** | Convenience wrapper to strip DebugBridge SPM + `#if DEBUG` wiring. The structural Release-build guard is in Package.swift + CI; this skill is for guided manual removals. |
|
||||
| [`/ios-sync`](#ios-sync) | **iOS Bridge Resync** | Regenerate accessors and Swift templates against the latest upstream gstack. Run when you add new `@Observable` classes or upgrade gstack. |
|
||||
|
||||
---
|
||||
|
||||
@@ -1178,3 +1183,78 @@ Claude: Replied to Greptile. All tests pass.
|
||||
```
|
||||
|
||||
Three Greptile comments. One real fix. One auto-acknowledged. One false positive pushed back with a reply. Total extra time: about 30 seconds.
|
||||
|
||||
---
|
||||
|
||||
## `/ios-qa`
|
||||
|
||||
Live-device iOS QA. The fork's load-bearing insight was: don't simulate, don't run XCTest, don't bring up WebDriverAgent. Embed an HTTP server in the app under test, drive it from a Mac-side daemon over the USB CoreDevice IPv6 tunnel.
|
||||
|
||||
The agent reads your Swift source, finds `@Observable` classes with `@Snapshotable`-marked fields, codegens typed accessors, deploys a debug bridge, then runs a closed find→fix→verify loop.
|
||||
|
||||
### Architecture in one diagram
|
||||
|
||||
```
|
||||
┌──────────────────────┐ USB CoreDevice (IPv6) ┌──────────────────┐
|
||||
│ gstack-ios-qa daemon │ ────────────────────────▶ │ iOS app │
|
||||
│ (Mac, bun/TS) │ bearer + X-Session-Id │ StateServer │
|
||||
│ - rotates boot token │ │ (loopback only) │
|
||||
│ - mints session toks │ └──────────────────┘
|
||||
│ - capability tiers │
|
||||
│ - audit + redact │
|
||||
└──────────────────────┘
|
||||
▲
|
||||
│ Tailscale (optional, --tailnet)
|
||||
│
|
||||
┌──────────────────────┐
|
||||
│ Remote agent │
|
||||
│ (OpenClaw, etc.) │
|
||||
└──────────────────────┘
|
||||
```
|
||||
|
||||
The iOS app's `StateServer` binds loopback only (`::1` + `127.0.0.1`). The Mac daemon owns tailnet identity validation, capability tiers, and the audit trail. Remote agents NEVER see the boot token — only short-lived session tokens (1h default, 24h hard cap) minted via Tailscale identity gating.
|
||||
|
||||
### The unlock: USB-tethered + Tailscale = DIY device farm
|
||||
|
||||
A $500 Mac mini + an old iPhone + Tailscale free tier replaces what most teams pay BrowserStack/Sauce Labs for. Tailscale ACLs scope which identities can reach which devices at which capability tier.
|
||||
|
||||
See `ios-qa/docs/tailscale-acl-example.md` for the runnable setup.
|
||||
|
||||
### Capability tiers
|
||||
|
||||
| Tier | Endpoints |
|
||||
|------|-----------|
|
||||
| observe | `/screenshot`, `/elements`, `GET /state/*`, `/state/snapshot`, `/healthz` |
|
||||
| interact | observe + `/tap`, `/swipe`, `/type`, `/session/*` |
|
||||
| mutate | interact + `POST /state/<key>` |
|
||||
| restore | mutate + `POST /state/restore` |
|
||||
|
||||
Default minted tokens get `interact`. Higher tiers require explicit owner mint.
|
||||
|
||||
---
|
||||
|
||||
## `/ios-fix`
|
||||
|
||||
Iron Law: no fix without a reproducing snapshot. The agent captures pre-bug state via `GET /state/snapshot`, writes the fix, rebuilds, redeploys, restores the snapshot, and verifies the bug is gone. The snapshot becomes a regression test fixture so the bug can't recur silently.
|
||||
|
||||
Mirrors `/qa`'s find-bug → fix → re-verify loop for iOS.
|
||||
|
||||
---
|
||||
|
||||
## `/ios-design-review`
|
||||
|
||||
Designer's-eye QA on a real iPhone. Connects to the same `/ios-qa` daemon in observe-tier mode and screenshots every screen. Scores 10 dimensions 0-10: typography hierarchy, spacing rhythm, color hierarchy, touch targets, loading/empty/error states, accessibility, animation discipline, iOS idiom alignment, information density, AI-slop check.
|
||||
|
||||
For each score < 7, uses AskUserQuestion to present the issue with recommended fix.
|
||||
|
||||
---
|
||||
|
||||
## `/ios-clean`
|
||||
|
||||
Convenience wrapper. The structural Release-build guard against shipping DebugBridge is in `Package.swift` (`.when(configuration: .debug)`) plus a CI invariant test. `/ios-clean` is for developers who want a guided removal flow or who manually added the SPM dependency without going through `/ios-qa`.
|
||||
|
||||
---
|
||||
|
||||
## `/ios-sync`
|
||||
|
||||
Run after upgrading gstack or adding new `@Observable` classes. Detects what's installed, runs gen-accessors against the latest upstream templates, refreshes any changed Swift files, verifies the app rebuilds. Cache-key invalidation handles Swift version changes, generator git rev changes, and source changes.
|
||||
|
||||
+1
-1
@@ -1,6 +1,6 @@
|
||||
{
|
||||
"name": "gstack",
|
||||
"version": "1.39.2.0",
|
||||
"version": "1.40.0.0",
|
||||
"description": "Garry's Stack — Claude Code skills + fast headless browser. One repo, one install, entire AI engineering workflow.",
|
||||
"license": "MIT",
|
||||
"type": "module",
|
||||
|
||||
Reference in New Issue
Block a user