mirror of
https://github.com/garrytan/gstack.git
synced 2026-06-17 15:20:11 +02:00
docs: surface ios-qa CLIs + add end-to-end how-to walkthrough
The two CLIs that ship with the iOS device-farm capability — gstack-ios-qa-daemon and gstack-ios-qa-mint — were mentioned only inside ios-qa/SKILL.md. Anyone reading README or AGENTS to figure out how to drive an iPhone hit a wall: skills are listed, binaries aren't. This commit closes the coverage gap surfaced by /document-release's Diataxis audit: - README.md, AGENTS.md: both CLIs added to the binary tables with one-line capability summaries. - docs/howto-ios-testing-with-gstack.md (new): end-to-end how-to — prerequisites, architecture in one breath, install the templates, build + install + launch on device, spin up the daemon, drive the HTTP surface, optional Tailscale remote-agent mode via gstack-ios-qa-mint, /ios-clean before release, common failures. Pulled directly from the real iPhone 17 Pro Max / iOS 26.5 verification run. - README + AGENTS link to the new how-to from the iOS skill row. No CHANGELOG entry change — the consolidated 1.43.0.0 entry is /ship work. No VERSION bump — already at 1.43.0.0 covering all branch work. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -85,6 +85,15 @@ Invoke them by name (e.g., `/office-hours`).
|
||||
| `/ios-clean` | Convenience: strip DebugBridge + #if DEBUG wiring before a Release build. |
|
||||
| `/ios-sync` | Regenerate the iOS debug bridge against the latest upstream templates. |
|
||||
|
||||
Companion CLIs (run on the Mac that's plugged into the device):
|
||||
|
||||
| Command | What it does |
|
||||
|---------|-------------|
|
||||
| `gstack-ios-qa-daemon` | Mac-side broker. Loopback by default; `--tailnet` adds a Tailscale-facing listener with capability tiers and audit logging. |
|
||||
| `gstack-ios-qa-mint` | Owner-grant CLI for the tailnet allowlist (`grant`/`revoke`/`list`). |
|
||||
|
||||
End-to-end walkthrough: [docs/howto-ios-testing-with-gstack.md](docs/howto-ios-testing-with-gstack.md).
|
||||
|
||||
### Safety + scoping
|
||||
|
||||
| Skill | What it does |
|
||||
|
||||
@@ -230,7 +230,7 @@ Each skill feeds into the next. `/office-hours` writes a design doc that `/plan-
|
||||
| `/sync-gbrain` | **Keep Brain Current** — re-index this repo's code into gbrain via `gbrain sources add` + `gbrain sync --strategy code`, refresh the `## GBrain Search Guidance` block in CLAUDE.md, and auto-remove guidance when the capability check fails. `--incremental` (default), `--full`, `--dry-run`. Idempotent; safe to re-run. |
|
||||
| `/gstack-upgrade` | **Self-Updater** — upgrade gstack to latest. Detects global vs vendored install, syncs both, shows what changed. |
|
||||
| `/ios-qa` | **iOS Live-Device QA (v1.40+)** — drive a real iPhone over USB CoreDevice via an embedded `StateServer` in the app. Read Swift source, codegen typed `@Observable` accessors, run the agent loop. Optional `--tailnet` flag turns your Mac mini into a DIY device farm reachable by OpenClaw or any HTTP-capable agent on your Tailscale tailnet. Capability-tier allowlist (observe/interact/mutate/restore), per-device session lock, audit log. |
|
||||
| `/ios-fix`, `/ios-design-review`, `/ios-clean`, `/ios-sync` | iOS bug-fix loop, designer's-eye HIG audit, debug-bridge cleanup, and accessor resync. See `docs/skills.md`. |
|
||||
| `/ios-fix`, `/ios-design-review`, `/ios-clean`, `/ios-sync` | iOS bug-fix loop, designer's-eye HIG audit, debug-bridge cleanup, and accessor resync. See `docs/skills.md`. End-to-end walkthrough: [docs/howto-ios-testing-with-gstack.md](docs/howto-ios-testing-with-gstack.md). |
|
||||
|
||||
### New binaries (v0.19)
|
||||
|
||||
@@ -240,6 +240,8 @@ Beyond the slash-command skills, gstack ships standalone CLIs for workflows that
|
||||
|---------|-------------|
|
||||
| `gstack-model-benchmark` | **Cross-model benchmark** — run the same prompt through Claude, GPT (via Codex CLI), and Gemini; compare latency, tokens, cost, and (optionally) LLM-judge quality score. Auth detected per provider, unavailable providers skip cleanly. Output as table, JSON, or markdown. `--dry-run` validates flags + auth without spending API calls. |
|
||||
| `gstack-taste-update` | **Design taste learning** — writes approvals and rejections from `/design-shotgun` into a persistent per-project taste profile. Decays 5%/week. Feeds back into future variant generation so the system learns what you actually pick. |
|
||||
| `gstack-ios-qa-daemon` | **iOS device-farm daemon** — Mac-side broker between an agent and a connected iPhone over USB CoreDevice. Loopback by default; `--tailnet` opens a Tailscale-facing listener with identity-gated capability tiers. Single-instance via flock on `~/.gstack/ios-qa-daemon.pid`. See [docs/howto-ios-testing-with-gstack.md](docs/howto-ios-testing-with-gstack.md). |
|
||||
| `gstack-ios-qa-mint` | **iOS allowlist manager** — owner-grant CLI for the tailnet allowlist. `grant`/`revoke`/`list` against `~/.gstack/ios-qa-allowlist.json` (mode 0600). Remote agents never auto-allowlist; this is the explicit-intent path. |
|
||||
|
||||
### Continuous checkpoint mode (opt-in, local by default)
|
||||
|
||||
|
||||
@@ -0,0 +1,180 @@
|
||||
# How to test iOS apps with GStack iOS
|
||||
|
||||
This is the end-to-end walkthrough for the iOS device-farm capability that ships with gstack: install the canonical Swift templates into your app, connect a real iPhone over USB, and drive it from any agent (Claude Code locally, or any HTTP-capable agent over Tailscale). No simulators, no XCTest harness, no WebDriverAgent.
|
||||
|
||||
Everything below has been verified end-to-end on a real iPhone 17 Pro Max running iOS 26.5. The same flow works on any iOS 16+ device.
|
||||
|
||||
## What you'll need
|
||||
|
||||
- macOS with Xcode 16.0+ installed (`xcrun devicectl --version` must succeed). Xcode 16 ships the CoreDevice tunnel `devicectl` uses to reach the device over USB.
|
||||
- A real iPhone running iOS 16 or later. Unlocked, paired with your Mac, with **Developer Mode** enabled in Settings → Privacy & Security.
|
||||
- An Apple developer team — the free personal team works fine for live-device debug deploys. You'll need the team ID (e.g. `623FYQ2M88`), not the certificate ID. Find it in Xcode → Settings → Accounts → your Apple ID → team list. The setup signs the app for your device on first deploy via `-allowProvisioningUpdates -allowProvisioningDeviceRegistration`.
|
||||
- gstack installed (`./setup` complete; `bin/gstack-ios-qa-daemon` must be on disk and executable).
|
||||
- Bun runtime on PATH (`bun --version`). The Mac-side daemon is a bun process.
|
||||
|
||||
For the optional remote-agent (Tailscale) mode, you'll additionally need Tailscale installed on the Mac with `/var/run/tailscale.sock` readable.
|
||||
|
||||
## Architecture in one breath
|
||||
|
||||
```
|
||||
┌─────────────────┐ tailnet (opt) ┌──────────────────────┐ USB CoreDevice ┌─────────────────────┐
|
||||
│ Remote agent │ ─────────────────▶ │ gstack-ios-qa-daemon │ ──────────────────▶ │ iOS app StateServer │
|
||||
│ (Claude, GPT, │ bearer + session │ (Mac, bun/TS) │ IPv6 ULA tunnel │ (loopback only) │
|
||||
│ OpenClaw, ...) │ │ │ │ │
|
||||
└─────────────────┘ └──────────────────────┘ └─────────────────────┘
|
||||
```
|
||||
|
||||
- iOS app embeds a `StateServer` (`DebugBridge` SPM library, `#if DEBUG` only) listening on `::1` + `127.0.0.1` port 9999. Bearer-token gated. Boot token rotates within ~5 seconds of daemon spawn so anything scraping `os_log` past then sees a dead credential.
|
||||
- Mac daemon brokers traffic over the CoreDevice IPv6 tunnel that `xcrun devicectl` opens automatically when a paired device is connected.
|
||||
- In Tailscale mode, the daemon exposes a separate listener bound to your tailnet IP, with capability tiers (observe / interact / mutate / restore) enforced per session token. Tokens are minted explicitly by the Mac owner via `gstack-ios-qa-mint`; remote callers never auto-allowlist.
|
||||
|
||||
The iOS `StateServer` is loopback-only **always**, even in remote mode. Identity validation happens Mac-side because the iPhone has no way to validate a Tailscale identity.
|
||||
|
||||
## Step 1: Add the DebugBridge templates to your iOS app
|
||||
|
||||
The templates live at `~/.claude/skills/gstack/ios-qa/templates/` after `./setup`. The fastest install is to invoke the `/ios-qa` skill in Claude Code from your app's root — it reads your Swift source, codegens typed `@Observable` state accessors, and lays down the templates with your bundle ID. Or do it by hand:
|
||||
|
||||
1. Copy these into a `DebugBridge/` SPM package inside your app workspace:
|
||||
- `Sources/DebugBridgeCore/StateServer.swift` (from `StateServer.swift.template`)
|
||||
- `Sources/DebugBridgeCore/DebugBridgeManager.swift` (from `DebugBridgeManager.swift.template`)
|
||||
- `Sources/DebugBridgeTouch/DebugBridgeTouch.m` + `Sources/DebugBridgeTouch/include/DebugBridgeTouch.h` (from the two `.template` files)
|
||||
- `Sources/DebugBridgeUI/Bridges.swift` (from `Bridges.swift.template`)
|
||||
- `Sources/DebugBridgeUI/DebugOverlay.swift` (from `DebugOverlay.swift.template`)
|
||||
- `Package.swift` (from `Package.swift.template`)
|
||||
2. Add the package as a local dependency of your app. Depend on the `DebugBridgeUI` product with `condition: .when(configuration: .debug)`. `DebugBridgeCore` and `DebugBridgeTouch` come in transitively.
|
||||
3. In your `@main` App init, gate the wiring on `#if DEBUG`:
|
||||
|
||||
```swift
|
||||
#if DEBUG
|
||||
import DebugBridgeCore
|
||||
StateServer.shared.start()
|
||||
#if canImport(UIKit)
|
||||
import DebugBridgeUI
|
||||
DebugBridgeUIWiring.installAll()
|
||||
#endif
|
||||
#endif
|
||||
```
|
||||
|
||||
The three Swift targets split as: `DebugBridgeCore` is cross-platform (so `swift build` on a CI Mac host can validate the bulk of the code without UIKit), `DebugBridgeUI` and `DebugBridgeTouch` are iOS-only (they link UIKit). `DebugBridgeTouch` is Objective-C — it carries the KIF-derived UITouch synthesis with the iOS 18+ `_UIHitTestContext` fix that makes SwiftUI Button taps actually fire.
|
||||
|
||||
The structural Release-build guard is the `.when(configuration: .debug)` clause in `Package.swift`. SwiftPM refuses to link any `DebugBridge*` target in a Release build, so the bridge cannot ship to TestFlight even if you forget to clean up.
|
||||
|
||||
## Step 2: Build + install to the device
|
||||
|
||||
From the app's project directory:
|
||||
|
||||
```
|
||||
xcodebuild \
|
||||
-scheme YourAppScheme \
|
||||
-configuration Debug \
|
||||
-destination 'generic/platform=iOS' \
|
||||
-derivedDataPath /tmp/build \
|
||||
-allowProvisioningUpdates -allowProvisioningDeviceRegistration \
|
||||
CODE_SIGN_STYLE=Automatic \
|
||||
DEVELOPMENT_TEAM=YOUR_TEAM_ID \
|
||||
build
|
||||
```
|
||||
|
||||
Then install + launch:
|
||||
|
||||
```
|
||||
UDID=$(xcrun devicectl list devices 2>/dev/null | awk 'NR>2 && $0!="" {print $(NF-2); exit}')
|
||||
xcrun devicectl device install app --device "$UDID" /tmp/build/Build/Products/Debug-iphoneos/YourApp.app
|
||||
xcrun devicectl device process launch --device "$UDID" --terminate-existing your.bundle.id
|
||||
```
|
||||
|
||||
If the phone is locked you'll get `FBSOpenApplicationServiceErrorDomain error 1 — Locked`. Unlock and retry. First-time installs surface a Trust dialog on the phone; tap Trust, then re-run.
|
||||
|
||||
## Step 3: Start the Mac-side daemon
|
||||
|
||||
Two options.
|
||||
|
||||
**Option A — let the skill spawn it.** Run `/ios-qa` in Claude Code from anywhere; the skill spawns the daemon on demand, bootstraps the tunnel, rotates the boot token, and exposes the device through the proxy. Cleanest path for local-USB use.
|
||||
|
||||
**Option B — start it yourself.** Run:
|
||||
|
||||
```
|
||||
gstack-ios-qa-daemon
|
||||
```
|
||||
|
||||
The daemon prints `READY: port=<n> pid=<pid>` once both loopback listeners are bound. The default port is 9099. Spawners can read that line with a ~5 second timeout to confirm readiness; you can also point `curl` at the printed port.
|
||||
|
||||
Either way the daemon takes an exclusive flock on `~/.gstack/ios-qa-daemon.pid` — running it twice from two Claude Code sessions is safe; the second invocation discovers the running daemon's port and joins.
|
||||
|
||||
Set these env vars to target a specific device or bundle:
|
||||
|
||||
```
|
||||
GSTACK_IOS_TARGET_UDID=248C3A58-B843-5BDB-8F5D-89ADB7D7BF6A
|
||||
GSTACK_IOS_TARGET_BUNDLE_ID=com.yourorg.yourapp
|
||||
GSTACK_IOS_DAEMON_PORT=9099 # loopback listener port; default 9099
|
||||
```
|
||||
|
||||
If `GSTACK_IOS_TARGET_UDID` is unset, the daemon picks the first paired connected device.
|
||||
|
||||
## Step 4: Drive the device
|
||||
|
||||
Once the daemon is running, you have an HTTP surface at `http://127.0.0.1:9099` (or `[::1]:9099`). The skill flow does this for you, but the raw endpoints are:
|
||||
|
||||
| Endpoint | What it does | Auth |
|
||||
|---|---|---|
|
||||
| `GET /healthz` | Version probe. | none (loopback) |
|
||||
| `POST /auth/rotate` | Daemon-only; rotates the boot token to an in-memory-only value. | boot token |
|
||||
| `POST /session/acquire` | Acquire the per-device session lock. Returns `{session_id, ttl_seconds}`. | bearer |
|
||||
| `POST /session/release` | Release the lock. | bearer + session |
|
||||
| `GET /screenshot` | Capture a PNG of the active window. Returns `{png_base64: "..."}`. | bearer |
|
||||
| `GET /elements` | Accessibility-tree snapshot. | bearer |
|
||||
| `GET /state/snapshot` | Dump every `@Snapshotable` field as JSON. | bearer |
|
||||
| `POST /state/restore` | Atomically restore a full snapshot. | bearer + session, mutate tier |
|
||||
| `POST /tap` `{x,y}` | Synthesize a real UITouch at window coordinates. SwiftUI Buttons fire. | bearer + session, interact tier |
|
||||
| `POST /swipe` `{from_x,from_y,to_x,to_y}` | Scroll the nearest enclosing UIScrollView. | bearer + session, interact tier |
|
||||
| `POST /type` `{text}` | Set text on the current first responder. | bearer + session, interact tier |
|
||||
|
||||
Mutating requests require both an `Authorization: Bearer <token>` header AND an `X-Session-Id` header. Read endpoints (`/screenshot`, `/elements`, `GET /state/*`) only need the bearer.
|
||||
|
||||
The state snapshot is opt-in per field via a `@Snapshotable` property wrapper on your canonical state struct. Fields you don't annotate never appear in the snapshot, which keeps tokens, PII, and auth state out of recorded fixtures by default.
|
||||
|
||||
## Step 5: Make remote agents work (optional)
|
||||
|
||||
To let an agent on another machine drive the device, run the daemon with `--tailnet`:
|
||||
|
||||
```
|
||||
gstack-ios-qa-daemon --tailnet
|
||||
```
|
||||
|
||||
The daemon probes `/var/run/tailscale.sock` first; if the socket is missing or unreadable, it refuses to open the tailnet listener at all (loopback still runs). Remote mode never half-starts.
|
||||
|
||||
Then mint a session token for the identity that should be able to connect:
|
||||
|
||||
```
|
||||
gstack-ios-qa-mint grant --remote 'alice@example.com' --capability interact
|
||||
gstack-ios-qa-mint grant --remote 'tag:ci' --capability mutate --ttl 86400 --note 'nightly'
|
||||
gstack-ios-qa-mint list
|
||||
```
|
||||
|
||||
Capability tiers are nested: `observe` (read endpoints only) ⊂ `interact` (taps, swipes, type) ⊂ `mutate` (`POST /state/*`) ⊂ `restore` (`POST /state/restore`). Pick the smallest tier that does the job. The allowlist file is at `~/.gstack/ios-qa-allowlist.json` (mode 0600) — the daemon reads it on every `/auth/mint` request, so changes take effect immediately without restarting.
|
||||
|
||||
The remote agent then hits `POST /auth/mint` against the daemon's tailnet listener. The daemon canonicalizes the caller's identity via tailscaled's WhoIs endpoint, checks the allowlist, and returns a short-lived session token (1 hour default, 24 hour cap). Every authenticated mutating request lands in `~/.gstack/security/ios-qa-audit.jsonl`; rejected requests land in `~/.gstack/security/attempts.jsonl`.
|
||||
|
||||
## Step 6: Ship a release build
|
||||
|
||||
Before you ship to TestFlight or the App Store, run `/ios-clean`. It removes the `DebugBridge` SPM dependency and strips the `#if DEBUG` wiring from your `@main` App. The structural guard in `Package.swift` (`condition: .when(configuration: .debug)`) means a Release build wouldn't link the bridge even if you forgot to clean up, but `/ios-clean` gives you a tidy diff to review and ship.
|
||||
|
||||
## Common failures
|
||||
|
||||
| Symptom | What broke |
|
||||
|---|---|
|
||||
| `xcodebuild` fails with `Could not locate device support files for iOS X.Y` | Run `xcodebuild -downloadPlatform iOS` to fetch the device support package for your iPhone's iOS version (~8GB). |
|
||||
| Install succeeds, `process launch` fails with `Locked` | The phone is locked. Unlock and retry. |
|
||||
| First install on a paired device fails with no clear error | The phone needs to Trust the Mac. Open Settings → General → VPN & Device Management on the phone and confirm. |
|
||||
| `Developer Mode` toggle missing from Settings → Privacy | Connect the device to Xcode → Window → Devices and Simulators once, or try any `devicectl device install` against it. iOS will surface the toggle after the first attempt. |
|
||||
| `xcrun devicectl device copy from` returns ERROR 7000 | The source path is wrong — boot token lives at `tmp/gstack-ios-qa.token` inside the app's data container (NSTemporaryDirectory), not at the path's root. |
|
||||
| `/healthz` returns 200 but `/tap` returns ok:true with no UI change | The phone is paired but the StateServer port may have changed across launches. Re-resolve the CoreDevice IPv6 (`dscacheutil -q host -a name '<DeviceName>.coredevice.local'`). |
|
||||
| `403 identity_not_allowed` from `/auth/mint` | The remote caller's identity isn't on the Mac's allowlist. Run `gstack-ios-qa-mint grant --remote <identity> --capability interact` on the Mac. |
|
||||
| Daemon won't open the tailnet listener | Tailscale isn't installed, or `/var/run/tailscale.sock` is unreadable. Fix Tailscale, then restart the daemon. Loopback still runs in the meantime. |
|
||||
| SwiftUI Button tap returns `ok:true` but the action never fires | You're on iOS 17 or older where `_UIHitTestContext` doesn't exist. The DebugBridgeTouch implementation falls back to plain `hitTest:` which doesn't resolve into SwiftUI's gesture container. Update to iOS 18+ on the device, or tap a UIKit control instead. |
|
||||
|
||||
## What this gets you
|
||||
|
||||
You can write an agent loop in any language that speaks HTTP. Take a screenshot, ask a model what to do, send a tap. Capture state snapshots before and after to record deterministic fixtures for `/ios-fix` regression tests. Add a colleague to the allowlist and they drive your device farm from their laptop without ever touching the hardware. Plug the same daemon into CI by minting a `tag:ci` session token with mutate-tier capability and a 24-hour TTL.
|
||||
|
||||
The whole stack is a Mac you already own, an iPhone you already own, a free Apple developer account, and gstack. No paid device-farm subscription. No simulator drift. The thing the user sees is what the agent drives.
|
||||
Reference in New Issue
Block a user