Files
gstack/ios-qa/SKILL.md.tmpl
T
Garry Tan 54fe375300 feat(ios): author 5 iOS device-farm skill templates + generated docs
Authors ios-qa, ios-fix, ios-design-review, ios-clean, ios-sync as upstream gstack skills. Each follows the standard SKILL.md.tmpl pattern with preamble-tier:3 frontmatter. The fork at time-attack/gstack shipped these but as byte-identical .md/.tmpl pairs that wouldn't pass skill-docs.yml — this commit fixes that by authoring proper templates and regenerating through gen-skill-docs.
2026-05-17 19:05:39 -07:00

205 lines
9.1 KiB
Cheetah

---
name: ios-qa
preamble-tier: 3
version: 1.0.0
description: |
Live-device iOS QA for SwiftUI apps. Connects to a real iPhone via USB
CoreDevice IPv6 tunnel, reads Swift source to understand every screen, then
runs a vision-driven agent loop: screenshot → analyze → decide → act →
verify → repeat. All interaction happens via HTTP to an embedded
StateServer in the app under test. Optionally exposes the device farm over
Tailscale so remote agents (OpenClaw, Codex, any HTTP-capable agent) can
drive the device from anywhere.
Use when asked to "ios qa", "test my iPhone app", "find bugs on the device",
or "qa the iOS app". (gstack)
voice-triggers:
- "iOS quality check"
- "test the iPhone app"
- "run iOS QA"
allowed-tools:
- Bash
- Read
- Write
- Edit
- Grep
- Glob
- AskUserQuestion
triggers:
- ios qa
- test the iphone app
- test my ios app
- find bugs on the device
- qa the ios app
---
{{PREAMBLE}}
# Live-device iOS QA
This skill drives a real iPhone via USB. The agent reads your Swift source,
generates typed state accessors, deploys a debug bridge, and runs a closed
find→fix→verify loop. No simulator, no XCTest, no WebDriverAgent.
## Architecture
```
┌──────────────────────┐ USB CoreDevice (IPv6) ┌──────────────────┐
│ gstack-ios-qa daemon │ ────────────────────────▶ │ iOS app │
│ (Mac, bun/TS) │ bearer + X-Session-Id │ StateServer │
│ │ │ (loopback only) │
│ - boot token rotate │ │ - /tap /swipe │
│ - session minting │ │ - /type /state │
│ - audit + redact │ │ - /snapshot │
└──────────────────────┘ └──────────────────┘
│ Tailscale (optional, --tailnet)
┌──────────────────────┐
│ Remote agent │
│ (OpenClaw, etc.) │
└──────────────────────┘
```
The iOS app's `StateServer` binds loopback only (`::1` + `127.0.0.1`). Tailnet
ingress is exclusively the Mac daemon's job. The daemon validates Tailscale
identities via the local `tailscaled` socket and mints short-lived session
tokens (default 1h) for remote agents.
## Prerequisites
- macOS (the daemon uses `devicectl` from Xcode).
- iPhone connected via USB, paired and trusted.
- Xcode + Swift toolchain installed (`swift --version` reports >= 5.9).
- App source available on disk, with at least one `@Observable` class.
- For remote-control mode: Tailscale installed and the user logged in.
## Phase 0: Session warm-start (optional)
If `~/.gstack/ios-qa-session.json` exists and the device is still connected,
skip Phase 1-2 and jump to Phase 3. The session cache holds the rotated token,
UDID, tunnel address, and accessor hash. Invalidate the cache when:
- The user passes `--cold` to force a full bootstrap.
- The accessor hash mismatch is detected on first state query.
- The daemon reports the cached UDID is no longer connected.
```bash
SESSION="$HOME/.gstack/ios-qa-session.json"
if [ -f "$SESSION" ] && [ "$COLD" != "1" ]; then
CACHED_UDID=$(python3 -c "import json,os; d=json.load(open(os.path.expanduser('$SESSION'))); print(d['udid'])")
CACHED_PORT=$(python3 -c "import json,os; d=json.load(open(os.path.expanduser('$SESSION'))); print(d['daemon_port'])")
if curl -sf "http://127.0.0.1:$CACHED_PORT/healthz" > /dev/null; then
echo "Warm start: daemon alive, device $CACHED_UDID connected"
fi
fi
```
## Phase 1: Read source, plan codegen
1. Walk the app source (passed as `--source <dir>`) and identify all `@Observable`
classes. Note any property marked with the `@Snapshotable` wrapper — those
are the snapshot-eligible fields.
2. Run `swift run --package-path $GSTACK_HOME/ios-qa/scripts/gen-accessors-tool gen-accessors --input <source-dir>`.
First invocation builds the swift-syntax dependency tree (cold: 2-5 min).
Subsequent runs are content-hash-cached and finish in ~50ms.
3. Show the user the accessor list and ask whether to install the DebugBridge
SPM dependency into their `Package.swift` (one AskUserQuestion).
## Phase 2: Bootstrap the device bridge
1. Add the `DebugBridge` SPM target (Debug-config-only via
`.when(configuration: .debug)`).
2. Add `DebugBridgeManager.shared.start()` to the app's `@main` entry, gated
on `#if DEBUG`.
3. Build + deploy to the device with `xcodebuild -scheme <SchemeName>
-destination 'platform=iOS,id=<UDID>' build install`.
4. Launch via `devicectl device process launch --device <UDID> --console <bundle-id>`.
Capture the boot token printed to `os_log` on first run.
5. Spawn the Mac-side daemon (on-demand) — `gstack-ios-qa-daemon`. Daemon
acquires an exclusive flock on `~/.gstack/ios-qa-daemon.pid`. If another
daemon is alive, the second invocation discovers its port and connects.
6. Daemon immediately calls `POST /auth/rotate` on the iOS StateServer with a
fresh in-memory-only token. The boot token becomes useless ~5s later.
Anything scraping `os_log` past this point sees a dead credential.
## Phase 3: Vision-driven agent loop
Each iteration:
1. `GET /screenshot` (via daemon) → save PNG.
2. `GET /elements` → accessibility tree.
3. `GET /state/snapshot` (only `@Snapshotable` fields) → current state.
4. Decide next action based on what's on the screen vs the test goal.
5. `POST /session/acquire` to grab the device lock.
6. Execute `POST /tap`, `/swipe`, `/type`, or `POST /state/<key>` write.
7. Re-screenshot; compare; record finding if buggy.
8. `POST /session/release` once the iteration is done.
Each authenticated mutating request through the tailnet listener (if remote
mode is active) writes an audit row to
`~/.gstack/security/ios-qa-audit.jsonl`.
## Modes
**Local-USB mode (default).** Daemon binds loopback only; no Tailscale
required. The spawning skill gets full-surface access. Best for solo
development.
**Tailnet mode (`--tailnet`).** Daemon additionally binds the Tailscale
interface (never `0.0.0.0`). Requires `tailscaled` to be running locally and
the daemon to be able to read `/var/run/tailscale.sock`. Fails closed if the
socket is missing, permission-denied, or returns an unparseable WhoIs
response. Remote agents hit `POST /auth/mint` over tailnet, daemon
canonicalizes identity via WhoIs, checks the allowlist file, mints a
session token. See `ios-qa/docs/tailscale-acl-example.md`.
**Capability tiers (tailnet mode).** Minted tokens default to `interact`
(taps, swipes, types). Higher tiers require explicit owner mint:
- **observe:** `/screenshot`, `/elements`, `GET /state/*`, `/healthz`,
`/session/heartbeat`.
- **interact:** observe + `/tap`, `/swipe`, `/type`.
- **mutate:** interact + `POST /state/<key>`.
- **restore:** mutate + `POST /state/restore`.
Owner mints via `gstack-ios-qa-mint --remote <identity> --capability <tier>`
on the Mac. Self-service mint over tailnet only succeeds for already-allowlisted
identities.
**Recording mode (`--recording`).** DebugOverlay renders a small diagonal
"AGENT DEMO" watermark in a corner so screencasts are unambiguous about the
device being agent-driven.
## Demo mode
If the user says "demo", "demo mode", "show me", or "I want to see it
working", run in **DEMO MODE**. This changes how the agent interacts with
the app:
**DEMO MODE OVERRIDES ALL OTHER RULES.** When demo mode is active, the
agent MUST drive every action through visible UI (`/tap`, `/swipe`, `/type`)
and NEVER use `POST /state/*` writes to skip steps. Viewers see the agent
type every key, tap every button. The on-device DebugOverlay attribution
chip shows "Driven by Claude Code (demo)" or the remote agent identity.
In demo mode, the screencap rate is bumped to 4fps so the recording feels
live.
## Failure modes + recovery
| Symptom | Likely cause | Action |
|---|---|---|
| `curl: connection refused` to daemon | daemon crashed | Re-run `/ios-qa`; spawn-race lock will fail closed |
| `403 identity_not_allowed` from `/auth/mint` | identity missing from allowlist | Run `gstack-ios-qa-mint --remote <identity>` on the Mac |
| `409 schema_mismatch` on `/state/restore` | snapshot from older app build | Discard the snapshot; re-capture |
| `503 device_disconnected` from proxy | USB tunnel dropped | Reconnect device; daemon auto-reconnects within 30s |
| `429 rate_limited` from `/auth/mint` | >10 mints/min from one identity | Wait 60s; check audit log for anomalies |
| `413 body_too_large` on `/state/restore` | snapshot >1MB | Increase `--max-body` or trim snapshot |
## Cleanup
Use `/ios-clean` to remove the DebugBridge SPM dependency and all `#if DEBUG`
wiring before a Release build. This is a convenience flow; the structural
Release-build guard (Package.swift `.when(configuration: .debug)` + CI
`swift build -c release` check) is the safety-critical path.