Files
gstack/ios-qa/SKILL.md.tmpl
T
Garry Tan c2f2acebf6 feat(ios): hoist DebugBridgeTouch into canonical templates
Bridges.swift.template imports DebugBridgeTouch but no .m/.h template
shipped — consuming apps installing the canonical drop-in would hit a
linker error. Closes that gap with the fixture's verified working code.

Changes:

- New ios-qa/templates/DebugBridgeTouch.{h,m}.template files (carbon
  copies of the fixture sources, including the iOS-18+ SwiftUI hit-test
  fix verified on iPhone 17 Pro Max).
- Package.swift.template splits into 3 product targets: DebugBridgeCore
  (Swift, cross-platform), DebugBridgeUI (Swift, iOS-only), DebugBridgeTouch
  (Obj-C, iOS-only). Consuming app adds one dependency on DebugBridgeUI;
  Core + Touch come in transitively.
- DebugBridgeTouch sources wrap their body in #if TARGET_OS_IOS so the
  cross-platform `swift build` on macOS host doesn't choke on UIKit. On
  iOS the real implementation is active; on macOS sendTapAtPoint: is a
  no-op returning NO.
- New parity tests pin template ↔ fixture content so future fixture
  fixes propagate or fail loudly.
- Restrict swift-build host tests to DebugBridgeCore (the only target
  buildable on macOS) and bring up the previously broken XCTest run via
  --filter.

Verified post-change: real iPhone 17 Pro Max, iOS 26.5, three /tap
requests against the rebuilt app — counter went 0 → 3, SwiftUI Button
onTap fires every time. Templates now sufficient to ship to any
consuming iOS app.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-20 07:37:31 -07:00

222 lines
9.8 KiB
Cheetah

---
name: ios-qa
preamble-tier: 3
version: 1.0.0
description: |
Live-device iOS QA for SwiftUI apps. Connects to a real iPhone via USB
CoreDevice IPv6 tunnel, reads Swift source to understand every screen, then
runs a vision-driven agent loop: screenshot → analyze → decide → act →
verify → repeat. All interaction happens via HTTP to an embedded
StateServer in the app under test. Optionally exposes the device farm over
Tailscale so remote agents (OpenClaw, Codex, any HTTP-capable agent) can
drive the device from anywhere.
Use when asked to "ios qa", "test my iPhone app", "find bugs on the device",
or "qa the iOS app". (gstack)
voice-triggers:
- "iOS quality check"
- "test the iPhone app"
- "run iOS QA"
allowed-tools:
- Bash
- Read
- Write
- Edit
- Grep
- Glob
- AskUserQuestion
triggers:
- ios qa
- test the iphone app
- test my ios app
- find bugs on the device
- qa the ios app
---
{{PREAMBLE}}
# Live-device iOS QA
This skill drives a real iPhone via USB. The agent reads your Swift source,
generates typed state accessors, deploys a debug bridge, and runs a closed
find→fix→verify loop. No simulator, no XCTest, no WebDriverAgent.
## Architecture
```
┌──────────────────────┐ USB CoreDevice (IPv6) ┌──────────────────┐
│ gstack-ios-qa daemon │ ────────────────────────▶ │ iOS app │
│ (Mac, bun/TS) │ bearer + X-Session-Id │ StateServer │
│ │ │ (loopback only) │
│ - boot token rotate │ │ - /tap /swipe │
│ - session minting │ │ - /type /state │
│ - audit + redact │ │ - /snapshot │
└──────────────────────┘ └──────────────────┘
│ Tailscale (optional, --tailnet)
┌──────────────────────┐
│ Remote agent │
│ (OpenClaw, etc.) │
└──────────────────────┘
```
The iOS app's `StateServer` binds loopback only (`::1` + `127.0.0.1`). Tailnet
ingress is exclusively the Mac daemon's job. The daemon validates Tailscale
identities via the local `tailscaled` socket and mints short-lived session
tokens (default 1h) for remote agents.
## Prerequisites
- macOS (the daemon uses `devicectl` from Xcode).
- iPhone connected via USB, paired and trusted.
- Xcode + Swift toolchain installed (`swift --version` reports >= 5.9).
- App source available on disk, with at least one `@Observable` class.
- For remote-control mode: Tailscale installed and the user logged in.
## Phase 0: Session warm-start (optional)
If `~/.gstack/ios-qa-session.json` exists and the device is still connected,
skip Phase 1-2 and jump to Phase 3. The session cache holds the rotated token,
UDID, tunnel address, and accessor hash. Invalidate the cache when:
- The user passes `--cold` to force a full bootstrap.
- The accessor hash mismatch is detected on first state query.
- The daemon reports the cached UDID is no longer connected.
```bash
SESSION="$HOME/.gstack/ios-qa-session.json"
if [ -f "$SESSION" ] && [ "$COLD" != "1" ]; then
CACHED_UDID=$(python3 -c "import json,os; d=json.load(open(os.path.expanduser('$SESSION'))); print(d['udid'])")
CACHED_PORT=$(python3 -c "import json,os; d=json.load(open(os.path.expanduser('$SESSION'))); print(d['daemon_port'])")
if curl -sf "http://127.0.0.1:$CACHED_PORT/healthz" > /dev/null; then
echo "Warm start: daemon alive, device $CACHED_UDID connected"
fi
fi
```
## Phase 1: Read source, plan codegen
1. Walk the app source (passed as `--source <dir>`) and identify all `@Observable`
classes. Note any property marked with the `@Snapshotable` wrapper — those
are the snapshot-eligible fields.
2. Run `swift run --package-path $GSTACK_HOME/ios-qa/scripts/gen-accessors-tool gen-accessors --input <source-dir>`.
First invocation builds the swift-syntax dependency tree (cold: 2-5 min).
Subsequent runs are content-hash-cached and finish in ~50ms.
3. Show the user the accessor list and ask whether to install the DebugBridge
SPM dependency into their `Package.swift` (one AskUserQuestion).
## Phase 2: Bootstrap the device bridge
1. Add the `DebugBridge` SPM dependency to the app's `Package.swift`. The package
ships three Debug-config-only library products:
- `DebugBridgeCore` (Swift, cross-platform) — StateServer + bridge protocols.
- `DebugBridgeTouch` (Objective-C, iOS-only) — KIF-derived in-process touch
synthesis with iOS 18+ `_UIHitTestContext` SwiftUI hit-testing.
- `DebugBridgeUI` (Swift, iOS-only) — Screenshot / Elements / Mutation
bridge implementations.
The app target depends on `DebugBridgeUI` with `.when(configuration: .debug)`
(transitively pulls in Core + Touch). Release builds refuse to link these
targets.
2. Wire the bridges from the `@main` App init, gated on `#if DEBUG`:
```swift
#if DEBUG
import DebugBridgeCore
StateServer.shared.start()
#if canImport(UIKit)
import DebugBridgeUI
DebugBridgeUIWiring.installAll()
#endif
#endif
```
3. Build + deploy to the device with `xcodebuild -scheme <SchemeName>
-destination 'platform=iOS,id=<UDID>' build install`.
4. Launch via `devicectl device process launch --device <UDID> --console <bundle-id>`.
Capture the boot token printed to `os_log` on first run.
5. Spawn the Mac-side daemon (on-demand) — `gstack-ios-qa-daemon`. Daemon
acquires an exclusive flock on `~/.gstack/ios-qa-daemon.pid`. If another
daemon is alive, the second invocation discovers its port and connects.
6. Daemon immediately calls `POST /auth/rotate` on the iOS StateServer with a
fresh in-memory-only token. The boot token becomes useless ~5s later.
Anything scraping `os_log` past this point sees a dead credential.
## Phase 3: Vision-driven agent loop
Each iteration:
1. `GET /screenshot` (via daemon) → save PNG.
2. `GET /elements` → accessibility tree.
3. `GET /state/snapshot` (only `@Snapshotable` fields) → current state.
4. Decide next action based on what's on the screen vs the test goal.
5. `POST /session/acquire` to grab the device lock.
6. Execute `POST /tap`, `/swipe`, `/type`, or `POST /state/<key>` write.
7. Re-screenshot; compare; record finding if buggy.
8. `POST /session/release` once the iteration is done.
Each authenticated mutating request through the tailnet listener (if remote
mode is active) writes an audit row to
`~/.gstack/security/ios-qa-audit.jsonl`.
## Modes
**Local-USB mode (default).** Daemon binds loopback only; no Tailscale
required. The spawning skill gets full-surface access. Best for solo
development.
**Tailnet mode (`--tailnet`).** Daemon additionally binds the Tailscale
interface (never `0.0.0.0`). Requires `tailscaled` to be running locally and
the daemon to be able to read `/var/run/tailscale.sock`. Fails closed if the
socket is missing, permission-denied, or returns an unparseable WhoIs
response. Remote agents hit `POST /auth/mint` over tailnet, daemon
canonicalizes identity via WhoIs, checks the allowlist file, mints a
session token. See `ios-qa/docs/tailscale-acl-example.md`.
**Capability tiers (tailnet mode).** Minted tokens default to `interact`
(taps, swipes, types). Higher tiers require explicit owner mint:
- **observe:** `/screenshot`, `/elements`, `GET /state/*`, `/healthz`,
`/session/heartbeat`.
- **interact:** observe + `/tap`, `/swipe`, `/type`.
- **mutate:** interact + `POST /state/<key>`.
- **restore:** mutate + `POST /state/restore`.
Owner mints via `gstack-ios-qa-mint --remote <identity> --capability <tier>`
on the Mac. Self-service mint over tailnet only succeeds for already-allowlisted
identities.
**Recording mode (`--recording`).** DebugOverlay renders a small diagonal
"AGENT DEMO" watermark in a corner so screencasts are unambiguous about the
device being agent-driven.
## Demo mode
If the user says "demo", "demo mode", "show me", or "I want to see it
working", run in **DEMO MODE**. This changes how the agent interacts with
the app:
**DEMO MODE OVERRIDES ALL OTHER RULES.** When demo mode is active, the
agent MUST drive every action through visible UI (`/tap`, `/swipe`, `/type`)
and NEVER use `POST /state/*` writes to skip steps. Viewers see the agent
type every key, tap every button. The on-device DebugOverlay attribution
chip shows "Driven by Claude Code (demo)" or the remote agent identity.
In demo mode, the screencap rate is bumped to 4fps so the recording feels
live.
## Failure modes + recovery
| Symptom | Likely cause | Action |
|---|---|---|
| `curl: connection refused` to daemon | daemon crashed | Re-run `/ios-qa`; spawn-race lock will fail closed |
| `403 identity_not_allowed` from `/auth/mint` | identity missing from allowlist | Run `gstack-ios-qa-mint --remote <identity>` on the Mac |
| `409 schema_mismatch` on `/state/restore` | snapshot from older app build | Discard the snapshot; re-capture |
| `503 device_disconnected` from proxy | USB tunnel dropped | Reconnect device; daemon auto-reconnects within 30s |
| `429 rate_limited` from `/auth/mint` | >10 mints/min from one identity | Wait 60s; check audit log for anomalies |
| `413 body_too_large` on `/state/restore` | snapshot >1MB | Increase `--max-body` or trim snapshot |
## Cleanup
Use `/ios-clean` to remove the DebugBridge SPM dependency and all `#if DEBUG`
wiring before a Release build. This is a convenience flow; the structural
Release-build guard (Package.swift `.when(configuration: .debug)` + CI
`swift build -c release` check) is the safety-critical path.