Files
gstack/.github/workflows/windows-free-tests.yml
T
Garry Tan 6841d82a98 fix(windows-ci): gen:skill-docs in workflow + known-bad list for env-specific tests
Round 9 of windows-free-tests fixes. Round 8 cleared shard 7; shard 8
surfaced 4 fails:

1+2. test/gen-skill-docs.test.ts golden-file regression for Codex + Factory
   ship skills failed with ENOENT on `.agents/skills/gstack-ship/SKILL.md`
   and `.factory/skills/gstack-ship/SKILL.md`. These are gitignored
   gen-skill-docs outputs that the Mac/Linux CI workflows already
   regenerate elsewhere — the windows-free-tests lane never did.

   Fix: add `bun run gen:skill-docs --host all` step to
   windows-free-tests.yml after `bun install`.

3. test/host-config.test.ts:377 "detect finds claude" asserts the `claude`
   binary is on PATH. True when running inside Claude Code; false on a
   bare CI runner.

4. browse/test/findport.test.ts:117 asserts Bun.serve.stop() is
   fire-and-forget (returns undefined). Bun's Windows behavior for this
   polyfill differs; the assertion is Bun-on-non-Windows-specific.

Both 3 and 4 are environment/runtime-specific failures that don't fit a
regex pattern. Added a KNOWN_WINDOWS_INCOMPATIBLE explicit list to
scripts/test-free-shards.ts so they're curated by exact path, with a
reason string. The list is for cases where pattern matching can't infer
the failure shape from the source file alone.

Curated subset: 66 → 64 tests (~50% of free suite). 14 unit tests in
test/test-free-shards.test.ts still pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-28 00:15:50 -07:00

85 lines
3.4 KiB
YAML

name: Windows Free Tests
# Curated subset of the free test suite that runs on windows-latest.
#
# Codex's v1.18.0.0 review flagged that the existing evals.yml workflow uses
# a Linux container, so a windows-latest matrix entry there isn't a drop-in.
# This workflow is non-container, runs the curated Windows-safe subset, plus
# targeted resolver tests that exercise the Bun.which-based claude binary
# resolution + the GSTACK_CLAUDE_BIN override path on Windows.
#
# What this DOES NOT do (out of scope for v1.18.0.0):
# - Run the full free suite on Windows. The 24 tests that hardcode /bin/sh,
# spawn('sh',...), or raw /tmp/ paths are excluded by scripts/test-free-shards.ts
# --windows-only. They need POSIX-bound surfaces to be ported off shell
# primitives before they can run on Windows. Tracked as a follow-up TODO.
# - Run Playwright/browser-backed tests. Browse server bring-up on Windows is
# a separate concern (PR #1238 windows-pty-bun-pty-fix is in flight).
on:
pull_request:
branches: [main]
workflow_dispatch:
concurrency:
group: windows-free-${{ github.head_ref }}
cancel-in-progress: true
jobs:
windows-free-tests:
runs-on: windows-latest
timeout-minutes: 15
steps:
- uses: actions/checkout@v4
- uses: oven-sh/setup-bun@v1
with:
bun-version: latest
- name: Configure git identity (required by tests that init temp repos)
run: |
git config --global user.email "windows-ci@gstack.test"
git config --global user.name "Windows CI"
git config --global init.defaultBranch main
shell: bash
- name: Install dependencies
run: bun install --frozen-lockfile
- name: Build server-node.mjs (required by Windows browse path)
# browse/src/cli.ts module-level throws on Windows if server-node.mjs
# is missing — Bun can't drive Playwright's Chromium on Windows
# (oven-sh/bun#4253). The bundle must exist for any test that
# transitively loads cli.ts to even import. We build only the
# Node-compatible server bundle here; full `bun run build` would
# also compile every binary which is slow and unnecessary for tests.
run: bash browse/scripts/build-node-server.sh
shell: bash
- name: Generate host SKILL.md outputs (.agents, .factory)
# The golden-file regression tests in test/gen-skill-docs.test.ts read
# .agents/skills/gstack-ship/SKILL.md and .factory/skills/gstack-ship/
# SKILL.md. Both are gitignored — generated on demand by gen:skill-docs.
# On Mac/Linux CI the existing eval workflow regenerates these as part
# of its own pipeline; the windows-free-tests lane doesn't share that
# so it must regenerate explicitly.
run: bun run gen:skill-docs --host all
shell: bash
- name: Show curated subset (for build log audit trail)
run: bun run scripts/test-free-shards.ts --windows-only --list
shell: bash
- name: Run curated Windows-safe subset
run: bun run test:windows
shell: bash
- name: Targeted Claude resolver tests (real PATHEXT coverage on Windows)
run: bun test browse/test/claude-bin.test.ts
shell: bash
- name: gstack-paths helper test (resolves $GSTACK_STATE_ROOT etc. on Windows)
run: bun test test/gstack-paths.test.ts
shell: bash