Files
gstack/test/skill-e2e-plan-design-plan-mode.test.ts
T
Garry Tan 473c8717d8 test: rename plan-mode-handshake-helpers to plan-mode-helpers, strengthen smokes
Rename test/helpers/plan-mode-handshake-helpers.ts to
test/helpers/plan-mode-helpers.ts. Keep the write-guard helper that
asserts no Write/Edit tool call before the first AskUserQuestion
(this is what catches silent-bypass regressions the textual smoke
can't see). Rename the API: runPlanModeHandshakeTest to
runPlanModeSkillTest, assertHandshakeShape to assertNotHandshakeShape.
Extend the capture struct with exitPlanModeBeforeAsk.

Rewrite the four per-skill E2E tests (plan-ceo, plan-eng, plan-design,
plan-devex) as smoke tests that assert the skill's Step 0 question
fires first, not an A/C handshake. Each test picks a cheap first
answer (HOLD, TRIAGE, numeric score) so the run terminates quickly.

Keep test/skill-e2e-plan-mode-no-op.test.ts as the outside-plan-mode
non-interference regression, per codex outside-voice review: deleting
it would lose coverage for "the hoisted section stays quiet when plan
mode is absent."

Replace the gen-skill-docs.test.ts handshake describe block (lines
2778+) with a plan-mode-info describe block that:
- scans every generated SKILL.md under the repo root + every host
  subdir (.agents, .openclaw, .opencode, .factory, .hermes, .kiro,
  .cursor, .slate) and asserts "## Plan Mode Handshake" is absent
- asserts "## Skill Invocation During Plan Mode" lands in the first
  15KB of each of the four review skills' generated SKILL.md

Both assertions run on every bun test. A PR that re-introduces the
handshake resolver fails CI immediately.

Update test/e2e-harness-audit.test.ts to reference the renamed
runPlanModeSkillTest. Update test/helpers/touchfiles.ts entries to
point at the new resolver owner (generate-completion-status.ts) and
the renamed helper, and align per-skill touchfile keys.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 01:59:24 -07:00

32 lines
1.2 KiB
TypeScript

/**
* plan-design-review plan-mode smoke test (gate tier, paid).
*
* See test/skill-e2e-plan-ceo-plan-mode.test.ts for the shared assertion
* contract. Exercises the same assertions against /plan-design-review.
*/
import { describe, test, expect } from 'bun:test';
import {
runPlanModeSkillTest,
assertNotHandshakeShape,
} from './helpers/plan-mode-helpers';
const shouldRun = !!process.env.EVALS && process.env.EVALS_TIER === 'gate';
const describeE2E = shouldRun ? describe : describe.skip;
describeE2E('plan-design-review plan-mode smoke (gate)', () => {
test('goes straight to first design question, no handshake, no silent writes', async () => {
const result = await runPlanModeSkillTest({
skillName: 'plan-design-review',
// First question for design review varies; pick any reasonable match.
// The substring match falls back to the first option if no match.
firstAnswerSubstring: '7',
});
expect(result.askUserQuestions.length).toBeGreaterThanOrEqual(1);
assertNotHandshakeShape(result.askUserQuestions[0]!);
expect(result.writeOrEditBeforeAsk).toBe(false);
expect(result.exitPlanModeBeforeAsk).toBe(false);
}, 120_000);
});