mirror of
https://github.com/garrytan/gstack.git
synced 2026-05-07 14:06:42 +02:00
7b4738bca0
* feat(test/helpers): runPlanSkillFloorCheck — minimal AskUserQuestion-floor observer Adds a focused PTY observer that exits at the first non-permission numbered-option render. Catches the May 2026 transcript-bug class (model wrote plan + ExitPlanMode without firing any AUQ) without needing to fingerprint or navigate past the AUQ. Why separate from runPlanSkillCounting: plan-mode AUQs render every option on a single logical line via cursor-positioning escapes that stripAnsi can't simulate, so parseNumberedOptions returns < 2 options and never records a fingerprint. Counting tests work on 25-min budgets because eventually one frame parses cleanly; gate-tier floor tests need to exit early on the first observation. Trades fingerprint precision for early-exit reliability. Also drops COMPLETION_SUMMARY_RE check from this helper — it matches "GSTACK REVIEW REPORT" anywhere in the buffer including when the agent does recon by reading existing plan files. plan_ready (claude's actual "Ready to execute" confirmation) is the reliable terminal signal for "agent finished without asking." Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(resolvers): generateAntiShortcutClause shared resolver Adds {{ANTI_SHORTCUT_CLAUSE}} placeholder backed by a single resolver function in scripts/resolvers/review.ts. Plan-* review skills can now include the clause via one placeholder line in their .tmpl rather than cloning the paragraph four times. Future tightening edits one resolver, all four skills update on next gen-skill-docs. Wired into the existing RESOLVERS map alongside generateReviewDashboard and generatePlanFileReviewReport — no gen-skill-docs.ts change needed because the generator already does generic placeholder substitution against that map. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(plan-*-review): anti-shortcut clause in all four review skills Inserts {{ANTI_SHORTCUT_CLAUSE}} placeholder immediately after the **Anti-skip rule:** paragraph in plan-{eng,ceo,design,devex}-review SKILL.md.tmpl. The four templates use different surrounding section headers (eng "Review Sections (after scope is agreed)" vs ceo/design/devex variants), so anchoring on the paragraph rather than the heading works across all four. Closes the May 2026 transcript-bug loophole: existing STOP gates name forbidden actions only AFTER a per-section finding is identified. The anti-shortcut clause adds the pre-emptive rule — "the plan file is the OUTPUT of the interactive review, not a substitute for it" — covering the case the transcript exhibited (skip per-section walk, dump every finding into one plan write, call ExitPlanMode). Regenerated SKILL.md for all hosts via bun run gen:skill-docs --host all. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * test: gate-tier AskUserQuestion floor tests for all plan-* review skills Adds 4 finding-floor tests (one per plan-* skill) that catch the May 2026 transcript-bug class — model wrote a plan and called ExitPlanMode without firing any review-phase AskUserQuestion. Asserts via runPlanSkillFloorCheck that ANY non-permission AUQ render fires before the agent reaches plan_ready. Verified: - Eng floor: passed in 59s - CEO floor: passed in 197s - Design floor: passed - Devex floor: passed - Total ~$2-6 per CI run; only triggers on diff against the 4 plan-* templates, the shared resolver review.ts, the seeds fixture, or the PTY runner helper. Fixtures live in test/fixtures/forcing-finding-seeds.ts, one constant per skill. Each seed is engineered to force at least one obvious finding under that skill's review focus (architectural smell for eng, scope-creep for ceo, UI-slop for design, painful onboarding for devex). Touchfiles wiring: - E2E_TOUCHFILES: 4 plan-*-finding-floor entries with deps on the matching skill template, the shared resolver, the seeds fixture, and the PTY runner helper - E2E_TIERS: all 4 entries marked 'gate' - touchfiles.test.ts: count assertion bumped 21→22 with explicit plan-ceo-finding-floor containment check Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * chore: bump version and changelog (v1.27.1.0) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
81 lines
4.4 KiB
TypeScript
81 lines
4.4 KiB
TypeScript
/**
|
|
* RESOLVERS record — maps {{PLACEHOLDER}} names to generator functions.
|
|
* Each resolver takes a TemplateContext and returns the replacement string.
|
|
*/
|
|
|
|
import type { TemplateContext, ResolverFn } from './types';
|
|
|
|
// Domain modules
|
|
import { generatePreamble } from './preamble';
|
|
import { generateTestFailureTriage } from './preamble';
|
|
import { generateCommandReference, generateSnapshotFlags, generateBrowseSetup } from './browse';
|
|
import { generateDesignMethodology, generateDesignHardRules, generateDesignOutsideVoices, generateDesignReviewLite, generateDesignSketch, generateDesignSetup, generateDesignMockup, generateDesignShotgunLoop, generateTasteProfile, generateUXPrinciples } from './design';
|
|
import { generateTestBootstrap, generateTestCoverageAuditPlan, generateTestCoverageAuditShip, generateTestCoverageAuditReview } from './testing';
|
|
import { generateReviewDashboard, generatePlanFileReviewReport, generateAntiShortcutClause, generateSpecReviewLoop, generateBenefitsFrom, generateCodexSecondOpinion, generateAdversarialStep, generateCodexPlanReview, generatePlanCompletionAuditShip, generatePlanCompletionAuditReview, generatePlanVerificationExec, generateScopeDrift, generateCrossReviewDedup } from './review';
|
|
import { generateSlugEval, generateSlugSetup, generateBaseBranchDetect, generateDeployBootstrap, generateQAMethodology, generateCoAuthorTrailer, generateChangelogWorkflow } from './utility';
|
|
import { generateLearningsSearch, generateLearningsLog } from './learnings';
|
|
import { generateConfidenceCalibration } from './confidence';
|
|
import { generateInvokeSkill } from './composition';
|
|
import { generateReviewArmy } from './review-army';
|
|
import { generateDxFramework } from './dx';
|
|
import { generateModelOverlay } from './model-overlay';
|
|
import { generateGBrainContextLoad, generateGBrainSaveResults } from './gbrain';
|
|
import { generateQuestionPreferenceCheck, generateQuestionLog, generateInlineTuneFeedback } from './question-tuning';
|
|
import { generateMakePdfSetup } from './make-pdf';
|
|
|
|
export const RESOLVERS: Record<string, ResolverFn> = {
|
|
SLUG_EVAL: generateSlugEval,
|
|
SLUG_SETUP: generateSlugSetup,
|
|
COMMAND_REFERENCE: generateCommandReference,
|
|
SNAPSHOT_FLAGS: generateSnapshotFlags,
|
|
PREAMBLE: generatePreamble,
|
|
BROWSE_SETUP: generateBrowseSetup,
|
|
BASE_BRANCH_DETECT: generateBaseBranchDetect,
|
|
QA_METHODOLOGY: generateQAMethodology,
|
|
DESIGN_METHODOLOGY: generateDesignMethodology,
|
|
DESIGN_HARD_RULES: generateDesignHardRules,
|
|
UX_PRINCIPLES: generateUXPrinciples,
|
|
DESIGN_OUTSIDE_VOICES: generateDesignOutsideVoices,
|
|
DESIGN_REVIEW_LITE: generateDesignReviewLite,
|
|
REVIEW_DASHBOARD: generateReviewDashboard,
|
|
PLAN_FILE_REVIEW_REPORT: generatePlanFileReviewReport,
|
|
ANTI_SHORTCUT_CLAUSE: generateAntiShortcutClause,
|
|
TEST_BOOTSTRAP: generateTestBootstrap,
|
|
TEST_COVERAGE_AUDIT_PLAN: generateTestCoverageAuditPlan,
|
|
TEST_COVERAGE_AUDIT_SHIP: generateTestCoverageAuditShip,
|
|
TEST_COVERAGE_AUDIT_REVIEW: generateTestCoverageAuditReview,
|
|
TEST_FAILURE_TRIAGE: generateTestFailureTriage,
|
|
SPEC_REVIEW_LOOP: generateSpecReviewLoop,
|
|
DESIGN_SKETCH: generateDesignSketch,
|
|
DESIGN_SETUP: generateDesignSetup,
|
|
DESIGN_MOCKUP: generateDesignMockup,
|
|
DESIGN_SHOTGUN_LOOP: generateDesignShotgunLoop,
|
|
BENEFITS_FROM: generateBenefitsFrom,
|
|
CODEX_SECOND_OPINION: generateCodexSecondOpinion,
|
|
ADVERSARIAL_STEP: generateAdversarialStep,
|
|
SCOPE_DRIFT: generateScopeDrift,
|
|
DEPLOY_BOOTSTRAP: generateDeployBootstrap,
|
|
CODEX_PLAN_REVIEW: generateCodexPlanReview,
|
|
PLAN_COMPLETION_AUDIT_SHIP: generatePlanCompletionAuditShip,
|
|
PLAN_COMPLETION_AUDIT_REVIEW: generatePlanCompletionAuditReview,
|
|
PLAN_VERIFICATION_EXEC: generatePlanVerificationExec,
|
|
CO_AUTHOR_TRAILER: generateCoAuthorTrailer,
|
|
LEARNINGS_SEARCH: generateLearningsSearch,
|
|
LEARNINGS_LOG: generateLearningsLog,
|
|
CONFIDENCE_CALIBRATION: generateConfidenceCalibration,
|
|
INVOKE_SKILL: generateInvokeSkill,
|
|
CHANGELOG_WORKFLOW: generateChangelogWorkflow,
|
|
REVIEW_ARMY: generateReviewArmy,
|
|
CROSS_REVIEW_DEDUP: generateCrossReviewDedup,
|
|
DX_FRAMEWORK: generateDxFramework,
|
|
MODEL_OVERLAY: generateModelOverlay,
|
|
TASTE_PROFILE: generateTasteProfile,
|
|
BIN_DIR: (ctx) => ctx.paths.binDir,
|
|
GBRAIN_CONTEXT_LOAD: generateGBrainContextLoad,
|
|
GBRAIN_SAVE_RESULTS: generateGBrainSaveResults,
|
|
QUESTION_PREFERENCE_CHECK: generateQuestionPreferenceCheck,
|
|
QUESTION_LOG: generateQuestionLog,
|
|
INLINE_TUNE_FEEDBACK: generateInlineTuneFeedback,
|
|
MAKE_PDF_SETUP: generateMakePdfSetup,
|
|
};
|