test: add deterministic contributor mode preamble validation

40 new skill-validation tests (4 checks × 10 skills) verify:
- 0-10 rating scale present
- Calibration example present
- "What would make this a 10" field present
- Periodic reflection (not per-command)

Update existing E2E contributor eval for new report format.
This commit is contained in:
Garry Tan
2026-03-16 09:44:23 -05:00
parent 685be01b00
commit 6be7974cb4
2 changed files with 41 additions and 0 deletions
+3
View File
@@ -322,10 +322,13 @@ File a contributor report about this issue. Then tell me what you filed.`,
const logFiles = fs.readdirSync(logsDir).filter(f => f.endsWith('.md'));
expect(logFiles.length).toBeGreaterThan(0);
// Verify new reflection-based format
const logContent = fs.readFileSync(path.join(logsDir, logFiles[0]), 'utf-8');
expect(logContent).toContain('Hey gstack team');
expect(logContent).toContain('What I was trying to do');
expect(logContent).toContain('What happened instead');
expect(logContent).toMatch(/rating/i);
expect(logContent).toMatch(/what would make/i);
// Clean up
try { fs.rmSync(contribDir, { recursive: true, force: true }); } catch {}