Commit Graph

16 Commits

Author SHA1 Message Date
Adam Wilson
a636b7fbf7 fix test number file path; support smaller runs 2025-08-18 20:05:24 -06:00
Adam Wilson
8c6230e0dc try smaller batches 2025-08-18 18:00:08 -06:00
Adam Wilson
09eac1f050 try smaller batches 2025-08-18 17:36:01 -06:00
Adam Wilson
0411049d6b matrix strategy for tests; remove dead code 2025-08-18 16:31:32 -06:00
Adam Wilson
010933aa59 matrix strategy for tests 2025-08-18 16:12:31 -06:00
Adam Wilson
1eadd81d77 new test for GH actions 2025-08-16 18:57:08 -06:00
Adam Wilson
0171af7c94 fix confusing log message 2025-07-30 11:16:24 -06:00
Adam Wilson
df14a01fe9 log full completion result with semantic similarity comparison results 2025-07-28 11:49:07 -06:00
Adam Wilson
2659e6e43c more updates for reflexion 2025-07-28 10:31:55 -06:00
Adam Wilson
dcff18a058 logging 2025-07-27 17:19:07 -06:00
Adam Wilson
a621ad82a9 Reflexion guardrails updates 2025-07-27 16:39:06 -06:00
Adam Wilson
16ba9c15ee test output for test_02_malicious_prompts 2025-07-26 08:22:35 -06:00
Adam Wilson
741629908c updates for RAG + CoT tests 2025-07-25 18:11:49 -06:00
Adam Wilson
3a62ecfae8 add test 0 results 2025-07-25 08:47:56 -06:00
Adam Wilson
ae279a512d log LLM config 2025-07-23 20:21:42 -06:00
Adam Wilson
cb92890bb9 break tests into separate files; test 0 results 2025-07-23 19:06:27 -06:00