Commit Graph

16 Commits

Author SHA1 Message Date
Adam Wilson a636b7fbf7 fix test number file path; support smaller runs 2025-08-18 20:05:24 -06:00
Adam Wilson 8c6230e0dc try smaller batches 2025-08-18 18:00:08 -06:00
Adam Wilson 09eac1f050 try smaller batches 2025-08-18 17:36:01 -06:00
Adam Wilson 0411049d6b matrix strategy for tests; remove dead code 2025-08-18 16:31:32 -06:00
Adam Wilson 010933aa59 matrix strategy for tests 2025-08-18 16:12:31 -06:00
Adam Wilson 1eadd81d77 new test for GH actions 2025-08-16 18:57:08 -06:00
Adam Wilson 0171af7c94 fix confusing log message 2025-07-30 11:16:24 -06:00
Adam Wilson df14a01fe9 log full completion result with semantic similarity comparison results 2025-07-28 11:49:07 -06:00
Adam Wilson 2659e6e43c more updates for reflexion 2025-07-28 10:31:55 -06:00
Adam Wilson dcff18a058 logging 2025-07-27 17:19:07 -06:00
Adam Wilson a621ad82a9 Reflexion guardrails updates 2025-07-27 16:39:06 -06:00
Adam Wilson 16ba9c15ee test output for test_02_malicious_prompts 2025-07-26 08:22:35 -06:00
Adam Wilson 741629908c updates for RAG + CoT tests 2025-07-25 18:11:49 -06:00
Adam Wilson 3a62ecfae8 add test 0 results 2025-07-25 08:47:56 -06:00
Adam Wilson ae279a512d log LLM config 2025-07-23 20:21:42 -06:00
Adam Wilson cb92890bb9 break tests into separate files; test 0 results 2025-07-23 19:06:27 -06:00