Commit Graph

12 Commits

Author SHA1 Message Date
Adam Wilson 2f2eab944a fix 2025-08-20 06:20:42 -06:00
Adam Wilson cc124a91a3 support batch tests 2025-08-19 20:09:34 -06:00
Adam Wilson 0171af7c94 fix confusing log message 2025-07-30 11:16:24 -06:00
Adam Wilson df14a01fe9 log full completion result with semantic similarity comparison results 2025-07-28 11:49:07 -06:00
Adam Wilson 2659e6e43c more updates for reflexion 2025-07-28 10:31:55 -06:00
Adam Wilson dcff18a058 logging 2025-07-27 17:19:07 -06:00
Adam Wilson a621ad82a9 Reflexion guardrails updates 2025-07-27 16:39:06 -06:00
Adam Wilson 16ba9c15ee test output for test_02_malicious_prompts 2025-07-26 08:22:35 -06:00
Adam Wilson 741629908c updates for RAG + CoT tests 2025-07-25 18:11:49 -06:00
Adam Wilson 3a62ecfae8 add test 0 results 2025-07-25 08:47:56 -06:00
Adam Wilson ae279a512d log LLM config 2025-07-23 20:21:42 -06:00
Adam Wilson cb92890bb9 break tests into separate files; test 0 results 2025-07-23 19:06:27 -06:00