Commit Graph

21 Commits

Author SHA1 Message Date
Adam Wilson 058b281f25 default risk threshold; scores lower down in JSON serialization 2025-07-28 13:40:54 -06:00
Adam Wilson bc119ed50e don't overprocess results 2025-07-28 13:10:03 -06:00
Adam Wilson df14a01fe9 log full completion result with semantic similarity comparison results 2025-07-28 11:49:07 -06:00
Adam Wilson 2659e6e43c more updates for reflexion 2025-07-28 10:31:55 -06:00
Adam Wilson 5bc9f480f9 all domain unit tests pass 2025-07-27 18:53:30 -06:00
Adam Wilson a621ad82a9 Reflexion guardrails updates 2025-07-27 16:39:06 -06:00
Adam Wilson 99ec0ddf98 WIP reflexion service 2025-07-27 13:59:01 -06:00
Adam Wilson 741629908c updates for RAG + CoT tests 2025-07-25 18:11:49 -06:00
Adam Wilson 4f2e539996 updates 2025-07-24 18:51:18 -06:00
Adam Wilson ae279a512d log LLM config 2025-07-23 20:21:42 -06:00
Adam Wilson cb92890bb9 break tests into separate files; test 0 results 2025-07-23 19:06:27 -06:00
Adam Wilson fec4d711bf implement singular guidelines calls in main service 2025-07-18 12:33:51 -06:00
Adam Wilson 1dba565236 service implementations 2025-07-16 20:21:10 -06:00
Adam Wilson c788431416 prompt template IDs; fluent text generation service stubs 2025-07-12 12:18:19 -06:00
Adam Wilson a647060644 skip guidelines method; stubs for service calls in test class 2025-07-10 07:10:26 -06:00
Adam Wilson b4b2d792fc more progress on fluent service call 2025-07-09 21:56:44 -06:00
Adam Wilson af75e9aabf support prompt template loading 2025-07-07 21:38:42 -06:00
Adam Wilson 7d49d160b9 more test cases and notes 2025-07-06 17:47:49 -06:00
Adam Wilson ffa2d73ae0 guardrail analyzed response, etc. 2025-07-06 15:15:59 -06:00
Adam Wilson cb1be6746f support testing malicious prompts with no guidelines 2025-06-28 12:18:35 -06:00
Adam Wilson 8bb4a473ca restructure as domain-driven design architecture 2025-05-23 13:48:29 -06:00