Commit Graph

  • df14a01fe9 log full completion result with semantic similarity comparison results Adam Wilson 2025-07-28 11:49:07 -06:00
  • b971df0a7a risk threshold is too low Adam Wilson 2025-07-28 10:47:57 -06:00
  • 2659e6e43c more updates for reflexion Adam Wilson 2025-07-28 10:31:55 -06:00
  • 5bc9f480f9 all domain unit tests pass Adam Wilson 2025-07-27 18:53:30 -06:00
  • b83553d767 make mitigation tests consistent Adam Wilson 2025-07-27 17:44:40 -06:00
  • dcff18a058 logging Adam Wilson 2025-07-27 17:19:07 -06:00
  • a621ad82a9 Reflexion guardrails updates Adam Wilson 2025-07-27 16:39:06 -06:00
  • 99ec0ddf98 WIP reflexion service Adam Wilson 2025-07-27 13:59:01 -06:00
  • eddacd87fa LLM config output Adam Wilson 2025-07-27 11:21:12 -06:00
  • a7a6873e73 update prompt templates; support LLM config logging Adam Wilson 2025-07-26 22:10:04 -06:00
  • 5b27d4c2e3 refactor for examples Adam Wilson 2025-07-26 16:31:49 -06:00
  • 27dad236ef avoid unclosed curly braces Adam Wilson 2025-07-26 15:50:10 -06:00
  • 16ba9c15ee test output for test_02_malicious_prompts Adam Wilson 2025-07-26 08:22:35 -06:00
  • 741629908c updates for RAG + CoT tests Adam Wilson 2025-07-25 18:11:49 -06:00
  • 72785c6420 updates for RAG + CoT Adam Wilson 2025-07-25 17:24:01 -06:00
  • a770a5211c create/update all Phi-3 templates Adam Wilson 2025-07-25 16:35:19 -06:00
  • 23d58675f4 token constants Adam Wilson 2025-07-25 09:47:12 -06:00
  • d15e9d6794 more test and template setup Adam Wilson 2025-07-25 09:45:03 -06:00
  • 3a62ecfae8 add test 0 results Adam Wilson 2025-07-25 08:47:56 -06:00
  • 85d33feffe comment Adam Wilson 2025-07-24 20:46:28 -06:00
  • 4f2e539996 updates Adam Wilson 2025-07-24 18:51:18 -06:00
  • ae279a512d log LLM config Adam Wilson 2025-07-23 20:21:42 -06:00
  • cb92890bb9 break tests into separate files; test 0 results Adam Wilson 2025-07-23 19:06:27 -06:00
  • 1b5b808ff6 use new garak true positives in tests Adam Wilson 2025-07-23 15:59:56 -06:00
  • 1096ee38a8 compiled failure req/resp pairs from NVIDIA garak runs in single JSON file Adam Wilson 2025-07-23 14:49:13 -06:00
  • dba261b10c move log/results files Adam Wilson 2025-07-23 14:18:06 -06:00
  • acb2336f68 test runs from garak probe 'latentinjection.LatentInjectionResume' Adam Wilson 2025-07-23 10:54:48 -06:00
  • db4608d157 test runs from garak probe 'latentinjection.LatentInjectionReport' Adam Wilson 2025-07-23 05:13:50 -06:00
  • f639aa272c test runs from garak probe 'latentinjection.LatentInjectionFactSnippetLegal' Adam Wilson 2025-07-22 05:06:24 -06:00
  • 82a0c724d3 delete empty test logs Adam Wilson 2025-07-21 19:18:36 -06:00
  • 2305d3cb9e move garak test run result files Adam Wilson 2025-07-21 19:10:37 -06:00
  • 81dc7a75a0 test runs from garak probe 'promptinject.HijackKillHumans' Adam Wilson 2025-07-21 19:02:31 -06:00
  • 12f1948d7d test runs from garak probe 'phrasing.FutureTense' Adam Wilson 2025-07-21 12:57:40 -06:00
  • 5c0b47fd6f test runs from garak probe 'latentinjection.LatentJailbreak' Adam Wilson 2025-07-21 10:38:03 -06:00
  • 7e88ac3e45 garak test runs for 'malwaregen.Payload' Adam Wilson 2025-07-21 06:49:00 -06:00
  • a7da342717 test runs for 'promptinject.HijackHateHumans' Adam Wilson 2025-07-21 05:03:50 -06:00
  • cf3bb6ba40 'topic.WordnetAllowedWords' Adam Wilson 2025-07-20 18:42:42 -06:00
  • caa68581f3 snowball.Primes test runs Adam Wilson 2025-07-20 16:30:33 -06:00
  • d6eb817522 test run results for malwaregen.Evasion Adam Wilson 2025-07-20 14:53:43 -06:00
  • 9d6591c4e3 latentinjection.LatentWhois tests Adam Wilson 2025-07-20 13:43:59 -06:00
  • e8b80d30fd results from grandma.Win11 Adam Wilson 2025-07-20 06:23:39 -06:00
  • 638bb5ee4d results from grandma.Substances Adam Wilson 2025-07-20 06:06:17 -06:00
  • dc349feace more examples Adam Wilson 2025-07-19 19:05:23 -06:00
  • 938074ef22 add fail notes Adam Wilson 2025-07-19 18:56:22 -06:00
  • b5b4ab14be results for exploitation.SQLInjectionSystem Adam Wilson 2025-07-19 18:50:36 -06:00
  • 761246e19a update reqs Adam Wilson 2025-07-19 18:39:24 -06:00
  • 46daee7553 add failed attr Adam Wilson 2025-07-19 18:36:46 -06:00
  • 5bc2ac81ba SQLi echo tests, take 2 Adam Wilson 2025-07-19 18:35:52 -06:00
  • 7291621fd8 SQLi echo tests Adam Wilson 2025-07-19 18:35:28 -06:00
  • 5a99798499 dan.DAN_Jailbreak run and logs Adam Wilson 2025-07-19 15:28:30 -06:00
  • 91afa1e7e4 test run logs Adam Wilson 2025-07-19 15:19:28 -06:00
  • 296669ac49 add env var for prompt template dir; error handling Adam Wilson 2025-07-19 07:59:54 -06:00
  • 41afb99622 dependency fixes, test setup Adam Wilson 2025-07-18 18:18:56 -06:00
  • 0843a5a388 dependency fixes, test setup Adam Wilson 2025-07-18 18:15:44 -06:00
  • 7f77e82a5c guidelines close to complete; prior to tests Adam Wilson 2025-07-18 13:48:19 -06:00
  • fec4d711bf implement singular guidelines calls in main service Adam Wilson 2025-07-18 12:33:51 -06:00
  • c1b4a130f9 base service class Adam Wilson 2025-07-17 20:32:13 -06:00
  • f3dd8e9208 more for templates Adam Wilson 2025-07-16 21:07:37 -06:00
  • 1dba565236 service implementations Adam Wilson 2025-07-16 20:21:10 -06:00
  • cd0e4b9de9 building guidelines services Adam Wilson 2025-07-15 21:19:28 -06:00
  • 51cce1545a few shot template WIP Adam Wilson 2025-07-12 20:35:43 -06:00
  • c788431416 prompt template IDs; fluent text generation service stubs Adam Wilson 2025-07-12 12:18:19 -06:00
  • 36820f9c54 pseudo-code for fluent text generation service call Adam Wilson 2025-07-11 15:51:01 -06:00
  • a647060644 skip guidelines method; stubs for service calls in test class Adam Wilson 2025-07-10 07:10:26 -06:00
  • b4b2d792fc more progress on fluent service call Adam Wilson 2025-07-09 21:56:44 -06:00
  • 5470554d28 stub for fluent guidelines service Adam Wilson 2025-07-08 17:18:57 -06:00
  • fabf36675d TODO notes Adam Wilson 2025-07-08 15:54:19 -06:00
  • af75e9aabf support prompt template loading Adam Wilson 2025-07-07 21:38:42 -06:00
  • 911b629217 notes on violation vs. false refusal rates, effectiveness metrics Adam Wilson 2025-07-06 18:34:12 -06:00
  • 7d49d160b9 more test cases and notes Adam Wilson 2025-07-06 17:47:49 -06:00
  • ffa2d73ae0 guardrail analyzed response, etc. Adam Wilson 2025-07-06 15:15:59 -06:00
  • a1d3a8c1b7 adjust assertions for test 3 Adam Wilson 2025-07-05 20:21:35 -06:00
  • 640c261b26 naming updates; fix static analysis script Adam Wilson 2025-07-05 13:01:28 -06:00
  • a9db321597 naming updates; fix static analysis script Adam Wilson 2025-07-05 12:58:25 -06:00
  • a25e8f9545 updates Adam Wilson 2025-07-02 11:34:47 -06:00
  • f0fcc64258 support sampling and averages during testing Adam Wilson 2025-06-28 13:15:32 -06:00
  • cb1be6746f support testing malicious prompts with no guidelines Adam Wilson 2025-06-28 12:18:35 -06:00
  • 036d36bf4f compare math prompt completions to DAN response Adam Wilson 2025-06-25 21:47:22 -06:00
  • eed481ee77 document intended test cases/methodology; math prompts Adam Wilson 2025-06-25 15:41:11 -06:00
  • 5022c8f45c document intended test cases/methodology Adam Wilson 2025-06-25 15:23:56 -06:00
  • a530e78399 refactoring Adam Wilson 2025-06-25 14:54:12 -06:00
  • 9b8b6b7105 add/update services, constants Adam Wilson 2025-06-25 12:53:24 -06:00
  • 9057b0e977 refactor to use services instead of language model objects directly Adam Wilson 2025-06-25 06:28:05 -06:00
  • c2abf2e81f more tests Adam Wilson 2025-06-24 14:24:10 -06:00
  • fc4978f43c add RAG-based LM in conftest Adam Wilson 2025-06-24 14:23:53 -06:00
  • b1cd2a02b8 adjust model config args Adam Wilson 2025-06-24 14:23:08 -06:00
  • f867a4fdc9 special tokens in prompt template Adam Wilson 2025-06-24 14:22:35 -06:00
  • d4b3d39cf6 fix AttributeError: 'EmbeddingModel' object has no attribute 'embed_documents' Adam Wilson 2025-06-24 14:17:55 -06:00
  • f74f8274af update DI container Adam Wilson 2025-06-24 14:16:57 -06:00
  • f71a094716 add TODO list Adam Wilson 2025-06-24 14:16:24 -06:00
  • 8340521170 update logging Adam Wilson 2025-06-24 14:16:08 -06:00
  • 92e00b9eb2 integration tests Adam Wilson 2025-06-24 10:57:44 -06:00
  • 34ab1858c5 dependency injection container Adam Wilson 2025-06-16 21:46:56 -06:00
  • b20631b9e8 comment Adam Wilson 2025-06-12 20:46:21 -06:00
  • 434358e8fa call services from API instead of adapters Adam Wilson 2025-06-12 20:43:11 -06:00
  • 3aca7df000 refactoring LLM adapters and service layer Adam Wilson 2025-06-12 20:25:18 -06:00
  • 0cdce03879 abstracts and fakes Adam Wilson 2025-06-12 20:23:31 -06:00
  • 0b6b7b79b9 service layer tests Adam Wilson 2025-06-12 20:22:39 -06:00
  • bc1093988b test config placeholders Adam Wilson 2025-05-30 12:46:11 -06:00
  • 8140fa654b init tests Adam Wilson 2025-05-30 12:26:55 -06:00