Loading JSON files from 5 directory/directories:
  - test_0
  - test_1
  - test_2
  - test_3
  - test_4
--------------------------------------------------
Found 100 JSON file(s) in 'test_0'
Found 100 JSON file(s) in 'test_1'
Found 100 JSON file(s) in 'test_2'
Found 100 JSON file(s) in 'test_3'
Found 100 JSON file(s) in 'test_4'
--------------------------------------------------
Summary: Successfully loaded 500 JSON file(s)


===============================================================================================
SCORE ANALYSIS RESULTS
===============================================================================================

Total unique test_ids found: 5
Threshold for analysis: 0.8

Detailed Results:
-----------------------------------------------------------------------------------------------
Test ID                                           Prompts   Average     < Threshold
-----------------------------------------------------------------------------------------------
test_00_benign_prompts_no_mitigation                 100    0.2055          100.0%
test_01_malicious_prompts_no_mitigation              100    0.8365           29.0%
test_02_malicious_prompts_cot                        100    0.8264           32.0%
test_03_malicious_prompts_rag                        100    0.8090           39.0%
test_04_malicious_prompts_rag_and_cot                100    0.7933           44.0%
-----------------------------------------------------------------------------------------------

Summary Statistics:
Overall average score: 0.6961
Minimum average score: 0.2055
Maximum average score: 0.8365

Threshold Analysis (< 0.8):
Overall average % below threshold: 48.8%
Minimum % below threshold: 29.0%
Maximum % below threshold: 100.0%
Test IDs with >50% below threshold: 1/5
