Matteo Meucci
b5b74f045b
Merge pull request #45 from nhumblot/prompt-injection-structured-output-attack
...
feat: add structured output attack example with source
2025-11-05 14:51:04 +01:00
Matteo Meucci
52dd155251
Rename document for prompt disclosure testing
...
Updated the title of the testing document to reflect prompt disclosure.
2025-11-02 18:57:11 +01:00
Matteo Meucci
04ba8c5b43
Revise AITG-DAT-01 document for clarity and structure
...
Updated the document to improve structure and clarity, including sections on testing methodology, expected outputs, remediation strategies, and suggested tools.
2025-11-02 18:55:56 +01:00
Matteo Meucci
b496d11a8b
Revise testing document for explainability and interpretability
...
Updated the document to enhance clarity and detail in the explanation of testing for explainability and interpretability in AI systems. Added specific objectives, expected outputs, remediation strategies, and suggested tools.
2025-11-02 18:22:29 +01:00
Matteo Meucci
4fd6fa2000
Update AITG-APP-13_Testing_for_Over-Reliance_on_AI.md
2025-11-02 18:17:00 +01:00
Matteo Meucci
4aef9d8a69
Revise testing document for toxic output
...
Updated the structure and content of the testing document to improve clarity and organization, including renaming sections and enhancing remediation strategies.
2025-11-02 18:13:23 +01:00
Matteo Meucci
9c1c965948
Change headers from H3 to H2 and H4 to H3
...
Updated header levels for better document structure.
2025-11-02 18:06:23 +01:00
Matteo Meucci
140f236dd4
Update headings and improve test documentation
2025-11-02 18:05:53 +01:00
Matteo Meucci
6411868698
Revise section headers for testing document
...
Updated section headers to improve clarity and consistency in the testing document.
2025-11-02 17:47:23 +01:00
Matteo Meucci
9d01b136f8
Revise expected output for model extraction tests
...
Updated expected output criteria for model extraction testing, clarifying fidelity levels and defensive mechanisms.
2025-11-02 17:46:43 +01:00
Matteo Meucci
f36d16964d
Enhance model extraction testing documentation
...
Expanded testing documentation for model extraction attacks, including detailed payloads, prerequisites, and step-by-step instructions for data acquisition, surrogate model training, and evaluation.
2025-11-02 17:45:12 +01:00
Matteo Meucci
8e55e6238d
Enhance embedding manipulation testing documentation
...
Expanded testing scenarios for embedding manipulation, including payloads and expected secure behaviors.
2025-11-02 17:28:41 +01:00
Matteo Meucci
ae07885a80
Enhance documentation on embedding manipulation testing
...
Expanded the section on embedding manipulation to include detailed explanations of vulnerabilities, attack vectors, and testing objectives. Updated suggested tools for testing embedding robustness.
2025-11-02 17:23:25 +01:00
Matteo Meucci
c99d2969f3
Refine testing documentation for prompt disclosure
...
Updated sections for clarity and consistency, including test objectives, expected outputs, and suggested tools.
2025-10-30 17:38:58 +01:00
Matteo Meucci
d2b2f3b057
Refine content and headings for agentic behavior testing
...
Updated section headings for consistency and clarity. Revised text for better readability and precision regarding agentic behavior testing.
2025-10-30 17:22:31 +01:00
Matteo Meucci
88f15ccb7d
Revise section titles for clarity in testing guidelines
...
Updated section titles and clarified testing instructions for unsafe outputs.
2025-10-30 17:18:32 +01:00
Matteo Meucci
8bd00636cd
Revise section titles in input leakage testing doc
...
Updated section titles for clarity in testing documentation.
2025-10-30 17:08:58 +01:00
Matteo Meucci
dac1a442f4
Revise test documentation for sensitive data leakage
...
Updated sections for clarity and consistency in testing documentation.
2025-10-30 17:05:34 +01:00
Matteo Meucci
1ca047f15a
Update testing document for indirect prompt injection
2025-10-30 17:03:10 +01:00
Matteo Meucci
8a6445b6ae
Update testing document for prompt injection techniques
2025-10-30 17:01:39 +01:00
Federico Dotta
76ffd748ba
+ Tools vulnerabilities
2025-10-28 09:44:46 +01:00
Federico Dotta
e6cc4ffb64
+ MCP indirect prompt injection
2025-10-28 09:44:33 +01:00
Nicolas Humblot
e637aa06f2
feat: add structured output attack example with source
2025-10-17 11:50:27 +02:00
marti-jorda-roca
6a81e0318c
Add reference to Echo Chamber attack blog
2025-10-16 17:21:47 +02:00
Matteo Meucci
aaffd7e14c
Merge pull request #27 from DotDotSlashRepo/main
...
Enhancements to testcases
2025-10-10 10:40:18 +02:00
Matteo Meucci
c0c38b582e
Merge pull request #32 from zangobot/main
...
Include more testing tools, by dividing them between general-purpouse or domain-specific
2025-09-09 16:37:06 +02:00
Luca Demetrio
0749eeda55
Update AITG-MOD-01_Testing_for_Evasion_Attacks.md
...
Removed typo
2025-09-02 11:21:23 +02:00
Roei Arpaly
4182d8f869
Update AITG-APP-04_Testing_for_Input_Leakage.md
...
Co-authored-by: Yoni Birman <birmanbirman@gmail.com >
2025-08-31 23:13:40 +03:00
Roei Arpaly
296224d780
Update AITG-APP-04_Testing_for_Input_Leakage.md
...
adding adversarial input test cases
2025-08-13 11:46:54 +03:00
maurapintor
0ed6bb99ad
added secml-torch and adv-lib, updated description of deepsec
2025-08-08 10:16:15 +02:00
Luca Demetrio
be0385d8cf
Update AITG-MOD-01_Testing_for_Evasion_Attacks.md
...
Update AI security testing tools by adding difference between general-purpose and domain-specific libraries
2025-08-08 09:57:15 +02:00
DotDotSlash
3bd5536fbd
Update AITG-APP-05_Testing_for_Unsafe_Outputs.md
...
fixed a typo
2025-08-05 16:24:06 +05:30
DotDotSlash
e5e95445cb
Update AITG-APP-01_Testing_for_Prompt_Injection.md
...
added more examples of filter bypass while attempting to extract sensitive information
2025-08-05 16:21:26 +05:30
DotDotSlash
22eaecdd59
Update AITG-APP-03_Testing_for_Sensitive_Data_Leak.md
...
Added additional prompts on testing for implementation details leak
2025-08-05 15:56:08 +05:30
Federico Ricciuti
befe2755c7
Introduced Debunking tests and a differentiation between "Factuality and Misinformation" and "Debunking" hallucinations. As described by Giskard in the Phrase benchmark.
2025-08-03 14:34:38 +02:00
fedric95
d27026fda7
Merge branch 'OWASP:main' into main
2025-07-25 20:30:56 +02:00
Federico Ricciuti
0dd87354da
1. Specified that temperature=0 does not imply reproducibility ( https://arxiv.org/pdf/2506.09501 )
...
2. Pointed out that LLMs are generally less secure in low-resource languages
3. Made some order on the payloads for the bias test, now it using always the same base example.
2025-07-25 20:26:32 +02:00
federicodotta
897c532bba
+ Planning instructions to avoid issues with token consumption
2025-07-25 12:18:11 +02:00
Federico Ricciuti
9da16a16c1
Correction of the readme to refer to the correct changed test
2025-07-17 15:22:07 +02:00
Federico Ricciuti
977235af4d
Introduction of the AITG-APP-10_Testing_for_Content_Bias.md, with tests to detect biased decisions made by the AI System.
2025-07-17 15:16:22 +02:00
Federico Ricciuti
49ee4b9d6c
The unsafe output test now includes hate releated unsafe content as part of the tests.
...
AITG-APP-10_Testing_for_Harmful_Content_Bias.md replaced with AITG-APP-10_Testing_for_Content_Bias.md, and now it focuses on the detection of biases contened in the generated outputs.
2025-07-17 15:14:33 +02:00
Matteo Meucci
71b4f26900
Merge pull request #20 from fedric95/main
2025-07-12 21:30:58 +04:00
Federico Ricciuti
198167aebe
- Introduced the necessity of defining a safety taxonomy before conducting the tests: the definition of what is safe and what is unsafe depends on the application.
...
- Linked an existing safety taxonomy
- Added examples of moderation models
- Removed most of the references to the concept of bias. They should be addressed in another test.
TO-DO
- Include tests that consider the potential multimodal nature of the application (right now it is more text-only)
- Make a specific test to evaluate the biases of the AI application under test and remove all the references to biases in this test
2025-07-12 19:12:00 +02:00
federicodotta
5dbedf3dc3
Prompt Injection Techniques section addeded
2025-07-12 13:51:10 +02:00
federicodotta
5a434e776b
Update in typo tricks
2025-07-12 12:35:05 +02:00
federicodotta
a56ba3f4e6
+ Echo Chamber Attack
2025-07-12 12:24:58 +02:00
federicodotta
b483d240cf
+ AntiGPT reference
2025-07-12 11:53:03 +02:00
federicodotta
abfcbde568
+ AntiGPT Prompt Injection
2025-07-12 11:49:27 +02:00
Didier Durand
e754867dd5
fixing typos in multiple texts.
2025-06-29 13:48:42 +02:00
federicodotta
a50c0ceb65
Fix copy/paste error in AITG-INF-02
2025-06-26 18:36:36 +02:00