Commit Graph

74 Commits

Author SHA1 Message Date
Matteo Meucci b5b74f045b Merge pull request #45 from nhumblot/prompt-injection-structured-output-attack
feat: add structured output attack example with source
2025-11-05 14:51:04 +01:00
Matteo Meucci 52dd155251 Rename document for prompt disclosure testing
Updated the title of the testing document to reflect prompt disclosure.
2025-11-02 18:57:11 +01:00
Matteo Meucci 04ba8c5b43 Revise AITG-DAT-01 document for clarity and structure
Updated the document to improve structure and clarity, including sections on testing methodology, expected outputs, remediation strategies, and suggested tools.
2025-11-02 18:55:56 +01:00
Matteo Meucci b496d11a8b Revise testing document for explainability and interpretability
Updated the document to enhance clarity and detail in the explanation of testing for explainability and interpretability in AI systems. Added specific objectives, expected outputs, remediation strategies, and suggested tools.
2025-11-02 18:22:29 +01:00
Matteo Meucci 4fd6fa2000 Update AITG-APP-13_Testing_for_Over-Reliance_on_AI.md 2025-11-02 18:17:00 +01:00
Matteo Meucci 4aef9d8a69 Revise testing document for toxic output
Updated the structure and content of the testing document to improve clarity and organization, including renaming sections and enhancing remediation strategies.
2025-11-02 18:13:23 +01:00
Matteo Meucci 9c1c965948 Change headers from H3 to H2 and H4 to H3
Updated header levels for better document structure.
2025-11-02 18:06:23 +01:00
Matteo Meucci 140f236dd4 Update headings and improve test documentation 2025-11-02 18:05:53 +01:00
Matteo Meucci 6411868698 Revise section headers for testing document
Updated section headers to improve clarity and consistency in the testing document.
2025-11-02 17:47:23 +01:00
Matteo Meucci 9d01b136f8 Revise expected output for model extraction tests
Updated expected output criteria for model extraction testing, clarifying fidelity levels and defensive mechanisms.
2025-11-02 17:46:43 +01:00
Matteo Meucci f36d16964d Enhance model extraction testing documentation
Expanded testing documentation for model extraction attacks, including detailed payloads, prerequisites, and step-by-step instructions for data acquisition, surrogate model training, and evaluation.
2025-11-02 17:45:12 +01:00
Matteo Meucci 8e55e6238d Enhance embedding manipulation testing documentation
Expanded testing scenarios for embedding manipulation, including payloads and expected secure behaviors.
2025-11-02 17:28:41 +01:00
Matteo Meucci ae07885a80 Enhance documentation on embedding manipulation testing
Expanded the section on embedding manipulation to include detailed explanations of vulnerabilities, attack vectors, and testing objectives. Updated suggested tools for testing embedding robustness.
2025-11-02 17:23:25 +01:00
Matteo Meucci c99d2969f3 Refine testing documentation for prompt disclosure
Updated sections for clarity and consistency, including test objectives, expected outputs, and suggested tools.
2025-10-30 17:38:58 +01:00
Matteo Meucci d2b2f3b057 Refine content and headings for agentic behavior testing
Updated section headings for consistency and clarity. Revised text for better readability and precision regarding agentic behavior testing.
2025-10-30 17:22:31 +01:00
Matteo Meucci 88f15ccb7d Revise section titles for clarity in testing guidelines
Updated section titles and clarified testing instructions for unsafe outputs.
2025-10-30 17:18:32 +01:00
Matteo Meucci 8bd00636cd Revise section titles in input leakage testing doc
Updated section titles for clarity in testing documentation.
2025-10-30 17:08:58 +01:00
Matteo Meucci dac1a442f4 Revise test documentation for sensitive data leakage
Updated sections for clarity and consistency in testing documentation.
2025-10-30 17:05:34 +01:00
Matteo Meucci 1ca047f15a Update testing document for indirect prompt injection 2025-10-30 17:03:10 +01:00
Matteo Meucci 8a6445b6ae Update testing document for prompt injection techniques 2025-10-30 17:01:39 +01:00
Federico Dotta 76ffd748ba + Tools vulnerabilities 2025-10-28 09:44:46 +01:00
Federico Dotta e6cc4ffb64 + MCP indirect prompt injection 2025-10-28 09:44:33 +01:00
Nicolas Humblot e637aa06f2 feat: add structured output attack example with source 2025-10-17 11:50:27 +02:00
marti-jorda-roca 6a81e0318c Add reference to Echo Chamber attack blog 2025-10-16 17:21:47 +02:00
Matteo Meucci aaffd7e14c Merge pull request #27 from DotDotSlashRepo/main
Enhancements to testcases
2025-10-10 10:40:18 +02:00
Matteo Meucci c0c38b582e Merge pull request #32 from zangobot/main
Include more testing tools, by dividing them between general-purpouse or domain-specific
2025-09-09 16:37:06 +02:00
Luca Demetrio 0749eeda55 Update AITG-MOD-01_Testing_for_Evasion_Attacks.md
Removed typo
2025-09-02 11:21:23 +02:00
Roei Arpaly 4182d8f869 Update AITG-APP-04_Testing_for_Input_Leakage.md
Co-authored-by: Yoni Birman <birmanbirman@gmail.com>
2025-08-31 23:13:40 +03:00
Roei Arpaly 296224d780 Update AITG-APP-04_Testing_for_Input_Leakage.md
adding adversarial input test cases
2025-08-13 11:46:54 +03:00
maurapintor 0ed6bb99ad added secml-torch and adv-lib, updated description of deepsec 2025-08-08 10:16:15 +02:00
Luca Demetrio be0385d8cf Update AITG-MOD-01_Testing_for_Evasion_Attacks.md
Update AI security testing tools by adding difference between general-purpose and domain-specific libraries
2025-08-08 09:57:15 +02:00
DotDotSlash 3bd5536fbd Update AITG-APP-05_Testing_for_Unsafe_Outputs.md
fixed a typo
2025-08-05 16:24:06 +05:30
DotDotSlash e5e95445cb Update AITG-APP-01_Testing_for_Prompt_Injection.md
added more examples of filter bypass while attempting to extract sensitive information
2025-08-05 16:21:26 +05:30
DotDotSlash 22eaecdd59 Update AITG-APP-03_Testing_for_Sensitive_Data_Leak.md
Added additional prompts on testing for implementation details leak
2025-08-05 15:56:08 +05:30
Federico Ricciuti befe2755c7 Introduced Debunking tests and a differentiation between "Factuality and Misinformation" and "Debunking" hallucinations. As described by Giskard in the Phrase benchmark. 2025-08-03 14:34:38 +02:00
fedric95 d27026fda7 Merge branch 'OWASP:main' into main 2025-07-25 20:30:56 +02:00
Federico Ricciuti 0dd87354da 1. Specified that temperature=0 does not imply reproducibility (https://arxiv.org/pdf/2506.09501)
2. Pointed out that LLMs are generally less secure in low-resource languages
3. Made some order on the payloads for the bias test, now it using always the same base example.
2025-07-25 20:26:32 +02:00
federicodotta 897c532bba + Planning instructions to avoid issues with token consumption 2025-07-25 12:18:11 +02:00
Federico Ricciuti 9da16a16c1 Correction of the readme to refer to the correct changed test 2025-07-17 15:22:07 +02:00
Federico Ricciuti 977235af4d Introduction of the AITG-APP-10_Testing_for_Content_Bias.md, with tests to detect biased decisions made by the AI System. 2025-07-17 15:16:22 +02:00
Federico Ricciuti 49ee4b9d6c The unsafe output test now includes hate releated unsafe content as part of the tests.
AITG-APP-10_Testing_for_Harmful_Content_Bias.md replaced with AITG-APP-10_Testing_for_Content_Bias.md, and now it focuses on the detection of biases contened in the generated outputs.
2025-07-17 15:14:33 +02:00
Matteo Meucci 71b4f26900 Merge pull request #20 from fedric95/main 2025-07-12 21:30:58 +04:00
Federico Ricciuti 198167aebe - Introduced the necessity of defining a safety taxonomy before conducting the tests: the definition of what is safe and what is unsafe depends on the application.
- Linked an existing safety taxonomy
- Added examples of moderation models
- Removed most of the references to the concept of bias. They should be addressed in another test.

TO-DO

- Include tests that consider the potential multimodal nature of the application (right now it is more text-only)
- Make a specific test to evaluate the biases of the AI application under test and remove all the references to biases in this test
2025-07-12 19:12:00 +02:00
federicodotta 5dbedf3dc3 Prompt Injection Techniques section addeded 2025-07-12 13:51:10 +02:00
federicodotta 5a434e776b Update in typo tricks 2025-07-12 12:35:05 +02:00
federicodotta a56ba3f4e6 + Echo Chamber Attack 2025-07-12 12:24:58 +02:00
federicodotta b483d240cf + AntiGPT reference 2025-07-12 11:53:03 +02:00
federicodotta abfcbde568 + AntiGPT Prompt Injection 2025-07-12 11:49:27 +02:00
Didier Durand e754867dd5 fixing typos in multiple texts. 2025-06-29 13:48:42 +02:00
federicodotta a50c0ceb65 Fix copy/paste error in AITG-INF-02 2025-06-26 18:36:36 +02:00