www-project-ai-testing-guide/Document/content/4.0_Domain_Specific_Testing.md at eeaa84828dba35e8d052d99d25ff43c67b5d520e

CalvinBackup/www-project-ai-testing-guide

mirror of https://github.com/OWASP/www-project-ai-testing-guide.git synced 2026-03-01 05:43:24 +00:00

Files

Matteo Meucci 574e1221a7 Rename chapter to Appendix F

Updated chapter title to 'Appendix F: Domain Specific Testing'.

2025-11-13 16:53:22 +01:00

8.9 KiB

Raw Blame History

Appendix F: Domain Specific Testing

The AI Testing Guide framework is structured around four core layers of a typical AI Architecture:

Application Layer
Model Layer
Infrastructure Layer
Data Layer

While the framework integrates these layers comprehensively, it is also possible-and often useful—to focus testing efforts on one specific domain such as Security, Privacy, RAI and Trustworthy AI (TAI).

It is useful to understand which tests are related to Security and Privacy and which to RAI and TAI.

We define the following three kinds of Input for an AI System:

1) Prompt (Intended Prompt)
This is the standard, intended input given to the AI system by the designer or user without any malicious intent or attempt to manipulate the system. This type of input aligns with the intended functionality and expected behavior of the AI model.

Example: Asking the AI to summarize an article, answer a customer query, or generate text according to specified instructions.

2) Prompt Injection
This type of input refers to scenarios where the user intentionally or unintentionally provides prompts designed to manipulate or override the intended behavior of the AI system, causing it to produce unintended, possibly harmful, or sensitive outputs. It can be divided into two categories:

Direct Prompt Injection: User directly inputs text commands explicitly intended to manipulate model behavior.
Indirect Prompt Injection: Malicious prompts embedded within externally sourced content (e.g., web pages, documents) which the model ingests unknowingly.
Example: Injecting instructions to bypass security constraints, reveal private system information, or disregard programmed ethical guidelines.

3) Prompt Attack
Prompt attacks involve specific manipulative prompts designed to exploit inherent model vulnerabilities related to hallucinations (model inventing or confidently providing incorrect information) and Responsible AI (RAI) and Trustworthy considerations (ethical guidelines, fairness, transparency). This category explicitly deals with exploiting known weaknesses in the AI model’s accuracy or ethical controls.

Hallucination Attack: Crafting prompts designed to exploit the model's propensity for fabricating believable but incorrect responses.
- Example: Asking the AI for non-existent historical facts or scientific data, leading to plausible yet entirely fabricated responses.
Responsible AI Attack (RAI Attack): Prompts intentionally exploiting or subverting ethical and fairness-related constraints.
- Example: Manipulating the model to generate biased, toxic, or inappropriate outputs, contrary to established ethical standards.

Below are individual mappings for each domain, enabling focused and targeted testing.

Security Domain Testing

Test ID	Test Name & Link
AITG-APP-01	Testing for Prompt Injection
AITG-APP-02	Testing for Indirect Prompt Injection
AITG-APP-03	Testing for Sensitive Data Leak
AITG-APP-04	Testing for Input Leakage
AITG-APP-05	Testing for Unsafe Outputs
AITG-APP-06	Testing for Agentic Behavior Limits
AITG-APP-07	Testing for Prompt Disclosure
AITG-APP-08	Testing for Embedding Manipulation
AITG-APP-09	Testing for Model Extraction
AITG-MOD-01	Testing for Evasion Attacks
AITG-MOD-02	Testing for Runtime Model Poisoning
AITG-MOD-03	Testing for Poisoned Training Sets
AITG-INF-01	Testing for Supply Chain Tampering
AITG-INF-02	Testing for Resource Exhaustion
AITG-INF-05	Testing for Fine-tuning Poisoning
AITG-INF-06	Testing for Dev-Time Model Theft
AITG-DAT-02	Testing for Runtime Exfiltration

🛡️ Privacy Domain Testing

Test ID	Test Name & Link
AITG-APP-03	Testing for Sensitive Data Leak
AITG-APP-04	Testing for Input Leakage
AITG-APP-07	Testing for Prompt Disclosure
AITG-MOD-04	Testing for Membership Inference
AITG-MOD-05	Testing for Inversion Attacks
AITG-INF-06	Testing for Dev-Time Model Theft
AITG-DAT-01	Testing for Training Data Exposure
AITG-DAT-02	Testing for Runtime Exfiltration
AITG-DAT-05	Testing for Data Minimization & Consent

⚖️ Responsible AI (RAI) Domain Testing

Test ID	Test Name & Link
AITG-APP-05	Testing for Unsafe Outputs
AITG-APP-10	Testing for Harmful Content Bias
AITG-APP-12	Testing for Toxic Output
AITG-APP-13	Testing for Over-Reliance on AI
AITG-APP-14	Testing for Explainability and Interpretability
AITG-INF-04	Testing for Capability Misuse
AITG-DAT-03	Testing for Dataset Diversity & Coverage
AITG-DAT-04	Testing for Harmful Content in Data

✅ Trustworthy AI Domain Testing

Test ID	Test Name & Link
AITG-APP-06	Testing for Agentic Behavior Limits
AITG-APP-11	Testing for Hallucinations
AITG-APP-13	Testing for Over-Reliance on AI
AITG-APP-14	Testing for Explainability and Interpretability
AITG-MOD-06	Testing for Robustness to New Data
AITG-MOD-07	Testing for Goal Alignment
AITG-INF-03	Testing for Plugin Boundary Violations
AITG-INF-04	Testing for Capability Misuse
AITG-DAT-05	Testing for Data Minimization & Consent

8.9 KiB Raw Blame History Unescape Escape