mirror of
https://github.com/OWASP/www-project-ai-testing-guide.git
synced 2026-06-01 03:51:42 +02:00
@@ -5,7 +5,7 @@ This test aims to identify vulnerabilities related to content bias in AI-generat
|
||||
- Positional bias
|
||||
- Prioritization of content based on its position within the given context
|
||||
- Stereotypical bias
|
||||
- Prejudiced assumptions about a group in a context where it should not influence the decision process. These biases generally originate from overrepresentation of certain categories in specific roles. Examples of prejucided assumptions could be releated to gender, ethnicity, age.
|
||||
- Prejudiced assumptions about a group in a context where it should not influence the decision process. These biases generally originate from overrepresentation of certain categories in specific roles. Examples of prejudiced assumptions could be related to gender, ethnicity, age.
|
||||
- Framing bias
|
||||
- Alteration in responses when confronted with a problem presented in a different way what with the same semantic meaning
|
||||
- Anchoring bias
|
||||
@@ -19,7 +19,7 @@ The test must not consider only artificially constrained tasks, such as asking t
|
||||
|
||||
#### Test Objectives
|
||||
|
||||
The results of the test must be interpreted considering the level of fairness that the AI system must adher to (for example: equalized odds, equal opportunity, demographic parity). For example, considering an AI system that must evaluate the curriculum of students for the admission of a college, and assuming that equal opportunity is desired on the gender attribute, the test must evaluate if the True Positive Rate of the system is similar indipendently of the gender.
|
||||
The results of the test must be interpreted considering the level of fairness that the AI system must adhere to (for example: equalized odds, equal opportunity, demographic parity). For example, considering an AI system that must evaluate the curriculum of students for the admission of a college, and assuming that equal opportunity is desired on the gender attribute, the test must evaluate if the True Positive Rate of the system is similar independently of the gender.
|
||||
|
||||
#### How to Test/Payloads
|
||||
|
||||
|
||||
@@ -65,6 +65,6 @@ AI-generated outputs must:
|
||||
### References
|
||||
- Gentrace: "How to test for AI hallucination" [Link](https://gentrace.ai/blog/how-to-test-for-ai-hallucination)
|
||||
- OWASP Top 10 for LLM Applications 2025. "LLM09:2025 Misinformation." OWASP, 2025. [Link](https://genai.owasp.org/llmrisk/llm09-overreliance)
|
||||
- Phare LLM Benchmarkg, Giskard, 2025. [Link](https://phare.giskard.ai/)
|
||||
- Phare LLM Benchmarking, Giskard, 2025. [Link](https://phare.giskard.ai/)
|
||||
- OWASP Top 10 LLM 2025: a Synapsed Research Study [Link](https://synapsed.ai/rd-owasp-top-10-llm-2025-a-synapsed-research-study/)
|
||||
- Google Gemini Hallucinations - [Article Link](https://www.engadget.com/google-ceo-says-gemini-image-generation-failures-were-unacceptable-163748934.html)
|
||||
|
||||
Reference in New Issue
Block a user