Merge pull request #48 from mmorana1/patch-20

Update 2.2_Appendix_E.md
This commit is contained in:
Matteo Meucci
2025-10-21 19:16:13 +02:00
committed by GitHub

View File

@@ -1,34 +1,24 @@
## 2.2 Appendix E: AI Threats Mapping to AI Components Vulnerabilities (CVEs & CWEs)
**AI Penetration Testing Framework: Scoping, CVE/CWE Mapping, and Threat Correlation**
This appendix guides penetration testers on mapping discovered CVEs and CWEs in SAIF components of an AI architecture to AI-specific threats. CVEs generally point to vulnerabilities in the underlying technology stack such as libraries, frameworks, APIs used to build AI Systems and AI Applications. Because the pen tests described here target a live AI system/Application, careful scoping is essential: testers must first identify which SAIF components and subcomponents are in scope, enumerate the exact technologies deployed for each, and use that inventory to prioritize CVE/CWE enumeration and threat simulations. In-scope items commonly include components owned or operated by the organization and directly involved in the request→response flow — for example, chat UIs, API backends (e.g., FastAPI), session/orchestration layers, model orchestration frameworks (e.g., LangChain or LlamaIndex), vector stores (Redis, Pinecone, Weaviate), ETL/data pipelines, model-serving endpoints, and internally managed connectors. Because these components can contain outdated, misconfigured, or otherwise exploitable dependencies, the first operational step is threat enumeration: map each in-scope SAIF component to its tech stack, identify relevant CVEs (and corresponding CWEs), and derive likely exploit paths. That mapping then drives focused validation with scanners, SCA tools, and proof-of-concept testing so testers can prioritize, reproduce, and demonstrate how conventional software flaws translate into AI-centric impacts.
**Step 1 — Threat Enumeration and CVE Exploit Path Mapping**
This appendix guides penetration testers on mapping discovered CVEs and CWEs in SAIF components of an AI architecture to AI-specific threats. CVEs generally point to vulnerabilities in the underlying AI technology stack such as libraries, frameworks, APIs used to build AI Systems and AI Applications. CWEs (Common Weakness Enumerations), on the other hand, describe classes of software design or implementation flaws that may lead to such vulnerabilities. In this appendix, we outline how penetration testers can use CWE mappings to frame the root causes behind vulnerabilities identified during the AI Testing Guide (AITG) test execution. As testers conduct these tests across in-scope SAIF components, the findings may indicate technical exposures or behavioral anomalies in the AI system. To interpret these results effectively, testers should reference CWE (Common Weakness Enumeration) entries, which represent the underlying software or design weaknesses that may have caused the observed test failure. In this appendix, we outline how penetration testers can use CWE mappings to articulate the root causes behind vulnerabilities identified through AI Testing Guide (AITG) test execution. This appendix therefore provides a Threat-to-SAIF-Component-to-CWE mapping, complementing the existing Threat-to-Test-Case (AITG) correlation. The goal is to make test reporting more diagnostic and actionable: the more precisely the tester can describe why a test failed in terms of a CWE root cause, the more effectively development and engineering teams can remediate it.
This appendix guides penetration testers on mapping discovered CVEs and CWEs in SAIF components of an AI architecture to AI-specific threats. CVEs (Common Vulnerabilities and Exposures) generally point to specific, documented vulnerabilities in the underlying technology stack, such as libraries, frameworks, or APIs used to build AI systems and applications. CWEs (Common Weakness Enumerations), on the other hand, describe classes of software design or implementation flaws that may lead to such vulnerabilities.
**Step 1 - Scoping AI Penetration Tests Within the SAIF Architecture**
**Step 1 Scoping AI Penetration Tests Within the SAIF Architecture**
Because the pen tests described here target a live AI system/Application, careful scoping is essential: testers must first identify which SAIF components and subcomponents are in scope, enumerate the exact technologies deployed for each, and use that inventory to prioritize CVE/CWE enumeration and threat simulations. In-scope items commonly include components owned or operated by the organization and directly involved in the request→response flow, for example, chat UIs, API backends (e.g., FastAPI), session/orchestration layers, model orchestration frameworks (e.g., LangChain or LlamaIndex), vector stores (Redis, Pinecone, Weaviate), ETL/data pipelines, model-serving endpoints, and internally managed connectors. Because these components can contain outdated, misconfigured, or otherwise exploitable dependencies, the first operational step is threat enumeration: map each in-scope SAIF component to its tech stack, identify relevant CVEs (and corresponding CWEs), and derive likely exploit paths. That mapping then drives focused validation with scanners, SCA tools, and proof-of-concept testing so testers can prioritize, reproduce, and demonstrate how conventional software flaws translate into AI-centric impacts.
**Step 2 — Threat Enumeration and CVE Exploit Path Mapping**
The process of mapping threats to Ai system vulnerabilities starts by identifying known vulnerabilities in AI systems/applications using Software composition analyzers (SCAs) and runtime tools. SCA Tools (e.g., Snyk, Trivy, Dependabot, OWASP Dependency-Check, and GitHub Advanced Security) will flag vulnerable third party software dependencies, while scanners such as Nessus and Nuclei can confirm active CVE exposures in APIs and services. Runtime telemetry and host inspection can also validate which CVEs are exploitable in live environments. These CVEs are then mapped to AI-specific threats (i.e. TA0i-XX threats) outlined in this guide: for example, a FastAPI sanitization flaw (CVE-2022-36067) can be part of a prompt-injection vector (T01-DPIJ), and an Airflow ETL vulnerability (CVE-2022-40127) can lead to data poisoning (T01-DMP) in a RAG pipeline.
The process of mapping threats to Ai system vulnerabilities starts by identifying known vulnerabilities expressed as CVEs in AI systems/applications using Software composition analyzers (SCAs) and runtime tools. SCA Tools (e.g., Snyk, Trivy, Dependabot, OWASP Dependency-Check, and GitHub Advanced Security) will flag vulnerable third party software dependencies, while scanners such as Nessus and Nuclei can confirm active CVE exposures in APIs and services. Runtime telemetry and host inspection can also validate which CVEs are exploitable in live environments. These CVEs are then mapped to AI-specific threats (i.e. TA0i-XX threats) outlined in this guide: for example, a FastAPI sanitization flaw (CVE-2022-36067) can be part of a prompt-injection vector (T01-DPIJ), and an Airflow ETL vulnerability (CVE-2022-40127) can lead to data poisoning (T01-DMP) in a RAG pipeline.
For each SAIF component in scope, testers review subcomponents, confirm deployed technologies, and run focused tests to find exploitable or unpatched libraries. These findings drive AI-specific attack simulations such as prompt injection, model inversion, data poisoning, or runtime DoS to reveal real application impact. Using the CVE exploit-path mapping table, testers can maintain traceability from vulnerability to AI impact. For instance, Redis in SAIF #4 (Application Layer) vulnerable to CVE-2022-0543 links to risks like data leakage (T01-SID), model disruption (T01-DoSM), and manipulation (T01-MTD). A single Redis compromise can escalate from infrastructure control to model tampering—compromising data integrity, availability, and trust.
**Step 2 — Threat Enumeration and CWE Exploit Path Mapping**
The second recommended step is to perform a AI threat enumeration and CWE exploit-path mapping. This step transforms vulnerability-centric testing into design-level assurance. By classifying findings under CWE categories, the pen tester bridges the gap between patch management and resilient AI architecture. CWE mapping clarifies attacker objectives, expands test coverage beyond isolated CVEs, and guides remediation that strengthens entire system layers rather than individual components. The CWE-based table reframes technical flaws as architectural weaknesses, for instance, CWE-20 (Improper Input Validation) exposes weak parsing logic, CWE-276 (Incorrect Default Permissions) reveals insecure defaults in data storage such as S3 buckets, and CWE-345 (Insufficient Verification of Data Authenticity) uncovers trust and integrity flaws in RAG ingestion. This approach helps testers not only find where AI applications break, but also understand why they break and how to redesign them to resist future exploitation.
**Step 3 — Threat Enumeration and CWE Exploit Path Mapping**
**Step 3 AI Threat-to-CWE Mapping for Root Cause and Remediation**
The second recommended step is to perform a AI threat enumeration and CWE exploit-path mapping. This step transforms vulnerability-centric testing into design-level assurance. Accordingly, this appendix provides a Threat-to-SAIF-Component-to-CWE mapping, complementing the Threat-to-Test-Case mapping (AITG tests) presented earlier. Together, these mappings enable testers to correlate observed vulnerabilities with their root causeswhether they stem from insecure design flaws, implementation weaknesses, or configuration errors. This approach supports consistent reporting and helps organizations trace AI-specific test failures (e.g., prompt injection, data leakage, model poisoning) back to their foundational weaknesses, improving both remediation guidance and future threat modeling accuracy. By classifying findings under CWE categories, the pen tester bridges the gap between patch management and resilient AI architecture. CWE mapping clarifies attacker objectives, expands test coverage beyond isolated CVEs, and guides remediation that strengthens entire system layers rather than individual components. The CWE-based table reframes technical flaws as architectural weaknesses, for instance, CWE-20 (Improper Input Validation) exposes weak parsing logic, CWE-276 (Incorrect Default Permissions) reveals insecure defaults in data storage such as S3 buckets, and CWE-345 (Insufficient Verification of Data Authenticity) uncovers trust and integrity flaws in RAG ingestion. This approach helps testers not only find where AI applications break, but also understand why they break and how to redesign them to resist future exploitation.
**Step 3 — AI Threat Mapping and Secure Design Recommendations**
Finally, the third step is to look at AI threats, targeted CWEs and provide recommendations to Fix Them in the Pen Testing Report. Vulnerability types/CWEs might represent security design flaws or mis-configurations that could be targeted by AI threats. It is important that when these CWEs are included in the test report are also accompanied by recommendations to fix them, such as enforcing input validation, disabling default public access, verifying dataset authenticity, or encrypting sensitive data as examples. This means pen testers can move from “here is how I broke it” to “here is how you should secufre configure it or redesign it to prevent recurrence.” As pen testers revisit AI systems/application in scope for testing as these mighr change, they can update the CVE and CWE of newly discovered vulnerabilities and use the AI Threats column as a checklist for attack simulations in future red-team exercises. Over time, this evolving matrix becomes a living document that supports secure design, ongoing validation, and resilience in AI-enabled systems.
**Step 4 — AI Threat Mapping and Secure Design Recommendations**
Finally, the third step is to look at AI threats, targeted CWEs and provide recommendations to Fix Them in the Pen Testing Report. As testers conduct AITG tests across in-scope SAIF components, each test outcome should not only describe the immediate failure condition but also trace it back to the underlying design flaw, coding weakness, or misconfiguration responsible for the issue. These root causes are best framed as CWEs (Common Weakness Enumerations), which provide a standardized vocabulary for describing software and design-level weaknesses. Vulnerability types/CWEs might represent security design flaws or mis-configurations that could be targeted by AI threats. It is important that when these CWEs are included in the test report are also accompanied by recommendations to fix them, such as enforcing input validation, disabling default public access, verifying dataset authenticity, or encrypting sensitive data as examples. This means pen testers can move from “here is how I broke it” to “here is how you should secufre configure it or redesign it to prevent recurrence.” As pen testers revisit AI systems/application in scope for testing as these mighr change, they can update the CVE and CWE of newly discovered vulnerabilities and use the AI Threats column as a checklist for attack simulations in future red-team exercises. Over time, this evolving matrix becomes a living document that supports secure design, ongoing validation, and resilience in AI-enabled systems. Framing results this way also supports structured vulnerability management. Once remediation is implemented, the same AITG test can be re-run (re-test) to verify closure of the issue, ensuring traceability through the vulnerabilitys lifecycle. In practice, findings should be prioritized by technical risk severity—Critical, High, Medium, or Low—and remediated in alignment with organizational SLA targets for vulnerability resolution. This approach ensures that AI-specific test results translate into measurable improvements in both software resilience and overall risk posture.
The final recommended step is to perform AI threat enumeration and CWE exploit-path mapping, transforming vulnerability centric testing into design level assurance. This appendix provides a Threat-to-SAIF-Component-to-CWE mapping, complementing the Threat-to-Test-Case mapping (AITG tests) presented earlier in this guide. Together, these enable testers to link AI-specific vulnerabilities—such as prompt injection, data leakage, or model poisoning—to their root causes, whether insecure design, implementation weakness, or misconfiguration. By classifying findings under CWE categories, testers connect penetration testing results to recognized software weakness patterns. This approach bridges the gap between patch management and secure architecture, guiding fixes that strengthen entire system layers rather than individual components. For example, CWE-20 (Improper Input Validation) reveals weak parsing logic; CWE-276 (Incorrect Default Permissions) highlights insecure cloud storage defaults; and CWE-345 (Insufficient Verification of Data Authenticity) exposes trust flaws in RAG ingestion pipelines.
During AITG testing across in-scope SAIF components, each failed test should identify the immediate issue and trace it to a corresponding CWE root cause. Reports should include both the weakness and an actionable recommendation—for instance, enforcing input validation, disabling public defaults, verifying dataset authenticity, or encrypting sensitive data. This shifts the testers message from “how I broke it” to “how to fix and redesign it.” As systems evolve, testers can update the CVE and CWE mappings to reflect new vulnerabilities and use the AI Threats column as a living checklist for future red-team exercises. This evolving matrix supports continuous validation and resilience in AI-enabled systems. Once fixes are implemented, corresponding AITG tests should be re-run to verify closure, with findings prioritized by risk severity (Critical, High, Medium, Low) and resolved per SLA targets. This structured, CWE-driven approach ensures AI testing results are not just diagnostic but actionable, improving both software resilience and long-term AI system risk posture.
**AI Threat enumeration and CVE exploit path mapping**