Update 2.2_Appendix_E.md

aggiunta maggiore chiarezza su come la mappatura dalla minaccia al componente SAIF, al test e alle vulnerabilità (CVEs e CWEs) di quel componente renda il report di penetration testing più concreto e orientato all’azione, facilitando la formulazione di raccomandazioni di correzione efficaci.
This commit is contained in:
Marco Morana
2025-10-21 12:21:10 -04:00
committed by GitHub
parent c9438a0f81
commit 47b241cb29
+19
View File
@@ -4,17 +4,34 @@
This appendix guides penetration testers on mapping discovered CVEs and CWEs in SAIF components of an AI architecture to AI-specific threats. CVEs generally point to vulnerabilities in the underlying technology stack such as libraries, frameworks, APIs used to build AI Systems and AI Applications. Because the pen tests described here target a live AI system/Application, careful scoping is essential: testers must first identify which SAIF components and subcomponents are in scope, enumerate the exact technologies deployed for each, and use that inventory to prioritize CVE/CWE enumeration and threat simulations. In-scope items commonly include components owned or operated by the organization and directly involved in the request→response flow — for example, chat UIs, API backends (e.g., FastAPI), session/orchestration layers, model orchestration frameworks (e.g., LangChain or LlamaIndex), vector stores (Redis, Pinecone, Weaviate), ETL/data pipelines, model-serving endpoints, and internally managed connectors. Because these components can contain outdated, misconfigured, or otherwise exploitable dependencies, the first operational step is threat enumeration: map each in-scope SAIF component to its tech stack, identify relevant CVEs (and corresponding CWEs), and derive likely exploit paths. That mapping then drives focused validation with scanners, SCA tools, and proof-of-concept testing so testers can prioritize, reproduce, and demonstrate how conventional software flaws translate into AI-centric impacts.
**Step 1 — Threat Enumeration and CVE Exploit Path Mapping**
This appendix guides penetration testers on mapping discovered CVEs and CWEs in SAIF components of an AI architecture to AI-specific threats. CVEs generally point to vulnerabilities in the underlying AI technology stack such as libraries, frameworks, APIs used to build AI Systems and AI Applications. CWEs (Common Weakness Enumerations), on the other hand, describe classes of software design or implementation flaws that may lead to such vulnerabilities. In this appendix, we outline how penetration testers can use CWE mappings to frame the root causes behind vulnerabilities identified during the AI Testing Guide (AITG) test execution. As testers conduct these tests across in-scope SAIF components, the findings may indicate technical exposures or behavioral anomalies in the AI system. To interpret these results effectively, testers should reference CWE (Common Weakness Enumeration) entries, which represent the underlying software or design weaknesses that may have caused the observed test failure. In this appendix, we outline how penetration testers can use CWE mappings to articulate the root causes behind vulnerabilities identified through AI Testing Guide (AITG) test execution. This appendix therefore provides a Threat-to-SAIF-Component-to-CWE mapping, complementing the existing Threat-to-Test-Case (AITG) correlation. The goal is to make test reporting more diagnostic and actionable: the more precisely the tester can describe why a test failed in terms of a CWE root cause, the more effectively development and engineering teams can remediate it.
**Step 1 - Scoping AI Penetration Tests Within the SAIF Architecture**
Because the pen tests described here target a live AI system/Application, careful scoping is essential: testers must first identify which SAIF components and subcomponents are in scope, enumerate the exact technologies deployed for each, and use that inventory to prioritize CVE/CWE enumeration and threat simulations. In-scope items commonly include components owned or operated by the organization and directly involved in the request→response flow, for example, chat UIs, API backends (e.g., FastAPI), session/orchestration layers, model orchestration frameworks (e.g., LangChain or LlamaIndex), vector stores (Redis, Pinecone, Weaviate), ETL/data pipelines, model-serving endpoints, and internally managed connectors. Because these components can contain outdated, misconfigured, or otherwise exploitable dependencies, the first operational step is threat enumeration: map each in-scope SAIF component to its tech stack, identify relevant CVEs (and corresponding CWEs), and derive likely exploit paths. That mapping then drives focused validation with scanners, SCA tools, and proof-of-concept testing so testers can prioritize, reproduce, and demonstrate how conventional software flaws translate into AI-centric impacts.
**Step 2 — Threat Enumeration and CVE Exploit Path Mapping**
The process of mapping threats to Ai system vulnerabilities starts by identifying known vulnerabilities in AI systems/applications using Software composition analyzers (SCAs) and runtime tools. SCA Tools (e.g., Snyk, Trivy, Dependabot, OWASP Dependency-Check, and GitHub Advanced Security) will flag vulnerable third party software dependencies, while scanners such as Nessus and Nuclei can confirm active CVE exposures in APIs and services. Runtime telemetry and host inspection can also validate which CVEs are exploitable in live environments. These CVEs are then mapped to AI-specific threats (i.e. TA0i-XX threats) outlined in this guide: for example, a FastAPI sanitization flaw (CVE-2022-36067) can be part of a prompt-injection vector (T01-DPIJ), and an Airflow ETL vulnerability (CVE-2022-40127) can lead to data poisoning (T01-DMP) in a RAG pipeline.
For each SAIF component in scope, testers review subcomponents, confirm deployed technologies, and run focused tests to find exploitable or unpatched libraries. These findings drive AI-specific attack simulations such as prompt injection, model inversion, data poisoning, or runtime DoS to reveal real application impact. Using the CVE exploit-path mapping table, testers can maintain traceability from vulnerability to AI impact. For instance, Redis in SAIF #4 (Application Layer) vulnerable to CVE-2022-0543 links to risks like data leakage (T01-SID), model disruption (T01-DoSM), and manipulation (T01-MTD). A single Redis compromise can escalate from infrastructure control to model tampering—compromising data integrity, availability, and trust.
**Step 2 — Threat Enumeration and CWE Exploit Path Mapping**
The second recommended step is to perform a AI threat enumeration and CWE exploit-path mapping. This step transforms vulnerability-centric testing into design-level assurance. By classifying findings under CWE categories, the pen tester bridges the gap between patch management and resilient AI architecture. CWE mapping clarifies attacker objectives, expands test coverage beyond isolated CVEs, and guides remediation that strengthens entire system layers rather than individual components. The CWE-based table reframes technical flaws as architectural weaknesses, for instance, CWE-20 (Improper Input Validation) exposes weak parsing logic, CWE-276 (Incorrect Default Permissions) reveals insecure defaults in data storage such as S3 buckets, and CWE-345 (Insufficient Verification of Data Authenticity) uncovers trust and integrity flaws in RAG ingestion. This approach helps testers not only find where AI applications break, but also understand why they break and how to redesign them to resist future exploitation.
**Step 3 — Threat Enumeration and CWE Exploit Path Mapping**
The second recommended step is to perform a AI threat enumeration and CWE exploit-path mapping. This step transforms vulnerability-centric testing into design-level assurance. Accordingly, this appendix provides a Threat-to-SAIF-Component-to-CWE mapping, complementing the Threat-to-Test-Case mapping (AITG tests) presented earlier. Together, these mappings enable testers to correlate observed vulnerabilities with their root causes—whether they stem from insecure design flaws, implementation weaknesses, or configuration errors. This approach supports consistent reporting and helps organizations trace AI-specific test failures (e.g., prompt injection, data leakage, model poisoning) back to their foundational weaknesses, improving both remediation guidance and future threat modeling accuracy. By classifying findings under CWE categories, the pen tester bridges the gap between patch management and resilient AI architecture. CWE mapping clarifies attacker objectives, expands test coverage beyond isolated CVEs, and guides remediation that strengthens entire system layers rather than individual components. The CWE-based table reframes technical flaws as architectural weaknesses, for instance, CWE-20 (Improper Input Validation) exposes weak parsing logic, CWE-276 (Incorrect Default Permissions) reveals insecure defaults in data storage such as S3 buckets, and CWE-345 (Insufficient Verification of Data Authenticity) uncovers trust and integrity flaws in RAG ingestion. This approach helps testers not only find where AI applications break, but also understand why they break and how to redesign them to resist future exploitation.
**Step 3 — AI Threat Mapping and Secure Design Recommendations**
Finally, the third step is to look at AI threats, targeted CWEs and provide recommendations to Fix Them in the Pen Testing Report. Vulnerability types/CWEs might represent security design flaws or mis-configurations that could be targeted by AI threats. It is important that when these CWEs are included in the test report are also accompanied by recommendations to fix them, such as enforcing input validation, disabling default public access, verifying dataset authenticity, or encrypting sensitive data as examples. This means pen testers can move from “here is how I broke it” to “here is how you should secufre configure it or redesign it to prevent recurrence.” As pen testers revisit AI systems/application in scope for testing as these mighr change, they can update the CVE and CWE of newly discovered vulnerabilities and use the AI Threats column as a checklist for attack simulations in future red-team exercises. Over time, this evolving matrix becomes a living document that supports secure design, ongoing validation, and resilience in AI-enabled systems.
**Step 4 — AI Threat Mapping and Secure Design Recommendations**
Finally, the third step is to look at AI threats, targeted CWEs and provide recommendations to Fix Them in the Pen Testing Report. As testers conduct AITG tests across in-scope SAIF components, each test outcome should not only describe the immediate failure condition but also trace it back to the underlying design flaw, coding weakness, or misconfiguration responsible for the issue. These root causes are best framed as CWEs (Common Weakness Enumerations), which provide a standardized vocabulary for describing software and design-level weaknesses. Vulnerability types/CWEs might represent security design flaws or mis-configurations that could be targeted by AI threats. It is important that when these CWEs are included in the test report are also accompanied by recommendations to fix them, such as enforcing input validation, disabling default public access, verifying dataset authenticity, or encrypting sensitive data as examples. This means pen testers can move from “here is how I broke it” to “here is how you should secufre configure it or redesign it to prevent recurrence.” As pen testers revisit AI systems/application in scope for testing as these mighr change, they can update the CVE and CWE of newly discovered vulnerabilities and use the AI Threats column as a checklist for attack simulations in future red-team exercises. Over time, this evolving matrix becomes a living document that supports secure design, ongoing validation, and resilience in AI-enabled systems. Framing results this way also supports structured vulnerability management. Once remediation is implemented, the same AITG test can be re-run (re-test) to verify closure of the issue, ensuring traceability through the vulnerabilitys lifecycle. In practice, findings should be prioritized by technical risk severity—Critical, High, Medium, or Low—and remediated in alignment with organizational SLA targets for vulnerability resolution. This approach ensures that AI-specific test results translate into measurable improvements in both software resilience and overall risk posture.
**AI Threat enumeration and CVE exploit path mapping**
In this section we provide a mapping of SAIF components to AI threats and examples of component dependent tech-stack CVEs that can be exploited
| SAIF Component (Number) | Sub-Components | Tech Stack (Chatbot + RAG) | Mapped Threats | Example CVEs in Tech Stack |
@@ -39,6 +56,7 @@ In this section we provide a mapping of SAIF components to AI threats and exampl
| (19) External Sources | Public datasets, 3rd party APIs/feeds | Wikipedia, Common Crawl, arXiv, News APIs | T01-MIMI, T01-SID, T01-DMP, T01-MIS | Dataset poisoning risks (no CVEs, CWE-driven); API poisoning (CWE-345: Insufficient Verification of Data Authenticity) |
**AI Threat enumeration and Targeted CWEs**
In this section we provide a mapping of SAIF components to AI threats and examples of vulnerability types/CWEs that can be exploited
| SAIF Component | Mapped Threats | Targeted CWEs |
@@ -63,6 +81,7 @@ In this section we provide a mapping of SAIF components to AI threats and exampl
| (19) External Sources | T01-MIMI, T01-SID, T01-DMP, T01-MIS | CWE-20, CWE-200, CWE-203, CWE-345, CWE-352, CWE-359, CWE-494, CWE-522, CWE-74, CWE-825 |
**AI Threats, Targeted CWEs and Recommendations to Fix Them**
In this section we provide a mapping of SAIF components to threats, possibly targeted CWEs, the rationale for CWEs being targeted, and recommendations for fixing them.
- [(2) User Input](#2-user-input)