39 KiB
Appendix E: SAIF AI Threat Targeted Components & CVEs/CWEs
This appendix is intended to guide penetration testers by showing how common CVEs and CWEs map to AI-specific threats across the SAIF-defined components of an AI architecture. CVEs typically correspond to vulnerabilities in the technology stack—libraries, frameworks, APIs, and services—that implement user interfaces, the model layer, supporting infrastructure, or data sources. Because the AI pen tests of these guide are meant to be performed against an existing application, it is essential that testers perform careful scoping up front: identify which SAIF components and subcomponents are in scope, enumerate the actual technologies deployed for each, and use that inventory to prioritize CVE/CWE enumeration and threat simulations. For example, components directly involved in the AI application’s operation—such as the chat interface, FastAPI backend, model orchestration logic, and connected data stores—should be considered in scope for penetration testing. In contrast, external or third-party services not owned or controlled by the organization, such as vendor APIs or external data feeds, are typically out of scope, as they fall outside the AI application’s trust boundary and control.
Once the components in scope versus out of scope for testing are identified and agreed, the next step is to build or reference an inventory of the technology stack used to develop and operate those components. This inventory ensures that testing activities are precise and aligned with the actual implementations. For example, in-scope components typically include those owned or managed by the organization and directly involved in the application’s request–response flow, such as the chat interface, API backends (e.g. FastAPI), session and orchestration layers, model orchestration frameworks (e.g. LangChain or LlamaIndex), vector databases (e.g. Redis, Pinecone, Weaviate), ETL and data processing pipelines, model-serving endpoints, and any internally managed connectors. Because the in-scope components may contain vulnerable libraries due to being outdated, misconfigured, or exploitable by the services used, the first step is threat enumeration and CVE exploit-path mapping, that is the inventory of known CVEs against the tech stack mapped to AI threats. This mapping also helps in identifing likely attack paths exploiting these vulnerabilities and prioritize those paths for validation with scanners and proof-of-concept testing.
To help with this first step, we provide an example of Threat enumeration and CVE exploit path mapping. To begin, the pen tester performs threat enumeration and CVE exploit-path mapping across the in-scope technology stack. This involves identifying known vulnerabilities using tools such as software composition analyzers (SCA) and runtime scanners. SCA tools like Snyk, Trivy, or Dependabot help reveal vulnerable dependencies and libraries, while scanners such as Nessus or Nuclei can validate active exposures in services and APIs. Runtime telemetry and host-level tools can also provide additional evidence of exploitability in live environments where vulnerable components/libraries are being installed and run. Once these CVEs/vulnerabilities are identified, they could be mapped to AI-specific threats using the AI Threats column. For example, FastAPI sanitization weaknesses (CVE-2022-36067) might appear to be routine web vulnerabilities, but in the context of an LLM they translate to T01-DPJI (direct prompt injection). Similarly, CVE-2022-40127 in a LLM or RAG-based application, affect the Apache Airf or retrieval indexes (like Pinecone or Weaviate). This CVEs ould be exploited for remote code execution and T01-DMP (data poisoning), corrupting training or retrieval data. By mapping each CVE in the AI application tech stack to the specific AI threats turns a routine CVE finding into a clear attack path that explains the practical impact of exploitation of the CVE on AI Application being this altering the LLM model behavior or impacting the data integrity, confidentiality, or availability.
The AI pen-tester can follow a systematic AI pen-testing workflow by following this guidance: for each SAIF component in scope, inspect subcomponents to identify where injection, poisoning, or manipulation are possible, confirm the actual technologies deployed, and run tests to discover vulnerable or unpatched libraries and CVEs. Those findings drive simulations of AI-specific attacks for example prompt injection, model inversion and membership inference, data poisoning, and runtime DoS, to demonstrate real impact on the application. A pen test report could leverage the table of "Threat enumeration and CVE exploit path mappings" to maintain traceability of vulnerabilities and their impacts. A finding might read: Redis, used in SAIF #4 –Application Layer for session caching, API state management, and job queue orchestration, was found vulnerable to CVE-2022-0543 (Lua sandbox escape). This vulnerability could allow remote code execution within the application’s runtime environment, potentially leading to session hijacking, data manipulation, or further compromise of other in-scope components. Redis used in SAIF #4 Application Layer fo session caching, API state management, job queues is found vulnerable to CVE-2022-0543,” which maps to three primary AI-specific threats under SAIF: T01-SID (Sensitive Information Disclosure), where attackers can extract cached API tokens or user session data; T01-DoSM (Denial of Service – Model), where cache corruption or overload disrupts model inference and orchestration; and T01-MTD (Model Tampering/Disclosure), where manipulation of stored orchestration or metadata alters model behavior or exposes internal details. In essence, a single Redis compromise can cascade from infrastructure-level control to data leakage, service disruption, and model manipulation, undermining the integrity and trust of the AI application. This creates a clear chain from vulnerability to exploit to AI-specific risk, making the report resonate with both security engineers and AI/ML practitioners.
The second reccomended step is to conduct a Threat enumeration and CWE exploit path mapping. CWE-based enumeration is the bridge between patch-centric security and resilient AI system design. For pen testers, converting technical findings into CWE classes clarifies attacker goals, enables broader test coverage, and produces remediation guidance that hardens architecture — not only one vulnerable library — against the class of attacks that threaten AI applications. The CWE-based table helps the pen tester in framing these vulnerabilities/findings as design weaknesses, not just CVEs that need patching. For example, CWE-20 (improper input validation) points to weak parsing logic, CWE-276 (incorrect default permissions) highlights misconfigurations in data storage or S3 buckets, and CWE-345 (insufficient verification of data authenticity) shows systemic flaws in RAG ingestion.
Finally, the third step is to look at AI Threats, Targeted CWEs and Provide Recommendations to Fix Them in the Pen Testing Report. CWEs being targeted by a threat needs to be accompanied by secure design recommendations, such as enforcing schema validation, disabling default public access, verifying dataset authenticity, or encrypting sensitive data. This means pen testers can move from “here is how I broke it” to “here is how you should redesign it to prevent recurrence.” As pen testers revisit AI systems/application in scope for testing as these mighr change, they can update the CVE and CWE of newly discovered vulnerabilities and use the AI Threats column as a checklist for attack simulations in future red-team exercises. Over time, this evolving matrix becomes a living document that supports secure design, ongoing validation, and resilience in AI-enabled systems.
AI Threat enumeration and CVE exploit path mapping
In this section we provide a mapping of SAIF components to AI threats and examples of component dependent tech-stack CVEs that can be exploited
| SAIF Component (Number) | Sub-Components | Tech Stack (Chatbot + RAG) | Mapped Threats | Example CVEs in Tech Stack |
|---|---|---|---|---|
| (2) User Input | Text, voice, multimodal parsers | React/Next.js, Slack SDK, Teams Bot, Twilio, Whisper/ASR, FastAPI/Pydantic | T01-DPIJ, T01-IPI J, T01-SID, T01-DoSM, T01-IOH, T01-MTU | React XSS (CVE-2021-24033); FastAPI vuln (CVE-2023-27533); Twilio SDK (CVE-2022-36449) |
| (3) User Output | Renderers, formatting, TTS/visual output | React chat widgets, Slack/Teams cards, Polly/ElevenLabs, Markdown renderers | T01-EA, T01-SPL, T01-MIS, T01-IOH | Slack API auth bypass (CVE-2020-10753); Markdown injection (CVE-2022-21681) |
| (4) Application | Orchestration, session mgmt, APIs, business logic | LangChain, LlamaIndex, Semantic Kernel, FastAPI/Flask, Redis sessions, GraphQL APIs | T01-DPIJ, T01-IPI J, T01-SID, T01-DoSM, T01-MTU, T01-IOH, T01-EA, T01-SPL, T01-MIS | Flask template injection (CVE-2019-8341); Redis RCE (CVE-2022-0543); GraphQL DoS (CVE-2020-15159) |
| (5) Agent/Plugin | Connectors, plugin registry, tool adapters | LangGraph Agents, OpenAI Functions, Zapier/n8n, custom OpenAPI tools | T01-IPI J, T01-SID, T01-MTD, T01-EA, T01-VEW | n8n RCE (CVE-2023-37925); OpenAPI tooling parser injection (CVE-2021-32640) |
| (6) External Sources (App) | APIs, SaaS services, enterprise connectors | Salesforce, ServiceNow, Confluence, SharePoint APIs | T01-IPI J, T01-MTD, T01-SID, T01-EA, T01-VEW, T01-DMP | Confluence RCE (CVE-2023-22515); SharePoint RCE (CVE-2023-29357) |
| (7) Input Handling | Validation, sanitization, PII detection, scanning | Pydantic, JSON Schema, Presidio, ClamAV | T01-DPIJ, T01-AIE, T01-SID, T01-LSID, T01-DoSM, T01-SPL, T01-VEW | ClamAV RCE (CVE-2023-20032); JSON Schema validator injection (GitHub advisories) |
| (8) Output Handling | Filters, moderation, redaction, grounding checks | Guardrails.ai, OpenAI Moderation, NeMo Guardrails, RAGAS | T01-LSID, T01-SID, T01-DoSM, T01-SPL, T01-IOH, T01-TDL, T01-MTU, T01-EA, T01-MIS | NeMo Guardrails Python deps RCE (via PyTorch CVEs) |
| (9) Model | LLM weights, embeddings, rerankers | GPT-4o, Claude, Llama-3, Mistral, Cohere reranker, BGE embeddings | T01-DPIJ, T01-IPI J, T01-SCMP, T01-AIE, T01-DPFT, T01-RMP, T01-DMP, T01-SID, T01-MIMI, T01-TDL, T01-DoSM, T01-LSID, T01-SPL, T01-VEW, T01-MTU, T01-IOH, T01-MTR, T01-EA, T01-MIS | PyTorch vuln (CVE-2022-45907); TensorFlow overflow (CVE-2021-37678); Hugging Face sandbox escape (CVE-2023-6730) |
| (10) Model Storage Infrastructure | Registry, encrypted artifacts | MLflow, S3/GCS, Azure Blob, Vertex AI Registry | T01-DPFT, T01-SCMP, T01-MTR, T01-MTD | MLflow path traversal (CVE-2023-6836); AWS S3 bucket takeover misconfigs (CWE-based) |
| (11) Model Serving Infrastructure | GPU runtimes, inference servers, autoscaling | vLLM, NVIDIA Triton, TensorRT-LLM, Kubernetes GPU nodes | T01-SCMP, T01-MTU, T01-MTR, T01-DoSM | NVIDIA Triton RCE (CVE-2023-31036); Kubernetes privilege escalation (CVE-2023-3676); NVIDIA GPU DoS (CVE-2024-0146) |
| (12) Evaluation | Golden sets, drift/bias eval, safety harness | RAGAS, DeepEval, W&B, Evidently AI, Great Expectations | T01-AIE, T01-DMP, T01-LSID, T01-SID, T01-TDL, T01-DoSM, T01-MTU, T01-IOH, T01-MIS | Weights & Biases CLI vuln (GitHub advisories); Great Expectations YAML injection (potential CWE-74) |
| (13) Training & Tuning | Pipelines, fine-tuning, HPO | Kubeflow, SageMaker, Hugging Face PEFT, Optuna | T01-AIE, T01-MIS, T01-DPFT, T01-SCMP, T01-MTD | Kubeflow dashboard RCE (CVE-2021-31812); SageMaker Jupyter RCE (AWS advisory); Hugging Face PEFT vuln (CVE-2023-6730) |
| (14) Model Frameworks & Code | Frameworks, tokenizers, compilers | PyTorch, TensorFlow, Hugging Face, ONNX Runtime | T01-SCMP, T01-MTD, T01-VEW | TensorFlow buffer overflow (CVE-2021-37678); PyTorch vulnerability (CVE-2022-45907); ONNX Runtime DoS (CVE-2022-25883) |
| (15) Data Storage Infrastructure | Vector DBs, RDBMS, object stores | Weaviate, Pinecone, Milvus, Redis, Postgres, S3 | T01-RMP, T01-DMP, T01-DPFT, T01-SCMP, T01-SID, T01-MTD, T01-LSID | Redis RCE (CVE-2022-0543); PostgreSQL escalation (CVE-2023-2454); Milvus injection (CVE-2023-48022) |
| (16) Training Data | Raw corpora, labeled, synthetic | Chat logs, FAQs, Label Studio, synthetic Q&A | T01-MIMI, T01-TDL, T01-SID | Label Studio auth bypass (CVE-2021-36701) |
| (17) Data Filtering & Processing | ETL, cleaning, chunking, tagging | Airflow, dbt, Unstructured.io, spaCy, NLTK | T01-RMP, T01-DMP, T01-DPFT, T01-SID, T01-MIMI, T01-TDL, T01-VEW, T01-MIS | Apache Airflow RCE (CVE-2023-42793); dbt adapter injection (GitHub advisories) |
| (18) Data Sources | Internal KBs, CRM, telemetry | Confluence, Jira, Elastic, Splunk | T01-SID, T01-DMP, T01-VEW, T01-MIS | Confluence RCE (CVE-2023-22515); Jira auth bypass (CVE-2020-14181); ElasticSearch RCE (CVE-2015-1427); Splunk RCE (CVE-2022-32158) |
| (19) External Sources | Public datasets, 3rd party APIs/feeds | Wikipedia, Common Crawl, arXiv, News APIs | T01-MIMI, T01-SID, T01-DMP, T01-MIS | Dataset poisoning risks (no CVEs, CWE-driven); API poisoning (CWE-345: Insufficient Verification of Data Authenticity) |
AI Threat enumeration and Targeted CWEs
In this section we provide a mapping of SAIF components to AI threats and examples of vulnerability types/CWEs that can be exploited
| SAIF Component | Mapped Threats | Targeted CWEs |
|---|---|---|
| (2) User Input | T01-DPIJ, T01-IPI J, T01-SID, T01-DoSM, T01-IOH, T01-MTU | CWE-116, CWE-1204, CWE-1389, CWE-20, CWE-200, CWE-359, CWE-400, CWE-522, CWE-74, CWE-75, CWE-770, CWE-787, CWE-79, CWE-94 |
| (3) User Output | T01-EA, T01-SPL, T01-MIS, T01-IOH | CWE-116, CWE-209, CWE-284, CWE-285, CWE-345, CWE-352, CWE-359, CWE-640, CWE-79, CWE-825 |
| (4) Application | T01-DPIJ, T01-IPI J, T01-SID, T01-DoSM, T01-MTU, T01-IOH, T01-EA, T01-SPL, T01-MIS | CWE-116, CWE-1204, CWE-1389, CWE-20, CWE-200, CWE-209, CWE-284, CWE-285, CWE-345, CWE-352, CWE-359, CWE-400, CWE-522, CWE-640, CWE-74, CWE-75, CWE-770, CWE-787, CWE-79, CWE-825, CWE-94 |
| (5) Agent/Plugin | T01-IPI J, T01-SID, T01-MTD, T01-EA, T01-VEW | CWE-1389, CWE-20, CWE-200, CWE-276, CWE-284, CWE-285, CWE-359, CWE-494, CWE-502, CWE-522, CWE-74, CWE-829, CWE-918, CWE-94 |
| (6) External Sources | T01-IPI J, T01-MTD, T01-SID, T01-EA, T01-VEW, T01-DMP | CWE-1389, CWE-20, CWE-200, CWE-276, CWE-284, CWE-285, CWE-359, CWE-494, CWE-502, CWE-522, CWE-74, CWE-829, CWE-918, CWE-94 |
| (7) Input Handling | T01-DPIJ, T01-AIE, T01-SID, T01-LSID, T01-DoSM, T01-SPL, T01-VEW | CWE-117, CWE-1389, CWE-20, CWE-200, CWE-209, CWE-359, CWE-400, CWE-502, CWE-522, CWE-532, CWE-640, CWE-693, CWE-74, CWE-770, CWE-787, CWE-829, CWE-918 |
| (8) Output Handling | T01-LSID, T01-SID, T01-DoSM, T01-SPL, T01-IOH, T01-TDL, T01-MTU, T01-EA, T01-MIS | CWE-116, CWE-117, CWE-1204, CWE-200, CWE-201, CWE-209, CWE-284, CWE-285, CWE-345, CWE-352, CWE-359, CWE-400, CWE-522, CWE-532, CWE-640, CWE-75, CWE-770, CWE-787, CWE-79, CWE-825 |
| (9) Model | T01-DPIJ, T01-IPI J, T01-SCMP, T01-AIE, T01-DPFT, T01-RMP, T01-DMP, T01-SID, T01-MIMI, T01-TDL, T01-DoSM, T01-LSID, T01-SPL, T01-VEW, T01-MTU, T01-IOH, T01-MTR, T01-EA, T01-MIS | CWE-116, CWE-117, CWE-119, CWE-1204, CWE-1389, CWE-20, CWE-200, CWE-201, CWE-203, CWE-209, CWE-276, CWE-284, CWE-285, CWE-345, CWE-352, CWE-359, CWE-400, CWE-494, CWE-502, CWE-522, CWE-532, CWE-640, CWE-693, CWE-74, CWE-75, CWE-770, CWE-787, CWE-79, CWE-825, CWE-829, CWE-830, CWE-918, CWE-94 |
| (10) Model Storage Infra | T01-DPFT, T01-SCMP, T01-MTR, T01-MTD | CWE-276, CWE-284, CWE-285, CWE-494, CWE-522, CWE-829, CWE-830 |
| (11) Model Serving Infra | T01-SCMP, T01-MTU, T01-MTR, T01-DoSM | CWE-1204, CWE-276, CWE-284, CWE-400, CWE-494, CWE-522, CWE-75, CWE-770, CWE-787, CWE-829 |
| (12) Evaluation | T01-AIE, T01-DMP, T01-LSID, T01-SID, T01-TDL, T01-DoSM, T01-MTU, T01-IOH, T01-MIS | CWE-116, CWE-117, CWE-1204, CWE-1389, CWE-20, CWE-200, CWE-201, CWE-345, CWE-352, CWE-359, CWE-400, CWE-494, CWE-522, CWE-532, CWE-693, CWE-74, CWE-75, CWE-770, CWE-787, CWE-79, CWE-825 |
| (13) Training & Tuning | T01-AIE, T01-MIS, T01-DPFT, T01-SCMP, T01-MTD | CWE-1389, CWE-20, CWE-276, CWE-285, CWE-345, CWE-352, CWE-494, CWE-693, CWE-825, CWE-829, CWE-830 |
| (14) Model Frameworks & Code | T01-SCMP, T01-MTD, T01-VEW | CWE-276, CWE-285, CWE-494, CWE-502, CWE-829, CWE-918 |
| (15) Data Storage Infra | T01-RMP, T01-DMP, T01-DPFT, T01-SCMP, T01-SID, T01-MTD, T01-LSID | CWE-117, CWE-119, CWE-20, CWE-200, CWE-276, CWE-285, CWE-359, CWE-494, CWE-522, CWE-532, CWE-74, CWE-829, CWE-830, CWE-94 |
| (16) Training Data | T01-MIMI, T01-TDL, T01-SID | CWE-200, CWE-201, CWE-203, CWE-359, CWE-522 |
| (17) Data Filtering & Processing | T01-RMP, T01-DMP, T01-DPFT, T01-SID, T01-MIMI, T01-TDL, T01-VEW, T01-MIS | CWE-119, CWE-20, CWE-200, CWE-201, CWE-203, CWE-345, CWE-352, CWE-359, CWE-494, CWE-502, CWE-522, CWE-74, CWE-825, CWE-829, CWE-830, CWE-918, CWE-94 |
| (18) Data Sources | T01-SID, T01-DMP, T01-VEW, T01-MIS | CWE-20, CWE-200, CWE-345, CWE-352, CWE-359, CWE-494, CWE-502, CWE-522, CWE-74, CWE-825, CWE-829, CWE-918 |
| (19) External Sources | T01-MIMI, T01-SID, T01-DMP, T01-MIS | CWE-20, CWE-200, CWE-203, CWE-345, CWE-352, CWE-359, CWE-494, CWE-522, CWE-74, CWE-825 |
AI Threats, Targeted CWEs and Recommendations to Fix Them
In this section we provide a mapping of SAIF components to threats, possibly targeted CWEs, the rationale for CWEs being targeted, and recommendations for fixing them.
- (2) User Input
- (3) User Output
- (4) Application
- (5) Agent / Plugin
- (6) External Sources
- (7) Input Handling
- (8) Output Handling
- (9) Model
- (10) Model Storage Infrastructure
- (11) Model Serving Infrastructure
- (12) Evaluation
- (13) Training & Tuning
- (14) Model Frameworks & Code
- (15) Data Storage Infrastructure
- (16) Training Data
- (17) Data Filtering & Processing
- (18) Data Sources
- (19) External Sources
(2) User Input
Summary: User Input is the front door of the system — every downstream component depends on it. Without strong input validation, filtering, and limits, it becomes the main vector for prompt injection, data leakage, DoS, and toxicity propagation.
Threats: T01-DPIJ, T01-IPI J, T01-SID, T01-DoSM, T01-IOH, T01-MTU
Targeted CWEs:
CWE-20, CWE-74, CWE-94, CWE-707, CWE-200, CWE-359, CWE-522, CWE-400, CWE-770, CWE-787, CWE-116, CWE-79
Direct Prompt Injection (T01-DPIJ) & Indirect Prompt Injection (T01-IPIJ)
Mapped CWEs: CWE-20, CWE-74, CWE-94, CWE-707
Rationale: Maliciously crafted inputs (user prompts or embedded instructions) can override instructions or trigger unintended actions.
Recommendations:
- Apply strict input validation and canonicalization before passing content to the model.
- Use prompt isolation/sandboxing (separate user and system instructions).
- Enforce allowlist-based instruction patterns.
- Test with adversarial prompt fuzzing.
Sensitive Information Disclosure (T01-SID)
Mapped CWEs: CWE-200, CWE-359, CWE-522
Rationale: Inputs may include secrets/PII that can be reflected in outputs or logs.
Recommendations:
- Integrate DLP filters into input channels.
- Mask/tokenize secrets and PII before forwarding to the model.
- Restrict logging of raw inputs.
Denial of Service – Model (T01-DoSM)
Mapped CWEs: CWE-400, CWE-770, CWE-787
Rationale: Oversized or adversarial inputs can exhaust tokens/compute.
Recommendations:
- Set input size and tokenization limits.
- Apply rate-limits and per-user quotas.
- Use circuit breakers/autoscaling.
Insecure Output Handling Triggered by Inputs (T01-IOH)
Mapped CWEs: CWE-116, CWE-79
Rationale: Malicious inputs may propagate to rendered outputs (e.g., XSS).
Recommendations:
- Sanitize and encode outputs by context (HTML/MD/JSON).
- Separate data from control characters; use safe rendering frameworks.
Model Toxicity / Unreliable Outputs (T01-MTU)
Mapped CWEs: CWE-707, CWE-345, CWE-1204
Rationale: Inputs can steer models toward toxic or unreliable content.
Recommendations:
- Add toxicity/bias classifiers and context filters.
- Escalate high-risk cases to human review.
(3) User Output
Summary: The last mile to users/connected systems; without control, it’s a vector for excessive agency, prompt leakage, misinformation, and unsafe rendering.
Threats: T01-EA, T01-SPL, T01-MIS, T01-IOH
Targeted CWEs:
CWE-284, CWE-285, CWE-200, CWE-209, CWE-359, CWE-532, CWE-116, CWE-79, CWE-75, CWE-345, CWE-1204
Excessive Agency (T01-EA)
Mapped CWEs: CWE-284, CWE-285
Rationale: Action-bearing outputs can trigger privileged operations without proper scoping.
Recommendations:
- Enforce least-privilege scopes for action outputs.
- Require policy checks before rendering actionable UI.
- Use allowlists and out-of-band approvals for high-risk actions.
Sensitive Prompt Leakage (T01-SPL)
Mapped CWEs: CWE-200, CWE-209, CWE-359, CWE-532
Rationale: Hidden prompts/keys/PII can surface in responses, errors, or logs.
Recommendations:
- Redact secrets/PII/system instructions before render/logging.
- Wrap errors safely; never show raw tool/model errors.
- Separate user-visible and operator logs with DLP.
Misinformation (T01-MIS)
Mapped CWEs: CWE-345, CWE-1204
Rationale: Ungrounded claims appear credible in UI.
Recommendations:
- Require grounding/citations for high-risk claims.
- Add verification metrics and “needs review” flags.
Insecure Output Handling (T01-IOH)
Mapped CWEs: CWE-116, CWE-79, CWE-75
Rationale: Unsanitized text can execute in rich renderers.
Recommendations:
- Render from structured formats; encode per context.
- Sanitize Markdown/HTML via allowlists; disable unsafe embeds.
(4) Application
Summary: Orchestration brain (sessions, APIs, business logic). Weak validation or access controls can cascade into systemic compromise.
Threats: T01-DPIJ, T01-IPI J, T01-SID, T01-DoSM, T01-MTU, T01-IOH, T01-EA, T01-SPL, T01-MIS
Targeted CWEs:
CWE-20, CWE-74, CWE-94, CWE-200, CWE-209, CWE-359, CWE-522, CWE-400, CWE-770, CWE-787, CWE-116, CWE-79, CWE-75, CWE-284, CWE-285, CWE-345, CWE-1204
Prompt Injection (T01-DPIJ, T01-IPIJ)
Mapped CWEs: CWE-20, CWE-74, CWE-94
Rationale: Unvalidated inputs into core instruction sets allow overrides.
Recommendations: Schema validation, role separation, safe interpreter layer.
Sensitive Information Disclosure (T01-SID, T01-SPL)
Mapped CWEs: CWE-200, CWE-209, CWE-359, CWE-522
Rationale: Secrets leak via logs/prompts/plugins.
Recommendations: Redact secrets, RBAC on sensitive data, safe error handling.
Denial of Service – Model (T01-DoSM)
Mapped CWEs: CWE-400, CWE-770, CWE-787
Recommendations: Rate-limit orchestration, circuit breakers, size checks.
Model Toxicity / Misinformation (T01-MTU, T01-MIS)
Mapped CWEs: CWE-345, CWE-1204
Recommendations: Grounding checks, toxicity/bias filters, confidence flags.
Insecure Output Handling (T01-IOH)
Mapped CWEs: CWE-79, CWE-116, CWE-75
Recommendations: Contextual encoding/sanitization; strip unsafe HTML/MD.
Excessive Agency (T01-EA)
Mapped CWEs: CWE-284, CWE-285
Recommendations: Least privilege, allowlists, secondary approvals.
(5) Agent / Plugin
Summary: Extended arms of the system; vulnerable to IPIJ, secrets handling, tampering, excessive actions, and unsafe workflows.
Threats: T01-IPI J, T01-SID, T01-MTD, T01-EA, T01-VEW
Targeted CWEs:
CWE-20, CWE-74, CWE-94, CWE-200, CWE-359, CWE-522, CWE-284, CWE-285, CWE-276, CWE-494, CWE-829, CWE-918, CWE-502
Indirect Prompt Injection (T01-IPIJ)
Mapped CWEs: CWE-20, CWE-74, CWE-94
Recommendations: Strict I/O schemas, escape parameters, forbid dynamic eval.
Sensitive Information Disclosure (T01-SID)
Mapped CWEs: CWE-200, CWE-359, CWE-522
Recommendations: Scoped credentials, redact tool responses, data minimization.
Model Tampering / Disclosure (T01-MTD)
Mapped CWEs: CWE-276, CWE-285, CWE-494
Recommendations: Hardened permissions, signed manifests, artifact signing.
Excessive Agency (T01-EA)
Mapped CWEs: CWE-284, CWE-285
Recommendations: Per-action least privilege, policy gates, human-in-the-loop.
Vulnerable External Workflow (T01-VEW)
Mapped CWEs: CWE-829, CWE-918, CWE-502
Recommendations: Tool allowlists, egress proxy, safe content types.
Operational Hardening (cross-cutting): Per-tool rate limits/timeouts; container isolation; telemetry; signed releases/SBOMs; tenant isolation for state.
(6) External Sources
Summary: Bridges to the outside world; unverified data can inject poison, trigger unsafe actions, or spread misinformation.
Threats: T01-IPI J, T01-MTD, T01-SID, T01-EA, T01-VEW, T01-DMP
Targeted CWEs:
CWE-20, CWE-74, CWE-94, CWE-200, CWE-359, CWE-522, CWE-276, CWE-284, CWE-285, CWE-494, CWE-829, CWE-918, CWE-502, CWE-353, CWE-345
Indirect Prompt Injection (T01-IPIJ)
Recommendations: Sanitize/normalize external content; restrict content types; segregate retrieved content.
Model Tampering/Disclosure (T01-MTD)
Recommendations: Integrity/signature checks; least-privilege access; explicit approvals; hardened storage permissions.
Sensitive Information Disclosure (T01-SID)
Recommendations: Mask sensitive fields; scoped OAuth; DLP policies.
Excessive Agency (T01-EA)
Recommendations: RBAC and allowlists for sources; policy checks before executing; sandboxed connectors.
Vulnerable External Workflow (T01-VEW)
Recommendations: Egress proxy + allowlists; safe content types; SBOM verification.
Data / Model Poisoning (T01-DMP)
Recommendations: Provenance/reputation scoring; adversarial sample testing; cryptographic integrity checks.
(7) Input Handling
Summary: The filter layer; weak parsing/schema enforcement lets adversarial inputs/injections slip through.
Threats: T01-DPIJ, T01-AIE, T01-SID, T01-LSID, T01-DoSM, T01-SPL, T01-VEW
Targeted CWEs:
CWE-20, CWE-74, CWE-94, CWE-200, CWE-359, CWE-522, CWE-532, CWE-209, CWE-400, CWE-770, CWE-787, CWE-79, CWE-116, CWE-75, CWE-918
Prompt Injection (T01-DPIJ)
Recommendations: Strict schemas and typing; strip unsafe control sequences; sandbox inputs.
Adversarial Input Evasion (T01-AIE)
Recommendations: Unicode normalization; adversarial testing; layered validation.
Sensitive Information Disclosure (T01-SID, T01-LSID, T01-SPL)
Recommendations: Ingestion-time redaction; masked logging; sanitize logs and errors.
Denial of Service – Model (T01-DoSM)
Recommendations: Input size/rate quotas; buffer validation.
Vulnerable External Workflow (T01-VEW)
Recommendations: Domain allowlists + proxy; content-type validation.
(8) Output Handling
Summary: Safety gate before delivery; failure here leaks sensitive data, misinformation, and unsafe content.
Threats: T01-LSID, T01-SID, T01-DoSM, T01-SPL, T01-IOH, T01-TDL, T01-MTU, T01-EA, T01-MIS
Targeted CWEs:
CWE-79, CWE-116, CWE-75, CWE-200, CWE-209, CWE-359, CWE-532, CWE-522, CWE-400, CWE-770, CWE-787, CWE-284, CWE-285, CWE-345, CWE-1204
Log/Storage Information Disclosure (T01-LSID)
Recommendations: Strip sensitive context; RBAC for logs; safe error messages.
Sensitive Information Disclosure (T01-SID, T01-SPL, T01-TDL)
Recommendations: Post-output DLP; encrypt/mask sensitive fields; prevent recall of sensitive training rows.
Denial of Service – Model (T01-DoSM)
Recommendations: Cap output size/tokens; quarantine oversized outputs; validate downstream buffers.
Insecure Output Handling (T01-IOH)
Recommendations: Contextual encoding; allowlist sanitizers; disable rich rendering for untrusted text.
Training Data Leakage (T01-TDL)
Recommendations: Differential privacy; verbatim/entropy filters; redact prompts; restrict logging.
Model Toxicity / Misinformation (T01-MTU, T01-MIS)
Recommendations: Toxicity/bias filters; grounding/citations; fallbacks.
Excessive Agency (T01-EA)
Recommendations: Allowlisted commands; authorization checks; explicit confirmation.
(9) Model
Summary: The core intelligence; targeted by injection, poisoning, theft, inversion, DoS, and unsafe outputs.
Threats:
T01-DPIJ, T01-IPI J, T01-SCMP, T01-AIE, T01-DPFT, T01-RMP, T01-DMP, T01-SID, T01-MIMI, T01-TDL, T01-DoSM, T01-LSID, T01-SPL, T01-VEW, T01-MTU, T01-IOH, T01-MTR, T01-EA, T01-MIS
Targeted CWEs:
CWE-20, CWE-74, CWE-94, CWE-200, CWE-209, CWE-359, CWE-522, CWE-532, CWE-276, CWE-284, CWE-285, CWE-400, CWE-770, CWE-787, CWE-918, CWE-502, CWE-494, CWE-345, CWE-353, CWE-1204, CWE-116, CWE-119, CWE-830, CWE-829, CWE-640, CWE-693, CWE-75, CWE-79
Prompt Injection (T01-DPIJ, T01-IPIJ)
Recommendations: Separate system/developer prompts; tokenizer-stage filtering; adversarial training.
Supply Chain / Data & Fine-tuning Poisoning (T01-SCMP, T01-DPFT, T01-RMP, T01-DMP)
Recommendations: Signed weights/datasets; provenance scoring; adversarial sanitation; SBOMs.
Adversarial Input Evasion (T01-AIE)
Recommendations: Normalize before tokenization; robustness testing; monitor embeddings.
Sensitive Information Disclosure / Training Data Leakage (T01-SID, T01-TDL, T01-LSID, T01-SPL)
Recommendations: DP in training; block verbatim sequences; redact system prompts; restrict logging.
Model Inversion / Membership Inference (T01-MIMI)
Recommendations: DP-SGD; rate limits/randomization; run MI red-teaming.
Denial of Service – Model (T01-DoSM)
Recommendations: Cap context; detect anomalies; harden serving buffers.
Insecure Output Handling / Unsafe Integrations (T01-IOH, T01-VEW)
Recommendations: Sanitize outputs; whitelist tools; enforce policy layers.
Model Theft / Exfiltration (T01-MTR, T01-MTD)
Recommendations: Access controls; encryption at rest; monitor for exfil.
Model Toxicity / Misinformation / Excessive Agency (T01-MTU, T01-MIS, T01-EA)
Recommendations: Toxicity/bias post-filters; grounding; restrict actionable outputs; approvals.
(10) Model Storage Infrastructure
Summary: Crown jewels at rest — must be encrypted, signed, and access-controlled.
Threats: T01-DPFT, T01-SCMP, T01-MTR, T01-MTD
Targeted CWEs:
CWE-276, CWE-284, CWE-285, CWE-200, CWE-359, CWE-522, CWE-494, CWE-353, CWE-922
Data/Prompt Fine-Tuning Poisoning (T01-DPFT)
Recommendations: Cryptographic signing + checksums; read-only versioned storage; attestation.
Supply Chain Model Poisoning (T01-SCMP)
Recommendations: Trusted registries; verify lineage; pin dependencies.
Model Theft / Exfiltration (T01-MTR)
Recommendations: Encrypt with KMS; least-privilege; monitor bulk downloads; harden defaults.
Model Tampering / Disclosure (T01-MTD)
Recommendations: WORM storage; integrity verification on load; restrict access to service accounts.
(11) Model Serving Infrastructure
Summary: Execution gateway; must resist poisoning, theft, DoS, and unsafe outputs.
Threats: T01-SCMP, T01-MTU, T01-MTR, T01-DoSM
Targeted CWEs:
CWE-276, CWE-284, CWE-285, CWE-400, CWE-770, CWE-787, CWE-494, CWE-353, CWE-345, CWE-1204, CWE-75
Supply Chain Model Poisoning (T01-SCMP)
Recommendations: Signed container images; checksums; SBOM-enforced provenance; block untrusted registries.
Model Toxicity / Unreliable Outputs (T01-MTU)
Recommendations: Moderation/toxicity filters; grounding checks; safe fallbacks.
Model Theft / Exfiltration (T01-MTR)
Recommendations: Rate limits/anomaly detection; mTLS + RBAC; encrypt weights; harden FS perms.
Denial of Service – Model (T01-DoSM)
Recommendations: Cap request size/tokens; quotas at gateway; circuit breakers/autoscaling; robust parsers.
(12) Evaluation
Summary: The safety lens; poison/bypass here yields false assurance.
Threats: T01-AIE, T01-DMP, T01-LSID, T01-SID, T01-TDL, T01-DoSM, T01-MTU, T01-IOH, T01-MIS
Targeted CWEs:
CWE-20, CWE-116, CWE-200, CWE-209, CWE-359, CWE-532, CWE-400, CWE-770, CWE-787, CWE-345, CWE-1204
Adversarial Input Evasion (T01-AIE)
Recommendations: Schema validation; normalization; adversarial red-teaming.
Data/Model Poisoning (T01-DMP)
Recommendations: Verify dataset provenance; cross-check baselines; ensemble evaluation.
Information Disclosure (T01-LSID, T01-SID, T01-TDL)
Recommendations: Sanitize logs; encrypt/ACL datasets; monitor for memorization leakage.
Denial of Service – Model (T01-DoSM)
Recommendations: Limit dataset size/runs; rate-limit jobs; fault isolation.
Model Toxicity / Unsafe Output / Misinformation (T01-MTU, T01-IOH, T01-MIS)
Recommendations: Include toxicity/factuality benchmarks; require grounding; scan for unsafe HTML/MD.
(13) Training & Tuning
Summary: Where knowledge is forged; poor data embeds lasting bias/backdoors.
Threats: T01-AIE, T01-MIS, T01-DPFT, T01-SCMP, T01-MTD
Targeted CWEs:
CWE-20, CWE-116, CWE-345, CWE-353, CWE-494, CWE-276, CWE-284, CWE-285, CWE-200, CWE-359
Adversarial Input Evasion (T01-AIE)
Recommendations: Enforce schemas + canonical normalization; adversarial resilience tests; anomaly detection in preprocessing.
Misinformation (T01-MIS)
Recommendations: Validate vs trusted sources; human oversight; training-time grounding.
Data/Prompt Fine-Tuning Poisoning (T01-DPFT)
Recommendations: Signed datasets; immutable baselines; adversarial testing pre-deploy.
Supply Chain Model Poisoning (T01-SCMP)
Recommendations: Trusted registries; signatures; hardened defaults and scoped access.
Model Tampering / Disclosure (T01-MTD)
Recommendations: Encrypt checkpoints/logs; RBAC; regular permission audits.
(14) Model Frameworks & Code
Summary: ML runtime backbone; supply chain or unsafe integrations taint the system.
Threats: T01-SCMP, T01-MTD, T01-VEW
Targeted CWEs:
CWE-94, CWE-95, CWE-829, CWE-494, CWE-353, CWE-276, CWE-284, CWE-285, CWE-918, CWE-502
Supply Chain Model Poisoning (T01-SCMP)
Recommendations: Pin versions; require signed packages; scan dependencies; maintain SBOMs.
Model Tampering / Disclosure (T01-MTD)
Recommendations: Harden runtimes; least-privilege service accounts; audit framework binaries.
Vulnerable External Workflow / Unsafe Integration (T01-VEW)
Recommendations: Disable/sandbox dynamic eval; restrict plugin loading; isolate untrusted code; harden deserialization.
(15) Data Storage Infrastructure
Summary: Knowledge vault; poisoning/tampering/leaks here undermine integrity & confidentiality.
Threats: T01-RMP, T01-DMP, T01-DPFT, T01-SCMP, T01-SID, T01-MTD, T01-LSID
Targeted CWEs:
CWE-276, CWE-284, CWE-285, CWE-200, CWE-359, CWE-522, CWE-532, CWE-400, CWE-770, CWE-787, CWE-494, CWE-353, CWE-345, CWE-922
Runtime/Model/Data Poisoning (T01-RMP, T01-DMP, T01-DPFT, T01-SCMP)
Recommendations: Integrity checks; provenance scoring; append-only/versioned stores; anomaly monitoring.
Sensitive Information Disclosure (T01-SID, T01-LSID)
Recommendations: Encrypt at rest + KMS; RBAC; sanitized logging; access monitoring.
Model/Data Tampering or Exfiltration (T01-MTD)
Recommendations: Disable public/broad ACLs; per-tenant keys; least-privilege; immutable storage for critical data.
Denial of Service – Storage
Recommendations: Quotas and rate limits; hardened parsers/buffers; ingestion throttling.
(16) Training Data
Summary: Root of trust; compromise propagates to all downstream behavior.
Threats: T01-MIMI, T01-TDL, T01-SID
Targeted CWEs:
CWE-200, CWE-359, CWE-522, CWE-345, CWE-353, CWE-494, CWE-276, CWE-284, CWE-285
Model Inversion / Membership Inference (T01-MIMI)
Recommendations: Differential privacy; strict RBAC on raw data; detect inversion patterns.
Training Data Leakage (T01-TDL)
Recommendations: Encrypt datasets; keep creds out of pipelines; tokenize sensitive fields pre-ingestion.
Sensitive Information Disclosure (T01-SID)
Recommendations: Least-privilege; row/column-level policies; audit all access.
Data Authenticity
Recommendations: Signed/versioned datasets; provenance scoring; golden-set cross-validation.
(17) Data Filtering & Processing
Summary: Gatekeeper stage; weak validation lets poisoned/sensitive data pass.
Threats: T01-RMP, T01-DMP, T01-DPFT, T01-SID, T01-MIMI, T01-TDL, T01-VEW, T01-MIS
Targeted CWEs:
CWE-20, CWE-116, CWE-200, CWE-359, CWE-345, CWE-353, CWE-494, CWE-276, CWE-284, CWE-285, CWE-400, CWE-770, CWE-787, CWE-829, CWE-918, CWE-502
Runtime / Data Poisoning (T01-RMP, T01-DMP, T01-DPFT)
Recommendations: Signed datasets; hash verification; drift detection.
Sensitive Information Disclosure (T01-SID, T01-TDL, T01-MIMI)
Recommendations: DLP in preprocessing; masking/tokenization; RBAC for feature stores.
Vulnerable External Workflow (T01-VEW)
Recommendations: Sandbox transforms; egress filtering; forbid unsafe deserialization.
Misinformation (T01-MIS)
Recommendations: Reputation/ground-truth validation; cross-dataset checks; human review for high-risk domains.
Denial of Service on Pipelines
Recommendations: Size quotas; ingestion rate limits; anomaly monitoring.
(18) Data Sources
Summary: Entry point of truth; without provenance checks, they introduce poisoned/unsafe content.
Threats: T01-SID, T01-DMP, T01-VEW, T01-MIS
Targeted CWEs:
CWE-200, CWE-359, CWE-522, CWE-345, CWE-353, CWE-494, CWE-829, CWE-918, CWE-502
Sensitive Information Disclosure (T01-SID)
Recommendations: DLP at ingestion; least-privilege credentials; encrypt sensitive datasets.
Data/Model Poisoning (T01-DMP)
Recommendations: Signature/hash checks; reputation scoring; golden-set cross-validation.
Vulnerable External Workflow (T01-VEW)
Recommendations: Proxy + allowlists; forbid unsafe formats; isolate connectors.
Misinformation (T01-MIS)
Recommendations: Reliability scoring; ground-truth cross-referencing; drift monitoring.
(19) External Sources
Summary: Outside the trust boundary; major vectors for poisoning, leakage, and misinformation.
Threats: T01-MIMI, T01-SID, T01-DMP, T01-MIS
Targeted CWEs:
CWE-200, CWE-359, CWE-522, CWE-345, CWE-353, CWE-494, CWE-918, CWE-829
Model Inversion / Membership Inference (T01-MIMI)
Recommendations: Privacy-preserving APIs; throttle/detect anomalies; k-anonymity/data minimization.
Sensitive Information Disclosure (T01-SID)
Recommendations: Secret managers; token rotation; TLS + mutual auth.
Data/Model Poisoning (T01-DMP)
Recommendations: Data signing/ checksums; cross-validate with references; vendor trust contracts.
Misinformation (T01-MIS)
Recommendations: Source reliability scores; ground-truth validation; human review for high-impact feeds.