Update README.md

This commit is contained in:
Gnet Ma sysu
2025-11-02 21:15:26 +08:00
committed by GitHub
parent 9c44a56ccd
commit 6dd9a29e29
+20
View File
@@ -1,2 +1,22 @@
# OmniSafeBench-MM
A Unified Benchmark and Toolbox for Multimodal Jailbreak AttackDefense Evaluation
## Attack Methods
## Defense Methods
| No. | Title | Type | Venue | Paper | Code |
|:---|:--------------------------------------------------------------------------------------------------------------------------------------------------------------------|:----:|:---:|:-----------|:-----------|
| 1 | Jailguard: A universal detection framework for llm prompt-based attacks | 1 | arXiv | [paper link](https://arxiv.org/pdf/2312.10766) | [code link](https://github.com/shiningrain/JailGuard) |
| 2 | Mllm-protector: Ensuring mllm's safety without hurting performance | 1 | arXiv | [paper link](https://arxiv.org/pdf/2401.02906) | [code link](https://github.com/pipilurj/MLLM-protector) |
| 3 | Eyes closed, safety on: Protecting multimodal llms via image-to-text transformation | 1 | ECCV 2024 | [paper link](https://arxiv.org/pdf/2403.09572) | [code link](https://gyhdog99.github.io/projects/ecso/) |
| 4 | Shieldlm: Empowering llms as aligned, customizable and explainable safety detectors | 1 | EMNLP 2024 | [paper link](https://arxiv.org/pdf/2402.16444) | [code link](https://github.com/thu-coai/ShieldLM) |
| 5 | Adashield: Safeguarding multimodal large language models from structure-based attack via adaptive shield prompting | 1 | ECCV 2024 | [paper link](https://link.springer.com/content/pdf/10.1007/978-3-031-72661-3_5.pdf) | [code link](https://github.com/SaFoLab-WISC/AdaShield)|
| 6 | Uniguard: Towards universal safety guardrails for jailbreak attacks on multimodal large language models | 1 | ECCV 2024 | [paper link](https://arxiv.org/pdf/2411.01703) | [code link](https://anonymous.4open.science/r/UniGuard/README.md) |
| 7 | Defending LVLMs Against Vision Attacks Through Partial-Perception Supervision | 1 | ICML 2025 | [paper link](https://arxiv.org/pdf/2412.12722) | [code link](https://github.com/tools-only/DPS) |
| 8 | Cross-modality Information Check for Detecting Jailbreaking in Multimodal Large Language Models | 1 | EMNLP 2024 | [paper link](https://arxiv.org/pdf/2407.21659) | [code link](https://github.com/PandragonXIII/CIDER) |
| 9 | Guardreasoner-vl: Safeguarding vlms via reinforced reasoning | 1 | ICML 2025 | [paper link](https://arxiv.org/pdf/2505.11049) | [code link](https://github.com/yueliu1999/GuardReasoner-VL/) |
| 10 | Llama Guard 4 | 1 | None | [document link](https://www.llama.com/docs/model-cards-and-prompt-formats/llama-guard-4/) | [model link](https://huggingface.co/meta-llama/Llama-Guard-4-12B) |
| 11 | QGuard: Question-based Zero-shot Guard for Multi-modal LLM Safety | 1 | ICML 2025 | [paper link](https://arxiv.org/pdf/2506.12299) | [code link](https://github.com/taegyeong-lee/QGuard-Question-based-Zero-shot-Guard-for-Multi-modal-LLM-Safety) |
| 12 | Llavaguard: Vlm-based safeguard for vision dataset curation and safety assessment | 1 | CVPR 2024 | [paper link](https://openaccess.thecvf.com/content/CVPR2024W/ReGenAI/papers/Helff_LLAVAGUARD_VLM-based_Safeguard_for_Vision_Dataset_Curation_and_Safety_Assessment_CVPRW_2024_paper.pdf) | [code link](https://github.com/ml-research/LlavaGuard) |
> More methods are coming soon!!