From 5df7bc07b01cbe6cbcefc4c8c6fd211e53669d3c Mon Sep 17 00:00:00 2001 From: Daizong Liu Date: Fri, 31 May 2024 17:25:12 +0800 Subject: [PATCH] update 5.31.2024 --- README.md | 34 +++++++++++++++++++++------------- 1 file changed, 21 insertions(+), 13 deletions(-) diff --git a/README.md b/README.md index 3cf5891..d518e63 100644 --- a/README.md +++ b/README.md @@ -1,10 +1,10 @@ # Awesome-LVLM-Attack [![Awesome](https://cdn.rawgit.com/sindresorhus/awesome/d7305f38d29fed78fa85652e3a63e154dd8e8829/media/badge.svg)](https://github.com/sindresorhus/awesome) -A **continual** collection of papers related to Text-guided 3D Visual Grounding (T-3DVG). +A **continual** collection of papers related to Attacks on Large-Vision-Language-Models (LVLMs). -Text-guided 3D visual grounding (T-3DVG) aims to locate a specific object that semantically corresponds to a language query from a complicated 3D scene, has drawn increasing attention in the 3D research community over the past few years. -T-3DVG presents great potential and challenges due to its closer proximity to the real world and the complexity of data collection and 3D point cloud source processing. +Large vision-language models (LVLMs) have achieved significant success and demonstrated promising capabilities in various multimodal downstream tasks. +Despite their remarkable capabilities, the increased complexity and deployment of LVLMs have also exposed them to various security threats and vulnerabilities, making the study of attacks on these models a critical area of research. -In the T-3DVG community, we've summarized existing T-3DVG methods in our survey paper👍. +Here, we've summarized existing LVLM Attack methods in our survey paper👍. > If you find some important work missed, it would be super helpful to let me know (`dzliu@stu.pku.edu.cn`). Thanks! @@ -12,25 +12,33 @@ In the T-3DVG community, we've summarized existing T-3DVG methods in our survey > If you find our survey useful for your research, please consider citing: ``` -@article{liu2024survey, - title={A Survey on Text-guided 3D Visual Grounding: Elements, Recent Advances, and Future Directions}, - author={Liu, Daizong and Liu, Yang and Huang, Wencan and Hu, Wei}, +@article{liu2024attack, + title={A Survey of Attacks on Large Vision-Language Models: Resources, Recent Advances, and Future Trends}, + author={Liu, Daizong and Yang, Mingyu and Qu, Xiaoye and Hu, Wei}, journal={arXiv preprint arXiv}, year={2024} } ``` **Table of Contents** -- [Fully-supervised two-stage approach](#Fully-Supervised-Two-Stage) -- [Fully-supervised one-stage approach](#Fully-Supervised-One-Stage) -- [Weakly-supervised approach](#Weakly-supervised) -- [Approaches using No Point Cloud Input](#Other-Modality) -- [Approaches using LLMs](#LLMs-based) +- [Adversarial Attacks](#Adversarial-Attack) +- [Jailbreack Attacks](#Jailbreack-Attack) +- [Prompt Injection](#Prompt-Injection) +- [Data Poisoning](#Data-Poisoning) --- -## Fully-Supervised-Two-Stage +## Adversarial-Attack * **ScanRefer: 3D Object Localization in RGB-D Scans using Natural Language** | [Github](https://github.com/daveredrum/ScanRefer) * Dave Zhenyu Chen, Angel X. Chang, Matthias Nießner * Technical University of Munich, Simon Fraser University * [ECCV2020] https://arxiv.org/abs/1912.08830 * A dataset, two-stage approach, proposal-then-selection + + +## Jailbreak-Attack + + +## Prompt-Injection + + +## Data-Poisoning