mirror of https://github.com/liudaizong/Awesome-LVLM-Attack.git synced 2026-07-25 14:30:52 +02:00

T

Daizong Liu 59fafca695 update 5.31.2024

update

2024-05-31 17:09:29 +08:00

README.md

update 5.31.2024

2024-05-31 17:09:29 +08:00

README.md

Awesome-LVLM-Attack

A continual collection of papers related to Text-guided 3D Visual Grounding (T-3DVG).

Text-guided 3D visual grounding (T-3DVG) aims to locate a specific object that semantically corresponds to a language query from a complicated 3D scene, has drawn increasing attention in the 3D research community over the past few years. T-3DVG presents great potential and challenges due to its closer proximity to the real world and the complexity of data collection and 3D point cloud source processing.

In the T-3DVG community, we've summarized existing T-3DVG methods in our survey paper👍.

If you find some important work missed, it would be super helpful to let me know (dzliu@stu.pku.edu.cn). Thanks!

If you find our survey useful for your research, please consider citing:

@article{liu2024survey,
  title={A Survey on Text-guided 3D Visual Grounding: Elements, Recent Advances, and Future Directions},
  author={Liu, Daizong and Liu, Yang and Huang, Wencan and Hu, Wei},
  journal={arXiv preprint arXiv},
  year={2024}
}

Table of Contents

Fully-supervised two-stage approach
Fully-supervised one-stage approach
Weakly-supervised approach
Approaches using No Point Cloud Input
Approaches using LLMs

Fully-Supervised-Two-Stage

ScanRefer: 3D Object Localization in RGB-D Scans using Natural Language | Github
- Dave Zhenyu Chen, Angel X. Chang, Matthias Nießner
- Technical University of Munich, Simon Fraser University
- [ECCV2020] https://arxiv.org/abs/1912.08830
- A dataset, two-stage approach, proposal-then-selection