Skip to content

ImOV3D: Learning Open Vocabulary Point Clouds 3D Object Detection from Only 2D Images (NeurIPS2024)

License

Notifications You must be signed in to change notification settings

yangtiming/ImOV3D

Repository files navigation

【NeurIPS 2024 🇨🇦】ImOV3D: Learning Open Vocabulary Point Clouds 3D Object Detection from Only 2D Images

  • We are the first to accomplish Open-Vocabulary 3D Object Detection tasks without using any 3D ground truth data.
  • Thank you for 🌟 our ImOV3D.

ImOV3D Project on arXiv ImOV3D Project Page

Timing Yang*, Yuanliang Ju*, Li Yi
Shanghai Qi Zhi Institute, IIIS Tsinghua University, Shanghai AI Lab

Overall Pipeline

Environment Setup

To set up the project environment, follow this step:

Create a virtual environment:

conda env create -f environment.yml

After creating the virtual environment, activate it with:

conda activate ImOV3D

PointNet++ Backbone Installation

cd pointnet2
python setup.py install
cd ..

Dataset Preparation

Pretrain Stage

For detailed guidance on setting up the dataset for the pretraining stage, see the dataset instructions.

Adaptation

See Data Preparation for SUNRGBD or ScanNet.

You can also download Data from Baidu.

Format

--[data_name]  # Root directory of the dataset
  ├── [data_name]_2d_bbox_train       # Training data with 2D bounding boxes
  ├── [data_name]_2d_bbox_val         # Validation data with 2D bounding boxes
  ├── [data_name]_pc_bbox_votes_train # Training data with point cloud bounding box votes
  ├── [data_name]_pc_bbox_votes_val   # Validation data with point cloud bounding box votes
  ├── [data_name]_trainval_train      # Training data (2D image + Calib)
  └── [data_name]_trainval_eval       # Evaluation data (2D image + Calib)

Pretrain Weight

Module Description
PointCloudRender Finetuned ControlNet
DataSet Description Logs
LVIS Pretrain Stage SUNRGBD,ScanNet
SUNRGBD Adaptation Stage SUNRGBD
ScanNet Adaptation Stage ScanNet

You can download then from Baidu.

Training and Evaluation

1️⃣ Pretrain

Pretrain ImOV3D on the LVIS dataset:

bash ./scripts/train_lvis.sh

2️⃣ Adapation

For the SUNRGBD dataset:

bash ./scripts/train_sunrgbd.sh

For the ScanNet dataset:

bash ./scripts/train_scannet.sh

3️⃣ Evaluation

To measure the effectiveness of model, proceed to the evaluation phase.

bash ./scripts/eval.sh

Contect

If you have any questions, please feel free to contact us:

Timing Yang: [email protected] Yuanliang Ju: [email protected]

Acknowledgement

Our code is based on ImVoteNet, OV-3DET, Detic, ControlNet, ZoeDepth, surface_normal_uncertainty.

Citation

@article{yang2024imov3d,
  title={ImOV3D: Learning Open-Vocabulary Point Clouds 3D Object Detection from Only 2D Images},
  author={Yang, Timing and Ju, Yuanliang and Yi, Li},
  journal={NeurIPS 2024},
  year={2024}
}

About

ImOV3D: Learning Open Vocabulary Point Clouds 3D Object Detection from Only 2D Images (NeurIPS2024)

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages