Monocular 3D Object Detection with Bounding Box Denoising in 3D by Perceiver

Project Page

https://xianpeng919.github.io/monoxiver

Abstract

The main challenge of monocular 3D object detection is the accurate localization of 3D center. Motivated by a new and strong observation that this challenge can be remedied by a 3D-space local-grid search scheme in an ideal case, we propose a stage-wise approach, which combines the information flow from 2D-to-3D (3D bounding box proposal generation with a single 2D image) and 3D-to-2D (proposal verification by denoising with 3D-to-2D contexts) in a top-down manner. Specifically, we first obtain initial proposals from off-the-shelf backbone monocular 3D detectors. Then, we generate a 3D anchor space by local-grid sampling from the initial proposals. Finally, we perform 3D bounding box denoising at the 3D-to-2D proposal verification stage. To effectively learn discriminative features for denoising highly overlapped proposals, this paper presents a method of using the Perceiver I/O model to fuse the 3D-to-2D geometric information and the 2D appearance information. With the encoded latent representation of a proposal, the verification head is implemented with a self-attention module. Our method, named as MonoXiver, is generic and can be easily adapted to any backbone monocular 3D detectors. Experimental results on the well-established KITTI dataset and the challenging large-scale Waymo dataset show that MonoXiver consistently achieves improvement with limited computation overhead.

1. Installation

Please refer to INSTALL.md.

2. Training and Testing

2.1. Prepare data

Download KITTI dataset and organize data following the official instructions in mmdetection3D. Then generate data by running:

python custom_create_mono3d_data_tools/create_data.py kitti --root-path ./data/kitti --out-dir ./data/kitti --extra-tag kitti

Prepare pretrained checkpoints and put it in the ./ckpts folder.

2.2. Train and test models

Training: see ddd_mmdet_train.sh

./scripts/ddd_mmdet_train.sh [relative_config_filename] [remove_old_if_exist_0_or_1] [name_tag] [gpus] [nb_gpus] [port] [resume_dir]

Testing: see ddd_mmdet_test.sh

./scripts/ddd_mmdet_test.sh [relative_config_filename] [ckpt_dir] [mode] [gpus] [nb_gpus] [port]

3. Checkpoints

3.1 Pretrained RPN head

Model	Link
MonoCon (3-Class)	Link
MonoCon (Car-only)	Link

3.2 MonoXiver Checkpoint

	AP40@Easy	AP40@Mod.	AP40@Hard	Link
MonoXiver	29.20	22.54	19.53	Model

Contact

Please feel free to report issues here and/or any related problem to Xianpeng Liu (xliu59atncsu~~dot~~edu).

Name		Name	Last commit message	Last commit date
Latest commit History 56 Commits
configs		configs
custom_create_mono3d_data_tools		custom_create_mono3d_data_tools
docs		docs
ivmclx		ivmclx
mmdet3d_data_tools		mmdet3d_data_tools
scripts		scripts
tools		tools
.gitignore		.gitignore
License.txt		License.txt
README.md		README.md
install.md		install.md
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Monocular 3D Object Detection with Bounding Box Denoising in 3D by Perceiver

Project Page

Abstract

1. Installation

2. Training and Testing

2.1. Prepare data

2.2. Train and test models

3. Checkpoints

3.1 Pretrained RPN head

3.2 MonoXiver Checkpoint

Contact

About

Releases 1

Packages

Languages

License

Xianpeng919/monoxiver

Folders and files

Latest commit

History

Repository files navigation

Monocular 3D Object Detection with Bounding Box Denoising in 3D by Perceiver

Project Page

Abstract

1. Installation

2. Training and Testing

2.1. Prepare data

2.2. Train and test models

3. Checkpoints

3.1 Pretrained RPN head

3.2 MonoXiver Checkpoint

Contact

About

Resources

License

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages