October 2019
tl;dr: Summary of the main idea.
Use bbox H, W, D (diagnal), average size of object h, w, b (breadth, along depth dimension) is good enough to regress the distance, with relative error ~10%, up to 300 meters.
This method seems much more promising than the one presented in object distance estimation.
This idea is quite similar to the more elaborate ICCV 2019 paper monoloco.
- 2000 bbox are used. Distance GT measured with laser scanner.
- We can add the backplane width for better estimation of depth.
- The method to extract GT information from point cloud may be noisy. But how to quantify and avoid this?