Type
Conference PaperKAUST Department
KAUSTVisual Computing Center (VCC)
Computer, Electrical and Mathematical Science and Engineering (CEMSE) Division
Computer Science Program
Date
2021-11-02Preprint Posting Date
2020-11-28Online Publication Date
2021-11-02Print Publication Date
2021-06Permanent link to this record
http://hdl.handle.net/10754/666232
Metadata
Show full item recordAbstract
We address the problem of estimating a high quality dense depth map from a single RGB input image. We start out with a baseline encoder-decoder convolutional neural network architecture and pose the question of how the global processing of information can help improve overall depth estimation. To this end, we propose a transformer-based architecture block that divides the depth range into bins whose center value is estimated adaptively per image. The final depth values are estimated as linear combinations of the bin centers. We call our new building block AdaBins. Our results show a decisive improvement over the state-of-the-art on several popular depth datasets across all metrics. We also validate the effectiveness of the proposed block with an ablation study and provide the code and corresponding pre-trained weights of the new state-of-the-art model.Citation
Farooq Bhat, S., Alhashim, I., & Wonka, P. (2021). AdaBins: Depth Estimation Using Adaptive Bins. 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). doi:10.1109/cvpr46437.2021.00400Publisher
IEEEConference/Event name
2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)ISBN
978-1-6654-4510-8arXiv
2011.14141Additional Links
https://ieeexplore.ieee.org/document/9578024/https://ieeexplore.ieee.org/document/9578024/
https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=9578024
ae974a485f413a2113503eed53cd6c53
10.1109/CVPR46437.2021.00400