Channel-wise Attention-based Network for Self-Supervised Monocular Depth Estimation


Jiaxing Yan, Hong Zhao, Penghui Bu and Yusheng Jin


Self-supervised learning has shown very promising results for monocular depth estimation. Scene structure and local details both are significant clues for high quality depth estimation. Recent works suffer from the lack of explicit modeling of scene structure and proper handling of details inforamtion, which lead to a performance bottleneck and blurry artifacts in predicted results. In this paper, we propose the Channel-wise Attention-based Depth Estimation Network (CADepth-Net) with two effective contributions: 1) The structure perception module employs the self-attention mechanism to capture long-range dependencies and aggregates discriminative features in channel dimensions, explicitly enhances the perception of scene structure, obtains the better scene understanding and robust feature representation. 2) The detail emphasis module re-calibrates channelwise feature maps and selectively emphasizes the informative features, aiming to highlight crucial local details information and fuse different level features more efficiently, which can result in more precise and sharper depth prediction. Furthermore, we validate the effectiveness of our methods and extensive experiments show that our model achieves the state-of-the-art results on the KITTI benchmark and Make3D datasets.

PDF (protected)

  Important Dates

All deadlines are 23:59 Pacific Time (PT). No extensions will be granted.

Paper registration July 23 30, 2021
Paper submission July 30, 2021
Supplementary August 8, 2021
Tutorial submission August 15, 2021
Tutorial notification August 31, 2021
Rebuttal period September 16-22, 2021
Paper notification October 1, 2021
Camera ready October 15, 2021
Demo submission July 30 Nov 15, 2021
Demo notification Oct 1 Nov 19, 2021
Tutorial November 30, 2021
Main conference December 1-3, 2021