International Conference on 3D Vision - Monocular Depth Estimation Primed by Salient Point Detection and Normalized Hessian Loss

Monocular Depth Estimation Primed by Salient Point Detection and Normalized Hessian Loss
Authors: Lam Huynh, Matteo Pedone, Phong Ha Nguyen, Jiri Matas, Esa Rahtu and Janne Heikkila
Abstract: Deep neural networks have recently thrived on single image depth estimation. That being said, current developments on this topic highlight an apparent compromise between accuracy and network size. This work proposes an accurate and lightweight framework for monocular depth estimation based on a self-attention mechanism stemming from salient point detection. Specifically, we utilize a sparse set of keypoints to train a FuSaNet model that consists of two major components: Fusion-Net and Saliency-Net. In addition, we introduce a normalized Hessian loss term invariant to scaling and shear along the depth direction, which is shown to substantially improve the accuracy. The proposed method achieves state-of-the-art results on NYU-Depth-v2 and KITTI while using 3.1-38.4 times smaller model in terms of the number of parameters than baseline approaches. Experiments on the SUN-RGBD further demonstrate the generalizability of the proposed method.
PDF (protected)

Paper registration

July 23 30, 2021

Paper submission

July 30, 2021

Supplementary

August 8, 2021

Tutorial submission

August 15, 2021

Tutorial notification

August 31, 2021

Rebuttal period

September 16-22, 2021

Paper notification

October 1, 2021

Camera ready

October 15, 2021

Demo submission

~~July 30~~ Nov 15, 2021

Demo notification

~~Oct 1~~ Nov 19, 2021

Tutorial

November 30, 2021

Main conference

December 1-3, 2021