A Dataset-Dispersion Perspective on Reconstruction Versus Recognition in Single-View 3D Reconstruction Networks
Yefan Zhou, Yiru Shen, Yujun Yan, Chen Feng and Yaoqing Yang
Neural networks (NN) for single-view 3D reconstruction (SVR) have gained increasing popularity. Recent work points out that for SVR, most cutting-edge NNs rely primarily on recognition rather than shape reconstruction, i.e., they tend to reconstruct shapes by inherently performing classification-based methods. However, it remains unclear when and why NNs prefer recognition to reconstruction and vice versa, and it is also unclear if conventional reconstruction scores can quantify this phenomenon. In this paper, we show that a leading factor in determining recognition versus reconstruction is how clustered the training data is. We introduce dispersion score, a new data-driven metric, to measure if the training data is clustered. Our main claim is that NNs are biased towards recognition when training images are less clustered and when training shapes are more clustered. We verify the main claim and validate the effectiveness of the proposed dispersion score through experiments on both synthetic and benchmark datasets. The proposed dispersion score provides a principled way to analyze reconstruction quality, and it provides novel information in addition to conventional reconstruction scores.