MeshMVS: Multi-View Stereo Guided Mesh Reconstruction
Rakesh Shrestha, Zhiwen Fan, Siyu Zhu, Qingkun Su, Zuozhuo Dai and Ping Tan
Deep learning based 3D shape generation methods generally utilize latent features extracted from color images to encode the semantics of objects and guide the shape generation process. These color image semantics only implicitly encode 3D information, potentially limiting the accuracy of the generated shapes. In this paper we propose a multi-view mesh generation method which incorporates geometry information explicitly by using the features from intermediate depth representations of multi-view stereo and regularizing the 3D shapes against these depth images. First, our system predicts a coarse 3D volume from the color images by probabilistically merging voxel occupancy grids from the prediction of individual views. Then the depth images from multi-view stereo along with the rendered depth images of the coarse shape are used as a contrastive input whose features guide the refinement of the coarse shape through a series of graph convolution networks. Notably, we achieve superior results than state-of-the-art multi-view shape generation methods with 34% decrease in Chamfer distance to ground truth and 14% increase in F1-score on ShapeNet dataset.