Demo Session

Demo Session 1A - Thursday 2nd December 2021 10:20 (GMT) via SlidesLive Demo Session 1B - Thursday 2nd December 2021 23:20 (GMT) via SlidesLive Chaired by Margarita Chli and Tony Tung
Mix3D: Out-Of-Context Data Augmentation Authors Alexey Nekrasov, Jonas Schult, Or Litany, Bastian Leibe and Francis Engelmann Description During this demo, we will demonstrate our model Mix3D [7] for semantic segmentation on 3D scenes. Mix3D is trained specifically to generalize beyond the structural priors of current 3D datasets. Since most indoor 3D datasets are very similar in terms of training and test set distributions, the actual generalization capabilities of our approach do not become apparent on these datasets. Therefore, in this demo, we apply our Mix3D model, and competing methods, to newly recorded challenging “in-the-wild” 3D scenes. Besides demonstrating the improved generalization capabilities of Mix3D, we also want to identify its limitations and failure cases. To this end, we challenge the participants of the demo to scan and upload their own 3D scenes, with the goal to make our model fail. The most interesting failure cases will be presented live during the demo. Website to upload 3D scans before the Demo presentation: https://mix3d-demo.nekrasov.dev/
TANDEM: Tracking and Dense Mapping in Real-time using Deep Multi-view Stereo Authors Lukas Koestler, Nan Yang, Niclas Zeller and Daniel Cremers Description This demo will show a live room-scale reconstruction from a handheld monocular camera using TANDEM [3]. For pose estimation, TANDEM performs photometric bundle adjustment based on a sliding window of keyframes. To increase the robustness, we propose a novel tracking frontend that performs dense direct image alignment using depthmaps rendered from a global model that is built incrementally from dense depth predictions. To predict the dense depth maps, we propose Cascade View-Aggregation MVS-Net (CVA-MVSNet) that utilizes the entire active keyframe window by hierarchically constructing 3D cost volumes with adaptive view aggregation to balance the different stereo baselines between the keyframes. Finally, the predicted depth maps are fused into a consistent global map represented as a truncated signed distance function (TSDF) grid.
DOPE + PoseBERT demo: Real-time Hand Mesh Recovery for Animating a Robotic Gripper Authors Fabien Baradel, Romain Bregier, Philippe Weinzaepfel, Yannis Kalantidis and Gregory Rogez Description We propose to showcase a real-time demonstration of 3D hand mesh recovery from an input video stream. In our demo, given a monocular (webcam) video input, the hand pose detected is used to animate a robotic hand in real-time, using a kinematic retargetting procedure. Figure 1 shows a screenshot from a physical installation of our demo in our building. When utilizing the connected webcam as input, our demo runs at 30 FPS on an Nvidia RTX 2080 GPU; the demo can also be adapted to accept remote webcam video as input.
RealisticHands: A Hybrid Model for 3D Hand Reconstruction Authors Michael Seeber, Roi Poranne, Marc Polleyfeys and Martin R. Oswald Description Estimating 3D hand meshes from RGB images robustly is a highly desirable task, made challenging due to the numerous degrees of freedom, and issues such as self-similarity and occlusions. Previous methods generally either use parametric 3D hand models or follow a model-free approach. While the former can be considered more robust, e.g. to occlusions, they are less expressive. We propose a hybrid approach, utilizing a deep neural network and differential rendering based optimization to demonstrably achieve the best of both worlds. In addition, we explore Virtual Reality (VR) as an application. Most VR headsets are nowadays equipped with multiple cameras, which we can leverage by extending our method to the egocentric stereo domain. This extension proves to be more resilient to the above mentioned issues. Finally, as a use-case, we show that the improved image-model alignment can be used to acquire the user’s hand texture, which leads to a more realistic virtual hand representation. In the demo we plan to showcase how our method reconstructs a textured 3D mesh from a camera stream.
TIST: Theme Inspired Style Transfer Authors Kumar Abhinav, Alpana Dubey and Suma Mani Kuriakose Description As we move towards product customization, additive manufacturing, and extended reality environments, the demand for 3D models is ever increasing. This increasing demand has given rise to multiple 3D design softwares. These softwares help in modeling, analyzing, and translating 3D models. However, there is still a lack of tools to aid in designer’s creativity. In this work, we present an approach for 3D style transfer to bridge this gap.
Sketch-to-3D (S3D) Design Assistant Authors Nitish Bhardwaj, Dhornala Bharadwaj, Alpana Dubey, Kumar Abhinav and Suma Mani Kuriakose Description Design process of a 3D model starts from raw handdrawn sketches. These sketches are further used to search a 3D model from the repository or generate a 3D model from scratch. Current methods to generate 3D models from sketches are either manual or tightly coupled with 3D modeling platforms. CAD tools like Autodesk, CATIA have developed software solutions to design 3D models from sketches. Sketching in these tools can easily help in 3D modelling but it requires high expertise and vast knowledge of the tool. These tools require professional artistic skills for sketching in 3D interface. Moreover, most of the existing approaches are based on geometric manipulation and thus cannot be generalized. In this work, we present Sketch-to-3D (S3D) design assistant, which helps the designers and stakeholders to acquire a better foundation in their design process. S3D design assistant is built using advanced deep neural network-based approaches offering two functionalities: SearchForSketch (SFS) and SingleSketch2Mesh (SS2M).
Fully Autonomous Live 3D Reconstruction with an MAV: Hardware- and Software-Setup Authors Yves Kompis, Luca Bartolomei and Margarita Chli Description In this demo we present a versatile Micro-Aerial Vehicle (MAV) setup demonstrating fully autonomous live 3D reconstruction in real-world scenarios. With the aim of reproducability we provide key information on both the hardware- and software-setup.
Visual Teach and Repeat with Deep Learned Features Authors Mona Gridseth, Yuchen Wu and Timothy D. Barfoot Description We provide a demo of Visual Teach and Repeat for autonomous path following on a mobile robot, which uses deep learned features to tackle localization across challenging appearance change.