Photo-Realistic Scene Modeling and Visualization using Online Photo Collections
Reconstructing 3D scenes from online photo collections has attracted a tremendous amount of interest from both academia and industry. The progress in the past decade has been exceptional, in terms of scale and reconstruction quality. Yet, we are still far from creating 3D models that support consumer level graphics applications. The challenge is twofold. First, modern geometry reconstruction and illumination/reflectance estimation techniques are generating low quality 3D models that are severely contaminated by visual artifacts, i.e., geometry holes, over-inflated boundaries, noisy surface details, low-resolution texture, etc. These artifacts are often extremely noticeable, thus largely limit the applicability of these 3D reconstruction approaches. Second, the real-world is dynamic, but very little research has been devoted to modeling and visualizing transient objects in photos. Therefore, typically we see ghost town 3D models in even the best-of-the-breed work. In this thesis, I first introduce the Visual Turing Test with two of the first relight-able city-scale MVS models. The results show that poor geometry reconstruction and the lack of transient scene elements significantly reduce the photorealism of the rendered images. Our grand vision is that eventually the 3D reconstruction research will be able to pass the Visual Turing Test. While we are still far from that, this dissertation proposes new approaches to photo-realistic scene modeling and visualization. This line of research addresses both of the two aspects of the challenge, i.e., reducing severe artifacts and incorporating some transient objects (people) in renderings, by improving various key components of modern 3D reconstruction pipelines. To be more specific, our work pushes the limit of (1) Structure-from-Motion research by solving the ground-to-aerial geo-registration with pixel level accuracy; (2) Multi-View Stereo by incorporating occluding contour information, and show dramatically improved geometry; (3) lighting/texture estimation by explicitly modeling outdoor illumination, and optimizing for lighting parameters and scene albedo, (4) image-based rendering to improve visualization of a scene with erroneous geometry, and (5) modeling transient objects. This dissertation describes work that can be considered as early effort towards the goal of making 3D reconstruction technologies widely applicable in real-world graphics applications.