Structure from DataMarch 6, 2008
Philippos Mordohai
University of Pennsylvania
Thursday, March 6, 11:00AM
Babbio 304
Stevens Institute of Technology
Abstract
Obtaining structure from data is a fundamental problem in computer science. In computer vision, specifically, structure inference is especially challenging due to the loss of a dimension as the 3D world is projected on images. In the first part of the talk, I will present a fundamental approach for perceptual organization that is very robust despite the absence of strong prior assumptions and global optimization. A second challenge in structure inference from data is due to the magnitude of the datasets that have to be processed for meaningful real-world applications. In the second part of the talk, I will show methods for large-scale processing of high resolution video and range data, that are scalable to entire cities.
I will begin by presenting a fundamental approach for perceptual organization. Tensor voting is a computational framework for perceptual organization of generic tokens founded on the Gestalt principles of proximity and good continuation. While these principles are widely used in computer vision, our approach is unique for a number of reasons. Arguably, the most important of these reasons is the unified representation of all structure types, such as surfaces, curves and junctions in 3D, which facilitates interactions among tokens belonging to structures of different dimensionality. The framework is applicable to a wide range of problems in computer vision and machine learning. Among them, I will briefly describe an approach to binocular stereo matching that exploits monocular cues.
In the second part of the talk, I will describe two large-scale applications of inferring structure from large data collections. The first application is a real-time, video-based 3D reconstruction system that generates accurate, detailed models from multiple video streams captured by a moving platform. Besides robustness and efficiency, the processing pipeline features several novel reconstruction techniques tailored to large-scale processing. I will also show results from an ongoing effort to segment and recognize objects from colored point clouds captured in urban environments by terrestrial and airborne range scanners. In a few months, we have obtained automatic object extraction and rough classification on datasets that exceed one billion points.
Finally, I will briefly discuss potential directions for future research.