There has been considerable interest recently in building 3D maps of environments using inexpensive depth cameras like the Microsoft Kinect sensor. We exploit the fact that typical indoor scenes have an abundance of planar features by modeling environments as sets of plane polygons. To this end, we build upon the Fast Sampling Plane Filtering (FSPF) algorithm that extracts points belonging to local neighborhoods of planes from depth images, even in the presence of clutter. We introduce an algorithm that uses the FSPF-generated plane filtered point clouds to generate convex polygons from individual observed depth images. We then contribute an approach of merging these detected polygons across successive frames while accounting for a complete history of observed plane filtered points without explicitly maintaining a list of all observed points. The FSPF and polygon merging algorithms run in real time at full camera frame rates with low CPU requirements: in a real world indoor environment scene, the FSPF and polygon merging algorithms take 2.5 ms on average to process a single 640 × 480 depth image. We provide experimental results demonstrating the computational efficiency of the algorithm and the accuracy of the detected plane polygons by comparing with ground truth.
Available at: http://works.bepress.com/joydeep-biswas/6/