Neural SLAM with Point Cloud Projection for Faster RGB-D Tracking
Dense visual simultaneous localization and mapping (SLAM) systems have significantly advanced with the integration of learning-based scene representations. These representations enhance mapping fidelity and robustness to noise but are computationally expensive, particularly in repetitive rendering processes. Neural point cloud-based systems, despite their adaptability and state-of-the-art performance, suffer from time-intensive nearest-neighbor searches. We propose a novel projection-first rendering strategy that significantly improves processing speed for tracking, enabling real-time performance. Additionally, we introduce a direct occlusion detection mechanism leveraging keyframe depth information for efficient and accurate occlusion handling without requiring volume rendering. Extensive evaluations on both synthetic (Replica) and real-world (TUM-RGBD) datasets demonstrate that PoP-SLAM significantly improves tracking speed—-achieving around 4 frames per second—-as well as maintaining the superior tracking accuracy in the TUM-RGBD dataset.