
LongSplat is a new framework that achieves high-quality novel view synthesis from casually captured long videos, without requiring camera poses. It overcomes challenges like irregular motion, pose drift, and memory limits to deliver state-of-the-art 3D reconstructions.
Key Highlights:
- Joint Optimization: Simultaneously refines camera poses and 3D Gaussians, ensuring globally consistent reconstructions.
- Robust Pose Estimation: Leverages learned 3D priors for accurate camera tracking even under complex trajectories.
- Octree Anchor Formation: A density-driven adaptive strategy that reduces memory usage while preserving fine scene details.
- Superior Performance: Outperforms COLMAP, LocalRF, CF-3DGS, and others on Free, Hike, and Tanks & Temples datasets, avoiding pose drift and OOM failures.
- Efficiency at Scale: Achieves real-time training speed (281 FPS) with a compact 101 MB model size on RTX 4090.
Why It Matters:
Casually recorded videos from phones and action cameras are everywhere, but extracting reliable 3D scenes is extremely difficult. LongSplat shows that 3D Gaussian Splatting can be made robust and memory-efficient for long, unconstrained videos, paving the way for VR/AR, digital tourism, video editing, and navigation applications.
Explore More:
- Paper: https://arxiv.org/abs/2508.14041
- Project Page: https://linjohnss.github.io/longsplat/
- Github Repository: https://github.com/NVlabs/LongSplat
- LearnOpenCV Blogs:
- MASt3R-SLAM: https://learnopencv.com/mast3r-slam-realtime-dense-slam-explained/
- 3D Gaussian Splatting: https://learnopencv.com/3d-gaussian-splatting/
5K+ Learners
Join Free VLM Bootcamp3 Hours of Learning