Research Papers

SimLingo: Vision-Only Closed-Loop Autonomous Driving with Language-Action Alignment

SimLingo unifies autonomous driving, vision-language understanding, and action reasoning-all from camera input only. It introduces Action Dreaming to test how well models follow instructions, and outperforms all prior methods on

Research Papers

SAM4D: Segment Anything in Camera and LiDAR Streams

SAM4D introduces a 4D foundation model for promptable segmentation across camera and LiDAR streams, addressing the limitations of frame-centric and modality-isolated approaches in autonomous driving. Key Highlights: Paper Resources

Research Papers

VideoGameBench: Can Vision-Language Models Complete Popular Video Games?

VideoGameBench is a rigorous benchmark that evaluates VLMs’ real-time decision-making, perception, memory, and planning by challenging them to complete 1990s-era video games with only raw visual inputs and minimal control

Research Papers

LeGO-LOAM: Lightweight and Ground-Optimized Lidar Odometry and Mapping on Variable Terrain

LeGO-LOAM introduces a cutting-edge lidar odometry and mapping framework designed to deliver real-time, accurate 6-DOF pose estimation for ground vehicles, optimized for challenging, variable terrain environments. It significantly reduces computational

Research Papers

Reliable-loc: Robust Sequential LiDAR Global Localization in Large-Scale Street Scenes Based on Verifiable Cues

Reliable-loc introduces a resilient LiDAR-based global localization system for wearable mapping devices in complex, GNSS-denied street environments with sparse features and incomplete prior maps. Key Highlights: Paper Resources Related articles

Research Papers

The Roller Coaster SLAM Dataset: High-Dynamic Visual-Inertial Benchmarks from Amusement Rides

This is the world’s first SLAM dataset recorded onboard real roller coasters, offering extreme motion dynamics, perceptual challenges, and unique conditions for benchmarking SLAM algorithms under aggressive real-world trajectories. Key

Research Papers

Tightly Coupled Range Inertial Odometry and Mapping with Exact Point Cloud Downsampling

This paper introduces a SLAM framework that achieves real-time CPU-only performance in dense, registration-error-minimization-based odometry and mapping by leveraging exact point cloud downsampling via coreset extraction, eliminating the need for

Research Papers

MP-SfM: Monocular Surface Priors for Robust Structure-from-Motion

MP-SfM redefines classical Structure-from-Motion by tightly integrating monocular depth and surface normal priors into incremental SfM, enabling robust 3D reconstruction from sparse, unstructured image collections. Key Highlights: Resources Paper: https://arxiv.org/abs/2504.20040Github:

Research Papers

NormalCrafter: Learning Temporally Consistent Normals from Video Diffusion Priors

NormalCrafter introduces a novel approach for surface normal estimation in videos, leveraging diffusion priors to achieve high spatial fidelity and temporal consistency over arbitrary-length sequences. Key Highlights: Project Related articles

Research Papers

OpenLiDARMap: Zero-Drift Point Cloud Mapping using Map Priors

OpenLiDARMap presents a GNSS-free mapping framework that combines sparse public map priors with LiDAR data through scan-to-map and scan-to-scan alignment. This approach achieves georeferenced and drift-free point cloud maps. Key

Research Papers

MedSAM2: Segment Anything in 3D Medical Images and Videos

MedSAM2 introduces a robust foundation model for promptable segmentation in 3D medical images and temporal video data, built by fine-tuning SAM2.1 on a large-scale curated medical dataset. Key Highlights: Resources

Research Papers

GenZ-ICP: Generalizable and Degeneracy-Robust LiDAR Odometry Using Adaptive Weighting

GenZ-ICP introduces an innovative iterative Closest Point (ICP) method that enhances LiDAR-based pose estimation by adaptively integrating point-to-plane and point-to-point error metrics, ensuring robust performance across diverse and degenerative environments.

Research Papers

Research Papers

SimLingo: Vision-Only Closed-Loop Autonomous Driving with Language-Action Alignment

SAM4D: Segment Anything in Camera and LiDAR Streams

VideoGameBench: Can Vision-Language Models Complete Popular Video Games?

LeGO-LOAM: Lightweight and Ground-Optimized Lidar Odometry and Mapping on Variable Terrain

Reliable-loc: Robust Sequential LiDAR Global Localization in Large-Scale Street Scenes Based on Verifiable Cues

The Roller Coaster SLAM Dataset: High-Dynamic Visual-Inertial Benchmarks from Amusement Rides

Tightly Coupled Range Inertial Odometry and Mapping with Exact Point Cloud Downsampling

MP-SfM: Monocular Surface Priors for Robust Structure-from-Motion

NormalCrafter: Learning Temporally Consistent Normals from Video Diffusion Priors

OpenLiDARMap: Zero-Drift Point Cloud Mapping using Map Priors

MedSAM2: Segment Anything in 3D Medical Images and Videos

GenZ-ICP: Generalizable and Degeneracy-Robust LiDAR Odometry Using Adaptive Weighting

Free Courses

Courses

Partnership

Resources

General Link

Free Courses

Courses

Partnership

Resources

General Link

Research Papers

Become a Member

Free Courses

Courses

Partnership

Resources

General Link

Free Courses

Courses

Partnership

Resources

General Link