News

Introduction to CLIP

Hello, Let me show you an image, can you describe what you see? Perfect! You nailed it: a bird sitting peacefully on a railing. Now, let’s flip it. I’ll describe

SimLingo: Vision-Only Closed-Loop Autonomous Driving with Language-Action Alignment

SimLingo unifies autonomous driving, vision-language understanding, and action reasoning-all from camera input only. It introduces Action Dreaming to test how well models follow instructions, and outperforms all prior methods on

Research Papers

OpenCV 4.12.0 Is Now Available

OpenCV’s summer update for 2025 is now available in all your favorite flavors on the Releases page. It includes a big list of changes to Core, Imgproc, Calib3d, DNN, Objdetect,

News, OpenCV

Introducing HAL riscv-rvv: Unleashing the power of RISC-V CPUs with RVV 1.0

This article was written by Yuantao Feng of the OpenCV China Team. What is RISC-V and RVV 1.0? RISC-V (pronounced “risk-five”) is an open standard instruction set architecture (ISA) based

OpenCV

SAM4D: Segment Anything in Camera and LiDAR Streams

SAM4D introduces a 4D foundation model for promptable segmentation across camera and LiDAR streams, addressing the limitations of frame-centric and modality-isolated approaches in autonomous driving. Key Highlights: Paper Resources

Research Papers

Vector Embeddings Explained

You’ve just finished listening to your favorite high-energy workout song on Spotify, and the next track that automatically plays is one you’ve never heard, but it’s a perfect fit for

VideoGameBench: Can Vision-Language Models Complete Popular Video Games?

VideoGameBench is a rigorous benchmark that evaluates VLMs’ real-time decision-making, perception, memory, and planning by challenging them to complete 1990s-era video games with only raw visual inputs and minimal control

Research Papers

Announcing The Winners of the First Perception Challenge for Bin-Picking (BPC)

OpenCV and sponsors at Intrinsic, BOP, and University of Hawaiʻi at Mānoa are excited to announce the prize winners of the first Perception Challenge for Bin-Picking, first revealed at CVPR

Competition

LeGO-LOAM: Lightweight and Ground-Optimized Lidar Odometry and Mapping on Variable Terrain

LeGO-LOAM introduces a cutting-edge lidar odometry and mapping framework designed to deliver real-time, accurate 6-DOF pose estimation for ground vehicles, optimized for challenging, variable terrain environments. It significantly reduces computational

Research Papers

Applications of Vision Language Models – Real World Use Cases with PaliGemma2 Mix

Imagine machines that don’t just capture pixels but truly understand them, recognizing objects, reading text, interpreting scenes, and even “speaking” about images as fluently as a human. VLMs merge computer

Deep Learning

Introduction to Vision Language Models

Now you can enjoy this Article in the form of an audio! Imagine an expert sommelier. They don’t just identify a wine; they experience it through multiple senses. They see

AI Careers

Reliable-loc: Robust Sequential LiDAR Global Localization in Large-Scale Street Scenes Based on Verifiable Cues

Reliable-loc introduces a resilient LiDAR-based global localization system for wearable mapping devices in complex, GNSS-denied street environments with sparse features and incomplete prior maps. Key Highlights: Paper Resources Related articles

Research Papers

News

Introduction to CLIP

SimLingo: Vision-Only Closed-Loop Autonomous Driving with Language-Action Alignment

OpenCV 4.12.0 Is Now Available

Introducing HAL riscv-rvv: Unleashing the power of RISC-V CPUs with RVV 1.0

SAM4D: Segment Anything in Camera and LiDAR Streams

Vector Embeddings Explained

VideoGameBench: Can Vision-Language Models Complete Popular Video Games?

Announcing The Winners of the First Perception Challenge for Bin-Picking (BPC)

LeGO-LOAM: Lightweight and Ground-Optimized Lidar Odometry and Mapping on Variable Terrain

Applications of Vision Language Models – Real World Use Cases with PaliGemma2 Mix

Introduction to Vision Language Models

Reliable-loc: Robust Sequential LiDAR Global Localization in Large-Scale Street Scenes Based on Verifiable Cues

Free Courses

Courses

Partnership

Resources

General Link

Free Courses

Courses

Partnership

Resources

General Link

News

Become a Member

Free Courses

Courses

Partnership

Resources

General Link

Free Courses

Courses

Partnership

Resources

General Link