sandeep

Nano3D: A Training-Free Approach for Efficient 3D Editing Without Masks

Nano3D revolutionizes 3D asset editing by enabling training-free, part-level shape modifications like removal, addition, and replacement without any manual masks. Developed by researchers from Tsinghua University, Peking University, HKUST, CASIA,

Research Papers

Triangle Splatting+: Differentiable Rendering with Opaque Triangles

Triangle Splatting+ redefines 3D scene reconstruction and rendering by directly optimizing opaque triangles, the fundamental primitive of computer graphic, in a fully differentiable framework. Unlike Gaussian Splatting or NeRF-based approaches,

Research Papers

Code2Video: A Code-Centric Paradigm for Educational Video Generation

Code2Video introduces a revolutionary framework for generating professional educational videos directly from executable Python code. Unlike pixel-based diffusion or text-to-video models, Code2Video treats code as the core generative medium, enabling

Research Papers

CAP4D: 4D Avatars with Morphable Multi-View Diffusion Models

CAP4D introduces a unified framework for generating photorealistic and animate style rendering 4D portrait avatars from any number of reference images as well as even a single image. By combining

Research Papers

Test3R: Learning to Reconstruct 3D at Test Time

Test3R is a novel and simple test-time learning technique that significantly improves 3D reconstruction quality. Unlike traditional pairwise methods such as DUSt3R, which often suffer from geometric inconsistencies and poor generalization,

Research Papers

BlenderFusion: 3D-Grounded Visual Editing and Generative Compositing

BlenderFusion is a novel framework that merges 3D graphics editing with diffusion models to enable precise, 3D-aware visual compositing. Unlike prior approaches that struggle with multi-object and camera disentanglement, BlenderFusion

Research Papers

Step-by-Step process to remove Backgrounds from Images Using OpenCV

Ever wondered how those slick background removal tools actually work? You upload a photo, click a button, and boom, the subject pops while the clutter disappears. But behind that magic

OpenCV

Gemma 3 Explained

The Google DeepMind team has unveiled its latest evolution in their family of open models – Gemma 3, and it’s a monumental leap forward. While the AI space is crowded

LongSplat: Robust Unposed 3D Gaussian Splatting for Casual Long Videos (NYCU & NVIDIA Research)

LongSplat is a new framework that achieves high-quality novel view synthesis from casually captured long videos, without requiring camera poses. It overcomes challenges like irregular motion, pose drift, and memory

Research Papers

DINOv3: Scaling Self-Supervised Learning for Vision Foundation Models (Meta AI)

DINOv3 is a next-generation vision foundation model trained purely with self-supervised learning. It introduces innovations that allow robust dense feature learning at scale with models reaching 7B parameters and achieves

Research Papers

Genie 3: A New Frontier for World Models (Google DeepMind)

Genie 3 is a general-purpose world model which, given just a text prompt, generates dynamic, interactive environments in real time and rendered at 720p, 24 fps, while maintaining consistency over

Research Papers

Application of VLM in Healthcare

In the complex world of modern medicine, two forms of data reign supreme: the visual and the textual. On one side, a deluge of images, X-rays, MRIs, and pathology slides.

sandeep

Nano3D: A Training-Free Approach for Efficient 3D Editing Without Masks

Triangle Splatting+: Differentiable Rendering with Opaque Triangles

Code2Video: A Code-Centric Paradigm for Educational Video Generation

CAP4D: 4D Avatars with Morphable Multi-View Diffusion Models

Test3R: Learning to Reconstruct 3D at Test Time

BlenderFusion: 3D-Grounded Visual Editing and Generative Compositing

Step-by-Step process to remove Backgrounds from Images Using OpenCV

Gemma 3 Explained

LongSplat: Robust Unposed 3D Gaussian Splatting for Casual Long Videos (NYCU & NVIDIA Research)

DINOv3: Scaling Self-Supervised Learning for Vision Foundation Models (Meta AI)

Genie 3: A New Frontier for World Models (Google DeepMind)

Application of VLM in Healthcare

Free Courses

Courses

Partnership

Resources

General Link

Free Courses

Courses

Partnership

Resources

General Link

sandeep

Become a Member

Free Courses

Courses

Partnership

Resources

General Link

Free Courses

Courses

Partnership

Resources

General Link