PartCrafter

PartCrafter is the first unified 3D generative model that jointly synthesizes multiple semantically meaningful and geometrically distinct 3D parts from a single RGB image without any segmentation required. Powered by compositional latent diffusion and hierarchical attention, it enables end-to-end part-aware 3D mesh generation for objects and scenes.

Key Highlights:

One-Shot Part-Aware 3D Generation – Simultaneous generation of multiple structured parts directly from a single image
No Segmentation Needed – Avoids two-stage pipelines; no 2D/3D segmentation models required
Local-Global Attention Transformer – Captures fine part-level detail and global coherence during mesh generation
Compositional Latent Space – Each part is represented by its own disentangled set of tokens
Part ID Embedding + Cross-Attention – Enables identity-aware generation even for occluded or invisible parts
Scalable to Complex Scenes – Handles up to 16 parts per object and multi-object room scenes from the 3D-Front dataset.

Why it matters:

PartCrafter unlocks the next generation of part-aware 3D content creation from robotics and gaming to digital twins and AR. By eliminating the need for segmentation and enabling controllable part-level synthesis, it brings modular 3D generation to the forefront of vision+graphics research.