
PartCrafter is the first unified 3D generative model that jointly synthesizes multiple semantically meaningful and geometrically distinct 3D parts from a single RGB image without any segmentation required. Powered by compositional latent diffusion and hierarchical attention, it enables end-to-end part-aware 3D mesh generation for objects and scenes.
Key Highlights:
- One-Shot Part-Aware 3D Generation – Simultaneous generation of multiple structured parts directly from a single image
- No Segmentation Needed – Avoids two-stage pipelines; no 2D/3D segmentation models required
- Local-Global Attention Transformer – Captures fine part-level detail and global coherence during mesh generation
- Compositional Latent Space – Each part is represented by its own disentangled set of tokens
- Part ID Embedding + Cross-Attention – Enables identity-aware generation even for occluded or invisible parts
- Scalable to Complex Scenes – Handles up to 16 parts per object and multi-object room scenes from the 3D-Front dataset.
Why it matters:
PartCrafter unlocks the next generation of part-aware 3D content creation from robotics and gaming to digital twins and AR. By eliminating the need for segmentation and enabling controllable part-level synthesis, it brings modular 3D generation to the forefront of vision+graphics research.
Explore More:
- Paper: https://arxiv.org/abs/2506.05573
- GitHub: https://github.com/wgsxm/PartCrafter
- Project Page: https://wgsxm.github.io/projects/partcrafter/
- Related: TripoSG, PartGen, MIDI
5K+ Learners
Join Free VLM Bootcamp3 Hours of Learning