News
EgoX introduces a novel framework for translating third-person (exocentric) videos into realistic first-person (egocentric) videos using only a single input video. The work tackles a highly challenging problem of extreme
Omni-Attribute introduces a new paradigm for fine-grained visual concept personalization, solving a long-standing problem in image generation: how to transfer only the desired attribute (identity, hairstyle, lighting, style, etc.) without
GeoVista introduces a new frontier in multimodal reasoning by enabling agentic geolocalization, a dynamic process where a model inspects high-resolution images, zooms into regions of interest, retrieves web information in
BlockVid represents a major leap forward in long-video generation, tackling one of the hardest open problems in video generation, i.e, producing coherent, high-fidelity, minute-long clips without collapse, drift, or degradation
WorldGrow redefines 3D world generation by enabling infinite, continuous 3D scene creation through a hierarchical block-wise synthesis and inpainting pipeline. Developed by researchers from Shanghai Jiao Tong University, Huawei Inc.,
Nano3D revolutionizes 3D asset editing by enabling training-free, part-level shape modifications like removal, addition, and replacement without any manual masks. Developed by researchers from Tsinghua University, Peking University, HKUST, CASIA,
