--- license: apache-2.0 pipeline_tag: image-to-3d tags: - text-to-3d - image-to-3d library_name: 3dtopia-xl --- # 3DTopia-XL This repo contains the pretrained weights for *3DTopia-XL: Scaling High-quality 3D Asset Generation via Primitive Diffusion*. [Project Page](https://3dtopia.github.io/3DTopia-XL/) | [Arxiv](https://arxiv.org/abs/2409.12957) | [Weights](https://huggingface.co/FrozenBurning/3DTopia-XL) | [Code](https://github.com/3DTopia/3DTopia-XL) ## Introduction 3DTopia-XL scales high-quality 3D asset generation using Diffusion Transformer (DiT) built upon an expressive and efficient 3D representation, **PrimX**. The denoising process takes 5 seconds to generate a 3D PBR asset from text/image input which is ready for the graphics pipeline to use. ## Model Details The model is trained on a ~256K subset of [Objaverse](https://huggingface.co/datasets/allenai/objaverse). For more details, please refer to our paper. ## Usage To download the model: ```python from huggingface_hub import hf_hub_download ckpt_path = hf_hub_download(repo_id="frozenburning/3DTopia-XL", filename="model_sview_dit_fp16.pt") vae_ckpt_path = hf_hub_download(repo_id="frozenburning/3DTopia-XL", filename="model_vae_fp16.pt") ``` Please refer to our [repo](https://github.com/3DTopia/3DTopia-XL) for more details on loading and inference. ## Citation ``` @article{chen2024primx, title={3DTopia-XL: High-Quality 3D PBR Asset Generation via Primitive Diffusion}, author={Chen, Zhaoxi and Tang, Jiaxiang and Dong, Yuhao and Cao, Ziang and Hong, Fangzhou and Lan, Yushi and Wang, Tengfei and Xie, Haozhe and Wu, Tong and Saito, Shunsuke and Pan, Liang and Lin, Dahua and Liu, Ziwei}, journal={arXiv preprint arXiv:2409.12957}, year={2024} } ```