--- license: openrail++ tags: - text-to-image - PixArt-Σ --- THIS IS A REDISTRIBUTION OF PIXART-Σ-XL-512-MS ### 🧨 Diffusers > [!IMPORTANT] > Make sure to upgrade diffusers to >= 0.28.0: > ```bash > pip install -U diffusers --upgrade > ``` > In addition make sure to install `transformers`, `safetensors`, `sentencepiece`, and `accelerate`: > ``` > pip install transformers accelerate safetensors sentencepiece > ``` > For `diffusers<0.28.0`, check this [script](https://github.com/PixArt-alpha/PixArt-sigma#2-integration-in-diffusers) for help. To just use the base model, you can run: ```python import torch from diffusers import Transformer2DModel, PixArtSigmaPipeline device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu") weight_dtype = torch.float16 pipe = PixArtSigmaPipeline.from_pretrained( "dattrong/pixart-sigma-512", torch_dtype=weight_dtype, use_safetensors=True, ) pipe.to(device) # Enable memory optimizations. # pipe.enable_model_cpu_offload() prompt = "A small cactus with a happy face in the Sahara desert." image = pipe(prompt).images[0] image.save("./catcus.png") ``` When using `torch >= 2.0`, you can improve the inference speed by 20-30% with torch.compile. Simple wrap the unet with torch compile before running the pipeline: ```py pipe.transformer = torch.compile(pipe.transformer, mode="reduce-overhead", fullgraph=True) ``` If you are limited by GPU VRAM, you can enable *cpu offloading* by calling `pipe.enable_model_cpu_offload` instead of `.to("cuda")`: ```diff - pipe.to("cuda") + pipe.enable_model_cpu_offload() ```