File size: 2,075 Bytes

5c0c829
 
 
90ddd95
5c0c829
459948b
a607268
459948b
77567ab
459948b
9b5120a
 
2785fa8
 
35d1d5f
 
 
0d53e65
7febe82
35d1d5f
 
5b8e303
35d1d5f
5a544b4
35d1d5f
 
 
 
3594ace
 
 
c3e7b5b
3594ace
 
 
 
 
 
9608b48
8ee8f07
 
 
23222ad
8ee8f07
77567ab
 
 
 
 
 
23222ad

---
language:
- en
pipeline_tag: text-to-image
---
# You Only Sample Once (YOSO)
![overview](overview.jpg)

The YOSO was proposed in "[You Only Sample Once: Taming One-Step Text-To-Image Synthesis by Self-Cooperative Diffusion GANs](https://www.arxiv.org/abs/2403.12931)" by *Yihong Luo, Xiaolong Chen, Xinghua Qu, Jing Tang*. 

Official Repository of this paper: [YOSO](https://github.com/Luo-Yihong/YOSO).

This model is fine-tuning from [
PixArt-XL-2-512x512](https://huggingface.co/PixArt-alpha/PixArt-XL-2-512x512), enabling one-step inference to perform text-to-image generation.

## usage
```python
import torch
from diffusers import PixArtAlphaPipeline, LCMScheduler, Transformer2DModel

transformer = Transformer2DModel.from_pretrained(
    "Luo-Yihong/yoso_pixart512", torch_dtype=torch.float16).to('cuda')

pipe = PixArtAlphaPipeline.from_pretrained("PixArt-alpha/PixArt-XL-2-512x512", 
                                           transformer=transformer,
                                           torch_dtype=torch.float16, use_safetensors=True)

pipe = pipe.to('cuda')
pipe.scheduler = LCMScheduler.from_config(pipe.scheduler.config)
pipe.scheduler.config.prediction_type = "v_prediction"
generator = torch.manual_seed(318)
imgs = pipe(prompt="Pirate ship trapped in a cosmic maelstrom nebula, rendered in cosmic beach whirlpool engine, volumetric lighting, spectacular, ambient lights, light pollution, cinematic atmosphere, art nouveau style, illustration art artwork by SenseiJaye, intricate detail.",
                    num_inference_steps=1, 
                    num_images_per_prompt = 1,
                    generator = generator,
                    guidance_scale=1.,
                   )[0]
imgs[0]
```
![Ship](ship.jpg)

## Bibtex
```
@misc{luo2024sample,
      title={You Only Sample Once: Taming One-Step Text-to-Image Synthesis by Self-Cooperative Diffusion GANs}, 
      author={Yihong Luo and Xiaolong Chen and Xinghua Qu and Jing Tang},
      year={2024},
      eprint={2403.12931},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}
```