# You Only Sample Once (YOSO) ![overview](overview.jpg) The YOSO was proposed in You Only Sample Once: Taming One-Step Text-To-Image Synthesis by Self-Cooperative Diffusion GANs by *Yihong Luo, Xiaolong Chen, Jing Tang*. ## Usage ### 1-step inference 1-step inference is only allowed based on SD v1.5 for now. And you should prepare the informative initialization according to the paper for better results. ```python pipeline = DiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v1-5", torch_dtype = torch.float16) pipeline = pipeline.to('cuda') pipeline.scheduler = LCMScheduler.from_config(pipeline.scheduler.config) pipeline.load_lora_weights('Luo-Yihong/yoso_sd1.5_lora') generator = torch.manual_seed(318) steps = 1 bs = 1 latents = ... # maybe some latent codes of real images or SD generation latent_mean = latent.mean(dim=0) noise = torch.randn([1,bs,64,64]) input_latent = pipeline.scheduler.add_noise(latent_mean.repeat(bs,1,1,1),noise,T) imgs= pipeline(prompt="A photo of a dog", num_inference_steps=steps, num_images_per_prompt = 1, generator = generator, guidance_scale=1.5, latents = input_latent, )[0] imgs ``` The simple inference without informative initialization, but worse quality: ```python pipeline = DiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v1-5", torch_dtype = torch.float16) pipeline = pipeline.to('cuda') pipeline.scheduler = LCMScheduler.from_config(pipeline.scheduler.config) pipeline.load_lora_weights('Yihong666/yoso_sd1.5_lora') generator = torch.manual_seed(318) steps = 1 imgs = pipeline(prompt="A photo of a corgi in forest, highly detailed, 8k, XT3.", num_inference_steps=1, num_images_per_prompt = 1, generator = generator, guidance_scale=1., )[0] imgs[0] ``` ![Corgi](corgi.jpg) ### 2-step inference We note that a small CFG can be used to enhance the image quality. ```python pipeline = DiffusionPipeline.from_pretrained("stablediffusionapi/realistic-vision-v51", torch_dtype = torch.float16) pipeline = pipeline.to('cuda') pipeline.scheduler = LCMScheduler.from_config(pipeline.scheduler.config) pipeline.load_lora_weights('Luo-Yihong/yoso_sd1.5_lora') generator = torch.manual_seed(318) steps = 2 imgs= pipeline(prompt="A photo of a man, XT3", num_inference_steps=steps, num_images_per_prompt = 1, generator = generator, guidance_scale=1.5, )[0] imgs ``` ![man](man.jpg) You may try some interesting applications, like: ```python generator = torch.manual_seed(318) steps = 2 img_list = [] for age in [2,20,30,50,60,80]: imgs = pipeline(prompt=f"A photo of a cute girl, {age} yr old, XT3", num_inference_steps=steps, num_images_per_prompt = 1, generator = generator, guidance_scale=1.1, )[0] img_list.append(imgs[0]) make_image_grid(img_list,rows=1,cols=len(img_list)) ``` ![life](life.jpg) You can increase the steps to improve sample quality.