File size: 2,494 Bytes
ddd56e8 3f523df ee34371 6487dd2 581062e 6487dd2 4dff679 6487dd2 3f523df 581062e 6262708 581062e 6262708 581062e 00bed5d 3f523df ddd56e8 581062e ddd56e8 6109f69 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 |
# You Only Sample Once (YOSO)
## Usage
### 1-step inference
1-step inference is only allowed based on SD v1.5 for now. And you should prepare the informative initialization according to the paper for better results.
```python
pipeline = DiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v1-5", torch_dtype = torch.float16)
pipeline = pipeline.to('cuda')
pipeline.scheduler = LCMScheduler.from_config(pipeline.scheduler.config)
pipeline.load_lora_weights('Yihong666/yoso_sd1.5_lora')
generator = torch.manual_seed(318)
steps = 1
bs = 1
latents = ... # maybe some latent codes of real images or SD generation
latent_mean = latent.mean(dim=0)
noise = torch.randn([1,bs,64,64])
input_latent = pipeline.scheduler.add_noise(latent_mean.repeat(bs,1,1,1),noise,T)
imgs= pipeline(prompt="A photo of a dog",
num_inference_steps=steps,
num_images_per_prompt = 1,
generator = generator,
guidance_scale=1.5,
latents = input_latent,
)[0]
imgs
```
The simple inference without informative initialization, but worse quality:
```python
pipeline = DiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v1-5", torch_dtype = torch.float16)
pipeline = pipeline.to('cuda')
pipeline.scheduler = LCMScheduler.from_config(pipeline.scheduler.config)
pipeline.load_lora_weights('Yihong666/yoso_sd1.5_lora')
generator = torch.manual_seed(318)
steps = 1
imgs = pipeline(prompt="A photo of a corgi in forest, highly detailed, 8k, XT3.",
num_inference_steps=1,
num_images_per_prompt = 1,
generator = generator,
guidance_scale=1.,
)[0]
imgs[0]
```
![Corgi](corgi.jpg)
### 2-step inference
We note that a small CFG can be used to enhance the image quality.
```python
pipeline = DiffusionPipeline.from_pretrained("stablediffusionapi/realistic-vision-v51", torch_dtype = torch.float16)
pipeline = pipeline.to('cuda')
pipeline.scheduler = LCMScheduler.from_config(pipeline.scheduler.config)
pipeline.load_lora_weights('Yihong666/yoso_sd1.5_lora')
generator = torch.manual_seed(318)
steps = 2
imgs= pipeline(prompt="A photo of a man, XT3",
num_inference_steps=steps,
num_images_per_prompt = 1,
generator = generator,
guidance_scale=1.5,
)[0]
imgs
```
![man](man.jpg) |