File size: 2,395 Bytes

994e46a
 
 
 
 
 
 
 
 
 
 
40f663f
6a63c98
 
73fa0da
 
6a63c98
73fa0da
 
6a63c98
 
 
994e46a
 
 
fcf7a9c
715e74d
 
994e46a
 
 
e81c441
994e46a
b70e3a2
fcf7a9c
994e46a
fcf7a9c
 
 
994e46a
 
 
 
 
 
 
fcf7a9c
6a63c98
994e46a
b70e3a2
994e46a
fcf7a9c
994e46a
 
 
fcf7a9c
994e46a
 
 
 
6a63c98
b70e3a2
994e46a
fcf7a9c
 
994e46a
 
fcf7a9c
994e46a
 
 
 
 
fcf7a9c
 
 
994e46a
fcf7a9c
994e46a
e81c441
994e46a
fcf7a9c
994e46a
715e74d

---
language:
- en
pipeline_tag: unconditional-image-generation
tags:
- Diffusion Models
- Stable Diffusion
- Perturbed-Attention Guidance
- PAG
---

# Perturbed-Attention Guidance for SDXL

<div style="display:flex">
  <video width=50% autoplay loop>
    <source src="https://huggingface.co/multimodalart/sdxl_perturbed_attention_guidance/resolve/main/pag_sdxl.mp4" type="video/mp4">
  </video>
  <video width=50% autoplay loop>
    <source src="https://huggingface.co/multimodalart/sdxl_perturbed_attention_guidance/resolve/main/pag_uncond.mp4" type="video/mp4">
  </video>
</div>


[Project](https://ku-cvlab.github.io/Perturbed-Attention-Guidance/) / [arXiv](https://arxiv.org/abs/2403.17377) / [GitHub](https://github.com/KU-CVLAB/Perturbed-Attention-Guidance)

This repository is based on [Diffusers](https://huggingface.co/docs/diffusers/index). The pipeline is a modification of StableDiffusionXLPipeline to add Perturbed-Attention Guidance (PAG).

The original Perturbed-Attention Guidance for unconditional models and SD1.5 by [Hyoungwon Cho](https://huggingface.co/hyoungwoncho) is availiable at [hyoungwoncho/sd_perturbed_attention_guidance](https://huggingface.co/hyoungwoncho/sd_perturbed_attention_guidance)

## Quickstart

Loading Custom Pipeline:

```py
from diffusers import StableDiffusionXLPipeline

pipe = StableDiffusionXLPipeline.from_pretrained(
    "stabilityai/stable-diffusion-xl-base-1.0",
    custom_pipeline="multimodalart/sdxl_perturbed_attention_guidance",
    torch_dtype=torch.float16
)

device="cuda"
pipe = pipe.to(device)
```

Unconditional sampling with PAG:
![image/jpeg](uncond_generation_pag.jpg)

```py
output = pipe(
        "",
        num_inference_steps=50,
        guidance_scale=0.0,
        pag_scale=5.0,
        pag_applied_layers=['mid']
    ).images
```

Sampling with PAG and CFG:
![image/jpeg](cfgpag.jpg)
```py
output = pipe(
        "the spirit of a tamagotchi wandering in the city of Vienna",
        num_inference_steps=25,
        guidance_scale=4.0,
        pag_scale=3.0,
        pag_applied_layers=['mid']
    ).images
```

## Parameters

`guidance_scale` : gudiance scale of CFG (ex: `7.5`)

`pag_scale` : gudiance scale of PAG (ex: `4.0`)

`pag_applied_layers`: layer to apply perturbation (ex: ['mid'])

`pag_applied_layers_index` : index of the layers to apply perturbation (ex: ['m0', 'm1'])

## Stable Diffusion XL Demo

Soon