Update README.md

bbab59d verified 3 months ago

3.8 kB

	---
	license: apache-2.0
	datasets:
	- poloclub/diffusiondb
	base_model:
	- PixArt-alpha/PixArt-Sigma-XL-2-1024-MS
	pipeline_tag: text-to-image
	library_name: diffusers
	---
	# AMD Nitro Diffusion


	![image/jpeg](https://cdn-uploads.huggingface.co/production/uploads/6355aded9c72a7e742f341a4/AsUvS7acUDLZhKOMRSH37.jpeg)

	## Introduction
	AMD Nitro Diffusion is a series of efficient text-to-image generation models that are distilled from popular diffusion models on AMD Instinct™ GPUs. The release consists of:

	* [Stable Diffusion 2.1 Nitro](https://huggingface.co/amd/SD2.1-Nitro): a UNet-based one-step model distilled from [Stable Diffusion 2.1](https://huggingface.co/stabilityai/stable-diffusion-2-1-base).
	* [PixArt-Sigma Nitro](https://huggingface.co/amd/PixArt-Sigma-Nitro): a high resolution transformer-based one-step model distilled from [PixArt-Sigma](https://pixart-alpha.github.io/PixArt-sigma-project/).

	⚡️ [Open-source code](https://github.com/AMD-AIG-AIMA/AMD-Diffusion-Distillation)! The models are based on our re-implementation of [Latent Adversarial Diffusion Distillation](https://arxiv.org/abs/2403.12015), the method used to build the popular Stable Diffusion 3 Turbo model. Since the original authors didn't provide training code, we release our re-implementation to help advance further research in the field.



	## Details

	* Model architecture: PixArt-Sigma Nitro has the same architecture as PixArt-Sigma and is compatible with the diffusers pipeline.
	* Inference steps: This model is distilled to perform inference in just a single step. However, the training code also supports distilling a model for 2, 4 or 8 steps.
	* Hardware: We use a single node consisting of 4 AMD Instinct™ MI250 GPUs for distilling PixArt-Sigma Nitro.
	* Dataset: We use 1M prompts from [DiffusionDB](https://huggingface.co/datasets/poloclub/diffusiondb) and generate the corresponding images from the base PixArt-Sigma model.
	* Training cost: The distillation process achieves reasonable results in less than 2 days on a single node.



	## Quickstart

	```python
	from diffusers import PixArtSigmaPipeline
	import torch
	from safetensors.torch import load_file

	pipe = PixArtSigmaPipeline.from_pretrained("PixArt-alpha/PixArt-Sigma-XL-2-1024-MS")

	ckpt_path = '<path to distilled checkpoint>'
	transformer_state_dict = load_file(ckpt_path)
	pipe.transformer.load_state_dict(transformer_state_dict)
	pipe = pipe.to("cuda")

	image = pipe(prompt='a photo of a cat',
	num_inference_steps=1,
	guidance_scale=0,
	timesteps=[400]).images[0]
	```

	For more details on training and evaluation please visit the [GitHub repo](https://github.com/AMD-AIG-AIMA/AMD-Diffusion-Distillation).



	## Results


	Compared to [PixArt-Sigma](https://pixart-alpha.github.io/PixArt-sigma-project/), our model achieves a 90.9% reduction in FLOPs at the cost of just 3.7% lower CLIP score and 10.5% higher FID.

	\| Model \| FID ↓ \| CLIP ↑ \|FLOPs\| Latency on AMD Instinct MI250 (sec)
	\| :---: \| :---: \| :---: \| :---: \| :---:
	\| PixArt-Sigma, 20 steps \| 34.14 \| 0.3289 \|187.96 \| 7.46
	\| PixArt-Sigma Nitro, 1 step \| 37.75 \| 0.3167\|17.04 \| 0.53



	## License
	Copyright (c) 2018-2024 Advanced Micro Devices, Inc. All Rights Reserved.
	Licensed under the Apache License, Version 2.0 (the "License");
	you may not use this file except in compliance with the License.
	You may obtain a copy of the License at
	http://www.apache.org/licenses/LICENSE-2.0
	Unless required by applicable law or agreed to in writing, software
	distributed under the License is distributed on an "AS IS" BASIS,
	WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
	See the License for the specific language governing permissions and
	limitations under the License.