Upload README.md
Browse files
README.md
CHANGED
@@ -1,180 +1,10 @@
|
|
1 |
-
|
2 |
-
|
3 |
-
|
4 |
-
|
5 |
-
|
6 |
-
|
7 |
-
|
8 |
-
|
9 |
-
|
10 |
-
|
11 |
-
<sup>1</sup>School of Cyber Science and Technology, Shenzhen Campus of Sun Yat-sen University<br> <sup>2</sup>S-Lab, Nanyang Technological University<br> <sup>3</sup>Huawei Noah's Ark Lab<br>* Equal contribution.
|
12 |
-
</div>
|
13 |
-
|
14 |
-
:fire::fire::fire: We have released the code, cheers!
|
15 |
-
|
16 |
-
|
17 |
-
:star: If S3Diff is helpful for you, please help star this repo. Thanks! :hugs:
|
18 |
-
|
19 |
-
|
20 |
-
## :book: Table Of Contents
|
21 |
-
|
22 |
-
- [Update](#update)
|
23 |
-
- [TODO](#todo)
|
24 |
-
- [Abstract](#abstract)
|
25 |
-
- [Framework Overview](#framework_overview)
|
26 |
-
- [Visual Comparison](#visual_comparison)
|
27 |
-
- [Setup](#setup)
|
28 |
-
- [Training](#training)
|
29 |
-
- [Inference](#inference)
|
30 |
-
|
31 |
-
<!-- - [Installation](#installation)
|
32 |
-
- [Inference](#inference) -->
|
33 |
-
|
34 |
-
## <a name="update"></a>:new: Update
|
35 |
-
|
36 |
-
- **2024.10.07**: Add gradio demo 🚀
|
37 |
-
- **2024.09.25**: The code is released :fire:
|
38 |
-
- **2024.09.25**: This repo is released :fire:
|
39 |
-
<!-- - [**History Updates** >]() -->
|
40 |
-
|
41 |
-
## <a name="todo"></a>:hourglass: TODO
|
42 |
-
|
43 |
-
- [x] Release Code :computer:
|
44 |
-
- [x] Release Checkpoints :link:
|
45 |
-
|
46 |
-
## <a name="abstract"></a>:fireworks: Abstract
|
47 |
-
|
48 |
-
> Diffusion-based image super-resolution (SR) methods have achieved remarkable success by leveraging large pre-trained text-to-image diffusion models as priors. However, these methods still face two challenges: the requirement for dozens of sampling steps to achieve satisfactory results, which limits efficiency in real scenarios, and the neglect of degradation models, which are critical auxiliary information in solving the SR problem. In this work, we introduced a novel one-step SR model, which significantly addresses the efficiency issue of diffusion-based SR methods. Unlike existing fine-tuning strategies, we designed a degradation-guided Low-Rank Adaptation (LoRA) module specifically for SR, which corrects the model parameters based on the pre-estimated degradation information from low-resolution images. This module not only facilitates a powerful data-dependent or degradation-dependent SR model but also preserves the generative prior of the pre-trained diffusion model as much as possible. Furthermore, we tailor a novel training pipeline by introducing an online negative sample generation strategy. Combined with the classifier-free guidance strategy during inference, it largely improves the perceptual quality of the super-resolution results. Extensive experiments have demonstrated the superior efficiency and effectiveness of the proposed model compared to recent state-of-the-art methods.
|
49 |
-
|
50 |
-
## <a name="framework_overview"></a>:eyes: Framework Overview
|
51 |
-
|
52 |
-
<img src=assets/pic/main_framework.jpg>
|
53 |
-
|
54 |
-
:star: Overview of S3Diff. We enhance a pre-trained diffusion model for one-step SR by injecting LoRA layers into the VAE encoder and UNet. Additionally, we employ a pre-trained Degradation Estimation Network to assess image degradation that is used to guide the LoRAs with the introduced block ID embeddings. We tailor a new training pipeline that includes an online negative prompting, reusing generated LR images with negative text prompts. The network is trained with a combination of a reconstruction loss and a GAN loss.
|
55 |
-
|
56 |
-
## <a name="visual_comparison"></a>:chart_with_upwards_trend: Visual Comparison
|
57 |
-
|
58 |
-
### Image Slide Results
|
59 |
-
[<img src="assets/pic/imgsli1.png" height="235px"/>](https://imgsli.com/MzAzNjIy) [<img src="assets/pic/imgsli2.png" height="235px"/>](https://imgsli.com/MzAzNjQ1) [<img src="assets/pic/imgsli3.png" height="235px"/>](https://imgsli.com/MzAzNjU4)
|
60 |
-
[<img src="assets/pic/imgsli4.png" height="272px"/>](https://imgsli.com/MzAzNjU5) [<img src="assets/pic/imgsli5.png" height="272px"/>](https://imgsli.com/MzAzNjI2)
|
61 |
-
### Synthesis Dataset
|
62 |
-
|
63 |
-
<img src=assets/pic/div2k_comparison.jpg>
|
64 |
-
|
65 |
-
### Real-World Dataset
|
66 |
-
|
67 |
-
<img src=assets/pic/london2.jpg>
|
68 |
-
<img src=assets/pic/realsr_vis3.jpg>
|
69 |
-
|
70 |
-
<!-- </details> -->
|
71 |
-
|
72 |
-
## <a name="setup"></a> ⚙️ Setup
|
73 |
-
```bash
|
74 |
-
conda create -n s3diff python=3.10
|
75 |
-
conda activate s3diff
|
76 |
-
pip install -r requirements.txt
|
77 |
-
```
|
78 |
-
Or use the conda env file that contains all the required dependencies.
|
79 |
-
|
80 |
-
```bash
|
81 |
-
conda env create -f environment.yaml
|
82 |
-
```
|
83 |
-
|
84 |
-
:star: Since we employ peft in our code, we highly recommend following the provided environmental requirements, especially regarding diffusers.
|
85 |
-
|
86 |
-
## <a name="training"></a> :wrench: Training
|
87 |
-
|
88 |
-
#### Step1: Download the pretrained models
|
89 |
-
We enable automatic model download in our code, if you need to conduct offline training, download the pretrained model [SD-Turbo](https://huggingface.co/stabilityai/sd-turbo)
|
90 |
-
|
91 |
-
#### Step2: Prepare training data
|
92 |
-
We train the S3Diff on [LSDIR](https://github.com/ofsoundof/LSDIR) + 10K samples from [FFHQ](https://github.com/NVlabs/ffhq-dataset), following [SeeSR](https://github.com/cswry/SeeSR) and [OSEDiff](https://github.com/cswry/OSEDiff).
|
93 |
-
|
94 |
-
#### Step3: Training for S3Diff
|
95 |
-
|
96 |
-
Please modify the paths to training datasets in `configs/sr.yaml`
|
97 |
-
Then run:
|
98 |
-
|
99 |
-
```bash
|
100 |
-
sh run_training.sh
|
101 |
-
```
|
102 |
-
|
103 |
-
If you need to conduct offline training, modify `run_training.sh` as follows, and fill in `sd_path` with your local path:
|
104 |
-
|
105 |
-
```bash
|
106 |
-
accelerate launch --num_processes=4 --gpu_ids="0,1,2,3" --main_process_port 29300 src/train_s3diff.py \
|
107 |
-
--sd_path="path_to_checkpoints/sd-turbo" \
|
108 |
-
--de_net_path="assets/mm-realsr/de_net.pth" \
|
109 |
-
--output_dir="./output" \
|
110 |
-
--resolution=512 \
|
111 |
-
--train_batch_size=4 \
|
112 |
-
--enable_xformers_memory_efficient_attention \
|
113 |
-
--viz_freq 25
|
114 |
-
```
|
115 |
-
|
116 |
-
## <a name="inference"></a> 💫 Inference
|
117 |
-
|
118 |
-
#### Step1: Download datasets for inference
|
119 |
-
|
120 |
-
#### Step2: Download the pretrained models
|
121 |
-
|
122 |
-
We enable automatic model download in our code, if you need to conduct offline inference, download the pretrained model [SD-Turbo](https://huggingface.co/stabilityai/sd-turbo) and S3Diff [[HuggingFace](https://huggingface.co/zhangap/S3Diff) | [GoogleDrive](https://drive.google.com/drive/folders/1cWYQYRFpadC4K2GuH8peg_hWEoFddZtj?usp=sharing)]
|
123 |
-
|
124 |
-
#### Step3: Inference for S3Diff
|
125 |
-
|
126 |
-
Please add the paths to evaluate datasets in `configs/sr_test.yaml` and the path of GT folder in `run_inference.sh`
|
127 |
-
Then run:
|
128 |
-
|
129 |
-
```bash
|
130 |
-
sh run_inference.sh
|
131 |
-
```
|
132 |
-
|
133 |
-
If you need to conduct offline inference, modify `run_inference.sh` as follows, and fill in with your paths:
|
134 |
-
|
135 |
-
```bash
|
136 |
-
accelerate launch --num_processes=1 --gpu_ids="0," --main_process_port 29300 src/inference_s3diff.py \
|
137 |
-
--sd_path="path_to_checkpoints/sd-turbo" \
|
138 |
-
--de_net_path="assets/mm-realsr/de_net.pth" \
|
139 |
-
--pretrained_path="path_to_checkpoints/s3diff.pkl" \
|
140 |
-
--output_dir="./output" \
|
141 |
-
--ref_path="path_to_ground_truth_folder" \
|
142 |
-
--align_method="wavelet"
|
143 |
-
```
|
144 |
-
|
145 |
-
#### Gradio Demo
|
146 |
-
|
147 |
-
Please install Gradio first
|
148 |
-
```bash
|
149 |
-
pip install gradio
|
150 |
-
```
|
151 |
-
|
152 |
-
Please run the following command to interact with the gradio website, have fun. 🤗
|
153 |
-
|
154 |
-
```
|
155 |
-
python src/gradio_s3diff.py
|
156 |
-
```
|
157 |
-

|
158 |
-
|
159 |
-
## :smiley: Citation
|
160 |
-
|
161 |
-
Please cite us if our work is useful for your research.
|
162 |
-
|
163 |
-
```
|
164 |
-
@article{2024s3diff,
|
165 |
-
author = {Aiping Zhang, Zongsheng Yue, Renjing Pei, Wenqi Ren, Xiaochun Cao},
|
166 |
-
title = {Degradation-Guided One-Step Image Super-Resolution with Diffusion Priors},
|
167 |
-
journal = {arxiv},
|
168 |
-
year = {2024},
|
169 |
-
}
|
170 |
-
```
|
171 |
-
|
172 |
-
## :notebook: License
|
173 |
-
|
174 |
-
This project is released under the [Apache 2.0 license](LICENSE).
|
175 |
-
|
176 |
-
|
177 |
-
## :envelope: Contact
|
178 |
-
|
179 |
-
If you have any questions, please feel free to contact [email protected].
|
180 |
-
|
|
|
1 |
+
---
|
2 |
+
title: S3Diff Demo
|
3 |
+
emoji: 🧪
|
4 |
+
colorFrom: blue
|
5 |
+
colorTo: green
|
6 |
+
sdk: gradio
|
7 |
+
sdk_version: "3.43.1" # Gradioのバージョンを3.43.1に設定
|
8 |
+
app_file: src/gradio_s3diff.py # メインファイルのパスを指定
|
9 |
+
pinned: false
|
10 |
+
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|