xierui.0097 commited on
Commit
4662946
Β·
1 Parent(s): e3dae60

Add application file

Browse files
Files changed (1) hide show
  1. README.md +15 -106
README.md CHANGED
@@ -1,106 +1,15 @@
1
- <div align="center">
2
- <h1>
3
- STAR: Spatial-Temporal Augmentation with Text-to-Video Models for Real-World Video Super-Resolution
4
- </h1>
5
- <div>
6
- <a href='https://github.com/CSRuiXie' target='_blank'>Rui Xie<sup>1*</sup></a>,&emsp;
7
- <a href='https://github.com/yhliu04' target='_blank'>Yinhong Liu<sup>1*</sup></a>,&emsp;
8
- <a href='https://scholar.google.com/citations?user=Uhp3JKgAAAAJ&hl=zh-CN&oi=sra' target='_blank'>Chen Zhao<sup>1</sup></a>,&emsp;
9
- <a href='https://scholar.google.com/citations?hl=zh-CN&user=yWq1Fd4AAAAJ' target='_blank'>Penghao Zhou<sup>2</sup></a>,&emsp;
10
- <a href='https://scholar.google.com/citations?hl=zh-CN&user=Ds5wwRoAAAAJ' target='_blank'>Zhenheng Yang<sup>2</sup></a><br>
11
- <a href='https://scholar.google.com/citations?hl=zh-CN&user=w03CHFwAAAAJ' target='_blank'>Jun Zhou<sup>3</sup></a>,&emsp;
12
- <a href='https://cszn.github.io/' target='_blank'>Kai Zhang<sup>1</sup></a>,&emsp;
13
- <a href='https://jessezhang92.github.io/' target='_blank'>Zhenyu Zhang<sup>1</sup></a>,&emsp;
14
- <a href='https://scholar.google.com.hk/citations?user=6CIDtZQAAAAJ&hl=zh-CN' target='_blank'>Jian Yang<sup>1</sup></a>,&emsp;
15
- <a href='https://tyshiwo.github.io/index.html' target='_blank'>Ying Tai<sup>1&#8224</sup></a>
16
- </div>
17
- <div>
18
- <sup>1</sup>Nanjing University,&emsp;<sup>2</sup>ByteDance,&emsp; <sup>3</sup>Southwest University
19
- </div>
20
- <div>
21
- <h4 align="center">
22
- <a href="https://nju-pcalab.github.io/projects/STAR" target='_blank'>
23
- <img src="https://img.shields.io/badge/🌟-Project%20Page-blue">
24
- </a>
25
- <a href="https://arxiv.org/abs/2407.07667" target='_blank'>
26
- <img src="https://img.shields.io/badge/arXiv-2312.06640-b31b1b.svg">
27
- </a>
28
- <a href="https://youtu.be/hx0zrql-SrU" target='_blank'>
29
- <img src="https://img.shields.io/badge/Demo%20Video-%23FF0000.svg?logo=YouTube&logoColor=white">
30
- </a>
31
- </h4>
32
- </div>
33
- </div>
34
-
35
-
36
- ### πŸ”† Updates
37
- - **2024.12.01** The pretrained STAR model (I2VGen-XL version) and inference code have been released.
38
-
39
-
40
- ## πŸ”Ž Method Overview
41
- ![STAR](assets/overview.png)
42
-
43
-
44
- ## πŸ“· Results Display
45
- ![STAR](assets/teaser.png)
46
- ![STAR](assets/real_world.png)
47
- πŸ‘€ More visual results can be found in our [Project Page](https://nju-pcalab.github.io/projects/STAR) and [Video Demo](https://youtu.be/hx0zrql-SrU).
48
-
49
-
50
- ## βš™οΈ Dependencies and Installation
51
- ```
52
- ## git clone this repository
53
- git clone https://github.com/NJU-PCALab/STAR.git
54
- cd STAR
55
-
56
- ## create an environment
57
- conda create -n star python=3.10
58
- conda activate star
59
- pip install -r requirements.txt
60
- sudo apt-get update && apt-get install ffmpeg libsm6 libxext6 -y
61
- ```
62
-
63
- ## πŸš€ Inference
64
- #### Step 1: Download the pretrained model STAR from [HuggingFace](https://huggingface.co/SherryX/STAR).
65
- We provide two verisions, `heavy_deg.pt` for heavy degraded videos and `light_deg.pt` for light degraded videos (e.g., the low-resolution video downloaded from video websites).
66
-
67
- You can put the weight into `pretrained_weight/`.
68
-
69
-
70
- #### Step 2: Prepare testing data
71
- You can put the testing videos in the `input/video/`.
72
-
73
- As for the prompt, there are three options: 1. No prompt. 2. Automatically generate a prompt [using Pllava](https://github.com/hpcaitech/Open-Sora/tree/main/tools/caption#pllava-captioning). 3. Manually write the prompt. You can put the txt file in the `input/text/`.
74
-
75
-
76
- #### Step 3: Change the path
77
- You need to change the paths in `video_super_resolution/scripts/inference_sr.sh` to your local corresponding paths, including `video_folder_path`, `txt_file_path`, `model_path`, and `save_dir`.
78
-
79
-
80
- #### Step 4: Running inference command
81
- ```
82
- bash video_super_resolution/scripts/inference_sr.sh
83
- ```
84
-
85
-
86
- ## ❀️ Acknowledgments
87
- This project is based on [I2VGen-XL](https://github.com/ali-vilab/VGen), [VEnhancer](https://github.com/Vchitect/VEnhancer) and [CogVideoX](https://github.com/THUDM/CogVideo). Thanks for their awesome works.
88
-
89
-
90
- ## πŸŽ“Citations
91
- If our project helps your research or work, please consider citing our paper:
92
-
93
- ```
94
- @misc{xie2024addsr,
95
- title={AddSR: Accelerating Diffusion-based Blind Super-Resolution with Adversarial Diffusion Distillation},
96
- author={Rui Xie and Ying Tai and Kai Zhang and Zhenyu Zhang and Jun Zhou and Jian Yang},
97
- year={2024},
98
- eprint={2404.01717},
99
- archivePrefix={arXiv},
100
- primaryClass={cs.CV}
101
- }
102
- ```
103
-
104
-
105
- ## πŸ“§ Contact
106
- If you have any inquiries, please don't hesitate to reach out via email at `[email protected]`
 
1
+ title: STAR
2
+ emoji: 🌟
3
+ colorFrom: red
4
+ colorTo: yellow
5
+ sdk: gradio
6
+ sdk_version: 4.37.2
7
+ app_file: app.py
8
+ pinned: false
9
+ disable_embedding: true
10
+ tags:
11
+ - Video Super-Resolution
12
+ - Video Restoration
13
+ - Video-to-Video
14
+ - Text-to-Video Model
15
+ short_description: Video super-resolution with text-to-video model