Le0jc commited on
Commit
7caa8e6
·
verified ·
1 Parent(s): d1a2283

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +0 -111
README.md CHANGED
@@ -91,32 +91,6 @@ All videos are available in this [Link](https://cloudbook-public-daily.oss-cn-ha
91
  - [x] Release diffusers version and optimize the GPU memory usage
92
  - [x] Release complete version of Tora
93
 
94
- ## 🧨 Diffusers verision
95
-
96
- Please refer to [the diffusers version](diffusers-version/README.md) for details.
97
-
98
- ## 🐍 Installation
99
-
100
- Please make sure your Python version is between 3.10 and 3.12, inclusive of both 3.10 and 3.12.
101
-
102
- ```bash
103
- # Clone this repository.
104
- git clone https://github.com/alibaba/Tora.git
105
- cd Tora
106
-
107
- # Install Pytorch (we use Pytorch 2.4.0) and torchvision following the official instructions: https://pytorch.org/get-started/previous-versions/. For example:
108
- conda create -n tora python==3.10
109
- conda activate tora
110
- conda install pytorch==2.4.0 torchvision==0.19.0 pytorch-cuda=12.1 -c pytorch -c nvidia
111
-
112
- # Install requirements
113
- cd modules/SwissArmyTransformer
114
- pip install -e .
115
- cd ../../sat
116
- pip install -r requirements.txt
117
- cd ..
118
- ```
119
-
120
  ## 📦 Model Weights
121
 
122
  ### Folder Structure
@@ -182,91 +156,6 @@ git clone https://www.modelscope.cn/xiaoche/Tora.git
182
  - T5: [text_encoder](https://huggingface.co/THUDM/CogVideoX-2b/tree/main/text_encoder), [tokenizer](https://huggingface.co/THUDM/CogVideoX-2b/tree/main/tokenizer)
183
  - Tora t2v model weights: [Link](https://cloudbook-public-daily.oss-cn-hangzhou.aliyuncs.com/Tora_t2v/mp_rank_00_model_states.pt). Downloading this weight requires following the [CogVideoX License](CogVideoX_LICENSE).
184
 
185
- ## 🔄 Inference
186
-
187
- ### Text to Video
188
- It requires around 30 GiB GPU memory tested on NVIDIA A100.
189
-
190
- ```bash
191
- cd sat
192
- PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True torchrun --standalone --nproc_per_node=$N_GPU sample_video.py --base configs/tora/model/cogvideox_5b_tora.yaml configs/tora/inference_sparse.yaml --load ckpts/tora/t2v --output-dir samples --point_path trajs/coaster.txt --input-file assets/text/t2v/examples.txt
193
- ```
194
-
195
- You can change the `--input-file` and `--point_path` to your own prompts and trajectory points files. Please note that the trajectory is drawn on a 256x256 canvas.
196
-
197
- Replace `$N_GPU` with the number of GPUs you want to use.
198
-
199
- ### Image to Video
200
-
201
- ```bash
202
- cd sat
203
- PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True torchrun --standalone --nproc_per_node=$N_GPU sample_video.py --base configs/tora/model/cogvideox_5b_tora_i2v.yaml configs/tora/inference_sparse.yaml --load ckpts/tora/i2v --output-dir samples --point_path trajs/sawtooth.txt --input-file assets/text/i2v/examples.txt --img_dir assets/images --image2video
204
- ```
205
-
206
- The first frame images should be placed in the `--img_dir`. The names of these images should be specified in the corresponding text prompt in `--input-file`, seperated by `@@`.
207
-
208
- ### Recommendations for Text Prompts
209
-
210
- For text prompts, we highly recommend using GPT-4 to enhance the details. Simple prompts may negatively impact both visual quality and motion control effectiveness.
211
-
212
- You can refer to the following resources for guidance:
213
-
214
- - [CogVideoX Documentation](https://github.com/THUDM/CogVideo/blob/main/inference/convert_demo.py)
215
- - [OpenSora Scripts](https://github.com/hpcaitech/Open-Sora/blob/main/scripts/inference.py)
216
-
217
- ## 🖥️ Gradio Demo
218
-
219
- Usage:
220
-
221
- ```bash
222
- cd sat
223
- python app.py --load ckpts/tora/t2v
224
- ```
225
-
226
- ## 🧠 Training
227
-
228
- ### Data Preparation
229
-
230
- Following this guide https://github.com/THUDM/CogVideo/blob/main/sat/README.md#preparing-the-dataset, structure the datasets as follows:
231
-
232
- ```
233
- .
234
- ├── labels
235
- │ ├── 1.txt
236
- │ ├── 2.txt
237
- │ ├── ...
238
- └── videos
239
- ├── 1.mp4
240
- ├── 2.mp4
241
- ├── ...
242
- ```
243
-
244
- Training data examples are in `sat/training_examples`
245
-
246
- ### Text to Video
247
-
248
- It requires around 60 GiB GPU memory tested on NVIDIA A100.
249
-
250
- Replace `$N_GPU` with the number of GPUs you want to use.
251
-
252
- - Stage 1
253
-
254
- ```bash
255
- PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True torchrun --standalone --nproc_per_node=$N_GPU train_video.py --base configs/tora/model/cogvideox_5b_tora.yaml configs/tora/train_dense.yaml --experiment-name "t2v-stage1"
256
- ```
257
-
258
- - Stage 2
259
-
260
- ```bash
261
- PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True torchrun --standalone --nproc_per_node=$N_GPU train_video.py --base configs/tora/model/cogvideox_5b_tora.yaml configs/tora/train_sparse.yaml --experiment-name "t2v-stage2"
262
- ```
263
-
264
- ## 🎯 Troubleshooting
265
-
266
- ### 1. ValueError: Non-consecutive added token...
267
-
268
- Upgrade the transformers package to 4.44.2. See [this](https://github.com/THUDM/CogVideo/issues/213) issue.
269
-
270
  ## 🤝 Acknowledgements
271
 
272
  We would like to express our gratitude to the following open-source projects that have been instrumental in the development of our project:
 
91
  - [x] Release diffusers version and optimize the GPU memory usage
92
  - [x] Release complete version of Tora
93
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
94
  ## 📦 Model Weights
95
 
96
  ### Folder Structure
 
156
  - T5: [text_encoder](https://huggingface.co/THUDM/CogVideoX-2b/tree/main/text_encoder), [tokenizer](https://huggingface.co/THUDM/CogVideoX-2b/tree/main/tokenizer)
157
  - Tora t2v model weights: [Link](https://cloudbook-public-daily.oss-cn-hangzhou.aliyuncs.com/Tora_t2v/mp_rank_00_model_states.pt). Downloading this weight requires following the [CogVideoX License](CogVideoX_LICENSE).
158
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
159
  ## 🤝 Acknowledgements
160
 
161
  We would like to express our gratitude to the following open-source projects that have been instrumental in the development of our project: