GuanjieChen commited on
Commit
bd8e317
·
verified ·
1 Parent(s): 09caf32

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -10,7 +10,7 @@ base_model:
10
  ---
11
  # Accelerating Vision Diffusion Transformers with Skip Branches
12
 
13
- This repository contains all the checkpoints of models the paper: **[Accelerating Vision Diffusion Transformers with Skip Branches](https://arxiv.org/abs/2411.17616)**. In this work, we enhance standard DiT models by introducing **Skip-DiT**, which incorporates skip branches to improve feature smoothness. We also propose **Skip-Cache**, a method that leverages skip branches to cache DiT features across timesteps during inference.The effectiveness of our approach is validated on various DiT backbones for both video and image generation, demonstrating how skip branches preserve generation quality while achieving significant speedup. Experimental results show that **Skip-Cache** provides a1.5x speedup with minimal computational cost and a 2.2x speedup with only a slight reduction in quantitative metrics. All the codes and checkpoints are publicly available at [huggingface](https://huggingface.co/GuanjieChen/Skip-DiT/tree/main) and [github](https://github.com/OpenSparseLLMs/Skip-DiT.git). More visualizations can be found [here](#visualization).
14
 
15
  ### Pipeline
16
  ![pipeline](visuals/pipeline.jpg)
@@ -18,7 +18,7 @@ Illustration of Skip-DiT and Skip-Cache for DiT visual generation caching. (a) T
18
 
19
  ### Feature Smoothness
20
  ![feature](visuals/feature.jpg)
21
- Feature smoothness analysis of DiT in the class-to-video generation task using DDPM. Normalized disturbances, controlled by strength coefficients $\alpha$ and $\beta$, are introduced to the model with and without skip connections. We compare the similarity between the original and perturbed features. The feature difference surface of Latte, with and without skip connections, is visualized at steps 10 and 250 of DDPM. The results show significantly better feature smoothness in Skip-DiT. Furthermore, we identify feature smoothness as a critical factor limiting the effectiveness of cross-timestep feature caching in DiT. This insight provides a deeper understanding of caching efficiency and its impact on performance.
22
 
23
  ### Pretrained Models
24
  | Model | Task | Training Data | Backbone | Size(G) | Skip-Cache |
 
10
  ---
11
  # Accelerating Vision Diffusion Transformers with Skip Branches
12
 
13
+ This repository contains all the checkpoints of models the paper: **[Accelerating Vision Diffusion Transformers with Skip Branches](https://arxiv.org/abs/2411.17616)**. In this work, we enhance standard DiT models by introducing **Skip-DiT**, which incorporates skip branches to improve feature smoothness. We also propose **Skip-Cache**, a method that leverages skip branches to cache DiT features across timesteps during inference. The effectiveness of our approach is validated on various DiT backbones for both video and image generation, demonstrating how skip branches preserve generation quality while achieving significant speedup. Experimental results show that **Skip-Cache** provides a 1.5x speedup with minimal computational cost and a 2.2x speedup with only a slight reduction in quantitative metrics. All the codes and checkpoints are publicly available at [huggingface](https://huggingface.co/GuanjieChen/Skip-DiT/tree/main) and [github](https://github.com/OpenSparseLLMs/Skip-DiT.git). More visualizations can be found [here](#visualization).
14
 
15
  ### Pipeline
16
  ![pipeline](visuals/pipeline.jpg)
 
18
 
19
  ### Feature Smoothness
20
  ![feature](visuals/feature.jpg)
21
+ Feature smoothness analysis of DiT in the class-to-video generation task using DDPM. Normalized disturbances, controlled by strength coefficients $\alpha$ and $\beta$, are introduced to the model with and without skip connections. We compare the similarity between the original and perturbed features. The feature difference surface of Latte, with and without skip connections, is visualized in steps 10 and 250 of DDPM. The results show significantly better feature smoothness in Skip-DiT. Furthermore, we identify feature smoothness as a critical factor limiting the effectiveness of cross-timestep feature caching in DiT. This insight provides a deeper understanding of caching efficiency and its impact on performance.
22
 
23
  ### Pretrained Models
24
  | Model | Task | Training Data | Backbone | Size(G) | Skip-Cache |