Lmxyy commited on
Commit
ef2ed40
·
verified ·
1 Parent(s): df765bc

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -4
README.md CHANGED
@@ -32,11 +32,12 @@ library_name: diffusers
32
  <div style="display: flex; justify-content: center; align-items: center; text-align: center;">
33
  <a href="https://arxiv.org/abs/2411.05007">[Paper]</a>&ensp;
34
  <a href='https://github.com/mit-han-lab/nunchaku'>[Code]</a>&ensp;
 
35
  <a href='https://hanlab.mit.edu/projects/svdquant'>[Website]</a>&ensp;
36
  <a href='https://hanlab.mit.edu/blog/svdquant'>[Blog]</a>
37
  </div>
38
 
39
- ![teaser](https://github.com/mit-han-lab/nunchaku/raw/refs/heads/main/assets/teaser.jpg)
40
  SVDQuant is a post-training quantization technique for 4-bit weights and activations that well maintains visual fidelity. On 12B FLUX.1-dev, it achieves 3.6× memory reduction compared to the BF16 model. By eliminating CPU offloading, it offers 8.7× speedup over the 16-bit model when on a 16GB laptop 4090 GPU, 3× faster than the NF4 W4A16 baseline. On PixArt-∑, it demonstrates significantly superior visual quality over other W4A4 or even W4A8 baselines. "E2E" means the end-to-end latency including the text encoder and VAE decoder.
41
 
42
  ## Method
@@ -78,9 +79,7 @@ image.save("example.png")
78
  ```
79
 
80
  ### Comfy UI
81
-
82
- ![comfyui](https://github.com/mit-han-lab/nunchaku/blob/main/assets/comfyui.jpg?raw=true)
83
- Please check [comfyui/README.md](comfyui/README.md) for the usage.
84
 
85
  ## Limitations
86
 
 
32
  <div style="display: flex; justify-content: center; align-items: center; text-align: center;">
33
  <a href="https://arxiv.org/abs/2411.05007">[Paper]</a>&ensp;
34
  <a href='https://github.com/mit-han-lab/nunchaku'>[Code]</a>&ensp;
35
+ <a href='https://svdquant.mit.edu'>[Demo]</a>&ensp;
36
  <a href='https://hanlab.mit.edu/projects/svdquant'>[Website]</a>&ensp;
37
  <a href='https://hanlab.mit.edu/blog/svdquant'>[Blog]</a>
38
  </div>
39
 
40
+ ![teaser](https://github.com/mit-han-lab/nunchaku/blob/main/app/flux.1/depth_canny/assets/demo.jpg)
41
  SVDQuant is a post-training quantization technique for 4-bit weights and activations that well maintains visual fidelity. On 12B FLUX.1-dev, it achieves 3.6× memory reduction compared to the BF16 model. By eliminating CPU offloading, it offers 8.7× speedup over the 16-bit model when on a 16GB laptop 4090 GPU, 3× faster than the NF4 W4A16 baseline. On PixArt-∑, it demonstrates significantly superior visual quality over other W4A4 or even W4A8 baselines. "E2E" means the end-to-end latency including the text encoder and VAE decoder.
42
 
43
  ## Method
 
79
  ```
80
 
81
  ### Comfy UI
82
+ roW
 
 
83
 
84
  ## Limitations
85