BlinkDL
/

clip-guided-binary-autoencoder

Model card Files Files and versions Community

BlinkDL commited on Sep 29, 2022

Commit

bc54ee8

·

1 Parent(s): 6f9f76f

Update README.md

Files changed (1) hide show

README.md +2 -4

README.md CHANGED Viewed

@@ -4,10 +4,8 @@ license: apache-2.0
 This model can encode 224x224 RGB image into 28x28x13bit (1274 bytes) latent. The compression rate is 28x28x13/(224x224x24)=1/118, or 0.203 bpp (same as VQGAN_f8_8192).
-12M params for Encoder + Decoder. Trained on LAION-Aesthetics V2 5+ for 60M images.
-Guided by https://huggingface.co/laion/CLIP-ViT-B-32-laion2B-s34B-b79K (it's great. better than OpenAI CLIP B/32) and https://github.com/dingkeyan93/DISTS.
-No GAN loss. So probably the image is slightly blurred in some cases?
 (still training. final checkpt will be better)

 This model can encode 224x224 RGB image into 28x28x13bit (1274 bytes) latent. The compression rate is 28x28x13/(224x224x24)=1/118, or 0.203 bpp (same as VQGAN_f8_8192).
+12M params for Encoder + Decoder. Trained on LAION-Aesthetics V2 5+ for 130M images.
+Guided by https://huggingface.co/laion/CLIP-ViT-B-32-laion2B-s34B-b79K (it's great. better than OpenAI CLIP B/32) and https://github.com/dingkeyan93/DISTS. No GAN loss.
 (still training. final checkpt will be better)