16ch-VAE

Disclaimer: this VAE is not intended to be a replacement for SD3's VAE since the latent spaces are entirely different.

A fully open source 16ch VAE reproduction for the SD3. Useful for people who are building their own image generation models and need an off-the-shelf VAE. Natively trained in fp16.

VAE rFID PSNR LPIPS
SD1.5 VAE 0.3131 26.4332 0.0328
SDXL VAE 0.3511 26.7577 0.032
SD3 VAE 0.0257 30.3231 0.0132
16ch-VAE 0.0667 31.5151 0.0136
16ch-VAE with FFT* 0.1584 31.0542 0.0281

Usage

Awaiting https://github.com/huggingface/diffusers/pull/8769 in diffusers!

Downloads last month
33
Inference API
Unable to determine this model’s pipeline type. Check the docs .

Evaluation results