Stable Diffusion 1.5 fine tuned VAE decoder for better pixel art generation by aliasing the output of the decoder.

comparison

Fine tuning was done by training 50 thousand images for 1 epoch effective batch size 12. I preprocessed the images to quantize each 8x8 tile to its average color. On a RTX3090, this took about 4 hours to fine-tune. Used only MSE loss at 1e-5 learning rate. The training data set was just generated from other stable diffusion models, mostly cartoon-like images.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.