Stable Diffusion 1.5 fine tuned VAE decoder for better pixel art generation by aliasing the output of the decoder.

Fine tuning was done by training 50 thousand images for 1 epoch effective batch size 12. I preprocessed the images to quantize each 8x8 tile to its average color. On a RTX3090, this took about 4 hours to fine-tune. Used only MSE loss at 1e-5 learning rate. The training data set was just generated from other stable diffusion models, mostly cartoon-like images.

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support