KBlueLeaf/EQ-SDXL-VAE · About finetuning current SDXL weights by the EQ-SDXL-VAE

4 days ago

•

You said in intro: "You can try to use this VAE to finetune your sdxl model and expect a better final result, but it may require lot of time to achieve it...". I am still very interesting in utilizing existed model weights. So my question is lot of how? I have ~500k samples and how many iterations are required to align the UNet of SDXL with new latent space?

KBlueLeaf

Owner 3 days ago

lot of training time.
ALTHOUGH some reported result is "few k step with a small lora works well"
Your setup is definitely ok

eeyrw

3 days ago

•

edited 3 days ago

I just thought dataset like LAION-400M needed. Finally it turns out in scale of kilo samples said to be working.

KBlueLeaf

Owner 1 day ago

I just thought dataset like LAION-400M needed. Finally it turns out in scale of kilo samples said to be working.

My thought is like danbooru (8M) or CC 12M
and yes, I'm also surprising that few k or just few dozen k is enough

eeyrw

about 24 hours ago

•

edited about 24 hours ago

I spent a night to have a quick try by finetuning a lora about 48k iterations and get very poor result and I suspect that there is something wrong in finetuning process. Do I need modify my training script in aspect of VAE? Because I notice there are some parameters not used by oringinal VAE such as:

            "shift_factor": 0.8640247167934477,

In my training script VAE encoding part goes like this:

            model_input = vae.encode(pixel_values).latent_dist.sample()
            model_input = model_input * vae.config.scaling_factor 
            model_input = model_input.to(weight_dtype)

should I change the code into:

            model_input = model_input * vae.config.scaling_factor + vae.config.shift_factor

?
By the way, I use StableDiffusionXLPipeline from diffusers for inference.