float8_e5m2 weights?

#4
by matatonic - opened

This is a really great model! Thanks for putting this together.

Can you upload float8_e5m2 weights as well, and/or the original BF16 version of the merged model also?

I'm curious as to how you did this/what tools you used-- not in a general sense, as I'm fully aware of how to bake in a VAE into a checkpoint-- but specifics to your process with this exact model.

I'm all self-taught on AI stuff (pandemic hit in 2020 and I got introduced to ChatGPT and Automatic1111 and it was all over and I was hooked, as I'm sure happened to tons of others on this site)... so much so that I have actually pursued it professionally, over the last 6 years, and currently am employed in the AI field training generative models for a small company. Nothing crazy, just started with helping recreate an art style for an asset class their former dev had "lost the info and prompts for", so they needed a quick fix to get it back-- point being is I'm not some AI whiz, I'm probably very remedial, at best, to someone with formal training or schooling.

...and while I'm aware of methods to bake VAEs into checkpoints, I'm more interested with YOUR process for doing it. As I'm sure you're well aware, the efficacy of doing something like this can be measured on such a wide margin of success and utter failure based on whatever script and parameters you used. And I sure would appreciate if you'd share with me either the config file and training/merging script you used, or the weights/recipe that was used, if you'd be so kind.

Reason I'm asking is that this is, by a wide margin, probably my favorite Schnell model that I've used. I even like this better than most dev finetunes I've encountered. And not just for the obvious convenience aspect, it's more when that is combined with the adherence to generating text from a text prompt (I've never "baked" a T5 into base model weights before), and am shocked how remarkably versatile this one is and how much of the original weights and overall imagery have been retained so god damned well. It's really a great accomplishment that I don't see nearly enough people using.

If you haven't given yourself a pat on the back for this one, you should-- because this shit is second to none. You really did a fantastic fucking job with this, no bullshit.

Shoot me a message if you're willing to hook a bro up with them recipes/configs/whatever black magic wizardry you used to pull this off as a one-file-to-rule-them-all flux model, because every time I use it, I'm more and more impressed with it, and I'd like to learn everything I can about why it works so well.

Thanks, in advance, for the reply. Even if it's a no, that's cool-- I just would be mad at myself for not even asking if I didn't.

And, FWIW, either way you answer: 'Yes' or 'No'-- thanks for making it. Sincerely.

If you have a ko-fi or paypal I'll gladly donate a few bucks, even if you don't want to share your secrets, because I like it THAT much.

Best,
-Chad

Sign up or log in to comment