GGUF for this one?

#5
by UnsignedLongshanks - opened

Hey man, appreciate your work. This model flew completely under my radar, probably everyone else's too given the size. I was curious if it would be possible to convert this one to GGUF format? I reckon quants wouldn't be needed since they probably wouldn't be as good as (or as popular as) the smaller Flux variants like Dev and Schnell, but even at full size this model is pretty similar size to the 30B LLM that I can run in GGUF format with KoboldCPP by offloading half its blocks to the second of my two RTX 4090s. Might have to dump CLIP/VAE to CPU but worse things could happen.

If you're busy to do it yourself I'd be happy to try it, lemme know if there's any tutorials you recommend, this seems a bit niche to find anything useful by googling. With llama.cpp it was pretty straightforward but I imagine this would be different. Thanks!

Hi. This model was mostly just an experiment for testing how well self-merges work, plus a very small bit of training to make it usable again. It's probably worse compared to base flux in all metrics except for size lol.

I could've sworn someone put up some quants on civitai for this model before but I can't find them now. If you want to take a shot at it, the ComfyUI-GGUF repo tools folder has most of the info you'll need on how to quant it yourself. It's pretty much just the llama.cpp repo with a patch on top to make it handle image models.

Sign up or log in to comment