How to quantize the hunyuan model to fp8

#1
by hz094 - opened

Hi sir, Thank for the excellent work, I am curious about how you quantize the hunyuan model, may you show more details?

you need torch and llama.cpp; could try to convert the safetensors to gguf and test it first; simply execute: ggc t

Screenshot 2024-12-27 001107.png

Screenshot 2024-12-27 001148.png

Sign up or log in to comment