Do I download all parts to choose between f16 and q8 I can't find the quants.

#3
by JCBh - opened

title .... thank you for your work on this

@JCBh This is the *.safetensors repo which is the model without quantization in its original format, you need all the parts to load the model by code and you can do whatever you like with it, such as quantize it yourself.

If you want to download a quantized version you need to look for the format, example *.GGUF format, and use the relevant software to load the model (LM Studio for example), check tutorials for how this works.

This is the repo for GGUF:
https://huggingface.co/Orenguteng/Llama-3.1-8B-Lexi-Uncensored-GGUF

Orenguteng changed discussion status to closed

Sign up or log in to comment