Do I download all parts to choose between f16 and q8 I can't find the quants.
#3
by
JCBh
- opened
title .... thank you for your work on this
@JCBh This is the *.safetensors repo which is the model without quantization in its original format, you need all the parts to load the model by code and you can do whatever you like with it, such as quantize it yourself.
If you want to download a quantized version you need to look for the format, example *.GGUF format, and use the relevant software to load the model (LM Studio for example), check tutorials for how this works.
This is the repo for GGUF:
https://huggingface.co/Orenguteng/Llama-3.1-8B-Lexi-Uncensored-GGUF
Orenguteng
changed discussion status to
closed