Cran-May/tempmotacilla-cinerea-0308
#764
by
Cran-May
- opened
By the way, is it possible to offer a Q4_K-L quant like bartowski's work?
Compare to Q4_K, Q4_K-L usually has a better quality.
./llama-quantize --imatrix model.imatrix --output-tensor-type q8_0 --token-embedding-type q8_0 ./Model-Conversion-F32.gguf ./Model-Quant-Q4_K_M.gguf Q4_K_M
It's queued!
Compare to Q4_K, Q4_K-L usually has a better quality.
And so has Q5_K_S, with probably much better quality/bit ratio. The non-standard format(s) you request have basically been discredited a long time ago. See https://huggingface.co/mradermacher/Rombo-LLM-V3.0-Qwen-32b-i1-GGUF/discussions/1 and https://old.reddit.com/r/KoboldAI/comments/1j6bx40/the_highest_quality_quantization_varient_gguf_and/ for recent discussions.
mradermacher
changed discussion status to
closed