4-bit quantization
#2
by
ibalampanis
- opened
Hello!
How you managed to quantize it to Q4_K _M? llama.cpp offers only q8_0, f16 and f32, right?
Thanks.
Hello!
How you managed to quantize it to Q4_K _M? llama.cpp offers only q8_0, f16 and f32, right?
Thanks.