I'm unable to make these models work

#8
by hdnh2006 - opened

Hello,

I have tried both 7b and 2b models and I am unable to correctly make them work. I have tried your official GPTQ quantization, I quantized myself a GGUF quantization and the results are poor.

With your GPTQ quantization

image.png

With my GGUF quantization

I have made a YouTube video (in spanish) about it in case you want to check it: https://youtu.be/CPkvcREEMc8

Without quantization

I have a RTX 4060 16GB, so I tried the 2b parameter at full precsion. I don't have enough hardware for the 7b parameter in full precision and the results are still the same

image.png

is there any change your provide the benchmarks?

Thanks in advance.

Sign up or log in to comment