BSC-LT/salamandra-7b-instruct · I'm unable to make these models work

Hello,

I have tried both 7b and 2b models and I am unable to correctly make them work. I have tried your official GPTQ quantization, I quantized myself a GGUF quantization and the results are poor.

With your GPTQ quantization

With my GGUF quantization

I have made a YouTube video (in spanish) about it in case you want to check it: https://youtu.be/CPkvcREEMc8

Without quantization

I have a RTX 4060 16GB, so I tried the 2b parameter at full precsion. I don't have enough hardware for the 7b parameter in full precision and the results are still the same

is there any change your provide the benchmarks?

Thanks in advance.