Q8 quantized version?
Hello,
Thank you very much for making this model, it works very well!
However the Q4 quantized version seems quite a bit more limited compared to the results you posted, especially on reasoning abilities. Would it be possible for you to publish a Q8_0 quantized version please?
Of course, I will be exporting Q8 quantized version soon.
hi @lrq3000 , I created Q8 quantized version but if you want to using on vllm should be using awq version instead of.
Thank you so much! It is awesome! The Q8 version is so much more powerful in terms of reasoning abilities, thank you for generating it! Thank you also for the AWQ version, I will check it out if I want to host it :-)
Wow I am very eager to see the new release :D I have now subscribed, keep up your awesome work!