8bit-coder/alpaca-7b-nativeEnhanced · Can we get the GPTQ quantized model?

Mar 30, 2023

•

edited Mar 30, 2023

Hello,

I noticed that you have experimented with quantization for your model, but the results were not as good as expected:

"When quantized to 4 bits, the model demonstrates unusual behavior, possibly due to its complexity. We suggest using a minimum quantization of 8 bits, although this has not been tested."

I recommend trying the new GPTQ quantization method, which includes the combined options "act-order" + "true-sequential" + "groupsize 128".
This method brings the quantized model's performance much closer to that of the 16-bit model.

Check out the following link for more information: https://github.com/qwopqwop200/GPTQ-for-LLaMa/tree/triton

If you decide to implement this quantization method, please consider uploading the resulting model to your repository. This way, users with both high-performance (16-bit) and lower-end computers (4-bit) can enjoy your models.

Best regards,

TheYuriLover changed discussion title from Can we get the GPTQ quantized version? to Can we get the GPTQ quantized model? Mar 30, 2023

DaveScream

Mar 30, 2023

+1

autobots

Apr 23, 2023

https://huggingface.co/autobots/alpaca-7b-native-enhanced-4bit/tree/main