JamePeng2023
/

Llama-3.2-3B-Instruct-abliterated-SpinQuant-w4a8

Model card Files Files and versions Community

JamePeng2023 commited on Feb 16

Commit

eeeeeea

·

verified ·

1 Parent(s): 72d7fe9

Update README.md

Files changed (1) hide show

README.md +5 -1

README.md CHANGED Viewed

@@ -3,6 +3,10 @@ license: llama3.2
 base_model:
 - huihui-ai/Llama-3.2-3B-Instruct-abliterated
 ---
-Quantize the model from https://huggingface.co/huihui-ai/Llama-3.2-3B-Instruct-abliterated using the SpinQuant quantization method from https://github.com/facebookresearch/SpinQuant for on-device deployment in an Android app with Executorch.
 2025-02-16 19:40:24,099 - spinquant - INFO - wiki2 ppl is: 11.502239227294922

 base_model:
 - huihui-ai/Llama-3.2-3B-Instruct-abliterated
 ---
+Using the SpinQuant quantization method from https://github.com/facebookresearch/SpinQuant, I quantized the Llama-3.2-3B-Instruct-abliterated model from https://huggingface.co/huihui-ai/Llama-3.2-3B-Instruct-abliterated.
+This quantization is for on-device deployment to Android apps with Executorch.
+To make it easier for everyone to quickly test and deploy the Executorch on-device demo, I've also converted the quantized PTH file to PTE format and uploaded it.
 2025-02-16 19:40:24,099 - spinquant - INFO - wiki2 ppl is: 11.502239227294922