kaitchup
/

Llama-3.3-70B-Instruct-AutoRound-GPTQ-4bit

Text Generation

text-generation-inference

4-bit precision

Model card Files Files and versions Community

bnjmnmarie commited on Dec 9, 2024

Commit

37a224e

·

verified ·

1 Parent(s): b751fd4

Update README.md

Files changed (1) hide show

README.md +21 -1

README.md CHANGED Viewed

@@ -1,4 +1,24 @@
 ---
 library_name: transformers
 license: llama3.3
----

 ---
+language:
+- en
 library_name: transformers
+tags:
+- auto-gptq
+- AutoRound
 license: llama3.3
+---
+## Model Details
+This is [meta-llama/Llama-3.3-70B-Instruct](https://huggingface.co/meta-llama/Llama-3.3-70B-Instruct) quantized with [AutoRound](https://github.com/intel/auto-round/tree/main) (symmetric quantization) and serialized with the GPTQ format in 4-bit. The model has been created, tested, and evaluated by The Kaitchup.
+Details on the quantization process and how to use the model here:
+[How to Quantize and Run Llama 3.3 70B Instruct on Your GPU](https://kaitchup.substack.com/p/how-to-quantize-and-run-llama-33)
+![Llama 3.3 70B Instruct_ Zero-Shot MMLU (Accuracy).png](https://cdn-uploads.huggingface.co/production/uploads/64b93e6bd6c468ac7536607e/0UOK_IsUinziw4GUMqq3_.png)
+![Llama 3.3 70B Instruct_ Model Size.png](https://cdn-uploads.huggingface.co/production/uploads/64b93e6bd6c468ac7536607e/hB8AdtD1DfVC90kOkYowz.png)
+- **Developed by:** [The Kaitchup](https://kaitchup.substack.com/)
+- **Language(s) (NLP):** English
+- **License:** Llama 3.3