fsaudm
/

Meta-Llama-3.1-70B-Instruct-NF4

Text Generation

text-generation-inference

Inference Endpoints

4-bit precision

Model card Files Files and versions Community

fsaudm commited on Aug 15, 2024

Commit

c11b64d

•

1 Parent(s): f37de11

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -20,7 +20,7 @@ tags:
 # Model Card for Model ID
 <!-- Provide a quick summary of what the model is/does. -->
-This is a quantized version of `Llama 3.1 70B Instruct`. Quantization to 8-bit using `bistandbytes` and `accelerate`.

 # Model Card for Model ID
 <!-- Provide a quick summary of what the model is/does. -->
+This is a quantized version of `Llama 3.1 70B Instruct`. Quantization to **4-bit** using `bistandbytes` and `accelerate`.