prince-canuma
/

Mixtral-8x22B-v0.1-4bit

Text Generation

Mixture of Experts

text-generation-inference

Inference Endpoints

4-bit precision

Model card Files Files and versions Community

prince-canuma commited on Apr 10, 2024

Commit

7d99d94

·

verified ·

1 Parent(s): 438f958

Update README.md

Files changed (1) hide show

README.md +3 -2

README.md CHANGED Viewed

@@ -10,8 +10,6 @@ tags:
 - moe
 ---
 # Model Card for Mixtral-8x22B
-This repo is derived from the full-precision model here: [v2ray/Mixtral-8x22B-v0.1](https://huggingface.co/v2ray/Mixtral-8x22B-v0.1), you can download from there if you want to. \
 The Mixtral-8x22B Large Language Model (LLM) is a pretrained generative Sparse Mixture of Experts.
 Model details:
@@ -21,6 +19,9 @@ Model details:
 - 🤓 32K vocab size
 - ✂️ Similar tokenizer as 7B
 ## Run the model in 4-bit precision
 ```python
 from transformers import AutoModelForCausalLM, AutoTokenizer

 - moe
 ---
 # Model Card for Mixtral-8x22B
 The Mixtral-8x22B Large Language Model (LLM) is a pretrained generative Sparse Mixture of Experts.
 Model details:
 - 🤓 32K vocab size
 - ✂️ Similar tokenizer as 7B
+Model quantized and added by [Prince Canuma](https://twitter.com/Prince_Canuma) using the full-precision model here: [v2ray/Mixtral-8x22B-v0.1](https://huggingface.co/v2ray/Mixtral-8x22B-v0.1).
 ## Run the model in 4-bit precision
 ```python
 from transformers import AutoModelForCausalLM, AutoTokenizer