About

static quantize of https://huggingface.co/Vezora/Mistral-22B-v0.2 iQ Quants can be found here(Richard Erkhov's work): https://huggingface.co/RichardErkhov/Vezora_-_Mistral-22B-v0.2-gguf

Provided Quants

Filename Quant type File Size Description
Mistral-22B-v0.2-Q5_K_M.gguf Q5_K_M 15.71GB High quality, recommended.
Mistral-22B-v0.2-Q4_K_M.gguf Q4_K_M 13.33GB Good quality, uses about 4.83 bits per weight, recommended.
Mistral-22B-v0.2-Q4_K_S.gguf Q4_K_S 12.65GB Slightly lower performance than Q4_K_M, fastest, best choice for 16G RAM devices, recommended.
Mistral-22B-v0.2-Q3_K_M.gguf Q3_K_M 10.75GB Even lower quality.
Mistral-22B-v0.2-Q2_K.gguf Q2_K 8.26GB Very low quality.
Downloads last month
4
GGUF
Model size
22.2B params
Architecture
llama

2-bit

4-bit

5-bit

Inference Examples
Inference API (serverless) does not yet support gguf models for this pipeline type.

Model tree for NLPark/Mistral-22B-v0.2-GGUF

Quantized
(4)
this model