sail
/

Sailor-0.5B-Chat-gguf

@@ -61,6 +61,22 @@ Through systematic experiments to determine the weights of different languages,
 The approach boosts their performance on SEA languages while maintaining proficiency in English and Chinese without significant compromise.
 Finally, we continually pre-train the Qwen1.5-0.5B model with 400 Billion tokens, and other models with 200 Billion tokens to obtain the Sailor models.
 ### How to run with `llama.cpp`
 ```shell

 The approach boosts their performance on SEA languages while maintaining proficiency in English and Chinese without significant compromise.
 Finally, we continually pre-train the Qwen1.5-0.5B model with 400 Billion tokens, and other models with 200 Billion tokens to obtain the Sailor models.
+### GGUF model list
+As for this 0.5B model, we only recommend the 8-bit and 16-bit qquf models for most purpose❗
+| Name                                                         | Quant method | Bits | Size    | Use case                                                     |
+| ------------------------------------------------------------ | ------------ | ---- | ------- | ------------------------------------------------------------ |
+| [ggml-model-Q2_K.gguf](https://huggingface.co/sail/Sailor-0.5B-Chat-gguf/blob/main/ggml-model-Q2_K.gguf) | Q2_K         | 2    | 298 MB  | smallest, significant quality loss |
+| [ggml-model-Q3_K_L.gguf](https://huggingface.co/sail/Sailor-0.5B-Chat-gguf/blob/main/ggml-model-Q3_K_L.gguf) | Q3_K_L       | 3    | 364 MB  | small, substantial quality loss                              |
+| [ggml-model-Q3_K_M.gguf](https://huggingface.co/sail/Sailor-0.5B-Chat-gguf/blob/main/ggml-model-Q3_K_M.gguf) | Q3_K_M       | 3    | 350 MB  | very small, balanced quality                                 |
+| [ggml-model-Q3_K_S.gguf](https://huggingface.co/sail/Sailor-0.5B-Chat-gguf/blob/main/ggml-model-Q3_K_S.gguf) | Q3_K_S       | 3    | 333 MB  | very small, high quality loss                                |
+| [ggml-model-Q4_K_M.gguf](https://huggingface.co/sail/Sailor-0.5B-Chat-gguf/blob/main/ggml-model-Q4_K_M.gguf) | Q4_K_M       | 4    | 407 MB  | small, balanced quality                                      |
+| [ggml-model-Q4_K_S.gguf](https://huggingface.co/sail/Sailor-0.5B-Chat-gguf/blob/main/ggml-model-Q4_K_S.gguf) | Q4_K_S       | 4    | 397 MB  | very small, greater quality loss                             |
+| [ggml-model-Q5_K_M.gguf](https://huggingface.co/sail/Sailor-0.5B-Chat-gguf/blob/main/ggml-model-Q5_K_M.gguf) | Q5_K_M       | 5    | 459 MB  | small, balanced quality                                      |
+| [ggml-model-Q5_K_S.gguf](https://huggingface.co/sail/Sailor-0.5B-Chat-gguf/blob/main/ggml-model-Q5_K_S.gguf) | Q5_K_S       | 5    | 453 MB  | small, very low quality loss                                 |
+| [ggml-model-Q6_K.gguf](https://huggingface.co/sail/Sailor-0.5B-Chat-gguf/blob/main/ggml-model-Q6_K.gguf) | Q6_K         | 6    | 515 MB  | small, extremely low quality loss                            |
+| [ggml-model-Q8_0.gguf](https://huggingface.co/sail/Sailor-0.5B-Chat-gguf/blob/main/ggml-model-Q8_0.gguf) | Q8_0         | 8    | 665 MB  | small, extremely low quality loss                            |
+| [ggml-model-f16.gguf](https://huggingface.co/sail/Sailor-0.5B-Chat-gguf/blob/main/ggml-model-f16.gguf) | f16          | 16   | 1.25 GB | original size, no quality loss                               |
 ### How to run with `llama.cpp`
 ```shell