Update README.md
Browse files
README.md
CHANGED
@@ -61,6 +61,22 @@ Through systematic experiments to determine the weights of different languages,
|
|
61 |
The approach boosts their performance on SEA languages while maintaining proficiency in English and Chinese without significant compromise.
|
62 |
Finally, we continually pre-train the Qwen1.5-0.5B model with 400 Billion tokens, and other models with 200 Billion tokens to obtain the Sailor models.
|
63 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
64 |
### How to run with `llama.cpp`
|
65 |
|
66 |
```shell
|
|
|
61 |
The approach boosts their performance on SEA languages while maintaining proficiency in English and Chinese without significant compromise.
|
62 |
Finally, we continually pre-train the Qwen1.5-0.5B model with 400 Billion tokens, and other models with 200 Billion tokens to obtain the Sailor models.
|
63 |
|
64 |
+
### GGUF model list
|
65 |
+
As for this 0.5B model, we only recommend the 8-bit and 16-bit qquf models for most purpose❗
|
66 |
+
| Name | Quant method | Bits | Size | Use case |
|
67 |
+
| ------------------------------------------------------------ | ------------ | ---- | ------- | ------------------------------------------------------------ |
|
68 |
+
| [ggml-model-Q2_K.gguf](https://huggingface.co/sail/Sailor-0.5B-Chat-gguf/blob/main/ggml-model-Q2_K.gguf) | Q2_K | 2 | 298 MB | smallest, significant quality loss |
|
69 |
+
| [ggml-model-Q3_K_L.gguf](https://huggingface.co/sail/Sailor-0.5B-Chat-gguf/blob/main/ggml-model-Q3_K_L.gguf) | Q3_K_L | 3 | 364 MB | small, substantial quality loss |
|
70 |
+
| [ggml-model-Q3_K_M.gguf](https://huggingface.co/sail/Sailor-0.5B-Chat-gguf/blob/main/ggml-model-Q3_K_M.gguf) | Q3_K_M | 3 | 350 MB | very small, balanced quality |
|
71 |
+
| [ggml-model-Q3_K_S.gguf](https://huggingface.co/sail/Sailor-0.5B-Chat-gguf/blob/main/ggml-model-Q3_K_S.gguf) | Q3_K_S | 3 | 333 MB | very small, high quality loss |
|
72 |
+
| [ggml-model-Q4_K_M.gguf](https://huggingface.co/sail/Sailor-0.5B-Chat-gguf/blob/main/ggml-model-Q4_K_M.gguf) | Q4_K_M | 4 | 407 MB | small, balanced quality |
|
73 |
+
| [ggml-model-Q4_K_S.gguf](https://huggingface.co/sail/Sailor-0.5B-Chat-gguf/blob/main/ggml-model-Q4_K_S.gguf) | Q4_K_S | 4 | 397 MB | very small, greater quality loss |
|
74 |
+
| [ggml-model-Q5_K_M.gguf](https://huggingface.co/sail/Sailor-0.5B-Chat-gguf/blob/main/ggml-model-Q5_K_M.gguf) | Q5_K_M | 5 | 459 MB | small, balanced quality |
|
75 |
+
| [ggml-model-Q5_K_S.gguf](https://huggingface.co/sail/Sailor-0.5B-Chat-gguf/blob/main/ggml-model-Q5_K_S.gguf) | Q5_K_S | 5 | 453 MB | small, very low quality loss |
|
76 |
+
| [ggml-model-Q6_K.gguf](https://huggingface.co/sail/Sailor-0.5B-Chat-gguf/blob/main/ggml-model-Q6_K.gguf) | Q6_K | 6 | 515 MB | small, extremely low quality loss |
|
77 |
+
| [ggml-model-Q8_0.gguf](https://huggingface.co/sail/Sailor-0.5B-Chat-gguf/blob/main/ggml-model-Q8_0.gguf) | Q8_0 | 8 | 665 MB | small, extremely low quality loss |
|
78 |
+
| [ggml-model-f16.gguf](https://huggingface.co/sail/Sailor-0.5B-Chat-gguf/blob/main/ggml-model-f16.gguf) | f16 | 16 | 1.25 GB | original size, no quality loss |
|
79 |
+
|
80 |
### How to run with `llama.cpp`
|
81 |
|
82 |
```shell
|