feihu.hf
commited on
Commit
·
edc3bdc
1
Parent(s):
28409f2
update config.json
Browse files
README.md
CHANGED
@@ -34,7 +34,8 @@ Qwen2.5-Coder is the latest series of Code-Specific Qwen large language models (
|
|
34 |
- Number of Paramaters (Non-Embedding): 1.31B
|
35 |
- Number of Layers: 28
|
36 |
- Number of Attention Heads (GQA): 12 for Q and 2 for KV
|
37 |
-
|
|
|
38 |
- Quantization: q2_K, q3_K_M, q4_0, q4_K_M, q5_0, q5_K_M, q6_K, q8_0
|
39 |
|
40 |
For more details, please refer to our [blog](https://qwenlm.github.io/blog/qwen2.5-coder/), [GitHub](https://github.com/QwenLM/Qwen2.5-Coder), and [Documentation](https://qwen.readthedocs.io/en/latest/).
|
|
|
34 |
- Number of Paramaters (Non-Embedding): 1.31B
|
35 |
- Number of Layers: 28
|
36 |
- Number of Attention Heads (GQA): 12 for Q and 2 for KV
|
37 |
+
- Context Length: Full 32,768 tokens
|
38 |
+
- Note: Currently, only vLLM supports YARN for length extrapolating. If you want to process sequences up to 131,072 tokens, please refer to non-GGUF models.
|
39 |
- Quantization: q2_K, q3_K_M, q4_0, q4_K_M, q5_0, q5_K_M, q6_K, q8_0
|
40 |
|
41 |
For more details, please refer to our [blog](https://qwenlm.github.io/blog/qwen2.5-coder/), [GitHub](https://github.com/QwenLM/Qwen2.5-Coder), and [Documentation](https://qwen.readthedocs.io/en/latest/).
|