feihu.hf commited on
Commit
edc3bdc
·
1 Parent(s): 28409f2

update config.json

Browse files
Files changed (1) hide show
  1. README.md +2 -1
README.md CHANGED
@@ -34,7 +34,8 @@ Qwen2.5-Coder is the latest series of Code-Specific Qwen large language models (
34
  - Number of Paramaters (Non-Embedding): 1.31B
35
  - Number of Layers: 28
36
  - Number of Attention Heads (GQA): 12 for Q and 2 for KV
37
- {{GGUF_LONG_SUMMARY}}
 
38
  - Quantization: q2_K, q3_K_M, q4_0, q4_K_M, q5_0, q5_K_M, q6_K, q8_0
39
 
40
  For more details, please refer to our [blog](https://qwenlm.github.io/blog/qwen2.5-coder/), [GitHub](https://github.com/QwenLM/Qwen2.5-Coder), and [Documentation](https://qwen.readthedocs.io/en/latest/).
 
34
  - Number of Paramaters (Non-Embedding): 1.31B
35
  - Number of Layers: 28
36
  - Number of Attention Heads (GQA): 12 for Q and 2 for KV
37
+ - Context Length: Full 32,768 tokens
38
+ - Note: Currently, only vLLM supports YARN for length extrapolating. If you want to process sequences up to 131,072 tokens, please refer to non-GGUF models.
39
  - Quantization: q2_K, q3_K_M, q4_0, q4_K_M, q5_0, q5_K_M, q6_K, q8_0
40
 
41
  For more details, please refer to our [blog](https://qwenlm.github.io/blog/qwen2.5-coder/), [GitHub](https://github.com/QwenLM/Qwen2.5-Coder), and [Documentation](https://qwen.readthedocs.io/en/latest/).