Qwen
/

QwQ-32B-Preview

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

eltociear commited on 25 days ago

Commit

1b83e26

•

1 Parent(s): 1032e81

docs: update README.md

Paramaters -> Parameters

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -25,7 +25,7 @@ library_name: transformers
 - Training Stage: Pretraining & Post-training
 - Architecture: transformers with RoPE, SwiGLU, RMSNorm, and Attention QKV bias
 - Number of Parameters: 32.5B
-- Number of Paramaters (Non-Embedding): 31.0B
 - Number of Layers: 64
 - Number of Attention Heads (GQA): 40 for Q and 8 for KV
 - Context Length: Full 32,768 tokens

 - Training Stage: Pretraining & Post-training
 - Architecture: transformers with RoPE, SwiGLU, RMSNorm, and Attention QKV bias
 - Number of Parameters: 32.5B
+- Number of Parameters (Non-Embedding): 31.0B
 - Number of Layers: 64
 - Number of Attention Heads (GQA): 40 for Q and 8 for KV
 - Context Length: Full 32,768 tokens