FractalGPT
/

RuQwen2.5-3B-Instruct-AWQ

Text Generation

text-generation-inference

Inference Endpoints

4-bit precision

Model card Files Files and versions Community

marattt commited on 4 days ago

Commit

fcc96b1

•

1 Parent(s): 6a32fcf

Update README.md

Files changed (1) hide show

README.md +1 -0

README.md CHANGED Viewed

@@ -29,6 +29,7 @@ pipeline_tag: text-generation
 - **Layers**: 36
 - **Attention Heads (GQA)**: 24 for Q, 4 for KV
 - **Context Length**: Supports a full context of 131,072 tokens and generation of up to 8,192 tokens
 ### Requirements
 The code of Qwen2.5 has been in the latest Hugging face transformers and we advise you to use the latest version of transformers.

 - **Layers**: 36
 - **Attention Heads (GQA)**: 24 for Q, 4 for KV
 - **Context Length**: Supports a full context of 131,072 tokens and generation of up to 8,192 tokens
+- **Quantization**: AWQ 4 bit
 ### Requirements
 The code of Qwen2.5 has been in the latest Hugging face transformers and we advise you to use the latest version of transformers.