Update README.md
Browse files
README.md
CHANGED
@@ -30,7 +30,7 @@ pipeline_tag: text-generation
|
|
30 |
- **Training Stages**: Pretraining & Instruction Tuning
|
31 |
- **Architecture**: Transformer with RoPE, SwiGLU, RMSNorm, and Attention QKV bias
|
32 |
- **Layers**: 64
|
33 |
-
- **Attention Heads (GQA)**:
|
34 |
- **Context Length**: Supports a full context of 131,072 tokens and generation of up to 8,192 tokens
|
35 |
- **Quantization**: AWQ 4 bit
|
36 |
- **Base model**: Qwen/Qwen2.5-3B-Instruct-AWQ
|
@@ -109,7 +109,7 @@ def generate(messages):
|
|
109 |
generated_text = tokenizer.decode(output[0], skip_special_tokens=False)#.split('<|im_start|>assistant')[1]
|
110 |
return generated_text
|
111 |
|
112 |
-
model_name = 'FractalGPT/RuQwen2.5-
|
113 |
model = Qwen2ForCausalLMWithBias.from_pretrained(model_name, torch_dtype=torch.float16)
|
114 |
tokenizer = AutoTokenizer.from_pretrained(model_name)
|
115 |
|
|
|
30 |
- **Training Stages**: Pretraining & Instruction Tuning
|
31 |
- **Architecture**: Transformer with RoPE, SwiGLU, RMSNorm, and Attention QKV bias
|
32 |
- **Layers**: 64
|
33 |
+
- **Attention Heads (GQA)**: 16 for Q and 2 for KV
|
34 |
- **Context Length**: Supports a full context of 131,072 tokens and generation of up to 8,192 tokens
|
35 |
- **Quantization**: AWQ 4 bit
|
36 |
- **Base model**: Qwen/Qwen2.5-3B-Instruct-AWQ
|
|
|
109 |
generated_text = tokenizer.decode(output[0], skip_special_tokens=False)#.split('<|im_start|>assistant')[1]
|
110 |
return generated_text
|
111 |
|
112 |
+
model_name = 'FractalGPT/RuQwen2.5-3B-Instruct-AWQ'
|
113 |
model = Qwen2ForCausalLMWithBias.from_pretrained(model_name, torch_dtype=torch.float16)
|
114 |
tokenizer = AutoTokenizer.from_pretrained(model_name)
|
115 |
|