Update README.md
Browse files
README.md
CHANGED
@@ -29,8 +29,8 @@ pipeline_tag: text-generation
|
|
29 |
- **Type**: Instruction-tuned Causal Language Model
|
30 |
- **Training Stages**: Pretraining & Instruction Tuning
|
31 |
- **Architecture**: Transformer with RoPE, SwiGLU, RMSNorm, and Attention QKV bias
|
32 |
-
- **Layers**:
|
33 |
-
- **Attention Heads (GQA)**:
|
34 |
- **Context Length**: Supports a full context of 131,072 tokens and generation of up to 8,192 tokens
|
35 |
- **Quantization**: AWQ 4 bit
|
36 |
- **Base model**: Qwen/Qwen2.5-3B-Instruct-AWQ
|
@@ -109,7 +109,7 @@ def generate(messages):
|
|
109 |
generated_text = tokenizer.decode(output[0], skip_special_tokens=False)#.split('<|im_start|>assistant')[1]
|
110 |
return generated_text
|
111 |
|
112 |
-
model_name = 'FractalGPT/RuQwen2.5-
|
113 |
model = Qwen2ForCausalLMWithBias.from_pretrained(model_name, torch_dtype=torch.float16)
|
114 |
tokenizer = AutoTokenizer.from_pretrained(model_name)
|
115 |
|
|
|
29 |
- **Type**: Instruction-tuned Causal Language Model
|
30 |
- **Training Stages**: Pretraining & Instruction Tuning
|
31 |
- **Architecture**: Transformer with RoPE, SwiGLU, RMSNorm, and Attention QKV bias
|
32 |
+
- **Layers**: 64
|
33 |
+
- **Attention Heads (GQA)**: 40 for Q and 8 for KV
|
34 |
- **Context Length**: Supports a full context of 131,072 tokens and generation of up to 8,192 tokens
|
35 |
- **Quantization**: AWQ 4 bit
|
36 |
- **Base model**: Qwen/Qwen2.5-3B-Instruct-AWQ
|
|
|
109 |
generated_text = tokenizer.decode(output[0], skip_special_tokens=False)#.split('<|im_start|>assistant')[1]
|
110 |
return generated_text
|
111 |
|
112 |
+
model_name = 'FractalGPT/RuQwen2.5-32B-Instruct-AWQ'
|
113 |
model = Qwen2ForCausalLMWithBias.from_pretrained(model_name, torch_dtype=torch.float16)
|
114 |
tokenizer = AutoTokenizer.from_pretrained(model_name)
|
115 |
|