Update README.md
Browse files
README.md
CHANGED
@@ -29,7 +29,7 @@ pipeline_tag: text-generation
|
|
29 |
- **Type**: Instruction-tuned Causal Language Model
|
30 |
- **Training Stages**: Pretraining & Instruction Tuning
|
31 |
- **Architecture**: Transformer with RoPE, SwiGLU, RMSNorm, and Attention QKV bias
|
32 |
-
- **Layers**:
|
33 |
- **Attention Heads (GQA)**: 16 for Q and 2 for KV
|
34 |
- **Context Length**: Supports a full context of 131,072 tokens and generation of up to 8,192 tokens
|
35 |
- **Quantization**: AWQ 4 bit
|
|
|
29 |
- **Type**: Instruction-tuned Causal Language Model
|
30 |
- **Training Stages**: Pretraining & Instruction Tuning
|
31 |
- **Architecture**: Transformer with RoPE, SwiGLU, RMSNorm, and Attention QKV bias
|
32 |
+
- **Layers**: 36
|
33 |
- **Attention Heads (GQA)**: 16 for Q and 2 for KV
|
34 |
- **Context Length**: Supports a full context of 131,072 tokens and generation of up to 8,192 tokens
|
35 |
- **Quantization**: AWQ 4 bit
|