Update README.md
Browse files
README.md
CHANGED
@@ -9,33 +9,51 @@ library_name: transformers
|
|
9 |
tags:
|
10 |
- mergekit
|
11 |
- merge
|
12 |
-
|
13 |
---
|
14 |
-
#
|
15 |
|
16 |
-
|
17 |
|
18 |
-
|
19 |
-
### Merge Method
|
20 |
|
21 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
22 |
|
23 |
-
|
24 |
|
25 |
-
|
26 |
-
|
27 |
-
|
28 |
-
|
29 |
-
|
30 |
|
31 |
-
|
32 |
|
33 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
34 |
|
35 |
-
|
36 |
|
37 |
-
|
|
|
38 |
|
|
|
|
|
39 |
base_model: Qwen/Qwen2.5-7B-Instruct-1M
|
40 |
dtype: bfloat16
|
41 |
merge_method: model_stock
|
@@ -46,6 +64,79 @@ models:
|
|
46 |
- model: bunnycore/Qwen2.5-7B-RRP-1M
|
47 |
- model: huihui-ai/Qwen2.5-7B-Instruct-1M-abliterated
|
48 |
tokenizer_source: Qwen/Qwen2.5-7B-Instruct-1M
|
|
|
49 |
|
|
|
50 |
|
|
|
|
|
|
|
|
|
|
|
51 |
```
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
9 |
tags:
|
10 |
- mergekit
|
11 |
- merge
|
12 |
+
license: mit
|
13 |
---
|
14 |
+
# ZeroXClem/Qwen2.5-7B-CelestialHarmony-1M
|
15 |
|
16 |
+
**ZeroXClem/Qwen2.5-7B-CelestialHarmony-1M** is a custom merged language model based on **Qwen2.5-7B** with enhanced reasoning, roleplaying, and long-context capabilities. This model supports up to **1 million token** context lengths, making it ideal for ultra-long text processing, deep reasoning tasks, and immersive roleplay interactions.
|
17 |
|
18 |
+
---
|
|
|
19 |
|
20 |
+
## π§ **Model Details**
|
21 |
+
- **Base Model**: `Qwen/Qwen2.5-7B-Instruct-1M`
|
22 |
+
- **Models Used in Merge**:
|
23 |
+
- `Qwen/Qwen2.5-7B-Instruct-1M`
|
24 |
+
- `bunnycore/Qwen2.5-7B-RRP-1M`
|
25 |
+
- `Triangle104/Q2.5-Instruct-1M_Harmony`
|
26 |
+
- **Merge Method**: `MODEL_STOCK` (Optimized layer-wise weight averaging)
|
27 |
|
28 |
+
---
|
29 |
|
30 |
+
## π **Overview**
|
31 |
+
**Qwen2.5-7B-CelestialHarmony-1M** enhances the **Qwen2.5-7B series** with a fine-tuned balance of roleplaying dynamics, structured reasoning, and long-context memory. The model is particularly well-suited for:
|
32 |
+
- **Roleplaying** π§ββοΈ: Immersive character-based storytelling with deep contextual awareness.
|
33 |
+
- **Reasoning & Thought Processing** π§ : Capable of structured logical thinking, especially when prompted with `<think>` tags.
|
34 |
+
- **Ultra-Long Context Handling** π: Efficient processing of sequences up to **1,010,000 tokens** using optimized sparse attention.
|
35 |
|
36 |
+
---
|
37 |
|
38 |
+
## βοΈ **Technical Specifications**
|
39 |
+
| Specification | Value |
|
40 |
+
|--------------|---------|
|
41 |
+
| **Model Type** | Causal Language Model |
|
42 |
+
| **Parameters** | 7.61B |
|
43 |
+
| **Non-Embedding Parameters** | 6.53B |
|
44 |
+
| **Layers** | 28 |
|
45 |
+
| **Attention Heads (GQA)** | 28 (Q), 4 (KV) |
|
46 |
+
| **Max Context Length** | 1,010,000 tokens |
|
47 |
+
| **Max Generation Length** | 8,192 tokens |
|
48 |
+
| **Merge Method** | Model Stock|
|
49 |
|
50 |
+
---
|
51 |
|
52 |
+
## π¬ **Merging Details**
|
53 |
+
This model was merged using the **Model Stock** method, which optimally averages weights from multiple fine-tuned models to create a more efficient, balanced, and performant model.
|
54 |
|
55 |
+
### **Merge YAML Configuration**
|
56 |
+
```yaml
|
57 |
base_model: Qwen/Qwen2.5-7B-Instruct-1M
|
58 |
dtype: bfloat16
|
59 |
merge_method: model_stock
|
|
|
64 |
- model: bunnycore/Qwen2.5-7B-RRP-1M
|
65 |
- model: huihui-ai/Qwen2.5-7B-Instruct-1M-abliterated
|
66 |
tokenizer_source: Qwen/Qwen2.5-7B-Instruct-1M
|
67 |
+
```
|
68 |
|
69 |
+
---
|
70 |
|
71 |
+
## π **Quickstart**
|
72 |
+
### **Install Required Packages**
|
73 |
+
Ensure you have the latest `transformers` library installed:
|
74 |
+
```bash
|
75 |
+
pip install transformers torch accelerate
|
76 |
```
|
77 |
+
|
78 |
+
### **Load and Use the Model**
|
79 |
+
```python
|
80 |
+
from transformers import AutoModelForCausalLM, AutoTokenizer
|
81 |
+
|
82 |
+
model_name = "ZeroXClem/Qwen2.5-7B-CelestialHarmony-1M"
|
83 |
+
|
84 |
+
model = AutoModelForCausalLM.from_pretrained(
|
85 |
+
model_name,
|
86 |
+
torch_dtype="auto",
|
87 |
+
device_map="auto"
|
88 |
+
)
|
89 |
+
tokenizer = AutoTokenizer.from_pretrained(model_name)
|
90 |
+
|
91 |
+
prompt = "Tell me a short story about an ancient celestial warrior."
|
92 |
+
messages = [
|
93 |
+
{"role": "system", "content": "You are a wise celestial storyteller."},
|
94 |
+
{"role": "user", "content": prompt}
|
95 |
+
]
|
96 |
+
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
|
97 |
+
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
|
98 |
+
|
99 |
+
generated_ids = model.generate(**model_inputs, max_new_tokens=512)
|
100 |
+
response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
|
101 |
+
|
102 |
+
print(response)
|
103 |
+
```
|
104 |
+
|
105 |
+
---
|
106 |
+
|
107 |
+
## β‘ **Optimized Deployment with vLLM**
|
108 |
+
For long-context inference, use **vLLM**:
|
109 |
+
```bash
|
110 |
+
git clone -b dev/dual-chunk-attn [email protected]:QwenLM/vllm.git
|
111 |
+
cd vllm
|
112 |
+
pip install -e . -v
|
113 |
+
```
|
114 |
+
Run the model:
|
115 |
+
```bash
|
116 |
+
vllm serve ZeroXClem/Qwen2.5-7B-CelestialHarmony-1M \
|
117 |
+
--tensor-parallel-size 4 \
|
118 |
+
--max-model-len 1010000 \
|
119 |
+
--enable-chunked-prefill --max-num-batched-tokens 131072 \
|
120 |
+
--enforce-eager \
|
121 |
+
--max-num-seqs 1
|
122 |
+
```
|
123 |
+
|
124 |
+
---
|
125 |
+
|
126 |
+
## π― **Model Capabilities**
|
127 |
+
β
**Roleplay & Storytelling** β Designed for engaging interactions.
|
128 |
+
β
**Long-Context Awareness** β Handles texts up to **1M tokens**.
|
129 |
+
β
**Logical Thinking & Reasoning** β Supports `<think>` tag to enhance thought structuring.
|
130 |
+
οΏ½οΏ½οΏ½ **Optimized Merge Strategy** β Uses `Model Stock` for superior generalization.
|
131 |
+
|
132 |
+
---
|
133 |
+
|
134 |
+
## π **Acknowledgments**
|
135 |
+
This model is built on top of **Qwen2.5-7B**, with contributions from **bunnycore, Triangle104, and Sakalti**, leveraging the **Model Stock** merging methodology.
|
136 |
+
|
137 |
+
For further details, see:
|
138 |
+
- π [Qwen2.5-7B Technical Report](https://arxiv.org/abs/2501.15383)
|
139 |
+
- π [MergeKit Documentation](https://github.com/mlfoundations/mergekit)
|
140 |
+
- π [vLLM for Long-Context Inference](https://github.com/QwenLM/vllm)
|
141 |
+
|
142 |
+
---
|