DatToad commited on
Commit
e69b5ad
·
verified ·
1 Parent(s): 134a6b5

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +70 -42
README.md CHANGED
@@ -1,42 +1,70 @@
1
- ---
2
- base_model: []
3
- library_name: transformers
4
- tags:
5
- - mergekit
6
- - merge
7
-
8
- ---
9
- # Chuluun8
10
-
11
- This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
12
-
13
- ## Merge Details
14
- ### Merge Method
15
-
16
- This model was merged using the [Model Stock](https://arxiv.org/abs/2403.19522) merge method using C:\FP16\Tess as a base.
17
-
18
- ### Models Merged
19
-
20
- The following models were included in the merge:
21
- * C:\FP16\Kunou
22
- * C:\FP16\Stink
23
- * C:\FP16\EVA-72B
24
- * C:\FP16\Magnum
25
-
26
- ### Configuration
27
-
28
- The following YAML configuration was used to produce this model:
29
-
30
- ```yaml
31
- models:
32
- - model: C:\FP16\Magnum
33
- - model: C:\FP16\EVA-72B
34
- - model: C:\FP16\Kunou
35
- - model: C:\FP16\Stink
36
- merge_method: model_stock
37
- base_model: C:\FP16\Tess
38
- parameters:
39
- filter_wise: false
40
- dytpe: float16
41
- name: C:\FP16\Chuluun8
42
- ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model:
3
+ - EVA-UNIT-01/EVA-Qwen2.5-72B-v0.2
4
+ library_name: transformers
5
+ tags:
6
+ - mergekit
7
+ - merge
8
+
9
+ ---
10
+ # Chuluun-Qwen2.5-72B-v0.08
11
+
12
+ ![image/png](https://huggingface.co/DatToad/Chuluun-Qwen2.5-72B-v0.08/resolve/main/Chuluun8.png)
13
+
14
+ This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
15
+
16
+ I re-ran the original Chuluun formula including the newly released Ink from Allura-Org. I've found the addition gives the model a lot more variability, likely because of aggressive de-slop applied to its dataset. Sometimes this means a word choice will be strange and you'll want to manually edit when needed, but it means you'll see less ministrations sparkling with mischief.
17
+
18
+ Because of this the best way to approach the model is to run multiple regens and choose the one you like, edit mercilessly, and continue. Like the original Chuluun this variant is very steerable for complex storywriting and RP. It's probably also a little spicier than v0.01 with both Magnum and whatever the heck Fizz threw into the data for Ink.
19
+
20
+ I've also been hearing praise for a level of character intelligence not seen in other models, including Largestral finetunes and merges. I'm not about to say any model of mine is smarter because it was a dumb idea to use Tess as the base and it somehow worked.
21
+
22
+ # Tips for effective use
23
+
24
+ As with all writing-focused models balancing intelligence with creativity is tricky. If this one seems like it understands some details but not others try v0.01, overall I think this model is more creative but a little less coherent. If v0.08 is a little too chaotic for your tastes consider running v0.01 at first and switch to this model mid-story if it gets stale.
25
+
26
+ All the models within the merge use ChatML format so you'll want to use it too. Use [Konnect's Qwenception](https://huggingface.co/Konnect1221/The-Inception-Presets-Methception-LLamaception-Qwenception) prompt or whatever you prefer, it seems to do fine with any decent sysprompt. Lower temps are suggested than with v0.01 because of Ink's de-slop dataset, testers reported anywhere between 1 and 1.2 as a baseline but plan to adjust to taste. Consider dynatemp for this model as well. If dialogue gets repetitive at all that's usually a sign you need more temp.
27
+
28
+
29
+ ## Merge Details
30
+ ### Merge Method
31
+
32
+ This model was merged using the [Model Stock](https://arxiv.org/abs/2403.19522) merge method using migtissera/Tess-v2.5.2-Qwen2-72B as a base.
33
+
34
+ ### Models Merged
35
+
36
+ The following models were included in the merge:
37
+ * EVA-UNIT-01/EVA-Qwen2.5-72B-v0.2
38
+ * Sao10K/72B-Qwen2.5-Kunou-v1
39
+ * anthracite-org/magnum-v4-72b
40
+ * allura-org/Qwen2.5-72b-RP-Ink
41
+
42
+ ### Configuration
43
+
44
+ The following YAML configuration was used to produce this model:
45
+
46
+ ```yaml
47
+ models:
48
+ - model: EVA-UNIT-01/EVA-Qwen2.5-72B-v0.2
49
+ - model: Sao10K/72B-Qwen2.5-Kunou-v1
50
+ - model: anthracite-org/magnum-v4-72b
51
+ - model: allura-org/Qwen2.5-72b-RP-Ink
52
+ merge_method: model_stock
53
+ base_model: migtissera/Tess-v2.5.2-Qwen2-72B
54
+ parameters:
55
+ filter_wise: false
56
+ dytpe: float16
57
+ name: DatToad/Chuluun-Qwen2.5-72B-v0.08
58
+ ```
59
+
60
+ ### Thank Yous!
61
+
62
+ Credit as always to the people who make the finetunes that go into this - they do the hard work, I just throw them in the blender! Most of them have Ko-fis, training isn't cheap and their time is valuable too. Special thanks to these contributors:
63
+
64
+ - Everyone in Allura-Org and friends in Discord, for the EVA and Ink models, as well as all their support and mentoring that gave me the knowledge that makes merges like this possible.
65
+
66
+ - Testers Geeechan and CURSE for invaluable feedback especially on optimal settings
67
+
68
+ - Quant support from scene legends Bartowski and MikeRoz
69
+
70
+ - All of you that have encouraged me and sent thanks and appreciation for this work. It wouldn't mean very much to me to keep this to myself.