Elizezen commited on
Commit
69ae844
ยท
verified ยท
1 Parent(s): dddd129

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +165 -109
README.md CHANGED
@@ -1,109 +1,165 @@
1
- ---
2
- base_model: []
3
- library_name: transformers
4
- tags:
5
- - mergekit
6
- - merge
7
-
8
- ---
9
- # final_merge
10
-
11
- This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
12
-
13
- ## Merge Details
14
- ### Merge Method
15
-
16
- This model was merged using the [DARE](https://arxiv.org/abs/2311.03099) [TIES](https://arxiv.org/abs/2306.01708) merge method using evol_merge_storage\input_models\Antler7B_2159541861 as a base.
17
-
18
- ### Models Merged
19
-
20
- The following models were included in the merge:
21
- * evol_merge_storage\input_models\chatntq-ja-7b-v1.0-westlake_932715917
22
- * evol_merge_storage\input_models\antler-starling-08_4074283220
23
- * evol_merge_storage\input_models\Phos7b-RP_654656604
24
-
25
- ### Configuration
26
-
27
- The following YAML configuration was used to produce this model:
28
-
29
- ```yaml
30
- base_model: evol_merge_storage\input_models\Antler7B_2159541861
31
- dtype: bfloat16
32
- merge_method: dare_ties
33
- parameters:
34
- int8_mask: 1.0
35
- normalize: 1.0
36
- slices:
37
- - sources:
38
- - layer_range: [0, 8]
39
- model: evol_merge_storage\input_models\Phos7b-RP_654656604
40
- parameters:
41
- density: 0.584107666175788
42
- weight: 0.47231634419785595
43
- - layer_range: [0, 8]
44
- model: evol_merge_storage\input_models\chatntq-ja-7b-v1.0-westlake_932715917
45
- parameters:
46
- density: 0.9357007414387093
47
- weight: 0.25531843586626907
48
- - layer_range: [0, 8]
49
- model: evol_merge_storage\input_models\antler-starling-08_4074283220
50
- parameters:
51
- density: 0.9750447748820433
52
- weight: 0.4753247646722287
53
- - layer_range: [0, 8]
54
- model: evol_merge_storage\input_models\Antler7B_2159541861
55
- - sources:
56
- - layer_range: [8, 16]
57
- model: evol_merge_storage\input_models\Phos7b-RP_654656604
58
- parameters:
59
- density: 0.8802238329444649
60
- weight: 0.4482746205621599
61
- - layer_range: [8, 16]
62
- model: evol_merge_storage\input_models\chatntq-ja-7b-v1.0-westlake_932715917
63
- parameters:
64
- density: 1.0
65
- weight: 0.5524329574915081
66
- - layer_range: [8, 16]
67
- model: evol_merge_storage\input_models\antler-starling-08_4074283220
68
- parameters:
69
- density: 1.0
70
- weight: 0.22634815425570032
71
- - layer_range: [8, 16]
72
- model: evol_merge_storage\input_models\Antler7B_2159541861
73
- - sources:
74
- - layer_range: [16, 24]
75
- model: evol_merge_storage\input_models\Phos7b-RP_654656604
76
- parameters:
77
- density: 0.9921437573982935
78
- weight: 0.44636209472148164
79
- - layer_range: [16, 24]
80
- model: evol_merge_storage\input_models\chatntq-ja-7b-v1.0-westlake_932715917
81
- parameters:
82
- density: 0.8757091247914811
83
- weight: 0.15431351637040108
84
- - layer_range: [16, 24]
85
- model: evol_merge_storage\input_models\antler-starling-08_4074283220
86
- parameters:
87
- density: 0.8667200206865777
88
- weight: 0.37827962987746055
89
- - layer_range: [16, 24]
90
- model: evol_merge_storage\input_models\Antler7B_2159541861
91
- - sources:
92
- - layer_range: [24, 32]
93
- model: evol_merge_storage\input_models\Phos7b-RP_654656604
94
- parameters:
95
- density: 0.966615155256828
96
- weight: 0.5041762338947331
97
- - layer_range: [24, 32]
98
- model: evol_merge_storage\input_models\chatntq-ja-7b-v1.0-westlake_932715917
99
- parameters:
100
- density: 1.0
101
- weight: 0.22555101554235693
102
- - layer_range: [24, 32]
103
- model: evol_merge_storage\input_models\antler-starling-08_4074283220
104
- parameters:
105
- density: 0.7616963147939114
106
- weight: 0.397020374822854
107
- - layer_range: [24, 32]
108
- model: evol_merge_storage\input_models\Antler7B_2159541861
109
- ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model: []
3
+ library_name: transformers
4
+ tags:
5
+ - mergekit
6
+ - merge
7
+
8
+ ---
9
+
10
+ # Antler 7B Evolve
11
+
12
+ <img src="https://huggingface.co/Elizezen/Antler-7B/resolve/main/OIG3.UAjshTXCEJU.jpg" alt="drawing" style="width:512px;"/>
13
+
14
+ ## Model Description
15
+
16
+ This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit), using **Evolutionary Model Merging**.
17
+
18
+ Generally better than Antler-7B at writing novels, especially at maintaining the context, but it can fall short for eroticness compared to the original model.
19
+
20
+ ## Example
21
+
22
+ ### Input
23
+
24
+ ```
25
+ ใใฎๆ—ฅใฎๆ˜ผไธ‹ใŒใ‚Šใ€‚็งใจใ‚ใ‚„ใฏใŠๆƒใ„ใฎๆตด่กฃใ‚’่บซใซ็บใ„ใ€ๅ†ฌ็ฅญใ‚Šใ‚’ๆฅฝใ—ใ‚€ใŸใ‚ใซ็”บใธใจ็นฐใ‚Šๅ‡บใ—ใฆใ„ใŸใ€‚ใ‚€ใ‚ใ‚“ใ€ๅฟ่€…ใฎ็ด ๆ€งใ‚’้š ใ™ใŸใ‚ใซ็š†ๅค‰่ฃ…ใ—ใฆใ„ใ‚‹ใ€‚
26
+ ๆ™ฎๆฎต็€ๆ…ฃใ‚Œใชใ„ๆœ่ฃ…ใฎใŸใ‚ๅฐ‘ใ€…่ฝใก็€ใ‹ใชใ„ใ‚‚ใฎใฎใ€ๅธฏใŒ็ทฉใ‚“ใ ใ‚Š็€ๅดฉใ‚Œใ—ใชใ„ใ‚ˆใ†ๆ„่ญ˜ใ—ใชใŒใ‚‰ๆญฉใใ€‚ใ™ใ‚‹ใจใ‚ใ‚„ใ‚‚ๅŒใ˜ใ‚ˆใ†ใซใŽใ“ใกใชใๆญฉใ„ใฆใ„ใ‚‹ใฎใŒๅˆ†ใ‹ใฃใŸใ€‚
27
+ ใ‚„ใŒใฆ่ก—ไธฆใฟใฏๆดปๆฐ—ใซๆบ€ใกๅง‹ใ‚ใ€้“่กŒใไบบใ€…ใฎ่ณ‘ใ‚„ใ‹ใชๅฃฐใŒ่žใ“ใˆใฆใใ‚‹ใ€‚
28
+ ๅบƒๅ ดใซๅˆฐ็€ใ™ใ‚‹ใจใ€ใใ“ใฏๅคงๅ‹ขใฎไบบใ€…ใงใซใŽใ‚ใ„ใ€่‰ฒใจใ‚Šใฉใ‚Šใฎๆ็ฏใŒ่พบใ‚Šใ‚’็…งใ‚‰ใ—ใฆใ„ใŸใ€‚ๆง˜ใ€…ใชๅ‡บๅบ—ใŒไธฆใณใ€ๅคงๅ‹ขใฎๅญไพ›้”ใจใใฎ่ฆชๅพกใ•ใ‚“ใŒ้ง†ใ‘ๅ›žใฃใฆใ„ใ‚‹ใ€‚
29
+ ๅบƒๅ ดใฎไธญๅคฎไป˜่ฟ‘ใซใ‚ใ‚‹่ˆžๅฐใงใฏๅ‚ฌ๏ฟฝ๏ฟฝ๏ฟฝ็‰ฉใŒ้–‹ๅ‚ฌใ•ใ‚ŒใฆใŠใ‚Šใ€ๅคช้ผ“ใฎ้Ÿณใซๅˆใ‚ใ›ใฆๆญŒใฃใฆใ„ใ‚‹ๆผ”่€…ใŸใกใŒใ„ใŸใ€‚
30
+ ใ€Œใ‚ใ๏ฝžใ€ใใ‚Œใ„๏ผใ€
31
+ ็›ฎใ‚’่ผใ‹ใ›ใฆ่พบใ‚Šใ‚’่ฆ‹ๅ›žใ™ใ‚ใ‚„ใ€‚ใ“ใ†ใ—ใฆใฟใ‚‹ใจๅนด็›ธๅฟœใฎๅญใฉใ‚‚ใซ่ฆ‹ใˆใ‚‹ใ€‚
32
+ ใ€Œใ“ใ‚‰ใ€ๅ‹ๆ‰‹ใซ่ตฐใ‚Šๅ›žใ‚‰ใชใ„ใ€
33
+ ใ€Œใˆใธใธ๏ฝžใ”ใ‚ใ‚“ใชใ•ใ„ใ€
34
+ ใŸใ—ใชใ‚ใ‚‰ใ‚ŒใชใŒใ‚‰ใ‚‚ใ€้ก”ใฏ็ถปใ‚“ใงใ„ใ‚‹ๆง˜ๅญใ‹ใ‚‰ใ‚‚ๅˆ†ใ‹ใ‚‹้€šใ‚Šใ€ๅฝผๅฅณใ‚‚ๆฅฝใ—ใฟใซใ—ใฆใ„ใ‚‹ใฎใฏๆ˜Žใ‚‰ใ‹ใ ใ‚ใ†ใ€‚
35
+ ใ‚ใ‚„ใŒๆฅฝใ—ใใ†ใ ใจใ€็งใ‚‚ๅฌ‰ใ—ใ„ใ€‚ไธๆ€่ญฐใชใ‚‚ใฎใ ใชใ€‚ไปŠใพใงใ“ใ‚“ใชๆฐ—ๆŒใกใซใชใฃใŸใ“ใจใฏใชใ‹ใฃใŸใ€‚
36
+ ๆ€ใ‚ใš็งใพใง็ฌ‘้ก”ใซใชใฃใฆใ—ใพใ†ใ€‚
37
+ ใ€Œใ“ใฎใกใ‚ƒใ‚“ใ€ใ“ใฎใกใ‚ƒใ‚“๏ผ่ฆ‹ใฆ๏ผใ“ใฎใกใ‚‡ใ†ใกใ‚“ๅฏๆ„›ใ„๏ฝž๏ผใ€
38
+ ใ€ŒใŠใŠใ€
39
+ ๅฝผๅฅณใซ่ข–ใ‚’ๅผ•ใ‹ใ‚Œใฆ้€ฃใ‚Œใฆ่กŒใ‹ใ‚ŒใŸใฎใฏใ€ๅฏๆ„›ใ‚‰ใ—ใ„ๅ‹•็‰ฉใฎ็ตตๆŸ„ใŒใ‚ใ—ใ‚‰ใ‚ใ‚ŒใŸ่กŒ็‡ˆใ€‚ใฉใ†ใ‚„ใ‚‰ๅฃฒๅบ—ใŒๅ‡บใ—ใฆใ„ใ‚‹ใ‚‰ใ—ใ„ใ€‚
40
+ ๅฑ‹ๅฐใงใฏ
41
+ ```
42
+
43
+ ### Output
44
+
45
+ ```
46
+ ๅฑ‹ๅฐใงใฏ็„ผใใใฐใ‚„็„ผใ้ณฅใชใฉใฎๅฎš็•ชใจไธฆใ‚“ใงใ€ๅ†ฌ้™ๅฎšใฎ้‹ๅฐ‚้–€ๅบ—ใชใฉใ‚‚ๅ‡บใฆใ„ใŸใ€‚
47
+ ใ€Œใ‚ใ‚„ใฏใชใซใ‹้ฃŸในใŸใ„ใ‚‚ใฎใ‚ใ‚‹ใ‹๏ผŸใ€
48
+ ใ€Œใˆใฃใจใญ๏ฝžใ€ใŠๅฅฝใฟ็„ผใใจใ‹โ€ฆโ€ฆใ€
49
+ ใ€ŒใŠๅฅฝใฟ็„ผใ๏ผŸใ€
50
+ ็งใฏๅˆใ‚ใฆ่žใๆ–™็†ๅใซ้ฆ–ใ‚’ๅ‚พใ’ใ‚‹ใ€‚ใ‚ใ‚„ใฏ็›ฎใ‚’ใ‚ญใƒฉใ‚ญใƒฉใ•ใ›ใชใŒใ‚‰ใ€ใใฎๆ–™็†ใซใคใ„ใฆ่ชฌๆ˜Žใ—ใฆใใ‚ŒใŸใ€‚
51
+ ๅฐ้บฆ็ฒ‰ใฎ็”Ÿๅœฐใซใ‚ญใƒฃใƒ™ใƒ„ใ‚„่ฑš่‚‰ใ€ๅคฉใ‹ใ™ใ€ใใ—ใฆใŠๅฅฝใฟ็„ผใใ‚ฝใƒผใ‚นใ‚’ใ‹ใ‘ใฆ็„ผใ„ใŸใ€ๅคง้˜ชๅ็‰ฉใฎๆ–™็†ใ‚‰ใ—ใ„ใ€‚
52
+ ใ€Œใใ‚ŒใฏใพใŸ้ข็™ฝใใ†ใชใ‚‚ใฎใ ใชใ€‚ใงใฏใใ“ใฎๅฑ‹ๅฐใซ่กŒใฃใฆใฟใ‚ˆใ†ใ€
53
+ ็ง้”ใฏ็›ฎๆ˜Ÿใ‚’ใคใ‘ใŸๅฑ‹ๅฐใธๅ‘ใ‹ใ†ใ“ใจใซใ—ใŸใ€‚
54
+ ใŠๅฅฝใฟ็„ผใใฎๅฑ‹ๅฐใฏใ€ไบˆๆƒณไปฅไธŠใฎ็››ๆณใถใ‚Šใง่กŒๅˆ—ใŒใงใใฆใ„ใŸใ€‚ใ—ใ‹ใ—ใ€ไธฆใถใ“ใจ30ๅˆ†ใปใฉใง็งใŸใกใฎ็•ชใŒใ‚„ใฃใฆใใ‚‹ใ€‚
55
+ ใ€ŒใŠใฐใกใ‚ƒใ‚“ใ€ใ“ใ‚Œใใ ใ•ใ„ใ€
56
+ ใ€Œใ‚ใ„ใ‚ˆ๏ผใกใ‚‡ใฃใจๅพ…ใฃใฆใช๏ผใ€
57
+ ๅฑ‹ๅฐใฎใŠใฐใกใ‚ƒใ‚“ใฏๅจๅ‹ขใฎใ„ใ„ๅฃฐใง่ฟ”ไบ‹ใ‚’ใ™ใ‚‹ใจใ€ๆ‰‹ๆ…ฃใ‚ŒใŸๆง˜ๅญใง้‰„ๆฟใฎไธŠใงใŠๅฅฝใฟ็„ผใใ‚’็„ผใไธŠใ’ใ‚‹ใ€‚
58
+ ใ€Œใ“ใ‚ŒใŒใŠๅฅฝใฟ็„ผใใ ใ‚ˆใ€
59
+ ๅ‡บๆฅไธŠใŒใฃใŸใŠๅฅฝใฟ็„ผใใ‚’ๆ‰‹ใซใ—ใŸใ‚ใ‚„ใŒใ€ใ†ใฃใจใ‚Šใจใ—ใŸๆง˜ๅญใงใใ†่จ€ใฃใŸใ€‚
60
+ ใ€Œใปใ†ใ€‚่ฆ‹ใ‚‹ใ‹ใ‚‰ใซ็พŽๅ‘ณใ—ใใ†ใ ใ€
61
+ ็งใ‚‚ใใฎ่‰ฒๅˆใ„ใซ่ช˜ใ‚ใ‚Œใ‚‹ใ‚ˆใ†ใซใ—ใฆใ€ไธ€ๅฃ้ ฌ
62
+ ```
63
+
64
+ ### Intended Use
65
+
66
+ The model is mainly intended to be used for generating novels. It may not be so capable with instruction-based responses.
67
+
68
+
69
+ ## Merge Details
70
+ ### Merge Method
71
+
72
+ This model was merged using the [DARE](https://arxiv.org/abs/2311.03099) [TIES](https://arxiv.org/abs/2306.01708) merge method using evol_merge_storage\input_models\Antler7B_2159541861 as a base.
73
+
74
+ ### Models Merged
75
+
76
+ The following models were included in the merge:
77
+ * evol_merge_storage\input_models\chatntq-ja-7b-v1.0-westlake_932715917
78
+ * evol_merge_storage\input_models\antler-starling-08_4074283220
79
+ * evol_merge_storage\input_models\Phos7b-RP_654656604
80
+
81
+ ### Configuration
82
+
83
+ The following YAML configuration was used to produce this model:
84
+
85
+ ```yaml
86
+ base_model: evol_merge_storage\input_models\Antler7B_2159541861
87
+ dtype: bfloat16
88
+ merge_method: dare_ties
89
+ parameters:
90
+ int8_mask: 1.0
91
+ normalize: 1.0
92
+ slices:
93
+ - sources:
94
+ - layer_range: [0, 8]
95
+ model: evol_merge_storage\input_models\Phos7b-RP_654656604
96
+ parameters:
97
+ density: 0.584107666175788
98
+ weight: 0.47231634419785595
99
+ - layer_range: [0, 8]
100
+ model: evol_merge_storage\input_models\chatntq-ja-7b-v1.0-westlake_932715917
101
+ parameters:
102
+ density: 0.9357007414387093
103
+ weight: 0.25531843586626907
104
+ - layer_range: [0, 8]
105
+ model: evol_merge_storage\input_models\antler-starling-08_4074283220
106
+ parameters:
107
+ density: 0.9750447748820433
108
+ weight: 0.4753247646722287
109
+ - layer_range: [0, 8]
110
+ model: evol_merge_storage\input_models\Antler7B_2159541861
111
+ - sources:
112
+ - layer_range: [8, 16]
113
+ model: evol_merge_storage\input_models\Phos7b-RP_654656604
114
+ parameters:
115
+ density: 0.8802238329444649
116
+ weight: 0.4482746205621599
117
+ - layer_range: [8, 16]
118
+ model: evol_merge_storage\input_models\chatntq-ja-7b-v1.0-westlake_932715917
119
+ parameters:
120
+ density: 1.0
121
+ weight: 0.5524329574915081
122
+ - layer_range: [8, 16]
123
+ model: evol_merge_storage\input_models\antler-starling-08_4074283220
124
+ parameters:
125
+ density: 1.0
126
+ weight: 0.22634815425570032
127
+ - layer_range: [8, 16]
128
+ model: evol_merge_storage\input_models\Antler7B_2159541861
129
+ - sources:
130
+ - layer_range: [16, 24]
131
+ model: evol_merge_storage\input_models\Phos7b-RP_654656604
132
+ parameters:
133
+ density: 0.9921437573982935
134
+ weight: 0.44636209472148164
135
+ - layer_range: [16, 24]
136
+ model: evol_merge_storage\input_models\chatntq-ja-7b-v1.0-westlake_932715917
137
+ parameters:
138
+ density: 0.8757091247914811
139
+ weight: 0.15431351637040108
140
+ - layer_range: [16, 24]
141
+ model: evol_merge_storage\input_models\antler-starling-08_4074283220
142
+ parameters:
143
+ density: 0.8667200206865777
144
+ weight: 0.37827962987746055
145
+ - layer_range: [16, 24]
146
+ model: evol_merge_storage\input_models\Antler7B_2159541861
147
+ - sources:
148
+ - layer_range: [24, 32]
149
+ model: evol_merge_storage\input_models\Phos7b-RP_654656604
150
+ parameters:
151
+ density: 0.966615155256828
152
+ weight: 0.5041762338947331
153
+ - layer_range: [24, 32]
154
+ model: evol_merge_storage\input_models\chatntq-ja-7b-v1.0-westlake_932715917
155
+ parameters:
156
+ density: 1.0
157
+ weight: 0.22555101554235693
158
+ - layer_range: [24, 32]
159
+ model: evol_merge_storage\input_models\antler-starling-08_4074283220
160
+ parameters:
161
+ density: 0.7616963147939114
162
+ weight: 0.397020374822854
163
+ - layer_range: [24, 32]
164
+ model: evol_merge_storage\input_models\Antler7B_2159541861
165
+ ```