File size: 4,643 Bytes
b48f1df |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 |
---
base_model:
- cgato/L3-TheSpice-8b-v0.8.3
- NousResearch/Hermes-2-Pro-Llama-3-8B
- openlynn/Llama-3-Soliloquy-8B-v2
library_name: transformers
tags:
- mergekit
- merge
---
# merge
This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
## Merge Details
### Merge Method
This model was merged using the [DARE](https://arxiv.org/abs/2311.03099) [TIES](https://arxiv.org/abs/2306.01708) merge method using [openlynn/Llama-3-Soliloquy-8B-v2](https://huggingface.co/openlynn/Llama-3-Soliloquy-8B-v2) as a base.
### Models Merged
The following models were included in the merge:
* [cgato/L3-TheSpice-8b-v0.8.3](https://huggingface.co/cgato/L3-TheSpice-8b-v0.8.3)
* [NousResearch/Hermes-2-Pro-Llama-3-8B](https://huggingface.co/NousResearch/Hermes-2-Pro-Llama-3-8B)
### Configuration
The following YAML configuration was used to produce this model:
```yaml
base_model: openlynn/Llama-3-Soliloquy-8B-v2
dtype: bfloat16
merge_method: dare_ties
parameters:
int8_mask: 1.0
normalize: 0.0
slices:
- sources:
- layer_range: [0, 4]
model: openlynn/Llama-3-Soliloquy-8B-v2
parameters:
density: 1.0
weight: 0.6861808716092435
- layer_range: [0, 4]
model: cgato/L3-TheSpice-8b-v0.8.3
parameters:
density: 0.6628290134113985
weight: 0.5815923052193855
- layer_range: [0, 4]
model: NousResearch/Hermes-2-Pro-Llama-3-8B
parameters:
density: 1.0
weight: 0.5113886163963061
- sources:
- layer_range: [4, 8]
model: cgato/L3-TheSpice-8b-v0.8.3
parameters:
density: 0.892655547455918
weight: 0.038732602391021484
- layer_range: [4, 8]
model: cgato/L3-TheSpice-8b-v0.8.3
parameters:
density: 1.0
weight: 0.1982145486303527
- layer_range: [4, 8]
model: openlynn/Llama-3-Soliloquy-8B-v2
parameters:
density: 1.0
weight: 0.6843011350690802
- sources:
- layer_range: [8, 12]
model: NousResearch/Hermes-2-Pro-Llama-3-8B
parameters:
density: 0.7817511027396784
weight: 0.13053333213489704
- layer_range: [8, 12]
model: cgato/L3-TheSpice-8b-v0.8.3
parameters:
density: 0.6963703515864826
weight: 0.20525481492667985
- layer_range: [8, 12]
model: openlynn/Llama-3-Soliloquy-8B-v2
parameters:
density: 0.6983086326765777
weight: 0.5843953969574106
- sources:
- layer_range: [12, 16]
model: NousResearch/Hermes-2-Pro-Llama-3-8B
parameters:
density: 0.9632895768462915
weight: 0.2101146706607748
- layer_range: [12, 16]
model: cgato/L3-TheSpice-8b-v0.8.3
parameters:
density: 0.597557434542081
weight: 0.6728172621848589
- layer_range: [12, 16]
model: openlynn/Llama-3-Soliloquy-8B-v2
parameters:
density: 0.756263557607837
weight: 0.2581423726361908
- sources:
- layer_range: [16, 20]
model: NousResearch/Hermes-2-Pro-Llama-3-8B
parameters:
density: 1.0
weight: 0.2116035543552448
- layer_range: [16, 20]
model: cgato/L3-TheSpice-8b-v0.8.3
parameters:
density: 1.0
weight: 0.22654226422958418
- layer_range: [16, 20]
model: openlynn/Llama-3-Soliloquy-8B-v2
parameters:
density: 0.8925914810507647
weight: 0.42243766315440867
- sources:
- layer_range: [20, 24]
model: NousResearch/Hermes-2-Pro-Llama-3-8B
parameters:
density: 0.7697608089825734
weight: 0.1535118632140203
- layer_range: [20, 24]
model: cgato/L3-TheSpice-8b-v0.8.3
parameters:
density: 0.9886758076773643
weight: 0.3305040603868546
- layer_range: [20, 24]
model: openlynn/Llama-3-Soliloquy-8B-v2
parameters:
density: 1.0
weight: 0.40670083428654535
- sources:
- layer_range: [24, 28]
model: NousResearch/Hermes-2-Pro-Llama-3-8B
parameters:
density: 1.0
weight: 0.4542810478500622
- layer_range: [24, 28]
model: cgato/L3-TheSpice-8b-v0.8.3
parameters:
density: 0.8330662483310117
weight: 0.2587495367324508
- layer_range: [24, 28]
model: openlynn/Llama-3-Soliloquy-8B-v2
parameters:
density: 0.9845313983551542
weight: 0.40378452705975915
- sources:
- layer_range: [28, 32]
model: NousResearch/Hermes-2-Pro-Llama-3-8B
parameters:
density: 1.0
weight: 0.2951962192288415
- layer_range: [28, 32]
model: cgato/L3-TheSpice-8b-v0.8.3
parameters:
density: 0.960315594933433
weight: 0.13142971773782525
- layer_range: [28, 32]
model: openlynn/Llama-3-Soliloquy-8B-v2
parameters:
density: 1.0
weight: 0.30838472094518804
```
|