File size: 3,985 Bytes
6095d7d
fb1011d
aaddbb1
 
6095d7d
aaddbb1
1309774
 
 
 
 
 
 
 
 
 
aaddbb1
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
6fa9985
aaddbb1
6fa9985
aaddbb1
 
 
6fa9985
aaddbb1
 
 
6fa9985
aaddbb1
 
 
6fa9985
aaddbb1
 
 
6fa9985
aaddbb1
 
 
6fa9985
aaddbb1
 
 
6fa9985
aaddbb1
 
 
6fa9985
aaddbb1
 
 
6fa9985
aaddbb1
 
 
6fa9985
aaddbb1
 
 
6fa9985
aaddbb1
 
 
6fa9985
aaddbb1
 
 
6fa9985
aaddbb1
 
 
6fa9985
aaddbb1
 
 
 
1309774
aaddbb1
 
 
 
1309774
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
---
license: cc-by-nc-4.0
language:
- en
---

# Update 2023-12-19

In light of [dataset contamination issue among the merged models](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard/discussions/474)
raised by the community in recent days, in particular
[berkeley-nest/Starling-LM-7B-alpha](https://huggingface.co/berkeley-nest/Starling-LM-7B-alpha), and
[Q-bert/MetaMath-Cybertron-Starling](https://huggingface.co/Q-bert/MetaMath-Cybertron-Starling),
we decided to remake another model without the models mentioned.
Additionally, their CC-by-NC-4.0 license is restrictive and thus are not suitable for an open model.


# Model Description
This is an experiment to test merging 14 models using DARE TIES 🦙

The result is a base model that performs quite well but requires some further instruction fine-tuning.

The 14 models are as follows:
1. [mistralai/Mistral-7B-Instruct-v0.2](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2)
2. [ehartford/dolphin-2.2.1-mistral-7b](https://huggingface.co/ehartford/dolphin-2.2.1-mistral-7b)
3. [SciPhi/SciPhi-Mistral-7B-32k](https://huggingface.co/SciPhi/SciPhi-Mistral-7B-32k)
4. [ehartford/samantha-1.2-mistral-7b](https://huggingface.co/ehartford/samantha-1.2-mistral-7b)
5. [Arc53/docsgpt-7b-mistral](https://huggingface.co/Arc53/docsgpt-7b-mistral)
6. [berkeley-nest/Starling-LM-7B-alpha](https://huggingface.co/berkeley-nest/Starling-LM-7B-alpha)
7. [Q-bert/MetaMath-Cybertron-Starling](https://huggingface.co/Q-bert/MetaMath-Cybertron-Starling)
8. [Open-Orca/Mistral-7B-OpenOrca](https://huggingface.co/Open-Orca/Mistral-7B-OpenOrca)
9. [v1olet/v1olet_marcoroni-go-bruins-merge-7B](https://huggingface.co/v1olet/v1olet_marcoroni-go-bruins-merge-7B)
10. [beowolx/MistralHermes-CodePro-7B-v1](https://huggingface.co/beowolx/MistralHermes-CodePro-7B-v1)
11. [TIGER-Lab/MAmmoTH-7B-Mistral](https://huggingface.co/TIGER-Lab/MAmmoTH-7B-Mistral)
12. [teknium/OpenHermes-2.5-Mistral-7B](https://huggingface.co/teknium/OpenHermes-2.5-Mistral-7B)
13. [Weyaxi/OpenHermes-2.5-neural-chat-v3-3-Slerp](https://huggingface.co/Weyaxi/OpenHermes-2.5-neural-chat-v3-3-Slerp)
14. [mlabonne/NeuralHermes-2.5-Mistral-7B](https://huggingface.co/mlabonne/NeuralHermes-2.5-Mistral-7B)

- base model: [mistralai/Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1)

The yaml config file for this model is here:

```yaml
models:
  - model: mistralai/Mistral-7B-Instruct-v0.2
    # no parameters necessary for base model
  - model: ehartford/dolphin-2.2.1-mistral-7b
    parameters:
      weight: 0.08
      density: 0.4
  - model: SciPhi/SciPhi-Mistral-7B-32k
    parameters:
      weight: 0.08
      density: 0.4
  - model: ehartford/samantha-1.2-mistral-7b
    parameters:
      weight: 0.08
      density: 0.4
  - model: Arc53/docsgpt-7b-mistral
    parameters:
      weight: 0.08
      density: 0.4
  - model: berkeley-nest/Starling-LM-7B-alpha
    parameters:
      weight: 0.08
      density: 0.4
  - model: Q-bert/MetaMath-Cybertron-Starling
    parameters:
      weight: 0.08
      density: 0.4
  - model: Open-Orca/Mistral-7B-OpenOrca
    parameters:
      weight: 0.08
      density: 0.4
  - model: v1olet/v1olet_marcoroni-go-bruins-merge-7B
    parameters:
      weight: 0.08
      density: 0.4
  - model: beowolx/MistralHermes-CodePro-7B-v1
    parameters:
      weight: 0.08
      density: 0.4
  - model: TIGER-Lab/MAmmoTH-7B-Mistral
    parameters:
      weight: 0.08
      density: 0.4
  - model: teknium/OpenHermes-2.5-Mistral-7B
    parameters:
      weight: 0.08
      density: 0.4
  - model: Weyaxi/OpenHermes-2.5-neural-chat-v3-3-Slerp
    parameters:
      weight: 0.08
      density: 0.4
  - model: mlabonne/NeuralHermes-2.5-Mistral-7B
    parameters:
      weight: 0.08
      density: 0.4
  - model: mistralai/Mistral-7B-Instruct-v0.2
    parameters:
      weight: 0.08
      density: 0.5
merge_method: dare_ties
base_model: mistralai/Mistral-7B-v0.1
parameters:
  int8_mask: true
dtype: bfloat16

```