Aryanne commited on
Commit
d7b589a
·
verified ·
1 Parent(s): 2b420b6

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +70 -0
README.md ADDED
@@ -0,0 +1,70 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model:
3
+ - cognitivecomputations/dolphin-2.2.1-mistral-7b
4
+ - l3utterfly/mistral-7b-v0.1-layla-v4-chatml
5
+ library_name: transformers
6
+ tags:
7
+ - mergekit
8
+ - merge
9
+
10
+ ---
11
+ # merged
12
+
13
+ This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
14
+
15
+ ## Merge Details
16
+ ### Merge Method
17
+
18
+ This model was merged using the task_swapping merge method using /content/mergekit/test as a base.
19
+
20
+ ### Models Merged
21
+
22
+ The following models were included in the merge:
23
+ * [cognitivecomputations/dolphin-2.2.1-mistral-7b](https://huggingface.co/cognitivecomputations/dolphin-2.2.1-mistral-7b)
24
+ * /content/mergekit/tri
25
+ * [l3utterfly/mistral-7b-v0.1-layla-v4-chatml](https://huggingface.co/l3utterfly/mistral-7b-v0.1-layla-v4-chatml)
26
+
27
+ ### Configuration
28
+
29
+ The following YAML configuration was used to produce this model:
30
+
31
+ ```yaml
32
+ base_model:
33
+ model:
34
+ path: /content/mergekit/test
35
+ dtype: bfloat16
36
+ merge_method: task_swapping
37
+ slices:
38
+ - sources:
39
+ - layer_range: [0, 32]
40
+ model:
41
+ model:
42
+ path: l3utterfly/mistral-7b-v0.1-layla-v4-chatml
43
+ parameters:
44
+ diagonal_offset: 4.0
45
+ random_mask: 0.1
46
+ random_mask_seed: 1956557.0
47
+ weight: 0.4
48
+ - layer_range: [0, 32]
49
+ model:
50
+ model:
51
+ path: cognitivecomputations/dolphin-2.2.1-mistral-7b
52
+ parameters:
53
+ diagonal_offset: 4.0
54
+ random_mask: 0.1
55
+ random_mask_seed: 18019.0
56
+ weight: 0.333
57
+ - layer_range: [0, 32]
58
+ model:
59
+ model:
60
+ path: /content/mergekit/tri
61
+ parameters:
62
+ diagonal_offset: 4.0
63
+ random_mask: 0.05
64
+ random_mask_seed: 666666.0
65
+ weight: 0.5
66
+ - layer_range: [0, 32]
67
+ model:
68
+ model:
69
+ path: /content/mergekit/test
70
+ ```