athirdpath commited on
Commit
843173f
1 Parent(s): 2745306

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +78 -29
README.md CHANGED
@@ -1,39 +1,88 @@
1
  ---
2
- base_model: []
3
- library_name: transformers
4
- tags:
5
- - mergekit
6
- - merge
7
-
8
  ---
9
- # wibblestock
10
 
11
- This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
 
 
 
12
 
13
- ## Merge Details
14
- ### Merge Method
15
 
16
- This model was merged using the [Model Stock](https://arxiv.org/abs/2403.19522) merge method using ./wibblel3 as a base.
17
 
18
- ### Models Merged
 
 
 
 
 
 
 
 
 
 
 
 
19
 
20
- The following models were included in the merge:
21
- * ./bigl3_2
22
- * ./bigl3
23
 
24
- ### Configuration
 
 
 
 
 
 
 
 
 
25
 
26
- The following YAML configuration was used to produce this model:
27
 
28
- ```yaml
29
- models:
30
- - model: ./wibblel3
31
- - model: ./bigl3
32
- - model: ./bigl3_2
33
- merge_method: model_stock
34
- base_model: ./wibblel3
35
- parameters:
36
- normalize: true
37
- int8_mask: true
38
- dtype: float16
39
- ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ license: llama3
 
 
 
 
 
3
  ---
 
4
 
5
+ This is a merge stock of 3 models:
6
+ - Part Wave
7
+ - Part Block
8
+ - Part Funnel
9
 
10
+ With Part Funnel as the base.
 
11
 
12
+ ---
13
 
14
+ Part Wave:
15
+ - sources:
16
+ - model: NousResearch/Meta-Llama-3-8B-Instruct
17
+ layer_range: [0, 12]
18
+ - sources:
19
+ - model: NousResearch/Meta-Llama-3-8B-Instruct
20
+ layer_range: [8, 18]
21
+ - sources:
22
+ - model: NousResearch/Meta-Llama-3-8B-Instruct
23
+ layer_range: [13, 23]
24
+ - sources:
25
+ - model: NousResearch/Meta-Llama-3-8B-Instruct
26
+ layer_range: [18, 32]
27
 
28
+ ---
 
 
29
 
30
+ Part Block:
31
+ - sources:
32
+ - model: NousResearch/Meta-Llama-3-8B-Instruct
33
+ layer_range: [0, 15]
34
+ - sources:
35
+ - model: NousResearch/Meta-Llama-3-8B-Instruct
36
+ layer_range: [8, 23]
37
+ - sources:
38
+ - model: NousResearch/Meta-Llama-3-8B-Instruct
39
+ layer_range: [16, 32]
40
 
41
+ ---
42
 
43
+ Part Funnel:
44
+ - sources:
45
+ - model: NousResearch/Meta-Llama-3-8B-Instruct
46
+ layer_range: [0, 15]
47
+ - sources:
48
+ - model: NousResearch/Meta-Llama-3-8B-Instruct
49
+ layer_range: [14, 14]
50
+ - sources:
51
+ - model: NousResearch/Meta-Llama-3-8B-Instruct
52
+ layer_range: [13, 13]
53
+ - sources:
54
+ - model: NousResearch/Meta-Llama-3-8B-Instruct
55
+ layer_range: [12, 12]
56
+ - sources:
57
+ - model: NousResearch/Meta-Llama-3-8B-Instruct
58
+ layer_range: [11, 11]
59
+ - sources:
60
+ - model: NousResearch/Meta-Llama-3-8B-Instruct
61
+ layer_range: [10, 10]
62
+ - sources:
63
+ - model: NousResearch/Meta-Llama-3-8B-Instruct
64
+ layer_range: [9, 9]
65
+ - sources:
66
+ - model: NousResearch/Meta-Llama-3-8B-Instruct
67
+ layer_range: [8, 23]
68
+ - sources:
69
+ - model: NousResearch/Meta-Llama-3-8B-Instruct
70
+ layer_range: [22, 22]
71
+ - sources:
72
+ - model: NousResearch/Meta-Llama-3-8B-Instruct
73
+ layer_range: [21, 21]
74
+ - sources:
75
+ - model: NousResearch/Meta-Llama-3-8B-Instruct
76
+ layer_range: [20, 20]
77
+ - sources:
78
+ - model: NousResearch/Meta-Llama-3-8B-Instruct
79
+ layer_range: [19, 19]
80
+ - sources:
81
+ - model: NousResearch/Meta-Llama-3-8B-Instruct
82
+ layer_range: [18, 18]
83
+ - sources:
84
+ - model: NousResearch/Meta-Llama-3-8B-Instruct
85
+ layer_range: [17, 17]
86
+ - sources:
87
+ - model: NousResearch/Meta-Llama-3-8B-Instruct
88
+ layer_range: [16, 32]