Mihaiii commited on
Commit
66e40e9
1 Parent(s): 623e5ce

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +37 -0
README.md ADDED
@@ -0,0 +1,37 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model: Mihaiii/Metis-0.3
3
+ inference: false
4
+ license: apache-2.0
5
+ license_name: apache-2.0
6
+ metrics:
7
+ - accuracy
8
+ ---
9
+
10
+ This is a merge between Metis-0.3 and Metis-0.1 having Metis-0.1 as base.
11
+ It was done using [mergekit](https://github.com/cg123/mergekit).
12
+
13
+ It works well with long system prompts.
14
+
15
+ It isn't generic in a sense that it shouldn't be used for story telling, for example, but only for reasoning and text comprehension.
16
+
17
+ This model is trained on a private dataset. The high GSM8K score is **NOT** because of the MetaMath dataset.
18
+
19
+ Merge config:
20
+ ```yaml
21
+ slices:
22
+ - sources:
23
+ - model: Mihaiii/Metis-0.3
24
+ layer_range: [0, 32]
25
+ - model: Mihaiii/Metis-0.1
26
+ layer_range: [0, 32]
27
+ merge_method: slerp
28
+ base_model: Mihaiii/Metis-0.1
29
+ parameters:
30
+ t:
31
+ - filter: self_attn
32
+ value: [0, 0.5, 0.3, 0.7, 1]
33
+ - filter: mlp
34
+ value: [1, 0.5, 0.7, 0.3, 0]
35
+ - value: 0.5 # fallback for rest of tensors
36
+ dtype: bfloat16
37
+ ```