fernandofernandes
commited on
Commit
•
305d5ed
1
Parent(s):
3a8d189
Update README.md
Browse files
README.md
CHANGED
@@ -8,12 +8,18 @@ An experimentation regarding 'lasering' each expert to denoise and enhance model
|
|
8 |
|
9 |
This model has half size in comparison to the Mixtral 8x7b Instruct. And it basically has the same level of performance (we are working to get a better MMLU score).
|
10 |
|
11 |
-
Used models (all lasered using laserRMT):
|
12 |
-
|
13 |
-
|
14 |
-
|
15 |
-
|
16 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
17 |
|
18 |
It follows the implementation of laserRMT @ https://github.com/cognitivecomputations/laserRMT
|
19 |
|
|
|
8 |
|
9 |
This model has half size in comparison to the Mixtral 8x7b Instruct. And it basically has the same level of performance (we are working to get a better MMLU score).
|
10 |
|
11 |
+
Used models (all lasered using laserRMT, except for the base model):
|
12 |
+
|
13 |
+
|
14 |
+
*mlabonne/Marcoro14-7B-slerp (base)
|
15 |
+
|
16 |
+
*cognitivecomputations/dolphin-2.6-mistral-7b-dpo
|
17 |
+
|
18 |
+
*beowolx/CodeNinja-1.0-OpenChat-7B
|
19 |
+
|
20 |
+
*Q-bert/MetaMath-Cybertron-Starling
|
21 |
+
|
22 |
+
*WizardLM/WizardMath-7B-V1.1
|
23 |
|
24 |
It follows the implementation of laserRMT @ https://github.com/cognitivecomputations/laserRMT
|
25 |
|