Steelskull
commited on
Commit
•
87ffdfd
1
Parent(s):
409f695
Update README.md
Browse files
README.md
CHANGED
@@ -58,14 +58,17 @@ Currently, I am still limited to Mergekit, for this method, which does not suppo
|
|
58 |
|
59 |
## ⚙️ Evals
|
60 |
|
61 |
-
|
62 |
-
|
|
|
|
|
|
|
63 |
|
64 |
As you know there is a usual loss of intelligence with model mergers, especially with Passthrough merging, on the par of 3ish points per billion duped, IF you get the right merge, if not your looking at a much larger loss (anywhere from 3-8 points per billion duped).
|
65 |
-
|
66 |
|
67 |
-
This method is still in active development, and
|
68 |
-
I am also working on a single layer duping script as merge kit does not currently support this and I am
|
69 |
|
70 |
## 🧩 Configuration
|
71 |
|
|
|
58 |
|
59 |
## ⚙️ Evals
|
60 |
|
61 |
+
![image/png](https://cdn-uploads.huggingface.co/production/uploads/64545af5ec40bbbd01242ca6/pS7KFYDheWmFEaGybxr3K.png)
|
62 |
+
|
63 |
+
[My Leaderboard:](https://huggingface.co/spaces/Steelskull/YALL-Leaderboard)
|
64 |
+
|
65 |
+
![image/png](https://cdn-uploads.huggingface.co/production/uploads/64545af5ec40bbbd01242ca6/CF9_p8AWMFraCnfiMa_no.png)
|
66 |
|
67 |
As you know there is a usual loss of intelligence with model mergers, especially with Passthrough merging, on the par of 3ish points per billion duped, IF you get the right merge, if not your looking at a much larger loss (anywhere from 3-8 points per billion duped).
|
68 |
+
Using DLEC, I was able to increase Phi-2 from 2.78b -> 3.25b with less than or around a single point of loss.
|
69 |
|
70 |
+
This method is still in active development, and I am currently tweaking the algorithm to improve the layer selection process,
|
71 |
+
I am also working on a single layer duping script as merge kit does not currently support this and I am merging layers that are unneeded and its degrading performance.
|
72 |
|
73 |
## 🧩 Configuration
|
74 |
|