Update README.md
Browse files
README.md
CHANGED
@@ -56,6 +56,17 @@ Currently, I am still limited to Mergekit, for this method, which does not suppo
|
|
56 |
|
57 |
# This Method is still in development and I do not expect "Game Changing" or will I oversell this method, it is purely done for fun. Please let me know how the model works for you.
|
58 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
59 |
## 🧩 Configuration
|
60 |
|
61 |
```yaml
|
|
|
56 |
|
57 |
# This Method is still in development and I do not expect "Game Changing" or will I oversell this method, it is purely done for fun. Please let me know how the model works for you.
|
58 |
|
59 |
+
## ⚙️ Evals
|
60 |
+
|
61 |
+
My Leaderboard:
|
62 |
+
https://huggingface.co/spaces/Steelskull/YALL-Leaderboard
|
63 |
+
|
64 |
+
As you know there is a usual loss of intelligence with model mergers, especially with Passthrough merging, on the par of 3ish points per billion duped, IF you get the right merge, if not your looking at a much larger loss (anywhere from 3-8 points per billion duped).
|
65 |
+
With DLEC, I was able to increase Phi-2 from 2.78b -> 3.25b with less than or equal to a single point of loss.
|
66 |
+
|
67 |
+
This method is still in active development, and T am currently tweaking the algorithm to improve the layer selection process,
|
68 |
+
I am also working on a single layer duping script as merge kit does not currently support this and I am being forced to merge layers that are unneeded and its degrading performance.
|
69 |
+
|
70 |
## 🧩 Configuration
|
71 |
|
72 |
```yaml
|