SteelStorage
/

phi-2-DLEC

Text Generation

abacaj/phi-2-super

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

Steelskull commited on Mar 21, 2024

Commit

409f695

·

verified ·

1 Parent(s): 2ac61c5

Update README.md

Files changed (1) hide show

README.md +11 -0

README.md CHANGED Viewed

@@ -56,6 +56,17 @@ Currently, I am still limited to Mergekit, for this method, which does not suppo
 # This Method is still in development and I do not expect "Game Changing" or will I oversell this method, it is purely done for fun. Please let me know how the model works for you.
 ## 🧩 Configuration
 ```yaml

 # This Method is still in development and I do not expect "Game Changing" or will I oversell this method, it is purely done for fun. Please let me know how the model works for you.
+## ⚙️ Evals
+My Leaderboard:
+https://huggingface.co/spaces/Steelskull/YALL-Leaderboard
+As you know there is a usual loss of intelligence with model mergers, especially with Passthrough merging, on the par of 3ish points per billion duped, IF you get the right merge, if not your looking at a much larger loss (anywhere from 3-8 points per billion duped).
+With DLEC, I was able to increase Phi-2 from 2.78b -> 3.25b with less than or equal to a single point of loss.
+This method is still in active development, and T am currently tweaking the algorithm to improve the layer selection process,
+I am also working on a single layer duping script as merge kit does not currently support this and I am being forced to merge layers that are unneeded and its degrading performance.
 ## 🧩 Configuration
 ```yaml