Upscaled models using the Block Expansion method. Unlike the more common DUP Scaling, BE doesn't require fine-tuning to recover lost performance.
-
Pretergeek/OpenChat-3.5-0106_8.11B_36Layers-Interleaved
Text Generation • Updated • 30 • 2 -
Pretergeek/OpenChat-3.5-0106_8.99B_40Layers-Interleaved
Text Generation • Updated • 26 • 2 -
Pretergeek/OpenChat-3.5-0106_10.7B_48Layers-Interleaved
Text Generation • Updated • 20 • 2 -
Pretergeek/OpenChat-3.5-0106_8.11B_36Layers-Appended
Text Generation • Updated • 31 • 2