macadeliccc
commited on
Commit
•
7a42420
1
Parent(s):
1ed3a5f
Update README.md
Browse files
README.md
CHANGED
@@ -10,7 +10,7 @@ Credit to Fernando Fernandes and Eric Hartford for their project [laserRMT](http
|
|
10 |
|
11 |
This model is a medium-sized MoE implementation based on [cognitivecomputations/dolphin-2.6-mistral-7b-dpo-laser](https://huggingface.co/cognitivecomputations/dolphin-2.6-mistral-7b-dpo-laser)
|
12 |
|
13 |
-
A 2x7b configuration offers better performance than a standard 7b model even if loaded in 4 bit.
|
14 |
|
15 |
If this 2x7b model is loaded in 4 bit the hellaswag score is .8270 which is higher than the base model achieves on its own in full precision.
|
16 |
|
|
|
10 |
|
11 |
This model is a medium-sized MoE implementation based on [cognitivecomputations/dolphin-2.6-mistral-7b-dpo-laser](https://huggingface.co/cognitivecomputations/dolphin-2.6-mistral-7b-dpo-laser)
|
12 |
|
13 |
+
A 2x7b configuration offers better performance than a standard 7b model even if loaded in 4 bit. (9G VRAM)
|
14 |
|
15 |
If this 2x7b model is loaded in 4 bit the hellaswag score is .8270 which is higher than the base model achieves on its own in full precision.
|
16 |
|