Upload folder using huggingface_hub
Browse files
README.md
CHANGED
@@ -15,9 +15,9 @@ tags:
|
|
15 |
|
16 |

|
17 |
|
18 |
-
# phixtral-
|
19 |
|
20 |
-
phixtral-
|
21 |
|
22 |
You can try it out using this [Space](https://huggingface.co/spaces/mlabonne/phixtral-chat).
|
23 |
|
@@ -25,12 +25,7 @@ You can try it out using this [Space](https://huggingface.co/spaces/mlabonne/phi
|
|
25 |
|
26 |
The evaluation was performed using [LLM AutoEval](https://github.com/mlabonne/llm-autoeval) on Nous suite.
|
27 |
|
28 |
-
|
29 |
-
|----------------------------------------------------------------|------:|------:|---------:|-------:|------:|
|
30 |
-
|[**phixtral-2x2_8**](https://huggingface.co/mlabonne/phixtral-2x2_8)| **34.1**| **70.44**| **48.78**| **37.82**| **47.78**|
|
31 |
-
|[dolphin-2_6-phi-2](https://huggingface.co/cognitivecomputations/dolphin-2_6-phi-2)| 33.12| 69.85| 47.39| 37.2| 46.89|
|
32 |
-
|[phi-2-dpo](https://huggingface.co/lxuechen/phi-2-dpo)| 30.39| 71.68| 50.75| 34.9| 46.93|
|
33 |
-
|[phi-2](https://huggingface.co/microsoft/phi-2)| 27.98| 70.8| 44.43| 35.21| 44.61|
|
34 |
|
35 |
Check [YALL - Yet Another LLM Leaderboard](https://huggingface.co/spaces/mlabonne/Yet_Another_LLM_Leaderboard) to compare it with other models.
|
36 |
|
@@ -58,7 +53,7 @@ Here's a [Colab notebook](https://colab.research.google.com/drive/1k6C_oJfEKUq0m
|
|
58 |
import torch
|
59 |
from transformers import AutoModelForCausalLM, AutoTokenizer
|
60 |
|
61 |
-
model_name = "phixtral-
|
62 |
instruction = '''
|
63 |
def print_prime(n):
|
64 |
"""
|
@@ -95,9 +90,9 @@ text = tokenizer.batch_decode(outputs)[0]
|
|
95 |
print(text)
|
96 |
```
|
97 |
|
98 |
-
Inspired by [mistralai/Mixtral-8x7B-v0.1](https://huggingface.co/mistralai/Mixtral-8x7B-v0.1), you can specify the `num_experts_per_tok` and `num_local_experts` in the [`config.json`](https://huggingface.co/mlabonne/phixtral-
|
99 |
|
100 |
-
[vince62s](https://huggingface.co/vince62s) implemented the MoE inference code in the `modeling_phi.py` file. In particular, see the [MoE class](https://huggingface.co/mlabonne/phixtral-
|
101 |
|
102 |
## 🤝 Acknowledgments
|
103 |
|
|
|
15 |
|
16 |

|
17 |
|
18 |
+
# phixtral-3x2_8
|
19 |
|
20 |
+
phixtral-3x2_8 is the first Mixure of Experts (MoE) made with two [microsoft/phi-2](https://huggingface.co/microsoft/phi-2) models, inspired by the [mistralai/Mixtral-8x7B-v0.1](https://huggingface.co/mistralai/Mixtral-8x7B-v0.1) architecture. It performs better than each individual expert.
|
21 |
|
22 |
You can try it out using this [Space](https://huggingface.co/spaces/mlabonne/phixtral-chat).
|
23 |
|
|
|
25 |
|
26 |
The evaluation was performed using [LLM AutoEval](https://github.com/mlabonne/llm-autoeval) on Nous suite.
|
27 |
|
28 |
+
TBD
|
|
|
|
|
|
|
|
|
|
|
29 |
|
30 |
Check [YALL - Yet Another LLM Leaderboard](https://huggingface.co/spaces/mlabonne/Yet_Another_LLM_Leaderboard) to compare it with other models.
|
31 |
|
|
|
53 |
import torch
|
54 |
from transformers import AutoModelForCausalLM, AutoTokenizer
|
55 |
|
56 |
+
model_name = "phixtral-3x2_8"
|
57 |
instruction = '''
|
58 |
def print_prime(n):
|
59 |
"""
|
|
|
90 |
print(text)
|
91 |
```
|
92 |
|
93 |
+
Inspired by [mistralai/Mixtral-8x7B-v0.1](https://huggingface.co/mistralai/Mixtral-8x7B-v0.1), you can specify the `num_experts_per_tok` and `num_local_experts` in the [`config.json`](https://huggingface.co/mlabonne/phixtral-3x2_8/blob/main/config.json#L26-L27) file (2 for both by default). This configuration is automatically loaded in `configuration.py`.
|
94 |
|
95 |
+
[vince62s](https://huggingface.co/vince62s) implemented the MoE inference code in the `modeling_phi.py` file. In particular, see the [MoE class](https://huggingface.co/mlabonne/phixtral-3x2_8/blob/main/modeling_phi.py#L293-L317).
|
96 |
|
97 |
## 🤝 Acknowledgments
|
98 |
|