|
--- |
|
license: apache-2.0 |
|
language: |
|
- en |
|
--- |
|
<p align="center"> <img src="https://cdn-lfs-us-1.huggingface.co/repos/58/11/5811c78d8fc8a7e29f637f442dc17b5fdc3ee97e6ce5e3ead6c9eaeed704e08f/12a4f1bdfdaabdc5114d8e72465b60c97c5e2037a7d5c22ff5fd53cfa80e58ab?response-content-disposition=inline%3B+filename*%3DUTF-8%27%27DeepSeek-Mixtral.png%3B+filename%3D%22DeepSeek-Mixtral.png%22%3B&response-content-type=image%2Fpng&Expires=1715149068&Policy=eyJTdGF0ZW1lbnQiOlt7IkNvbmRpdGlvbiI6eyJEYXRlTGVzc1RoYW4iOnsiQVdTOkVwb2NoVGltZSI6MTcxNTE0OTA2OH19LCJSZXNvdXJjZSI6Imh0dHBzOi8vY2RuLWxmcy11cy0xLmh1Z2dpbmdmYWNlLmNvL3JlcG9zLzU4LzExLzU4MTFjNzhkOGZjOGE3ZTI5ZjYzN2Y0NDJkYzE3YjVmZGMzZWU5N2U2Y2U1ZTNlYWQ2YzllYWVlZDcwNGUwOGYvMTJhNGYxYmRmZGFhYmRjNTExNGQ4ZTcyNDY1YjYwYzk3YzVlMjAzN2E3ZDVjMjJmZjVmZDUzY2ZhODBlNThhYj9yZXNwb25zZS1jb250ZW50LWRpc3Bvc2l0aW9uPSomcmVzcG9uc2UtY29udGVudC10eXBlPSoifV19&Signature=RZNhKGJhLpnd2j%7EeImEHG8wBlntw9yJ6xJcNcbQXdngDetFhqFK46fJ3ndgzAoxbgwSHrgpYTdAR9ZSzinuY8TuvUgXEX64dZhvmgLIzcfdqfMIKOOg4XME45rZpWdQApAn%7EsSGNNwJPGvXh3MHXPjo0fOxiCf5zSPNl342EInA8FY%7E2jXEykwrfAK5OBWpbEi65WSbBSs6r3ob-66dURDEKfvfPN22VMvAYfiBiajvo6tQcL8cQOK5BWeQcsAZCDOTSxljD8--g2nXU2pl5WXh6Kv74szFWA4zEL7GOaZLRNdcTUHQmxen6144xngrv%7ERnd2jTRNCpH27M7rbGpvA__&Key-Pair-Id=KCD77M1F0VK2B" width="auto" title="LlaMoE-Medium model image"> </p> |
|
|
|
|
|
## Mixtral Experts with DeepSeek-MoE Architecture |
|
|
|
[](https://discord.gg/cognitivecomputations) |
|
Discord: https://discord.gg/cognitivecomputations |
|
|
|
This is a direct extraction of the 8 experts from [Mixtral-8x7b-Instruct-v0.1](https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1), and a transfer of them into the DeepSeek-MoE Architecture. |
|
|
|
- **Expert Configuration:** It is 2 experts per token. |
|
- **Performance:** Performance is identical to instruct, if not a little better. |
|
- **Evaluations:** Evals will come when compute clears up, it also appears more malleable to training. |
|
- **Experimentation:** This is the first of a few MoE expert extraction and modification projects we're working on, more to come. Enjoy. |
|
|
|
## Instruction Format |
|
To leverage instruction fine-tuning, your prompts should be enclosed with `[INST]` and `[/INST]` tokens. The very first instruction should begin with a begin-of-sentence id, while subsequent instructions should not. Assistant generation will conclude with an end-of-sentence token id. |
|
|
|
### Example |
|
```plaintext |
|
text = "<s>[INST] What is your favourite condiment? [/INST]" |
|
"Well, I'm quite partial to a good squeeze of fresh lemon juice. It adds just the right amount of zesty flavour to whatever I'm cooking up in the kitchen!</s>" |
|
"[INST] Do you have mayonnaise recipes? [/INST]" |
|
``` |
|
|
|
### Applying the Chat Template |
|
This format can be implemented using the `apply_chat_template()` method from the `transformers` library: |
|
|
|
```python |
|
from transformers import AutoModelForCausalLM, AutoTokenizer |
|
|
|
device = "cuda" # the device to load the model onto |
|
|
|
# Load the model and tokenizer |
|
model = AutoModelForCausalLM.from_pretrained("cognitivecomputations/DeepMixtral-8x7b-Instruct", trust_remote_code=True, device_map="auto") |
|
tokenizer = AutoTokenizer.from_pretrained("cognitivecomputations/DeepMixtral-8x7b-Instruct") |
|
|
|
# Define the conversation messages |
|
messages = [ |
|
{"role": "user", "content": "What is your favourite condiment?"}, |
|
{"role": "assistant", "content": "Well, I'm quite partial to a good squeeze of fresh lemon juice. It adds just the right amount of zesty flavour to whatever I'm cooking up in the kitchen!"}, |
|
{"role": "user", "content": "Do you have mayonnaise recipes?"} |
|
] |
|
|
|
# Apply chat template |
|
encodeds = tokenizer.apply_chat_template(messages, return_tensors="pt") |
|
model_inputs = encodeds.to(device) |
|
model.to(device) |
|
|
|
# Generate response |
|
generated_ids = model.generate(model_inputs, max_new_tokens=1000, do_sample=True) |
|
decoded = tokenizer.batch_decode(generated_ids) |
|
print(decoded[0]) |
|
``` |
|
|
|
Special Thanks: Eric Hartford, and Fernando Neto. |
|
|
|
- Lucas Atkins (Crystalcareai) |
|
|
|
|