File size: 4,109 Bytes
8df50c3
 
 
 
 
de5e60d
 
 
c7ac1fb
3d9a8e0
05fe7c8
 
 
50d4f86
221a259
 
 
50d4f86
d197773
3d9a8e0
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
14ac240
 
3d9a8e0
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
c9e7b81
 
 
221a259
c9e7b81
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
---
license: apache-2.0
language:
- en
---
<p align="center"> <img src="https://cdn-lfs-us-1.huggingface.co/repos/58/11/5811c78d8fc8a7e29f637f442dc17b5fdc3ee97e6ce5e3ead6c9eaeed704e08f/12a4f1bdfdaabdc5114d8e72465b60c97c5e2037a7d5c22ff5fd53cfa80e58ab?response-content-disposition=inline%3B+filename*%3DUTF-8%27%27DeepSeek-Mixtral.png%3B+filename%3D%22DeepSeek-Mixtral.png%22%3B&response-content-type=image%2Fpng&Expires=1715149068&Policy=eyJTdGF0ZW1lbnQiOlt7IkNvbmRpdGlvbiI6eyJEYXRlTGVzc1RoYW4iOnsiQVdTOkVwb2NoVGltZSI6MTcxNTE0OTA2OH19LCJSZXNvdXJjZSI6Imh0dHBzOi8vY2RuLWxmcy11cy0xLmh1Z2dpbmdmYWNlLmNvL3JlcG9zLzU4LzExLzU4MTFjNzhkOGZjOGE3ZTI5ZjYzN2Y0NDJkYzE3YjVmZGMzZWU5N2U2Y2U1ZTNlYWQ2YzllYWVlZDcwNGUwOGYvMTJhNGYxYmRmZGFhYmRjNTExNGQ4ZTcyNDY1YjYwYzk3YzVlMjAzN2E3ZDVjMjJmZjVmZDUzY2ZhODBlNThhYj9yZXNwb25zZS1jb250ZW50LWRpc3Bvc2l0aW9uPSomcmVzcG9uc2UtY29udGVudC10eXBlPSoifV19&Signature=RZNhKGJhLpnd2j%7EeImEHG8wBlntw9yJ6xJcNcbQXdngDetFhqFK46fJ3ndgzAoxbgwSHrgpYTdAR9ZSzinuY8TuvUgXEX64dZhvmgLIzcfdqfMIKOOg4XME45rZpWdQApAn%7EsSGNNwJPGvXh3MHXPjo0fOxiCf5zSPNl342EInA8FY%7E2jXEykwrfAK5OBWpbEi65WSbBSs6r3ob-66dURDEKfvfPN22VMvAYfiBiajvo6tQcL8cQOK5BWeQcsAZCDOTSxljD8--g2nXU2pl5WXh6Kv74szFWA4zEL7GOaZLRNdcTUHQmxen6144xngrv%7ERnd2jTRNCpH27M7rbGpvA__&Key-Pair-Id=KCD77M1F0VK2B" width="auto" title="LlaMoE-Medium model image"> </p>


## Mixtral Experts with DeepSeek-MoE Architecture

[![Discord](https://img.shields.io/discord/1156064224225808488?logo=Discord&logoColor=%23ffffff&label=Discord&link=https%3A%2F%2Fdiscord.gg%2FtCMkMDDHwm)](https://discord.gg/cognitivecomputations)
Discord: https://discord.gg/cognitivecomputations

This is a direct extraction of the 8 experts from [Mixtral-8x7b-Instruct-v0.1](https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1), and a transfer of them into the DeepSeek-MoE Architecture.

- **Expert Configuration:** It is 2 experts per token.
- **Performance:** Performance is identical to instruct, if not a little better.
- **Evaluations:** Evals will come when compute clears up, it also appears more malleable to training.
- **Experimentation:** This is the first of a few MoE expert extraction and modification projects we're working on, more to come. Enjoy.

## Instruction Format
To leverage instruction fine-tuning, your prompts should be enclosed with `[INST]` and `[/INST]` tokens. The very first instruction should begin with a begin-of-sentence id, while subsequent instructions should not. Assistant generation will conclude with an end-of-sentence token id.

### Example
```plaintext
text = "<s>[INST] What is your favourite condiment? [/INST]"
"Well, I'm quite partial to a good squeeze of fresh lemon juice. It adds just the right amount of zesty flavour to whatever I'm cooking up in the kitchen!</s>"
"[INST] Do you have mayonnaise recipes? [/INST]"
```

### Applying the Chat Template
This format can be implemented using the `apply_chat_template()` method from the `transformers` library:

```python
from transformers import AutoModelForCausalLM, AutoTokenizer

device = "cuda"  # the device to load the model onto

# Load the model and tokenizer
model = AutoModelForCausalLM.from_pretrained("cognitivecomputations/DeepMixtral-8x7b-Instruct", trust_remote_code=True, device_map="auto")
tokenizer = AutoTokenizer.from_pretrained("cognitivecomputations/DeepMixtral-8x7b-Instruct")

# Define the conversation messages
messages = [
    {"role": "user", "content": "What is your favourite condiment?"},
    {"role": "assistant", "content": "Well, I'm quite partial to a good squeeze of fresh lemon juice. It adds just the right amount of zesty flavour to whatever I'm cooking up in the kitchen!"},
    {"role": "user", "content": "Do you have mayonnaise recipes?"}
]

# Apply chat template
encodeds = tokenizer.apply_chat_template(messages, return_tensors="pt")
model_inputs = encodeds.to(device)
model.to(device)

# Generate response
generated_ids = model.generate(model_inputs, max_new_tokens=1000, do_sample=True)
decoded = tokenizer.batch_decode(generated_ids)
print(decoded[0])
```

Special Thanks: Eric Hartford, and Fernando Neto.

- Lucas Atkins (Crystalcareai)