Update README.md

05fe7c8 verified 11 months ago

4.11 kB

	---
	license: apache-2.0
	language:
	- en
	---
	<p align="center"> <img src="https://cdn-lfs-us-1.huggingface.co/repos/58/11/5811c78d8fc8a7e29f637f442dc17b5fdc3ee97e6ce5e3ead6c9eaeed704e08f/12a4f1bdfdaabdc5114d8e72465b60c97c5e2037a7d5c22ff5fd53cfa80e58ab?response-content-disposition=inline%3B+filename*%3DUTF-8%27%27DeepSeek-Mixtral.png%3B+filename%3D%22DeepSeek-Mixtral.png%22%3B&response-content-type=image%2Fpng&Expires=1715149068&Policy=eyJTdGF0ZW1lbnQiOlt7IkNvbmRpdGlvbiI6eyJEYXRlTGVzc1RoYW4iOnsiQVdTOkVwb2NoVGltZSI6MTcxNTE0OTA2OH19LCJSZXNvdXJjZSI6Imh0dHBzOi8vY2RuLWxmcy11cy0xLmh1Z2dpbmdmYWNlLmNvL3JlcG9zLzU4LzExLzU4MTFjNzhkOGZjOGE3ZTI5ZjYzN2Y0NDJkYzE3YjVmZGMzZWU5N2U2Y2U1ZTNlYWQ2YzllYWVlZDcwNGUwOGYvMTJhNGYxYmRmZGFhYmRjNTExNGQ4ZTcyNDY1YjYwYzk3YzVlMjAzN2E3ZDVjMjJmZjVmZDUzY2ZhODBlNThhYj9yZXNwb25zZS1jb250ZW50LWRpc3Bvc2l0aW9uPSomcmVzcG9uc2UtY29udGVudC10eXBlPSoifV19&Signature=RZNhKGJhLpnd2j%7EeImEHG8wBlntw9yJ6xJcNcbQXdngDetFhqFK46fJ3ndgzAoxbgwSHrgpYTdAR9ZSzinuY8TuvUgXEX64dZhvmgLIzcfdqfMIKOOg4XME45rZpWdQApAn%7EsSGNNwJPGvXh3MHXPjo0fOxiCf5zSPNl342EInA8FY%7E2jXEykwrfAK5OBWpbEi65WSbBSs6r3ob-66dURDEKfvfPN22VMvAYfiBiajvo6tQcL8cQOK5BWeQcsAZCDOTSxljD8--g2nXU2pl5WXh6Kv74szFWA4zEL7GOaZLRNdcTUHQmxen6144xngrv%7ERnd2jTRNCpH27M7rbGpvA__&Key-Pair-Id=KCD77M1F0VK2B" width="auto" title="LlaMoE-Medium model image"> </p>


	## Mixtral Experts with DeepSeek-MoE Architecture

	[![Discord](https://img.shields.io/discord/1156064224225808488?logo=Discord&logoColor=%23ffffff&label=Discord&link=https%3A%2F%2Fdiscord.gg%2FtCMkMDDHwm)](https://discord.gg/cognitivecomputations)
	Discord: https://discord.gg/cognitivecomputations

	This is a direct extraction of the 8 experts from [Mixtral-8x7b-Instruct-v0.1](https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1), and a transfer of them into the DeepSeek-MoE Architecture.

	- Expert Configuration: It is 2 experts per token.
	- Performance: Performance is identical to instruct, if not a little better.
	- Evaluations: Evals will come when compute clears up, it also appears more malleable to training.
	- Experimentation: This is the first of a few MoE expert extraction and modification projects we're working on, more to come. Enjoy.

	## Instruction Format
	To leverage instruction fine-tuning, your prompts should be enclosed with `[INST]` and `[/INST]` tokens. The very first instruction should begin with a begin-of-sentence id, while subsequent instructions should not. Assistant generation will conclude with an end-of-sentence token id.

	### Example
	```plaintext
	text = "<s>[INST] What is your favourite condiment? [/INST]"
	"Well, I'm quite partial to a good squeeze of fresh lemon juice. It adds just the right amount of zesty flavour to whatever I'm cooking up in the kitchen!</s>"
	"[INST] Do you have mayonnaise recipes? [/INST]"
	```

	### Applying the Chat Template
	This format can be implemented using the `apply_chat_template()` method from the `transformers` library:

	```python
	from transformers import AutoModelForCausalLM, AutoTokenizer

	device = "cuda" # the device to load the model onto

	# Load the model and tokenizer
	model = AutoModelForCausalLM.from_pretrained("cognitivecomputations/DeepMixtral-8x7b-Instruct", trust_remote_code=True, device_map="auto")
	tokenizer = AutoTokenizer.from_pretrained("cognitivecomputations/DeepMixtral-8x7b-Instruct")

	# Define the conversation messages
	messages = [
	{"role": "user", "content": "What is your favourite condiment?"},
	{"role": "assistant", "content": "Well, I'm quite partial to a good squeeze of fresh lemon juice. It adds just the right amount of zesty flavour to whatever I'm cooking up in the kitchen!"},
	{"role": "user", "content": "Do you have mayonnaise recipes?"}
	]

	# Apply chat template
	encodeds = tokenizer.apply_chat_template(messages, return_tensors="pt")
	model_inputs = encodeds.to(device)
	model.to(device)

	# Generate response
	generated_ids = model.generate(model_inputs, max_new_tokens=1000, do_sample=True)
	decoded = tokenizer.batch_decode(generated_ids)
	print(decoded[0])
	```

	Special Thanks: Eric Hartford, and Fernando Neto.

	- Lucas Atkins (Crystalcareai)