QuixiAI
/

DeepMixtral-8x7b-Instruct

Text Generation

Model card Files Files and versions

Crystalcareai commited on May 5, 2024

Commit

97a76e7

·

verified ·

1 Parent(s): c7ac1fb

Update README.md

Files changed (1) hide show

README.md +2 -1

README.md CHANGED Viewed

@@ -13,7 +13,8 @@ This is a direct extraction of the 8 experts from [Mixtral-8x7b-Instruct-v0.1](h
 - **Expert Configuration:** It is 2 experts per token.
 - **Performance:** Performance is identical to instruct, if not a little better.
 - **Evaluations:** Evals will come, it is more malleable to training.
-- **Experimentation:** This is our first experiment with expert extraction and modification, more to come. Enjoy.
 ## Instruction Format
 To leverage instruction fine-tuning, your prompts should be enclosed with `[INST]` and `[/INST]` tokens. The very first instruction should begin with a begin-of-sentence id, while subsequent instructions should not. Assistant generation will conclude with an end-of-sentence token id.

 - **Expert Configuration:** It is 2 experts per token.
 - **Performance:** Performance is identical to instruct, if not a little better.
 - **Evaluations:** Evals will come, it is more malleable to training.
+- **Experimentation:** This is the first of a few MoE expert extraction and modification projects we're working on, more to come.
+-  Enjoy.
 ## Instruction Format
 To leverage instruction fine-tuning, your prompts should be enclosed with `[INST]` and `[/INST]` tokens. The very first instruction should begin with a begin-of-sentence id, while subsequent instructions should not. Assistant generation will conclude with an end-of-sentence token id.