Crystalcareai commited on
Commit
c9e7b81
·
verified ·
1 Parent(s): fa057a9

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +6 -0
README.md CHANGED
@@ -7,3 +7,9 @@ language:
7
 
8
 
9
  This is a direct extraction of the 8 experts from [Mixtral-8x7b-Instruct-v0.1](https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1), and placing them into the Deepseek-MoE Architecture.
 
 
 
 
 
 
 
7
 
8
 
9
  This is a direct extraction of the 8 experts from [Mixtral-8x7b-Instruct-v0.1](https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1), and placing them into the Deepseek-MoE Architecture.
10
+ It is 2 experts per token. Performance is good, and is likely more malleable to training. This is our first experiment with expert extraction and modification, more to come. Enjoy.
11
+
12
+ Special Thanks: Eric Hartford, and Fernando Neto.
13
+
14
+ -Lucas Atkins (Crystalcareai)
15
+