GGUF
English
Merge
Inference Endpoints
Kquant03 commited on
Commit
3d0d6eb
·
1 Parent(s): 22a16ee

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -40,7 +40,7 @@ If all our tokens are sent to just a few popular experts, that will make trainin
40
  The difference between MoE and "frankenMoE" lies in the fact that the router layer in a model like the one on this repo is not trained simultaneously. There are rumors about someone developing a way for us to unscuff these frankenMoE models by training the router layer simultaneously. For now, frankenMoE remains psychotic. Raiden does improve upon the base heegyu/WizardVicuna-Uncensored-3B-0719, though.
41
 
42
  ## "Are there at least any datasets or plans for this model, in any way?"
43
- There are many datasets included as a result of merging four models...for one, Silicon Maid is a merge of xDan which is trained on the [OpenOrca Dataset](https://huggingface.co/datasets/Open-Orca/OpenOrca) and the [OpenOrca DPO pairs](https://huggingface.co/datasets/Intel/orca_dpo_pairs). Loyal-Macaroni-Maid uses OpenChat-3.5, Starling and NeuralChat which has so many datasets I'm not going to list them all here. Dolphin 2.6 Mistral also has a large variety of datasets. Panda-7B-v0.1 was fine tuned from base mistral for the purpose of writing compelling narratives that beat even mistral-instruct.
44
 
45
  # Results
46
  ## Some results from the model's performance.
 
40
  The difference between MoE and "frankenMoE" lies in the fact that the router layer in a model like the one on this repo is not trained simultaneously. There are rumors about someone developing a way for us to unscuff these frankenMoE models by training the router layer simultaneously. For now, frankenMoE remains psychotic. Raiden does improve upon the base heegyu/WizardVicuna-Uncensored-3B-0719, though.
41
 
42
  ## "Are there at least any datasets or plans for this model, in any way?"
43
+ There are many datasets included as a result of merging four models...for one, Silicon Maid is a merge of xDan which is trained on the [OpenOrca Dataset](https://huggingface.co/datasets/Open-Orca/OpenOrca) and the [OpenOrca DPO pairs](https://huggingface.co/datasets/Intel/orca_dpo_pairs). Loyal-Macaroni-Maid uses OpenChat-3.5, Starling and NeuralChat which has so many datasets I'm not going to list them all here. Dolphin 2.6 Mistral also has a large variety of datasets. Panda-7B-v0.1 was fine tuned by the person collaborating on this project with me using a base mistral and a private dataset. Panda gives the model the creativity it has while the rest act as support.
44
 
45
  # Results
46
  ## Some results from the model's performance.