Kquant03
/

CognitiveFusion-4x7B-GGUF

Model card Files Files and versions Community

Kquant03 commited on Jan 3, 2024

Commit

3d0d6eb

·

1 Parent(s): 22a16ee

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -40,7 +40,7 @@ If all our tokens are sent to just a few popular experts, that will make trainin
 The difference between MoE and "frankenMoE" lies in the fact that the router layer in a model like the one on this repo is not trained simultaneously. There are rumors about someone developing a way for us to unscuff these frankenMoE models by training the router layer simultaneously. For now, frankenMoE remains psychotic. Raiden does improve upon the base heegyu/WizardVicuna-Uncensored-3B-0719, though.
 ## "Are there at least any datasets or plans for this model, in any way?"
-There are many datasets included as a result of merging four models...for one, Silicon Maid is a merge of xDan which is trained on the [OpenOrca Dataset](https://huggingface.co/datasets/Open-Orca/OpenOrca) and the [OpenOrca DPO pairs](https://huggingface.co/datasets/Intel/orca_dpo_pairs). Loyal-Macaroni-Maid uses OpenChat-3.5, Starling and NeuralChat which has so many datasets I'm not going to list them all here. Dolphin 2.6 Mistral also has a large variety of datasets. Panda-7B-v0.1 was fine tuned from base mistral for the purpose of writing compelling narratives that beat even mistral-instruct.
 # Results
 ## Some results from the model's performance.

 The difference between MoE and "frankenMoE" lies in the fact that the router layer in a model like the one on this repo is not trained simultaneously. There are rumors about someone developing a way for us to unscuff these frankenMoE models by training the router layer simultaneously. For now, frankenMoE remains psychotic. Raiden does improve upon the base heegyu/WizardVicuna-Uncensored-3B-0719, though.
 ## "Are there at least any datasets or plans for this model, in any way?"
+There are many datasets included as a result of merging four models...for one, Silicon Maid is a merge of xDan which is trained on the [OpenOrca Dataset](https://huggingface.co/datasets/Open-Orca/OpenOrca) and the [OpenOrca DPO pairs](https://huggingface.co/datasets/Intel/orca_dpo_pairs). Loyal-Macaroni-Maid uses OpenChat-3.5, Starling and NeuralChat which has so many datasets I'm not going to list them all here. Dolphin 2.6 Mistral also has a large variety of datasets. Panda-7B-v0.1 was fine tuned by the person collaborating on this project with me using a base mistral and a private dataset. Panda gives the model the creativity it has while the rest act as support.
 # Results
 ## Some results from the model's performance.