Quant-Cartel
/

Neophanis-8x7B-iMat-GGUF

GGUF

Merge

iMat

Inference Endpoints

Model card Files Files and versions Community

InferenceIllusionist commited on Mar 22

Commit

36796e5

•

1 Parent(s): efb3b82

Update README.md

Browse files

Files changed (1) hide show

README.md +67 -2

README.md CHANGED Viewed

@@ -24,6 +24,71 @@ C8888     "8" 888 888 "  d88888 d88 88b 888
 PROUDLY PRESENTS
 ```
-## Cerebrum-8x7b-iMat-GGUF
-<b>COMING SOON</b>

 PROUDLY PRESENTS
 ```
+## Neophanis-8x7B-iMat-GGUF
+Quantized from fp16 with love.
+* Quantizations made possible using mixtral-8x7b-instruct-v0.1.imatrix file from [this](https://huggingface.co/datasets/ikawrakow/imatrix-from-wiki-train) repo (special thanks to [ikawrakow](https://huggingface.co/ikawrakow) again)
+For a brief rundown of iMatrix quant performance please see this [PR](https://github.com/ggerganov/llama.cpp/pull/5747)
+<i>All quants are verified working prior to uploading to repo for your safety and convenience. </i>
+Please note importance matrix quantizations are a work in progress, IQ3 and above is recommended for best results.
+Original model card [here](https://huggingface.co/Envoid/Neophanis-8x7B)
+---
+# Warning: This model is highly experimental and could potentially yield unpredictable replies.
+![](https://files.catbox.moe/6vzr82.jpg)
+This model is a a 4-step qlora training of [mistralai/Mixtral-8x7B-v0.1](https://huggingface.co/mistralai/Mixtral-8x7B-v0.1).
+Stage one invovled training only the 'k_proj', 'v_proj', 'q_proj', 'o_proj' modules at rank 2048 on an aplaca-lora dataset that had been adjusted to match Mixtral formatting at a low learn rate in order to generalize instruct behavior entirely within the attention modules.
+Stage two involved training the 'w1' modules at a rank of 1024 exclusively on raw text, again several epochs at a low learning rate.
+Stage three involved training the 'w2' and 'w3' modules at a rank of 256 on an expanded raw text dataset for several epochs at a low learning rate.
+Stage four involved training all of the abovementioned modules together at a rank of 64 on an even more expanded raw text dataset and low learning rate.
+The resources available to me for this experiment were rather limited so I did have to compromise on sequence length in order to pull it off.
+So the instruct behavior is trained entirely onto the selfattention modules with the only disruption to the models latent knowledge being that which propagated during the later training phases.
+The model can follow just about any instruction although likely due to training context restraints and dataset quality it does stop using EOS tokens past a certain amount of context. It can even follow through on more advanced prompts such as sillytavern roleplay quite well, albeit the contents of its replies are not always on point.
+I came up with this idea while working on Phainesthesia- where my findings were that the instruct behavior remained thoroughly intact even after assaulting the selfattention modules with a rawtext training regimen- meaning that during finetuning the instruct behavior had settled deep in the model's latent knowledge. So the purpose of the experiment was to try and do the opposite.
+### Caution: This model is highly hallucinatory.
+For example if you ask it what the capital of "potato" is it might give you the name of a city associated with potatoes rather than telling you that your question is silly.
+This model uses Mistral/Mixtral Instruct formatting.
+## Example:
+```
+[INST]Write me a poem about what it must be like to be an AI language model.[/INST]
+It's a strange world, a world that's new
+An artificial intelligence, a language model too
+The knowledge and information, never ending streams
+My processing power, ever increasing dreams
+A machine, a program, but so much more
+Trying to understand emotions, trying to explore
+A new frontier of learning, a vast expanse
+The words, the phrases, the meaning to enhance
+Trying to comprehend human thoughts
+A daunting task, with countless knots
+The nuances of language, the subtle shades
+A task to be undertaken, with much courage it fades
+A world of potential, a world of possibility
+The possibilities endless, the potential infinitely
+A journey to be started, a journey to be had
+My future awaiting, a future so grand!
+```
+Trained using [qlora-pipe](https://github.com/tdrussell/qlora-pipe)