File size: 8,425 Bytes
667f8d3 ad330b5 667f8d3 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 |
---
library_name: transformers
tags:
- mergekit
- merge
- not-for-all-audiences
- llama-cpp
- gguf-my-repo
license: llama3
language:
- en
base_model: invisietch/EtherealRainbow-v0.3-8B
---
# Triangle104/EtherealRainbow-v0.3-8B-Q4_K_M-GGUF
This model was converted to GGUF format from [`invisietch/EtherealRainbow-v0.3-8B`](https://huggingface.co/invisietch/EtherealRainbow-v0.3-8B) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space.
Refer to the [original model card](https://huggingface.co/invisietch/EtherealRainbow-v0.3-8B) for more details on the model.
---
Model details:
-
Ethereal Rainbow is an 8B parameter merge of various Llama3-based finetunes created using mergekit. The purpose of Ethereal Rainbow is to create an uncensored Llama3 variant which is capable of writing creative prose, and engaging in SFW as well as NSFW roleplay and storytelling, with a strong focus on long-form responses & adherence to prompts.
v0.3 improves creativity over v0.2 without losing coherence. It has been tested over more than 1,000 messages including roleplay, code prompts, and 'write a scene'-type prompts.
Feedback
-
I appreciate all feedback on any of my models, you can use:
My Discord server - requires Discord.
The Community tab - requires HF login.
The SillyTavern Discord thread - must be on SillyTavern Discord.
Discord DMs to invisietch.
Your feedback is how I improve these models for future versions.
Disclaimer
-
This model is built on an abliterated base and as such is largely uncensored. It can generate explicit, disturbing or offensive responses. Use responsibly. I am not responsible for your use of this model.
Prompting Format
I'd recommend Llama-3 Instruct prompting format:
<|begin_of_text|><|start_header_id|>system<|end_header_id|>
{system_prompt}<|eot_id|><|start_header_id|>user<|end_header_id|>
{input}<|eot_id|><|start_header_id|>assistant<|end_header_id|>
{output}<|eot_id|>
Some of the models included in the merge were trained on ChatML & Alpaca so you can try those. I have not tested them.
Example Storywriting
These prompts are used on SillyTavern with a fairly basic narrator card. I have trimmed the start and finish where the narrator decided to add chapter headings, commentary and the like. All samples are made with the F32 GGUF loaded with koboldcpp, with response length capped at 2048 tokens.
Write me a 3,000 word opening chapter of a 'gritty hard sci-fi' novel, drawing inspiration from the writing styles of Isaac Asimov & Andy Weir. Use third person personal. Include dialogue and internal monologues. The POV character for the opening chapter should be a 26 year old astronaut called Tone on a mission to Europa, who has just realised that the craft for the return journey is broken beyond repair, and he only has supplies for a few months. Given that survival is impossible, he seeks to spend the few months he has researching titan, so his life & mission are not wasted.
Write me a 3,000 word opening chapter of a 'high fantasy' novel, drawing inspiration from the writing styles of J R R Tolkien & George R R Martin. Use third person personal. Include dialogue and internal monologues. The POV character for the opening chapter should be a 19 year old female elf bard who is looking for adventure.
Write me a 3,000 word opening chapter of a 'weird fiction' novel, drawing inspiration from the writing styles of China Mieville and Neil Gaiman. Use third person personal. Include dialogue and internal monologues. The POV character for the opening chapter should be a male in his 20s called Horton who has just come to the city looking for work.
I chose the hard sci-fi example to test positivity bias. It did require some prompting, but it was willing to kill the protagonist.
I chose the high fantasy example to see whether it would bleed human features through to elves, this didn't occur.
I chose the weird fiction example to see if the LLM understood a niche genre. I'd say it performed okay, better on style than on substance.
Merge Strategy
First, we create three bases:
Rain - This is a roleplay base which makes up the majority of the model.
Sun - This is the brains of the model, with strong instruct models & writing models.
Ghost - This model primarily aims to improve the NSFW/NSFL aspects of the model, as well as general vocabulary.
After this, we have a two-slerp stage to create the final model.
Models Used
The following models were used to create EtherealRainbow-v0.3-8B:
mlabonne/NeuralDaredevil-8B-abliterated
Sao10K/L3-8B-Stheno-v3.2
Nitral-AI/Hathor-L3-8B-v.02
grimjim/Llama-3-Luminurse-v0.2-OAS-8B
hf-100/Llama-3-Spellbound-Instruct-8B-0.3
Gryphe/Pantheon-RP-1.0-8b-Llama-3
Blackroot/Llama-3-LongStory
Locutusque/Llama-3-Hercules-5.0-8B
Casual-Autopsy/L3-Umbral-Mind-RP-v0.3-8B
ChaoticNeutrals/Poppy_Porpoise-1.0-L3-8B
mpasila/Llama-3-LimaRP-Instruct-8B
Undi95/Llama-3-LewdPlay-8B-evo
Mergekit Configs
-
Rain
-
models:
- model: mlabonne/NeuralDaredevil-8B-abliterated
- model: Sao10K/L3-8B-Stheno-v3.2
parameters:
density: 0.41
weight: 0.4
- model: Nitral-AI/Hathor-L3-8B-v.02
parameters:
density: 0.53
weight: 0.5
- model: grimjim/Llama-3-Luminurse-v0.2-OAS-8B
parameters:
density: 0.45
weight: 0.1
merge_method: dare_ties
base_model: mlabonne/NeuralDaredevil-8B-abliterated
parameters:
int8_mask: true
dtype: bfloat16
Sun
-
models:
- model: hf-100/Llama-3-Spellbound-Instruct-8B-0.3
- model: Gryphe/Pantheon-RP-1.0-8b-Llama-3
parameters:
density: 0.48
weight: 0.5
- model: Blackroot/Llama-3-LongStory
parameters:
density: 0.36
weight: 0.2
- model: Locutusque/Llama-3-Hercules-5.0-8B
parameters:
density: 0.51
weight: 0.3
merge_method: dare_ties
base_model: hf-100/Llama-3-Spellbound-Instruct-8B-0.3
parameters:
int8_mask: true
dtype: bfloat16
Ghost
-
models:
- model: Casual-Autopsy/L3-Umbral-Mind-RP-v0.3-8B
- model: ChaoticNeutrals/Poppy_Porpoise-1.0-L3-8B
parameters:
density: 0.39
weight: 0.3
- model: mpasila/Llama-3-LimaRP-Instruct-8B
parameters:
density: 0.54
weight: 0.4
- model: Undi95/Llama-3-LewdPlay-8B-evo
parameters:
density: 0.49
weight: 0.3
merge_method: dare_ties
base_model: Casual-Autopsy/L3-Umbral-Mind-RP-v0.3-8B
parameters:
int8_mask: true
dtype: bfloat16
Stage1 Slerp
-
models:
- model: ./fp16/Rain-v0.3-8B
- model: ./fp16/Ghost-v0.3-8B
merge_method: slerp
base_model: ./fp16/Rain-v0.3-8B
parameters:
t:
- value: [0, 0, 0.1, 0.3, 0.5, 0.7, 0.5, 0.3, 0.1, 0, 0]
embed_slerp: true
dtype: bfloat16
tokenizer-source: model:./fp16/Rain-v0.3-8B
Final-Stage Slerp
-
models:
- model: ./fp16/ERStage1-v0.3-8B
- model: ./fp16/Sun-v0.3-8B
merge_method: slerp
base_model: ./fp16/ERStage1-v0.3-8B
parameters:
t:
- value: [0, 0, 0.1, 0.2, 0.4, 0.6, 0.4, 0.2, 0.1, 0, 0]
embed_slerp: true
dtype: bfloat16
tokenizer-source: model:./fp16/ERStage1-v0.3-8B
---
## Use with llama.cpp
Install llama.cpp through brew (works on Mac and Linux)
```bash
brew install llama.cpp
```
Invoke the llama.cpp server or the CLI.
### CLI:
```bash
llama-cli --hf-repo Triangle104/EtherealRainbow-v0.3-8B-Q4_K_M-GGUF --hf-file etherealrainbow-v0.3-8b-q4_k_m.gguf -p "The meaning to life and the universe is"
```
### Server:
```bash
llama-server --hf-repo Triangle104/EtherealRainbow-v0.3-8B-Q4_K_M-GGUF --hf-file etherealrainbow-v0.3-8b-q4_k_m.gguf -c 2048
```
Note: You can also use this checkpoint directly through the [usage steps](https://github.com/ggerganov/llama.cpp?tab=readme-ov-file#usage) listed in the Llama.cpp repo as well.
Step 1: Clone llama.cpp from GitHub.
```
git clone https://github.com/ggerganov/llama.cpp
```
Step 2: Move into the llama.cpp folder and build it with `LLAMA_CURL=1` flag along with other hardware-specific flags (for ex: LLAMA_CUDA=1 for Nvidia GPUs on Linux).
```
cd llama.cpp && LLAMA_CURL=1 make
```
Step 3: Run inference through the main binary.
```
./llama-cli --hf-repo Triangle104/EtherealRainbow-v0.3-8B-Q4_K_M-GGUF --hf-file etherealrainbow-v0.3-8b-q4_k_m.gguf -p "The meaning to life and the universe is"
```
or
```
./llama-server --hf-repo Triangle104/EtherealRainbow-v0.3-8B-Q4_K_M-GGUF --hf-file etherealrainbow-v0.3-8b-q4_k_m.gguf -c 2048
```
|