Not-For-All-Audiences

llama-cpp

gguf-my-repo

Inference Endpoints

conversational

Model card Files Files and versions Community

EtherealRainbow-v0.3-8B-Q8_0-GGUF / README.md

Triangle104

Update README.md

9a216fe verified about 1 month ago

preview code

raw

history blame

8.41 kB

	---
	library_name: transformers
	tags:
	- mergekit
	- merge
	- not-for-all-audiences
	- llama-cpp
	- gguf-my-repo
	license: llama3
	language:
	- en
	base_model: invisietch/EtherealRainbow-v0.3-8B
	---

	# Triangle104/EtherealRainbow-v0.3-8B-Q8_0-GGUF
	This model was converted to GGUF format from [`invisietch/EtherealRainbow-v0.3-8B`](https://huggingface.co/invisietch/EtherealRainbow-v0.3-8B) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space.
	Refer to the [original model card](https://huggingface.co/invisietch/EtherealRainbow-v0.3-8B) for more details on the model.

	---
	Model details:
	-
	Ethereal Rainbow is an 8B parameter merge of various Llama3-based finetunes created using mergekit. The purpose of Ethereal Rainbow is to create an uncensored Llama3 variant which is capable of writing creative prose, and engaging in SFW as well as NSFW roleplay and storytelling, with a strong focus on long-form responses & adherence to prompts.

	v0.3 improves creativity over v0.2 without losing coherence. It has been tested over more than 1,000 messages including roleplay, code prompts, and 'write a scene'-type prompts.

	Feedback
	-
	I appreciate all feedback on any of my models, you can use:

	My Discord server - requires Discord.
	The Community tab - requires HF login.
	The SillyTavern Discord thread - must be on SillyTavern Discord.
	Discord DMs to invisietch.

	Your feedback is how I improve these models for future versions.

	Disclaimer
	-
	This model is built on an abliterated base and as such is largely uncensored. It can generate explicit, disturbing or offensive responses. Use responsibly. I am not responsible for your use of this model.
	Prompting Format

	I'd recommend Llama-3 Instruct prompting format:

	<\|begin_of_text\|><\|start_header_id\|>system<\|end_header_id\|>

	{system_prompt}<\|eot_id\|><\|start_header_id\|>user<\|end_header_id\|>

	{input}<\|eot_id\|><\|start_header_id\|>assistant<\|end_header_id\|>

	{output}<\|eot_id\|>

	Some of the models included in the merge were trained on ChatML & Alpaca so you can try those. I have not tested them.
	Example Storywriting

	These prompts are used on SillyTavern with a fairly basic narrator card. I have trimmed the start and finish where the narrator decided to add chapter headings, commentary and the like. All samples are made with the F32 GGUF loaded with koboldcpp, with response length capped at 2048 tokens.
	Write me a 3,000 word opening chapter of a 'gritty hard sci-fi' novel, drawing inspiration from the writing styles of Isaac Asimov & Andy Weir. Use third person personal. Include dialogue and internal monologues. The POV character for the opening chapter should be a 26 year old astronaut called Tone on a mission to Europa, who has just realised that the craft for the return journey is broken beyond repair, and he only has supplies for a few months. Given that survival is impossible, he seeks to spend the few months he has researching titan, so his life & mission are not wasted.

	Write me a 3,000 word opening chapter of a 'high fantasy' novel, drawing inspiration from the writing styles of J R R Tolkien & George R R Martin. Use third person personal. Include dialogue and internal monologues. The POV character for the opening chapter should be a 19 year old female elf bard who is looking for adventure.

	Write me a 3,000 word opening chapter of a 'weird fiction' novel, drawing inspiration from the writing styles of China Mieville and Neil Gaiman. Use third person personal. Include dialogue and internal monologues. The POV character for the opening chapter should be a male in his 20s called Horton who has just come to the city looking for work.

	I chose the hard sci-fi example to test positivity bias. It did require some prompting, but it was willing to kill the protagonist.

	I chose the high fantasy example to see whether it would bleed human features through to elves, this didn't occur.

	I chose the weird fiction example to see if the LLM understood a niche genre. I'd say it performed okay, better on style than on substance.
	Merge Strategy

	First, we create three bases:

	Rain - This is a roleplay base which makes up the majority of the model.
	Sun - This is the brains of the model, with strong instruct models & writing models.
	Ghost - This model primarily aims to improve the NSFW/NSFL aspects of the model, as well as general vocabulary.

	After this, we have a two-slerp stage to create the final model.
	Models Used

	The following models were used to create EtherealRainbow-v0.3-8B:

	mlabonne/NeuralDaredevil-8B-abliterated
	Sao10K/L3-8B-Stheno-v3.2
	Nitral-AI/Hathor-L3-8B-v.02
	grimjim/Llama-3-Luminurse-v0.2-OAS-8B
	hf-100/Llama-3-Spellbound-Instruct-8B-0.3
	Gryphe/Pantheon-RP-1.0-8b-Llama-3
	Blackroot/Llama-3-LongStory
	Locutusque/Llama-3-Hercules-5.0-8B
	Casual-Autopsy/L3-Umbral-Mind-RP-v0.3-8B
	ChaoticNeutrals/Poppy_Porpoise-1.0-L3-8B
	mpasila/Llama-3-LimaRP-Instruct-8B
	Undi95/Llama-3-LewdPlay-8B-evo

	Mergekit Configs
	-
	Rain
	-
	models:
	- model: mlabonne/NeuralDaredevil-8B-abliterated
	- model: Sao10K/L3-8B-Stheno-v3.2
	parameters:
	density: 0.41
	weight: 0.4
	- model: Nitral-AI/Hathor-L3-8B-v.02
	parameters:
	density: 0.53
	weight: 0.5
	- model: grimjim/Llama-3-Luminurse-v0.2-OAS-8B
	parameters:
	density: 0.45
	weight: 0.1
	merge_method: dare_ties
	base_model: mlabonne/NeuralDaredevil-8B-abliterated
	parameters:
	int8_mask: true
	dtype: bfloat16

	Sun
	-
	models:
	- model: hf-100/Llama-3-Spellbound-Instruct-8B-0.3
	- model: Gryphe/Pantheon-RP-1.0-8b-Llama-3
	parameters:
	density: 0.48
	weight: 0.5
	- model: Blackroot/Llama-3-LongStory
	parameters:
	density: 0.36
	weight: 0.2
	- model: Locutusque/Llama-3-Hercules-5.0-8B
	parameters:
	density: 0.51
	weight: 0.3
	merge_method: dare_ties
	base_model: hf-100/Llama-3-Spellbound-Instruct-8B-0.3
	parameters:
	int8_mask: true
	dtype: bfloat16

	Ghost
	-
	models:
	- model: Casual-Autopsy/L3-Umbral-Mind-RP-v0.3-8B
	- model: ChaoticNeutrals/Poppy_Porpoise-1.0-L3-8B
	parameters:
	density: 0.39
	weight: 0.3
	- model: mpasila/Llama-3-LimaRP-Instruct-8B
	parameters:
	density: 0.54
	weight: 0.4
	- model: Undi95/Llama-3-LewdPlay-8B-evo
	parameters:
	density: 0.49
	weight: 0.3
	merge_method: dare_ties
	base_model: Casual-Autopsy/L3-Umbral-Mind-RP-v0.3-8B
	parameters:
	int8_mask: true
	dtype: bfloat16

	Stage1 Slerp
	-
	models:
	- model: ./fp16/Rain-v0.3-8B
	- model: ./fp16/Ghost-v0.3-8B
	merge_method: slerp
	base_model: ./fp16/Rain-v0.3-8B
	parameters:
	t:
	- value: [0, 0, 0.1, 0.3, 0.5, 0.7, 0.5, 0.3, 0.1, 0, 0]
	embed_slerp: true
	dtype: bfloat16
	tokenizer-source: model:./fp16/Rain-v0.3-8B

	Final-Stage Slerp
	-
	models:
	- model: ./fp16/ERStage1-v0.3-8B
	- model: ./fp16/Sun-v0.3-8B
	merge_method: slerp
	base_model: ./fp16/ERStage1-v0.3-8B
	parameters:
	t:
	- value: [0, 0, 0.1, 0.2, 0.4, 0.6, 0.4, 0.2, 0.1, 0, 0]
	embed_slerp: true
	dtype: bfloat16
	tokenizer-source: model:./fp16/ERStage1-v0.3-8B

	---
	## Use with llama.cpp
	Install llama.cpp through brew (works on Mac and Linux)

	```bash
	brew install llama.cpp

	```
	Invoke the llama.cpp server or the CLI.

	### CLI:
	```bash
	llama-cli --hf-repo Triangle104/EtherealRainbow-v0.3-8B-Q8_0-GGUF --hf-file etherealrainbow-v0.3-8b-q8_0.gguf -p "The meaning to life and the universe is"
	```

	### Server:
	```bash
	llama-server --hf-repo Triangle104/EtherealRainbow-v0.3-8B-Q8_0-GGUF --hf-file etherealrainbow-v0.3-8b-q8_0.gguf -c 2048
	```

	Note: You can also use this checkpoint directly through the [usage steps](https://github.com/ggerganov/llama.cpp?tab=readme-ov-file#usage) listed in the Llama.cpp repo as well.

	Step 1: Clone llama.cpp from GitHub.
	```
	git clone https://github.com/ggerganov/llama.cpp
	```

	Step 2: Move into the llama.cpp folder and build it with `LLAMA_CURL=1` flag along with other hardware-specific flags (for ex: LLAMA_CUDA=1 for Nvidia GPUs on Linux).
	```
	cd llama.cpp && LLAMA_CURL=1 make
	```

	Step 3: Run inference through the main binary.
	```
	./llama-cli --hf-repo Triangle104/EtherealRainbow-v0.3-8B-Q8_0-GGUF --hf-file etherealrainbow-v0.3-8b-q8_0.gguf -p "The meaning to life and the universe is"
	```
	or
	```
	./llama-server --hf-repo Triangle104/EtherealRainbow-v0.3-8B-Q8_0-GGUF --hf-file etherealrainbow-v0.3-8b-q8_0.gguf -c 2048
	```