README.md · DavidAU/L3-8B-Stheno-v3.2-Ultra-NEO-V1-IMATRIX-GGUF at d27bf86629f16ae68fbcbf72a313aa87c25bb4fe

L3-8B-Stheno-v3.2-Ultra-NEO-V1-IMATRIX-GGUF / README.md

DavidAU

Update README.md

d27bf86 verified 7 months ago

preview code

raw

history blame

1.85 kB

	---
	license: apache-2.0
	language:
	- en
	tags:
	- story
	- general usage
	- roleplay
	- creative
	- rp
	- fantasy
	- story telling
	- ultra high precision
	---
	<B>NEO CLASS Ultra Quants for : L3-8B-Stheno-v3.2</B>

	Additional quants are uploading...

	The NEO Class tech was created after countless investigations and over 120 lab experiments backed by
	real world testing and qualitative results.

	<b>NEO Class results: </b>

	Better overall function, instruction following, output quality and stronger connections to ideas, concepts and the world in general.

	In addition quants now operate above their "grade" so to speak :

	IE: Q4 / IQ4 operate at Q5KM/Q6 levels.

	Likewise for Q3/IQ3 operate at Q4KM/Q5 levels.

	Perplexity drop of 1191 points for Neo Class Imatrix quant of IQ4XS VS regular quant of IQ4XS.

	(lower is better)

	<B> A Funny thing happened on the way to the "lab" ... </b>

	Although this model uses a "Llama3" template we found that Command-R's template worked better specifically for creative purposes.

	This applies to both normal quants and Neo quants.

	Here is Command-R's template:

	{
	"name": "Cohere Command R",
	"inference_params": {
	"input_prefix": "<\|END_OF_TURN_TOKEN\|><\|START_OF_TURN_TOKEN\|><\|USER_TOKEN\|>",
	"input_suffix": "<\|END_OF_TURN_TOKEN\|><\|START_OF_TURN_TOKEN\|><\|CHATBOT_TOKEN\|>",
	"antiprompt": [
	"<\|START_OF_TURN_TOKEN\|>",
	"<\|END_OF_TURN_TOKEN\|>"
	],
	"pre_prompt_prefix": "<\|START_OF_TURN_TOKEN\|><\|SYSTEM_TOKEN\|>",
	"pre_prompt_suffix": ""
	}
	}

	This was "interesting" issue was confirmed by multiple users.

	<B> Model Notes: </B>

	Maximum context is 8k. Please see original model maker's page for details, and usage information for this model.

	Special thanks to the model creators at SAO10K for making such a fantastic model:

	[ https://huggingface.co/Sao10K/L3-8B-Stheno-v3.2 ]