--- license: apache-2.0 language: - en tags: - story - general usage - roleplay - creative - rp - fantasy - story telling - ultra high precision --- NEO CLASS Ultra Quants for : L3-8B-Stheno-v3.2 Additional quants are uploading... The NEO Class tech was created after countless investigations and over 120 lab experiments backed by real world testing and qualitative results. NEO Class results: Better overall function, instruction following, output quality and stronger connections to ideas, concepts and the world in general. In addition quants now operate above their "grade" so to speak : IE: Q4 / IQ4 operate at Q5KM/Q6 levels. Likewise for Q3/IQ3 operate at Q4KM/Q5 levels. Perplexity drop of 1191 points for Neo Class Imatrix quant of IQ4XS VS regular quant of IQ4XS. (lower is better) A Funny thing happened on the way to the "lab" ... Although this model uses a "Llama3" template we found that Command-R's template worked better specifically for creative purposes. This applies to both normal quants and Neo quants. Here is Command-R's template: { "name": "Cohere Command R", "inference_params": { "input_prefix": "<|END_OF_TURN_TOKEN|><|START_OF_TURN_TOKEN|><|USER_TOKEN|>", "input_suffix": "<|END_OF_TURN_TOKEN|><|START_OF_TURN_TOKEN|><|CHATBOT_TOKEN|>", "antiprompt": [ "<|START_OF_TURN_TOKEN|>", "<|END_OF_TURN_TOKEN|>" ], "pre_prompt_prefix": "<|START_OF_TURN_TOKEN|><|SYSTEM_TOKEN|>", "pre_prompt_suffix": "" } } This was "interesting" issue was confirmed by multiple users. Model Notes: Maximum context is 8k. Please see original model maker's page for details, and usage information for this model. Special thanks to the model creators at SAO10K for making such a fantastic model: [ https://huggingface.co/Sao10K/L3-8B-Stheno-v3.2 ]