|
--- |
|
license: apache-2.0 |
|
language: |
|
- en |
|
tags: |
|
- story |
|
- general usage |
|
- roleplay |
|
- creative |
|
- rp |
|
- fantasy |
|
- story telling |
|
- ultra high precision |
|
--- |
|
<B>NEO CLASS Ultra Quants for : L3-8B-Stheno-v3.2</B> |
|
|
|
Additional quants are uploading... |
|
|
|
The NEO Class tech was created after countless investigations and over 120 lab experiments backed by |
|
real world testing and qualitative results. |
|
|
|
<b>NEO Class results: </b> |
|
|
|
Better overall function, instruction following, output quality and stronger connections to ideas, concepts and the world in general. |
|
|
|
In addition quants now operate above their "grade" so to speak : |
|
|
|
IE: Q4 / IQ4 operate at Q5KM/Q6 levels. |
|
|
|
Likewise for Q3/IQ3 operate at Q4KM/Q5 levels. |
|
|
|
Perplexity drop of 1191 points for Neo Class Imatrix quant of IQ4XS VS regular quant of IQ4XS. |
|
|
|
(lower is better) |
|
|
|
<B> A Funny thing happened on the way to the "lab" ... </b> |
|
|
|
Although this model uses a "Llama3" template we found that Command-R's template worked better specifically for creative purposes. |
|
|
|
This applies to both normal quants and Neo quants. |
|
|
|
Here is Command-R's template: |
|
|
|
{ |
|
"name": "Cohere Command R", |
|
"inference_params": { |
|
"input_prefix": "<|END_OF_TURN_TOKEN|><|START_OF_TURN_TOKEN|><|USER_TOKEN|>", |
|
"input_suffix": "<|END_OF_TURN_TOKEN|><|START_OF_TURN_TOKEN|><|CHATBOT_TOKEN|>", |
|
"antiprompt": [ |
|
"<|START_OF_TURN_TOKEN|>", |
|
"<|END_OF_TURN_TOKEN|>" |
|
], |
|
"pre_prompt_prefix": "<|START_OF_TURN_TOKEN|><|SYSTEM_TOKEN|>", |
|
"pre_prompt_suffix": "" |
|
} |
|
} |
|
|
|
This was "interesting" issue was confirmed by multiple users. |
|
|
|
<B> Model Notes: </B> |
|
|
|
Maximum context is 8k. Please see original model maker's page for details, and usage information for this model. |
|
|
|
Special thanks to the model creators at SAO10K for making such a fantastic model: |
|
|
|
[ https://huggingface.co/Sao10K/L3-8B-Stheno-v3.2 ] |