|
--- |
|
license: apache-2.0 |
|
tags: |
|
- not-for-all-audiences |
|
- writing |
|
- roleplay |
|
- gguf |
|
- gguf-imatrix |
|
base_model: |
|
- nakodanei/Blue-Orchid-2x7b |
|
model_type: mixtral |
|
quantized_by: Green-Sky |
|
language: |
|
- en |
|
--- |
|
|
|
llama.cpp conversion of https://huggingface.co/nakodanei/Blue-Orchid-2x7b/ |
|
|
|
except for f16 and q8_0, every quant is using the `merge.imatrix` |
|
|
|
`merge.imatrix` is a merge of `kalomaze-group_10_merged.172chunks.imatrix` and `wiki.train.400chunks.imatrix`, which took ~10min + ~20min to calulate on my machine. |
|
|
|
full wiki.train would have taken 10h |
|
|
|
for more info on imatrix handling see https://github.com/ggerganov/llama.cpp/pull/5302 |
|
|
|
### ppl (512 wiki.test, 300chunks) |
|
| quant | ppl (lower is better) | |
|
|--------------------|-----| |
|
| f16(baseline) | 5.8839 +/- 0.05173 | |
|
| q8_0 | xxx | |
|
| q5_k_m | xxx | |
|
| q4_k_m | xxx | |
|
| iq3_xxs | 6.1984 +/- 0.05475 | |
|
| iq3_xxs(only-wiki) | 6.1796 +/- 0.05446 | |
|
| iq3_xxs(only-kal) | 6.1984 +/- 0.05475 | |
|
| q2_k | xxx | |
|
| iq2_xs | xxx | |
|
| iq2_xxs | xxx | |