llama.cpp conversion of https://huggingface.co/nakodanei/Blue-Orchid-2x7b/
except for f16 and q8_0, every quant is using the merge.imatrix
merge.imatrix
is a merge of kalomaze-group_10_merged.172chunks.imatrix
and wiki.train.400chunks.imatrix
, which took ~10min + ~20min to calulate on my machine.
full wiki.train would have taken 10h
for more info on imatrix handling see https://github.com/ggerganov/llama.cpp/pull/5302
ppl (512 wiki.test, 300chunks)
quant | ppl (lower is better) |
---|---|
f16(baseline) | 5.8839 +/- 0.05173 |
q8_0 | 5.8880 +/- 0.05178 |
q5_k_m | 5.8912 +/- 0.05177 |
q5_k_m(without-imat) | 5.8893 +/- 0.05174 |
q4_k_m | 5.9248 +/- 0.05216 |
q4_k_m(without-imat) | 5.9492 +/- 0.05249 |
iq3_xxs | 6.1984 +/- 0.05475 |
iq3_xxs(only-wiki) | 6.1796 +/- 0.05446 |
iq3_xxs(only-kal) | 6.1984 +/- 0.05475 |
iq3_xxs(withou-imat) | 6.4228 +/- 0.05756 |
Interesting observations
despite merge.imatrix
being different from kalomaze-group_10_merged.172chunks.imatrix
, they produce the exact same quantized iq3_xxs model file. (same hash, checked multiple times)
q5_k_m has a lower perplexity with the imatrix. but that probably is caused by kalomaze-group_10_merged diverging enough from wiki.
- Downloads last month
- 596
Inference Providers
NEW
This model is not currently available via any of the supported third-party Inference Providers, and
HF Inference API was unable to determine this model's library.
Model tree for Green-Sky/nakodanei-Blue-Orchid-2x7b-GGUF-iMatrix
Base model
nakodanei/Blue-Orchid-2x7b