Green-Sky's picture
Update README.md
c54d59d verified
|
raw
history blame
1.1 kB
metadata
license: apache-2.0
tags:
  - not-for-all-audiences
  - writing
  - roleplay
  - gguf
  - gguf-imatrix
base_model:
  - nakodanei/Blue-Orchid-2x7b
model_type: mixtral
quantized_by: Green-Sky
language:
  - en

llama.cpp conversion of https://huggingface.co/nakodanei/Blue-Orchid-2x7b/

except for f16 and q8_0, every quant is using the merge.imatrix

merge.imatrix is a merge of kalomaze-group_10_merged.172chunks.imatrix and wiki.train.400chunks.imatrix, which took ~10min + ~20min to calulate on my machine.

full wiki.train would have taken 10h

for more info on imatrix handling see https://github.com/ggerganov/llama.cpp/pull/5302

ppl (512 wiki.test, 300chunks)

quant ppl (lower is better)
f16(baseline) 5.8839 +/- 0.05173
q8_0 xxx
q5_k_m xxx
q4_k_m xxx
iq3_xxs 6.1984 +/- 0.05475
iq3_xxs(only-wiki) 6.1796 +/- 0.05446
iq3_xxs(only-kal) 6.1984 +/- 0.05475
q2_k xxx
iq2_xs xxx
iq2_xxs xxx