Green-Sky
/

nakodanei-Blue-Orchid-2x7b-GGUF-iMatrix

Not-For-All-Audiences

Inference Endpoints

Model card Files Files and versions Community

nakodanei-Blue-Orchid-2x7b-GGUF-iMatrix / README.md

Green-Sky's picture

Update README.md

c54d59d verified about 1 year ago

|

1.1 kB

	---
	license: apache-2.0
	tags:
	- not-for-all-audiences
	- writing
	- roleplay
	- gguf
	- gguf-imatrix
	base_model:
	- nakodanei/Blue-Orchid-2x7b
	model_type: mixtral
	quantized_by: Green-Sky
	language:
	- en
	---

	llama.cpp conversion of https://huggingface.co/nakodanei/Blue-Orchid-2x7b/

	except for f16 and q8_0, every quant is using the `merge.imatrix`

	`merge.imatrix` is a merge of `kalomaze-group_10_merged.172chunks.imatrix` and `wiki.train.400chunks.imatrix`, which took ~10min + ~20min to calulate on my machine.

	full wiki.train would have taken 10h

	for more info on imatrix handling see https://github.com/ggerganov/llama.cpp/pull/5302

	### ppl (512 wiki.test, 300chunks)
	\| quant \| ppl (lower is better) \|
	\|--------------------\|-----\|
	\| f16(baseline) \| 5.8839 +/- 0.05173 \|
	\| q8_0 \| xxx \|
	\| q5_k_m \| xxx \|
	\| q4_k_m \| xxx \|
	\| iq3_xxs \| 6.1984 +/- 0.05475 \|
	\| iq3_xxs(only-wiki) \| 6.1796 +/- 0.05446 \|
	\| iq3_xxs(only-kal) \| 6.1984 +/- 0.05475 \|
	\| q2_k \| xxx \|
	\| iq2_xs \| xxx \|
	\| iq2_xxs \| xxx \|