dranger003
/

miquliz-120b-v2.0-iMat.GGUF

Inference Endpoints

Model card Files Files and versions Community

miquliz-120b-v2.0-iMat.GGUF / README.md

dranger003's picture

Update README.md

6155a58 verified 11 months ago

|

303 Bytes

	---
	license: cc-by-nc-2.0
	---
	GGUF importance matrix (imatrix) quants for https://huggingface.co/wolfram/miquliz-120b-v2.0
	The importance matrix was trained for 100K tokens (200 batches of 512 tokens) using wiki.train.raw.

	Using IQ2_XXS it seems to fit 100/141 layers using 2K context on a 24GB card.