DavidAU
/

TinyLlama-1.1B-Chat-v1.0-Ultra-NEO-V1-Imatrix-GGUF

Text Generation

ultra high precision

Inference Endpoints

Model card Files Files and versions Community

TinyLlama-1.1B-Chat-v1.0-Ultra-NEO-V1-Imatrix-GGUF / README.md

DavidAU's picture

Update README.md

ceca4d6 verified 8 months ago

|

1.23 kB

	---
	license: apache-2.0
	language:
	- en
	tags:
	- story
	- general usage
	- ultra high precision
	pipeline_tag: text-generation
	---
	<B>NEO CLASS Ultra Quants for : TinyLlama-1.1B-Chat-v1.0-Ultra-NEO-V1-Imatrix-GGUF</B>

	The NEO Class tech was created after countless investigations and over 120 lab experiments backed by
	real world testing and qualitative results.

	<b>NEO Class results: </b>

	Better overall function, instruction following, output quality and stronger connections to ideas, concepts and the world in general.

	In addition quants now operate above their "grade" so to speak :

	IE: Q4 / IQ4 operate at Q5KM/Q6 levels.

	Likewise for Q3/IQ3 operate at Q4KM/Q5 levels.

	Perplexity drop of 591 points for Neo Class Imatrix quant of IQ4XS VS regular quant of IQ4XS.

	(lower is better)

	For experimental "X" quants of this model please go here:

	[ https://huggingface.co/DavidAU/TinyLlama-1.1B-Chat-v1.0-Ultra-NEO-V1-X-Imatrix-GGUF ]

	<B> Model Notes: </B>

	Maximum context is 2k. Please see original model maker's page for details, and usage information for this model.

	Special thanks to the model creators at TinyLLama for making such a fantastic model:

	[ https://huggingface.co/TinyLlama/TinyLlama-1.1B-Chat-v1.0 ]