QuantPanda
/

LongWriter-glm4-9B-GGUF

Text Generation

Inference Endpoints

Model card Files Files and versions Community

LongWriter-glm4-9B-GGUF / README.md

QuantPanda's picture

Update README.md

bb8a221 verified 5 months ago

|

history blame contribute delete

850 Bytes

	---
	license: other
	license_name: glm-4-9b-license
	license_link: https://huggingface.co/THUDM/glm-4-9b-chat/blob/main/LICENSE
	base_model: THUDM/LongWriter-glm4-9b
	datasets:
	- THUDM/LongWriter-6k
	language:
	- en
	pipeline_tag: text-generation
	---

	# LongWriter-glm4-9b

	Original model link: https://huggingface.co/THUDM/LongWriter-glm4-9b

	Model by: THUDM

	Quants by: QuantPanda

	GGUF quantization for [llama.cpp](https://github.com/ggerganov/llama.cpp) and similar applications.

	Example:

	``./llama-cli -m LongWriter-glm4-9B-Q5_K_M.gguf -p "You are a helpful AI assistant." --conversation``

	If the model takes too long to load you can reduce the context size with ```--ctx-size```

	Example with smaller context size:

	``./llama-cli -m LongWriter-glm4-9B-Q5_K_M.gguf -p "You are a helpful AI assistant." --conversation --ctx-size 4096``