QuantPanda's picture
Update README.md
bb8a221 verified
metadata
license: other
license_name: glm-4-9b-license
license_link: https://huggingface.co/THUDM/glm-4-9b-chat/blob/main/LICENSE
base_model: THUDM/LongWriter-glm4-9b
datasets:
  - THUDM/LongWriter-6k
language:
  - en
pipeline_tag: text-generation

LongWriter-glm4-9b

Original model link: https://huggingface.co/THUDM/LongWriter-glm4-9b

Model by: THUDM

Quants by: QuantPanda

GGUF quantization for llama.cpp and similar applications.

Example:

./llama-cli -m LongWriter-glm4-9B-Q5_K_M.gguf -p "You are a helpful AI assistant." --conversation

If the model takes too long to load you can reduce the context size with --ctx-size

Example with smaller context size:

./llama-cli -m LongWriter-glm4-9B-Q5_K_M.gguf -p "You are a helpful AI assistant." --conversation --ctx-size 4096