|
--- |
|
license: other |
|
license_name: glm-4-9b-license |
|
license_link: https://huggingface.co/THUDM/glm-4-9b-chat/blob/main/LICENSE |
|
base_model: THUDM/LongWriter-glm4-9b |
|
datasets: |
|
- THUDM/LongWriter-6k |
|
language: |
|
- en |
|
pipeline_tag: text-generation |
|
--- |
|
|
|
# LongWriter-glm4-9b |
|
|
|
Original model link: https://huggingface.co/THUDM/LongWriter-glm4-9b |
|
|
|
Model by: **THUDM** |
|
|
|
Quants by: **QuantPanda** |
|
|
|
GGUF quantization for [llama.cpp](https://github.com/ggerganov/llama.cpp) and similar applications. |
|
|
|
**Example:** |
|
|
|
``./llama-cli -m LongWriter-glm4-9B-Q5_K_M.gguf -p "You are a helpful AI assistant." --conversation`` |
|
|
|
If the model takes too long to load you can reduce the context size with ```--ctx-size``` |
|
|
|
**Example with smaller context size:** |
|
|
|
``./llama-cli -m LongWriter-glm4-9B-Q5_K_M.gguf -p "You are a helpful AI assistant." --conversation --ctx-size 4096`` |