sydneyfong
commited on
Commit
•
447102e
1
Parent(s):
386c1fb
Upload folder using huggingface_hub
Browse files
README.md
CHANGED
@@ -1,3 +1,29 @@
|
|
1 |
-
|
2 |
-
|
3 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
## Llamacpp Quantizations of THUDM/glm-4-9b-chat
|
2 |
+
|
3 |
+
Using <a href="https://github.com/ggerganov/llama.cpp/">llama.cpp</a> commit hash 7d0e23d72ef4540d0d4409cb63ae682c17d53926 for quantization. Notably this includes b3333, the first official llama.cpp release that supports GLM-3 and GLM-4.
|
4 |
+
|
5 |
+
Original model: https://huggingface.co/THUDM/glm-4-9b-chat
|
6 |
+
|
7 |
+
As of writing, this is probably the only GLM-4 llama.cpp quant that is created with llama.cpp b3333+
|
8 |
+
|
9 |
+
I have tested the gguf files with some simple prompts and they seem to work fine.
|
10 |
+
|
11 |
+
## Prompt format
|
12 |
+
|
13 |
+
```
|
14 |
+
[gMASK]<sop><|user|>
|
15 |
+
{prompt}
|
16 |
+
<|assistant|>
|
17 |
+
```
|
18 |
+
|
19 |
+
Apparently the model supports function calling as well if you supply a more elaborate system prompt. The original chat template is provided in https://huggingface.co/THUDM/glm-4-9b-chat/blob/main/tokenizer_config.json , and it is too complicated if you don't want that functionality. (If you don't read Chinese, you're advised to translate it to a language you understand and read it first before adopting that prompt for your purposes.)
|
20 |
+
|
21 |
+
## Quantizations
|
22 |
+
|
23 |
+
Due to resource limitations we only have a select handful of quantizations. Hopefully they are useful for your purposes.
|
24 |
+
|
25 |
+
## Legal / License
|
26 |
+
|
27 |
+
*"Built with glm-4"*
|
28 |
+
|
29 |
+
I just copied the LICENSE file from https://huggingface.co/THUDM/glm-4-9b-chat as required for redistribution.
|