aisensiy
/

Qwen-72B-Chat-GGUF

Model card Files Files and versions Community

aisensiy commited on Dec 4, 2023

Commit

36b848d

•

1 Parent(s): 79bff4e

Update README.md

Files changed (1) hide show

README.md +17 -0

README.md CHANGED Viewed

@@ -2,6 +2,23 @@
 license: mit
 ---
 ## Files are split and require joining

 license: mit
 ---
+## How to convert
+First, you need git clone [llama.cpp](https://github.com/ggerganov/llama.cpp) and make it.
+Then follow the instrution to generate gguf files.
+```
+# convert Qwen HF models to gguf fp16 format
+python convert-hf-to-gguf.py --outfile qwen7b-chat-f16.gguf --outtype f16 Qwen-7B-Chat
+# quantize the model to 4-bits (using q4_0 method)
+./quantize qwen7b-chat-f16.gguf qwen7b-chat-q4_0.gguf q4_0
+# chat with Qwen models
+./main -m qwen7b-chat-q4_0.gguf -n 512 --color -i -cml -f prompts/chat-with-qwen.txt
+```
 ## Files are split and require joining