Update README.md
Browse files
README.md
CHANGED
@@ -2,6 +2,23 @@
|
|
2 |
license: mit
|
3 |
---
|
4 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
5 |
|
6 |
## Files are split and require joining
|
7 |
|
|
|
2 |
license: mit
|
3 |
---
|
4 |
|
5 |
+
## How to convert
|
6 |
+
|
7 |
+
First, you need git clone [llama.cpp](https://github.com/ggerganov/llama.cpp) and make it.
|
8 |
+
|
9 |
+
Then follow the instrution to generate gguf files.
|
10 |
+
|
11 |
+
```
|
12 |
+
# convert Qwen HF models to gguf fp16 format
|
13 |
+
python convert-hf-to-gguf.py --outfile qwen7b-chat-f16.gguf --outtype f16 Qwen-7B-Chat
|
14 |
+
|
15 |
+
# quantize the model to 4-bits (using q4_0 method)
|
16 |
+
./quantize qwen7b-chat-f16.gguf qwen7b-chat-q4_0.gguf q4_0
|
17 |
+
|
18 |
+
# chat with Qwen models
|
19 |
+
./main -m qwen7b-chat-q4_0.gguf -n 512 --color -i -cml -f prompts/chat-with-qwen.txt
|
20 |
+
```
|
21 |
+
|
22 |
|
23 |
## Files are split and require joining
|
24 |
|