Update README.md
#9
by
MaziyarPanahi
- opened
README.md
CHANGED
@@ -43,6 +43,12 @@ You can download only the quants you need instead of cloning the entire reposito
|
|
43 |
huggingface-cli download MaziyarPanahi/WizardLM-2-8x22B-GGUF --local-dir . --include '*Q2_K*gguf'
|
44 |
```
|
45 |
|
|
|
|
|
|
|
|
|
|
|
|
|
46 |
## Load sharded model
|
47 |
|
48 |
`llama_load_model_from_file` will detect the number of files and will load additional tensors from the rest of files.
|
@@ -51,7 +57,6 @@ huggingface-cli download MaziyarPanahi/WizardLM-2-8x22B-GGUF --local-dir . --inc
|
|
51 |
llama.cpp/main -m WizardLM-2-8x22B.Q2_K-00001-of-00005.gguf -p "Building a website can be done in 10 simple steps:\nStep 1:" -n 1024 -e
|
52 |
```
|
53 |
|
54 |
-
|
55 |
## Prompt template
|
56 |
|
57 |
```
|
|
|
43 |
huggingface-cli download MaziyarPanahi/WizardLM-2-8x22B-GGUF --local-dir . --include '*Q2_K*gguf'
|
44 |
```
|
45 |
|
46 |
+
On Windows:
|
47 |
+
|
48 |
+
```sh
|
49 |
+
huggingface-cli download MaziyarPanahi/WizardLM-2-8x22B-GGUF --local-dir . --include *Q4_K_S*gguf
|
50 |
+
```
|
51 |
+
|
52 |
## Load sharded model
|
53 |
|
54 |
`llama_load_model_from_file` will detect the number of files and will load additional tensors from the rest of files.
|
|
|
57 |
llama.cpp/main -m WizardLM-2-8x22B.Q2_K-00001-of-00005.gguf -p "Building a website can be done in 10 simple steps:\nStep 1:" -n 1024 -e
|
58 |
```
|
59 |
|
|
|
60 |
## Prompt template
|
61 |
|
62 |
```
|