TheBloke
/

WizardLM-13B-V1.1-GGML

Model card Files Files and versions Community

TheBloke commited on Jul 9, 2023

Commit

642f9c2

•

1 Parent(s): a0e2e04

Update README.md

Files changed (1) hide show

README.md +4 -0

README.md CHANGED Viewed

@@ -47,6 +47,10 @@ ASSISTANT:
 <!-- compatibility_ggml start -->
 ## Compatibility
 ### Original llama.cpp quant methods: `q4_0, q4_1, q5_0, q5_1, q8_0`
 These are guaranteed to be compatible with any UIs, tools and libraries released since late May. They may be phased out soon, as they are largely superseded by the new k-quant methods.

 <!-- compatibility_ggml start -->
 ## Compatibility
+**Note:** due to this model having a non-standard vocab size of 32,001, k-quants are slightly larger than they are for other models of the same size and type.
+For example, a 13B q4_K_M will be around 150MB larger. Inference speed should not be noticeably affected, and quality will be the same or higher.
 ### Original llama.cpp quant methods: `q4_0, q4_1, q5_0, q5_1, q8_0`
 These are guaranteed to be compatible with any UIs, tools and libraries released since late May. They may be phased out soon, as they are largely superseded by the new k-quant methods.