TheBloke commited on
Commit
642f9c2
1 Parent(s): a0e2e04

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -0
README.md CHANGED
@@ -47,6 +47,10 @@ ASSISTANT:
47
  <!-- compatibility_ggml start -->
48
  ## Compatibility
49
 
 
 
 
 
50
  ### Original llama.cpp quant methods: `q4_0, q4_1, q5_0, q5_1, q8_0`
51
 
52
  These are guaranteed to be compatible with any UIs, tools and libraries released since late May. They may be phased out soon, as they are largely superseded by the new k-quant methods.
 
47
  <!-- compatibility_ggml start -->
48
  ## Compatibility
49
 
50
+ **Note:** due to this model having a non-standard vocab size of 32,001, k-quants are slightly larger than they are for other models of the same size and type.
51
+
52
+ For example, a 13B q4_K_M will be around 150MB larger. Inference speed should not be noticeably affected, and quality will be the same or higher.
53
+
54
  ### Original llama.cpp quant methods: `q4_0, q4_1, q5_0, q5_1, q8_0`
55
 
56
  These are guaranteed to be compatible with any UIs, tools and libraries released since late May. They may be phased out soon, as they are largely superseded by the new k-quant methods.