Update README.md
Browse files
README.md
CHANGED
@@ -2,14 +2,12 @@
|
|
2 |
license: apache-2.0
|
3 |
inference: false
|
4 |
---
|
5 |
-
![perplexity stats](https://huggingface.co/eachadea/ggml-vicuna-13b-1.1/resolve/main/perplexity.png)
|
6 |
|
7 |
**NOTE: This GGML conversion is primarily for use with llama.cpp.**
|
8 |
- 13B parameters
|
9 |
- 4-bit quantized
|
10 |
- Based on version 1.1
|
11 |
-
- Used
|
12 |
-
- For q4_2, "Q4_2 ARM #1046" was used. Will update regularly if new changes are made.
|
13 |
- **Choosing between q4_0, q4_1, and q4_2:**
|
14 |
- 4_0 is the fastest. The quality is the poorest.
|
15 |
- 4_1 is slower. The quality is noticeably better.
|
|
|
2 |
license: apache-2.0
|
3 |
inference: false
|
4 |
---
|
|
|
5 |
|
6 |
**NOTE: This GGML conversion is primarily for use with llama.cpp.**
|
7 |
- 13B parameters
|
8 |
- 4-bit quantized
|
9 |
- Based on version 1.1
|
10 |
+
- Used best available quantization for each format
|
|
|
11 |
- **Choosing between q4_0, q4_1, and q4_2:**
|
12 |
- 4_0 is the fastest. The quality is the poorest.
|
13 |
- 4_1 is slower. The quality is noticeably better.
|