Update README.md
Browse files
README.md
CHANGED
@@ -3,21 +3,20 @@ license: apache-2.0
|
|
3 |
inference: false
|
4 |
---
|
5 |
|
6 |
-
**NOTE: This GGML conversion is primarily for use with llama.cpp.**
|
7 |
-
-
|
8 |
-
-
|
9 |
-
-
|
10 |
-
-
|
11 |
-
-
|
12 |
-
-
|
13 |
-
|
14 |
-
-
|
15 |
-
-
|
16 |
-
|
17 |
-
-
|
18 |
-
|
19 |
-
|
20 |
-
<br>
|
21 |
|
22 |
# Vicuna Model Card
|
23 |
|
|
|
3 |
inference: false
|
4 |
---
|
5 |
|
6 |
+
**NOTE: This GGML conversion is primarily for use with llama.cpp.**
|
7 |
+
- PR #896 was used for q4_0. Everything else is latest as of upload time.
|
8 |
+
- A warning for q4_2 and q4_3: These are WIP. Do not expect any kind of backwards compatibility until they are finalized.
|
9 |
+
- 7B can be found here: https://huggingface.co/eachadea/ggml-vicuna-7b-1.1
|
10 |
+
- **Choosing the right model:**
|
11 |
+
- `ggml-vicuna-13b-1.1-q4_0` - Fast, lacks in accuracy.
|
12 |
+
- `ggml-vicuna-13b-1.1-q4_1` - More accurate, lacks in speed.
|
13 |
+
|
14 |
+
- `ggml-vicuna-13b-1.1-q4_2` - Pretty much a better `q4_0`. Similarly fast, but more accurate.
|
15 |
+
- `ggml-vicuna-13b-1.1-q4_3` - Pretty much a better `q4_1`. More accurate, still slower.
|
16 |
+
|
17 |
+
- `ggml-vicuna-13b-1.0-uncensored` - Available in `q4_2` and `q4_3`, is an uncensored/unfiltered variant of the model. It is based on the previous release and still uses the `### Human:` syntax. Avoid unless you need it.
|
18 |
+
|
19 |
+
---
|
|
|
20 |
|
21 |
# Vicuna Model Card
|
22 |
|