Update README.md
Browse files
README.md
CHANGED
@@ -9,14 +9,14 @@ Other quantized models are available from TheBloke: [GGML](https://huggingface.c
|
|
9 |
|
10 |
## Model details
|
11 |
|
12 |
-
|
|
13 |
-
|
14 |
-
| [main](https://huggingface.co/R136a1/MythoMax-L2-13B-exl2/tree/main) | 5
|
15 |
-
| [6bit](https://huggingface.co/R136a1/MythoMax-L2-13B-exl2/tree/6bit) | 6
|
16 |
-
| [3bit](https://huggingface.co/R136a1/MythoMax-L2-13B-exl2/tree/3bit) | 3
|
17 |
-
| -
|
18 |
-
| - | 7
|
19 |
-
| - | 8
|
20 |
|
21 |
I'll upload the 7 and 8 bits quant if someone request it. (Idk y the 5 bits quant preplexity is lower than higher bits quant, I think I did something wrong?)
|
22 |
|
|
|
9 |
|
10 |
## Model details
|
11 |
|
12 |
+
| Branch | Bits | Perplexity | Desc |
|
13 |
+
|----------------------------------------------------------------------|------|------------|---------------------------------------|
|
14 |
+
| [main](https://huggingface.co/R136a1/MythoMax-L2-13B-exl2/tree/main) | 5 | 6.1018 | Up to 6144 context size on T4 GPU |
|
15 |
+
| [6bit](https://huggingface.co/R136a1/MythoMax-L2-13B-exl2/tree/6bit) | 6 | 6.1182 | 4096 context size (tokens) on T4 GPU |
|
16 |
+
| [3bit](https://huggingface.co/R136a1/MythoMax-L2-13B-exl2/tree/3bit) | 3 | 6.3666 | Low bits quant while still good |
|
17 |
+
| [4bit](https://huggingface.co/R136a1/MythoMax-L2-13B-exl2/tree/4bit) | 4 | 6.1601 | To be updated |
|
18 |
+
| - | 7 | 6.1056 | 2048 max context size for T4 GPU |
|
19 |
+
| - | 8 | 6.1027 | Just, why? |
|
20 |
|
21 |
I'll upload the 7 and 8 bits quant if someone request it. (Idk y the 5 bits quant preplexity is lower than higher bits quant, I think I did something wrong?)
|
22 |
|