Update README.md
Browse files
README.md
CHANGED
@@ -1,3 +1,9 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
[EXL2](https://github.com/turboderp/exllamav2/tree/master#exllamav2) Quantization of [Gryphe's MythoMax L2 13B](https://huggingface.co/Gryphe/MythoMax-L2-13b).
|
2 |
|
3 |
Other quantized models are available from TheBloke: [GGML](https://huggingface.co/TheBloke/MythoMax-L2-13B-GGML) - [GPTQ](https://huggingface.co/TheBloke/MythoMax-L2-13B-GPTQ) - [GGUF](https://huggingface.co/TheBloke/MythoMax-L2-13B-GGUF) - [AWQ](https://huggingface.co/TheBloke/MythoMax-L2-13B-AWQ)
|
@@ -8,14 +14,12 @@ Other quantized models are available from TheBloke: [GGML](https://huggingface.c
|
|
8 |
|
9 |
## Model details
|
10 |
|
11 |
-
| Branch
|
12 |
-
|
13 |
-
| [
|
14 |
-
| [
|
15 |
-
| [
|
16 |
-
| [
|
17 |
-
| - | 7 | 6.1056 | 2048 max context size for T4 GPU |
|
18 |
-
| - | 8 | 6.1027 | Just, why? |
|
19 |
|
20 |
I'll upload the 7 and 8 bits quant if someone request it. (Idk y the 5 bits quant preplexity is lower than higher bits quant, I think I did something wrong?)
|
21 |
|
|
|
1 |
+
---
|
2 |
+
license: other
|
3 |
+
language:
|
4 |
+
- en
|
5 |
+
---
|
6 |
+
|
7 |
[EXL2](https://github.com/turboderp/exllamav2/tree/master#exllamav2) Quantization of [Gryphe's MythoMax L2 13B](https://huggingface.co/Gryphe/MythoMax-L2-13b).
|
8 |
|
9 |
Other quantized models are available from TheBloke: [GGML](https://huggingface.co/TheBloke/MythoMax-L2-13B-GGML) - [GPTQ](https://huggingface.co/TheBloke/MythoMax-L2-13B-GPTQ) - [GGUF](https://huggingface.co/TheBloke/MythoMax-L2-13B-GGUF) - [AWQ](https://huggingface.co/TheBloke/MythoMax-L2-13B-AWQ)
|
|
|
14 |
|
15 |
## Model details
|
16 |
|
17 |
+
| **Branch** | **bits** | **Perplexity** | **Description** |
|
18 |
+
|----------------------------------------------------------------------|----------|----------------|-------------------------------------------------------------|
|
19 |
+
| [3bit](https://huggingface.co/R136a1/MythoMax-L2-13B-exl2/tree/3bit) | 3.73 | 5.8251 | Low bits quant while still good |
|
20 |
+
| [4bit](https://huggingface.co/R136a1/MythoMax-L2-13B-exl2/tree/4bit) | 4.33 | 5.7784 | can go 6K context on T4 GPU |
|
21 |
+
| [main](https://huggingface.co/R136a1/MythoMax-L2-13B-exl2/tree/main) | 5.33 | 5.7427 | 4k Context on T4 GPU (recommended if you use Google Colab) |
|
22 |
+
| [6bit](https://huggingface.co/R136a1/MythoMax-L2-13B-exl2/tree/6bit) | 6.13 | 5.7347 | For those who want better quality and capable of running it |
|
|
|
|
|
23 |
|
24 |
I'll upload the 7 and 8 bits quant if someone request it. (Idk y the 5 bits quant preplexity is lower than higher bits quant, I think I did something wrong?)
|
25 |
|