R136a1 commited on
Commit
36d8bbd
1 Parent(s): ed66c38

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +12 -8
README.md CHANGED
@@ -1,3 +1,9 @@
 
 
 
 
 
 
1
  [EXL2](https://github.com/turboderp/exllamav2/tree/master#exllamav2) Quantization of [Gryphe's MythoMax L2 13B](https://huggingface.co/Gryphe/MythoMax-L2-13b).
2
 
3
  Other quantized models are available from TheBloke: [GGML](https://huggingface.co/TheBloke/MythoMax-L2-13B-GGML) - [GPTQ](https://huggingface.co/TheBloke/MythoMax-L2-13B-GPTQ) - [GGUF](https://huggingface.co/TheBloke/MythoMax-L2-13B-GGUF) - [AWQ](https://huggingface.co/TheBloke/MythoMax-L2-13B-AWQ)
@@ -8,14 +14,12 @@ Other quantized models are available from TheBloke: [GGML](https://huggingface.c
8
 
9
  ## Model details
10
 
11
- | Branch | Bits | Perplexity | Desc |
12
- |----------------------------------------------------------------------|------|------------|---------------------------------------------------------|
13
- | [main](https://huggingface.co/R136a1/MythoMax-L2-13B-exl2/tree/main) | 5 | 6.1018 | Up to 6144 context size on T4 GPU |
14
- | [6bit](https://huggingface.co/R136a1/MythoMax-L2-13B-exl2/tree/6bit) | 6 | 6.1182 | 4096 context size (tokens) on T4 GPU |
15
- | [3bit](https://huggingface.co/R136a1/MythoMax-L2-13B-exl2/tree/3bit) | 3 | 6.3666 | Low bits quant while still good |
16
- | [4bit](https://huggingface.co/R136a1/MythoMax-L2-13B-exl2/tree/4bit) | 4 | 6.1601 | Slightly better than 4bit GPTQ, ez 8K context on T4 GPU |
17
- | - | 7 | 6.1056 | 2048 max context size for T4 GPU |
18
- | - | 8 | 6.1027 | Just, why? |
19
 
20
  I'll upload the 7 and 8 bits quant if someone request it. (Idk y the 5 bits quant preplexity is lower than higher bits quant, I think I did something wrong?)
21
 
 
1
+ ---
2
+ license: other
3
+ language:
4
+ - en
5
+ ---
6
+
7
  [EXL2](https://github.com/turboderp/exllamav2/tree/master#exllamav2) Quantization of [Gryphe's MythoMax L2 13B](https://huggingface.co/Gryphe/MythoMax-L2-13b).
8
 
9
  Other quantized models are available from TheBloke: [GGML](https://huggingface.co/TheBloke/MythoMax-L2-13B-GGML) - [GPTQ](https://huggingface.co/TheBloke/MythoMax-L2-13B-GPTQ) - [GGUF](https://huggingface.co/TheBloke/MythoMax-L2-13B-GGUF) - [AWQ](https://huggingface.co/TheBloke/MythoMax-L2-13B-AWQ)
 
14
 
15
  ## Model details
16
 
17
+ | **Branch** | **bits** | **Perplexity** | **Description** |
18
+ |----------------------------------------------------------------------|----------|----------------|-------------------------------------------------------------|
19
+ | [3bit](https://huggingface.co/R136a1/MythoMax-L2-13B-exl2/tree/3bit) | 3.73 | 5.8251 | Low bits quant while still good |
20
+ | [4bit](https://huggingface.co/R136a1/MythoMax-L2-13B-exl2/tree/4bit) | 4.33 | 5.7784 | can go 6K context on T4 GPU |
21
+ | [main](https://huggingface.co/R136a1/MythoMax-L2-13B-exl2/tree/main) | 5.33 | 5.7427 | 4k Context on T4 GPU (recommended if you use Google Colab) |
22
+ | [6bit](https://huggingface.co/R136a1/MythoMax-L2-13B-exl2/tree/6bit) | 6.13 | 5.7347 | For those who want better quality and capable of running it |
 
 
23
 
24
  I'll upload the 7 and 8 bits quant if someone request it. (Idk y the 5 bits quant preplexity is lower than higher bits quant, I think I did something wrong?)
25