R136a1
/

MythoMax-L2-13B-exl2

Text Generation

Inference Endpoints

Model card Files Files and versions Community

R136a1 commited on Sep 25, 2023

Commit

7c1e85e

•

1 Parent(s): 3e237f6

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -14,7 +14,7 @@ Other quantized models are available from TheBloke: [GGML](https://huggingface.c
 | [main](https://huggingface.co/R136a1/MythoMax-L2-13B-exl2/tree/main) | 5    | 6.1018     | Up to 6144 context size on T4 GPU     |
 | [6bit](https://huggingface.co/R136a1/MythoMax-L2-13B-exl2/tree/6bit) | 6    | 6.1182     | 4096 context size (tokens) on T4 GPU  |
 | [3bit](https://huggingface.co/R136a1/MythoMax-L2-13B-exl2/tree/3bit) | 3    | 6.3666     | Low bits quant while still good       |
-| [4bit](https://huggingface.co/R136a1/MythoMax-L2-13B-exl2/tree/4bit) | 4    | 6.1601     | To be updated                         |
 | -                                                                    | 7    | 6.1056     | 2048 max context size for T4 GPU      |
 | -                                                                    | 8    | 6.1027     | Just, why?                            |

 | [main](https://huggingface.co/R136a1/MythoMax-L2-13B-exl2/tree/main) | 5    | 6.1018     | Up to 6144 context size on T4 GPU     |
 | [6bit](https://huggingface.co/R136a1/MythoMax-L2-13B-exl2/tree/6bit) | 6    | 6.1182     | 4096 context size (tokens) on T4 GPU  |
 | [3bit](https://huggingface.co/R136a1/MythoMax-L2-13B-exl2/tree/3bit) | 3    | 6.3666     | Low bits quant while still good       |
+| [4bit](https://huggingface.co/R136a1/MythoMax-L2-13B-exl2/tree/4bit) | 4    | 6.1601     | Slightly better than 4bit GPTQ        |
 | -                                                                    | 7    | 6.1056     | 2048 max context size for T4 GPU      |
 | -                                                                    | 8    | 6.1027     | Just, why?                            |