R136a1
/

MythoMax-L2-13B-exl2

Text Generation

Inference Endpoints

Model card Files Files and versions Community

R136a1 commited on Sep 24, 2023

Commit

abdfc21

•

1 Parent(s): 3f0a176

Update README.md

Files changed (1) hide show

README.md +8 -8

README.md CHANGED Viewed

@@ -9,14 +9,14 @@ Other quantized models are available from TheBloke: [GGML](https://huggingface.c
 ## Model details
-| **Branch**                                                           | **Bits** | **Perplexity** |
-|----------------------------------------------------------------------|----------|----------------|
-| [main](https://huggingface.co/R136a1/MythoMax-L2-13B-exl2/tree/main) | 5        | 6.1018         |
-| [6bit](https://huggingface.co/R136a1/MythoMax-L2-13B-exl2/tree/6bit) | 6        | 6.1182         |
-| -                                                                    | 7        | 6.1056         |
-| -                                                                    | 8        | 6.1027         |
-I'll upload the 7 and 8 bits quant if someone request it.
 ## Prompt Format

 ## Model details
+| **Branch**                                                           | **Bits** | **Perplexity** | **Desc **                             |
+|----------------------------------------------------------------------|----------|----------------|---------------------------------------|
+| [main](https://huggingface.co/R136a1/MythoMax-L2-13B-exl2/tree/main) | 5        | 6.1018         | Up to 6144 context size on T4 GPU     |
+| [6bit](https://huggingface.co/R136a1/MythoMax-L2-13B-exl2/tree/6bit) | 6        | 6.1182         | 4096 context size (tokens) on T4 GPU  |
+| -                                                                    | 7        | 6.1056         | 2048 max context size for T4 GPU      |
+| -                                                                    | 8        | 6.1027         | Just, why?                            |
+I'll upload the 7 and 8 bits quant if someone request it. (Idk y the 5 bits quant preplexity is lower than higher bits quant, need some test)
 ## Prompt Format