|
--- |
|
license: other |
|
language: |
|
- en |
|
--- |
|
|
|
[EXL2](https://github.com/turboderp/exllamav2/tree/master#exllamav2) Quantization of [Gryphe's MythoMax L2 13B](https://huggingface.co/Gryphe/MythoMax-L2-13b). |
|
|
|
Other quantized models are available from TheBloke: [GGML](https://huggingface.co/TheBloke/MythoMax-L2-13B-GGML) - [GPTQ](https://huggingface.co/TheBloke/MythoMax-L2-13B-GPTQ) - [GGUF](https://huggingface.co/TheBloke/MythoMax-L2-13B-GGUF) - [AWQ](https://huggingface.co/TheBloke/MythoMax-L2-13B-AWQ) |
|
|
|
|
|
|
|
|
|
|
|
## Model details |
|
|
|
| **Branch** | **bits** | **Perplexity** | **Description** | |
|
|----------------------------------------------------------------------|----------|----------------|-------------------------------------------------------------| |
|
| [3bit](https://huggingface.co/R136a1/MythoMax-L2-13B-exl2/tree/3bit) | 3.73 | 5.8251 | Low bits quant while still good | |
|
| [4bit](https://huggingface.co/R136a1/MythoMax-L2-13B-exl2/tree/4bit) | 4.33 | 5.7784 | can go 6K context on T4 GPU | |
|
| [main](https://huggingface.co/R136a1/MythoMax-L2-13B-exl2/tree/main) | 5.33 | 5.7427 | 4k Context on T4 GPU (recommended if you use Google Colab) | |
|
| [6bit](https://huggingface.co/R136a1/MythoMax-L2-13B-exl2/tree/6bit) | 6.13 | 5.7347 | For those who want better quality and capable of running it | |
|
|
|
I'll upload the 7 and 8 bits quant if someone request it. (Idk y the 5 bits quant preplexity is lower than higher bits quant, I think I did something wrong?) |
|
|
|
## Prompt Format |
|
|
|
Alpaca format: |
|
``` |
|
### Instruction: |
|
|
|
|
|
|
|
|
|
|
|
### Response: |
|
``` |