R136a1's picture
Update README.md
196a0e7
|
raw
history blame
1.76 kB
metadata
license: other
language:
  - en

EXL2 Quantization of Gryphe's MythoMax L2 13B.

Other quantized models are available from TheBloke: GGML - GPTQ - GGUF - AWQ

Model details

Base Perplexity : 5.7447

Branch bits Perplexity Description
3bit 3.73 5.8251 Low bits quant while still good
4bit 4.33 5.7784 can go 6K context on T4 GPU
main 5.33 5.7427 4k Context on T4 GPU (recommended if you use Google Colab)
6bit 6.13 5.7347 For those who want better quality and capable of running it

I'll upload the 7 and 8 bits quant if someone request it. (Idk y the 5 bits quant preplexity is lower than higher bits quant, I think I did something wrong?)

Prompt Format

Alpaca format:

### Instruction:





### Response: