README.md · R136a1/MythoMax-L2-13B-exl2 at d5233e5dead6299f5985c4bacfb34ce585fb220d

metadata

license: other
language:
  - en

EXL2 Quantization of Gryphe's MythoMax L2 13B.

Other quantized models are available from TheBloke: GGML - GPTQ - GGUF - AWQ

Model details

Branch	Bits	Perplexity	Desc
main	5	6.1018	Up to 6144 context size on T4 GPU
6bit	6	6.1182	4096 context size (tokens) on T4 GPU
3bit	3	6.3666	Low bits quant while still good
4bit	4	6.1601	Slightly better than 4bit GPTQ, ez 8K context on T4 GPU
-	7	6.1056	2048 max context size for T4 GPU
-	8	6.1027	Just, why?

I'll upload the 7 and 8 bits quant if someone request it. (Idk y the 5 bits quant preplexity is lower than higher bits quant, I think I did something wrong?)

Prompt Format

Alpaca format:

### Instruction:


### Response: