README.md · R136a1/MythoMax-L2-13B-exl2 at 2a7b65a2e8189fb3f4e3daa5fc6affbcf365f35b

metadata

license: other
language:
  - en

EXL2 Quantization of Gryphe's MythoMax L2 13B.

Other quantized models are available from TheBloke: GGML - GPTQ - GGUF - AWQ

Model details

Branch	Bits	Perplexity	Desc
main	5	6.1018	Up to 6144 context size on T4 GPU
6bit	6	6.1182	4096 context size (tokens) on T4 GPU
-	7	6.1056	2048 max context size for T4 GPU
-	8	6.1027	Just, why?

I'll upload the 7 and 8 bits quant if someone request it. (Idk y the 5 bits quant preplexity is lower than higher bits quant, I think I did something wrong?)

Prompt Format

Alpaca format:

### Instruction:


### Response: