R136a1's picture
Update README.md
2a7b65a
|
raw
history blame
1.62 kB
metadata
license: other
language:
  - en

EXL2 Quantization of Gryphe's MythoMax L2 13B.

Other quantized models are available from TheBloke: GGML - GPTQ - GGUF - AWQ

Model details

Branch Bits Perplexity **Desc **
main 5 6.1018 Up to 6144 context size on T4 GPU
6bit 6 6.1182 4096 context size (tokens) on T4 GPU
- 7 6.1056 2048 max context size for T4 GPU
- 8 6.1027 Just, why?

I'll upload the 7 and 8 bits quant if someone request it. (Idk y the 5 bits quant preplexity is lower than higher bits quant, I think I did something wrong?)

Prompt Format

Alpaca format:

### Instruction:


### Response:

license: other