metadata
license: other
language:
- en
EXL2 Quantization of Gryphe's MythoMax L2 13B.
Other quantized models are available from TheBloke: GGML - GPTQ - GGUF - AWQ
Model details
Branch | Bits | Perplexity | Desc |
---|---|---|---|
main | 5 | 6.1018 | Up to 6144 context size on T4 GPU |
6bit | 6 | 6.1182 | 4096 context size (tokens) on T4 GPU |
3bit | 3 | 6.3666 | Low bits quant while still good |
4bit | 4 | 6.1601 | To be updated |
- | 7 | 6.1056 | 2048 max context size for T4 GPU |
- | 8 | 6.1027 | Just, why? |
I'll upload the 7 and 8 bits quant if someone request it. (Idk y the 5 bits quant preplexity is lower than higher bits quant, I think I did something wrong?)
Prompt Format
Alpaca format:
### Instruction:
### Response: