File size: 397 Bytes
0740466
 
 
3195f07
5231bcb
a55e484
64d454b
a55e484
64d454b
a55e484
1
2
3
4
5
6
7
8
9
10
---
license: other
---
5 bit quantization of airoboros 70b 1.4.1 (https://huggingface.co/jondurbin/airoboros-l2-70b-gpt4-1.4.1), using exllama2.

Update 21/09/23

Re-quanted with latest exllamav2 version, which fixed some measurement issues.

Also, now 5bpw works on 2x24GB VRAM cards, using gpu_split 21,21 and flash-attn (only Linux for now), for 4096 context and 1 GB to spare, to try for more.