Can you produce a 2.4bpw quantization of this model?
#1
by
xldistance
- opened
rtx4090 24gb video memory can only run 2.4bpw quantized models
xldistance
changed discussion title from
Can you produce a 2.34bpw quantization of this model?
to Can you produce a 2.4bpw quantization of this model?
I've posted a 2.75, 2.5 and 2.25 for Athene. I'm running perplexity scoring now and will update the README's with those scores when they're done.
Perplexity scores have now also been added.
Dracones
changed discussion status to
closed