ikawrakow
/

various-2bit-sota-gguf

Inference Endpoints

Model card Files Files and versions Community

Resources

View closed (0)

What 2bit quantization approach are you using?

#13 opened about 1 year ago by

Not able to use with ctransformers

#12 opened about 1 year ago by

NousCapybara 34b

#11 opened about 1 year ago by

code-llama-70b

#10 opened about 1 year ago by

What -ctx and -chunks parameters did you use to make the iMatrix of the Lllama 2 70b?

#9 opened about 1 year ago by

Quantize these amazing models

#8 opened about 1 year ago by deleted

mixtral-instruct-8x7b for Q2KS as well

#7 opened about 1 year ago by

Would love a deepseekcode 2bit quant. I bet others would love it too :)

#6 opened about 1 year ago by

[Model request] Saily 100b, Saily 220b

#5 opened about 1 year ago by

Could We combine AWQ and Importance Matrix calculation together to further improve perplexity.

#4 opened about 1 year ago by

[Model Request] cognitivecomputations/dolphin-2.6-mistral-7b-dpo-laser

#3 opened about 1 year ago by

Magic Issues with nous-hermes-2-34b-2.16bpw.gguf (Log Attached...)

#2 opened about 1 year ago by

Please 3b model rocket 2bit?

#1 opened about 1 year ago by