any plan to release quantization that works with llama.cpp

#3
by ziyadalkhonein - opened

any plan to release quantization that works with llama.cpp? you know not lot of people have V100 or A100

You can run it on oobabooga in 4bit which will take less vram

Your need to confirm your account before you can post a new comment.

Sign up or log in to comment