Thanks!

#1
by smcleod - opened

Just wanted to say a quick thanks for getting the torrent and converting it, really appreciate it :)

smcleod changed discussion status to closed
Owner

uwo

@v2ray

How much VRAM does it require ? Any GPT-Q versions available?

Owner

@RageshAntony It's a 140B model, so it would require around 280GB VRAM to load, and a few more GBs for the context and KV cache.
And afaik there's currently no GPTQ version available, but someone would do it soon I think.

@v2ray 280GB ??

Even H100 has 80GB only? How to load this then ?

Owner

@RageshAntony You use multiple GPUs, or use a quantized version(Which doesn't exist yet).

Sign up or log in to comment