Quantization of model

#1
by kuliev-vitaly - opened

Model is to large for starting on even 2 A100. Quantizations should help with hardware requirements.
Could you please make awq(4bit) or fp8 quantizations?

Sign up or log in to comment