Will quantised version be available?

#9
by angerhang - opened

Thanks for sharing but what are the recommended ways to quantise this model?
Or will quantised model be made available so that it is not as resource-intensive to do inference?

Thanks

Did you see https://huggingface.co/models?other=base_model:quantized:nvidia/Llama-3.1-Nemotron-70B-Instruct-HF?
Use the model tree section on model pages to see what quantizations are available.

Sign up or log in to comment