Quantization Performances

#4
by AutomaticHourglass - opened

What are the quantization performances? Is it ok to use q8 or we should only use the fp16?

Here is a simple explanation of differences between quantization levels.

Your need to confirm your account before you can post a new comment.

Sign up or log in to comment