Are there instructions on how to quantize the model to achieve <2s inference times? It's ok if precision is reduced.
· Sign up or log in to comment