hf inference endpoint
#5
by
tintin12
- opened
has anyone tried deploying this through hf inference endpoint? i get errors. I know the inference engine command line has the option to pass in parameter to tell that it's an AWQ model, but the deployment interface does not provide such thing, i get errors and can't run
No, I've never tried it on the hosted HF endpoints. Only with a local TGI deployment via a Docker container.
If the hosted endpoint provides no way to specify the model type then my guess would be that it's not supported, but other than that I'm afraid I don't know. Contact HF support maybe?