Inference Endpoints Changelog 🚀

Community Article Published October 11, 2024

Week 41, Oct 7-13

This week we had a lot of nice UI/UX improvements:

  • clearer error on models that are too large for any instance type, like for llama 405B 😅 image/png

  • better logs loading message if the endpoint isn't ready image/png

Additionally:

  • deprecated the "text2text-generation" tasks, it's been deprecated on the Hub and in the Inference API as well
  • you can now pass the "seed" parameter in the widget for diffuser models
  • small bug fixes on llama.cpp containers
  • you can directly play in the widget with openAI API parameters
  • Shoutout to Alvaro for making the NVLM-D-72B model compatible on endpoints 🙌

On the backend we're also making improvements to the autoscaling. This might not immediately have noticeable impact for user but soon it'll ripple to the front end as well. Stay tuned 👀