Correct machine to deploy the model on AWS Sagemaker

#11

by LorenzoCevolaniAXA - opened Feb 16, 2024

Feb 16, 2024

I am trying to deploy the model in an endpoint inside AWS Sagemaker.
I have tried several instances from "ml.g5.4xlarge" with 4 GPUs, which should be the standard way of deploying a 13B model as this one, to the bigger "ml.g5.48xlarge" with 8 GPUS and I always get an error about an OOM in one of the GPUS, is there something I can try to make it work?
Do you have a configuration that is working on your side?

Ilanmeiss

Feb 12

Having same issues. I tryed ml.g5.12xlarge with 4 24GB GPUs. should defiantly be enough but had no success.

Ilanmeiss

Feb 12

I solved it on ml.g5.12xlarge
You can follow this tutorial
https://dgallitelli95.medium.com/using-aya-101-in-amazon-sagemaker-4c1f30dfa5cd
Notice the version of get_huggingface_llm_image_uri("huggingface",version="1.1.0")
I did with version="2.0.2" as I usually do and it did not work. It does work with version 1.1.0

LorenzoCevolaniAXA

Feb 12

Thanks! I will try it out!

LorenzoCevolaniAXA changed discussion status to closed Feb 12

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Your need to confirm your account before you can post a new comment.

· Sign up or log in to comment