[Cache Request] aws-neuron/Llama-2-7b-hf-neuron-budget

#59

by Gerald001 - opened Apr 18, 2024

Apr 18, 2024

Please add the following model to the neuron cache

AWS Inferentia and Trainium org Apr 19, 2024

Llama 7b is already present in the cache: please go to the model card, select deploy and look at the Inferentia code snippet.

dacorvo changed discussion status to closed Apr 19, 2024

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Your need to confirm your account before you can post a new comment.

· Sign up or log in to comment