aws-neuron
/

all-MiniLM-L6-v2-neuron

Model card Files Files and versions Community

jburtoft commited on Sep 28

Commit

d964909

•

1 Parent(s): 8779e29

Update README.md

Files changed (1) hide show

README.md +41 -3

README.md CHANGED Viewed

@@ -1,3 +1,41 @@
----
-license: apache-2.0
----

+---
+license: apache-2.0
+---
+**This model is a neuron compiled version of https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2 ***
+It was compiled on version 2.20 of the Neuron SDK. You may need to run the compilation process again.
+See https://huggingface.co/docs/optimum-neuron/en/inference_tutorials/sentence_transformers for more details
+For information on how to run on SageMaker: https://huggingface.co/docs/optimum-neuron/en/inference_tutorials/sentence_transformers
+To run:
+```
+from optimum.neuron import NeuronModelForSentenceTransformers
+from transformers import AutoTokenizer
+model_id = "jburtoft/all-MiniLM-L6-v2-neuron"
+# Use the line below if you have to compile the model yourself
+#model_id = "all-MiniLM-L6-v2-neuron"
+model = NeuronModelForSentenceTransformers.from_pretrained(model_id)
+tokenizer = AutoTokenizer.from_pretrained(model_id)
+# Run inference
+prompt = "I like to eat apples"
+encoded_input = tokenizer(prompt, return_tensors='pt')
+outputs = model(**encoded_input)
+token_embeddings = outputs.token_embeddings
+sentence_embedding = outputs.sentence_embedding
+print(f"token embeddings: {token_embeddings.shape}") # torch.Size([1, 7, 384])
+print(f"sentence_embedding: {sentence_embedding.shape}") # torch.Size([1, 384])
+```
+To compile:
+```
+optimum-cli export neuron -m sentence-transformers/all-MiniLM-L6-v2 --sequence_length 512 --batch_size 1 --task feature-extraction all-MiniLM-L6-v2-neuron
+```