Edit model card

Neuronx model for BAAI/bge-base-en-v1.5

This repository contains are AWS Inferentia2 and neuronx compatible checkpoint for BAAI/bge-base-en-v1.5. You can find detailed information about the base model on its Model Card.

Usage on Amazon SageMaker

coming soon

Usage with optimum-neuron


from optimum.neuron import NeuronModelForFeatureExtraction
from transformers import AutoTokenizer
import torch
import torch_neuronx

# Load Model from Hugging Face repository
model = NeuronModelForFeatureExtraction.from_pretrained("aws-neuron/bge-base-en-v1-5-seqlen-384-bs-1")
tokenizer = AutoTokenizer.from_pretrained("aws-neuron/bge-base-en-v1-5-seqlen-384-bs-1")

# sentence input
inputs = "Hello, my dog is cute"

# Tokenize sentences
encoded_input = tokenizer(inputs,return_tensors="pt",truncation=True,max_length=model.config.neuron["static_sequence_length"])

# Compute embeddings
with torch.no_grad():
    model_output = model(*tuple(encoded_input.values()))

# Perform pooling. In this case, cls pooling.
sentence_embeddings = model_output[0][:, 0]
# normalize embeddings
sentence_embeddings = torch.nn.functional.normalize(sentence_embeddings, p=2, dim=1)   

input_shapes

{
  "sequence_length": 384,
  "batch_size": 1
}
Downloads last month
229
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Space using aws-neuron/bge-base-en-v1-5-seqlen-384-bs-1 1

Evaluation results