optimum
/

bge-base-en-v1.5-neuronx

Feature Extraction

text-embeddings-inference

Inference Endpoints

Model card Files Files and versions Community

bge-base-en-v1.5-neuronx / README.md

Jingya's picture

Jingya HF staff

Update README.md

98d4907 verified about 1 year ago

|

history blame contribute delete

876 Bytes

	---
	license: apache-2.0
	---

	This model is compiled for neuronx devices (eg. on inf2 instance).

	This original checkpoint is [`BAAI/bge-base-en-v1.5`](https://huggingface.co/BAAI/bge-base-en-v1.5).

	## Export

	Here below is the command used for exporting this model:

	```bash
	optimum-cli export neuron -m BAAI/bge-base-en-v1.5 --sequence_length 384 --batch_size 1 --task feature-extraction bge_emb/
	```

	## Usage

	To use the compiled artifacts for inference, here is an example:

	```python
	from transformers import AutoTokenizer
	from optimum.neuron import NeuronModelForSenetenceTransformers

	emb_model = NeuronModelForSenetenceTransformers.from_pretrained("optimum/bge-base-en-v1.5-neuronx")
	inputs = tokenizer("Hamilton is considered to be the best musical of human history.", return_tensors="pt")
	emb = emb_model(**inputs)

	# ["token_embeddings", "sentence_embedding"]
	```