|
--- |
|
license: apache-2.0 |
|
--- |
|
|
|
This model is compiled for neuronx devices (eg. on inf2 instance). |
|
|
|
This original checkpoint is [`BAAI/bge-base-en-v1.5`](https://huggingface.co/BAAI/bge-base-en-v1.5). |
|
|
|
## Export |
|
|
|
Here below is the command used for exporting this model: |
|
|
|
```bash |
|
optimum-cli export neuron -m BAAI/bge-base-en-v1.5 --sequence_length 384 --batch_size 1 --task feature-extraction bge_emb/ |
|
``` |
|
|
|
## Usage |
|
|
|
To use the compiled artifacts for inference, here is an example: |
|
|
|
```python |
|
from transformers import AutoTokenizer |
|
from optimum.neuron import NeuronModelForSenetenceTransformers |
|
|
|
emb_model = NeuronModelForSenetenceTransformers.from_pretrained("optimum/bge-base-en-v1.5-neuronx") |
|
inputs = tokenizer("Hamilton is considered to be the best musical of human history.", return_tensors="pt") |
|
emb = emb_model(**inputs) |
|
|
|
# ["token_embeddings", "sentence_embedding"] |
|
``` |
|
|