me5-wb / README.md
seniichev's picture
Update README.md
43956f1 verified
|
raw
history blame
1.51 kB
---
library_name: sentence-transformers
pipeline_tag: sentence-similarity
tags:
- sentence-transformers
- sentence-similarity
license: mit
language:
- en
- ru
---
# Multilingual E5 WB
Fine-tuned version of default [multilingual-e5-base](https://huggingface.co/intfloat/multilingual-e5-base) for WB DS School and RAG project.
## Evaluation Results
<!--- Describe how your model was evaluated -->
As model is used as retriever, **goal was to boost its performance at cosine similarity between question and answer.**
With given dataset of QA pairs model performance on EmbeddingSimilarityEvaluator improved from **0.62 to 0.78.**
## Fine Tuning
**DataLoader**:
`torch.utils.data.dataloader.DataLoader` of length 790 with parameters:
```
{'batch_size': 12, 'sampler': 'torch.utils.data.sampler.SequentialSampler', 'batch_sampler': 'torch.utils.data.sampler.BatchSampler'}
```
**Loss**:
`sentence_transformers.losses.ContrastiveLoss.ContrastiveLoss` with parameters:
```
{'distance_metric': 'SiameseDistanceMetric.COSINE_DISTANCE', 'margin': 0.5, 'size_average': True}
```
Parameters of the fit()-Method:
```
{
"epochs": 10,
"evaluation_steps": 100,
"evaluator": "sentence_transformers.evaluation.EmbeddingSimilarityEvaluator.EmbeddingSimilarityEvaluator",
"optimizer_class": "<class 'torch.optim.adamw.AdamW'>",
"optimizer_params": {
"lr": 2e-05
},
"scheduler": "WarmupLinear",
"steps_per_epoch": null,
"warmup_steps": 790,
"weight_decay": 0.01
}
```