dragonkue's picture
Update README.md
8b2d153 verified
---
license: apache-2.0
language:
- ko
- en
metrics:
- accuracy
base_model:
- BAAI/bge-reranker-v2-m3
pipeline_tag: text-classification
library_name: sentence-transformers
---
# Reranker (Cross-Encoder)
Different from embedding model, reranker uses question and document as input and directly output similarity instead of embedding. You can get a relevance score by inputting query and passage to the reranker. And the score can be mapped to a float value in [0,1] by sigmoid function.
## Model Details
- Base model : BAAI/bge-reranker-v2-m3
- The multilingual model has been optimized for Korean.
## Usage with Transformers
```python
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
model = AutoModelForSequenceClassification.from_pretrained('dragonkue/bge-reranker-v2-m3-ko')
tokenizer = AutoTokenizer.from_pretrained('dragonkue/bge-reranker-v2-m3-ko')
features = tokenizer([['λͺ‡ 년도에 μ§€λ°©μ„Έμ™Έμˆ˜μž…λ²•μ΄ μ‹œν–‰λμ„κΉŒ?', 'μ‹€λ¬΄κ΅μœ‘μ„ 톡해 β€˜μ§€λ°©μ„Έμ™Έμˆ˜μž…λ²•β€™μ— λŒ€ν•œ μžμΉ˜λ‹¨μ²΄μ˜ 관심을 μ œκ³ ν•˜κ³  μžμΉ˜λ‹¨μ²΄μ˜ 차질 μ—†λŠ” 업무 좔진을 μ§€μ›ν•˜μ˜€λ‹€. μ΄λŸ¬ν•œ 쀀비과정을 거쳐 2014λ…„ 8μ›” 7일뢀터 β€˜μ§€λ°©μ„Έμ™Έμˆ˜μž…λ²•β€™μ΄ μ‹œν–‰λ˜μ—ˆλ‹€.'],
['λͺ‡ 년도에 μ§€λ°©μ„Έμ™Έμˆ˜μž…λ²•μ΄ μ‹œν–‰λμ„κΉŒ?', 'μ‹ν’ˆμ˜μ•½ν’ˆμ•ˆμ „μ²˜λŠ” 21일 κ΅­λ‚΄ μ œμ•½κΈ°μ—… μœ λ°”μ΄μ˜€λ‘œμ§μŠ€κ°€ 개발 쀑인 μ‹ μ’… μ½”λ‘œλ‚˜λ°”μ΄λŸ¬μŠ€ 감염증(μ½”λ‘œλ‚˜19) λ°±μ‹  ν›„λ³΄λ¬Όμ§ˆ β€˜μœ μ½”λ°±-19β€™μ˜ μž„μƒμ‹œν—˜ κ³„νšμ„ μ§€λ‚œ 20일 μŠΉμΈν–ˆλ‹€κ³  λ°ν˜”λ‹€.']], padding=True, truncation=True, return_tensors="pt")
model.eval()
with torch.no_grad():
logits = model(**features).logits
scores = torch.sigmoid(logits)
print(scores)
# [9.9997962e-01 5.0702977e-07]
```
## Usage with SentenceTransformers
First install the Sentence Transformers library:
```
pip install -U sentence-transformers
```
```python
from sentence_transformers import CrossEncoder
model = CrossEncoder('dragonkue/bge-reranker-v2-m3-ko', default_activation_function=torch.nn.Sigmoid())
scores = model.predict([['λͺ‡ 년도에 μ§€λ°©μ„Έμ™Έμˆ˜μž…λ²•μ΄ μ‹œν–‰λμ„κΉŒ?', 'μ‹€λ¬΄κ΅μœ‘μ„ 톡해 β€˜μ§€λ°©μ„Έμ™Έμˆ˜μž…λ²•β€™μ— λŒ€ν•œ μžμΉ˜λ‹¨μ²΄μ˜ 관심을 μ œκ³ ν•˜κ³  μžμΉ˜λ‹¨μ²΄μ˜ 차질 μ—†λŠ” 업무 좔진을 μ§€μ›ν•˜μ˜€λ‹€. μ΄λŸ¬ν•œ 쀀비과정을 거쳐 2014λ…„ 8μ›” 7일뢀터 β€˜μ§€λ°©μ„Έμ™Έμˆ˜μž…λ²•β€™μ΄ μ‹œν–‰λ˜μ—ˆλ‹€.'],
['λͺ‡ 년도에 μ§€λ°©μ„Έμ™Έμˆ˜μž…λ²•μ΄ μ‹œν–‰λμ„κΉŒ?', 'μ‹ν’ˆμ˜μ•½ν’ˆμ•ˆμ „μ²˜λŠ” 21일 κ΅­λ‚΄ μ œμ•½κΈ°μ—… μœ λ°”μ΄μ˜€λ‘œμ§μŠ€κ°€ 개발 쀑인 μ‹ μ’… μ½”λ‘œλ‚˜λ°”μ΄λŸ¬μŠ€ 감염증(μ½”λ‘œλ‚˜19) λ°±μ‹  ν›„λ³΄λ¬Όμ§ˆ β€˜μœ μ½”λ°±-19β€™μ˜ μž„μƒμ‹œν—˜ κ³„νšμ„ μ§€λ‚œ 20일 μŠΉμΈν–ˆλ‹€κ³  λ°ν˜”λ‹€.']])
print(scores)
# [9.9997962e-01 5.0702977e-07]
```
## Usage with FlagEmbedding
First install the FlagEmbedding library:
```
pip install -U FlagEmbedding
```
```python
from FlagEmbedding import FlagReranker
reranker = FlagReranker('dragonkue/bge-reranker-v2-m3-ko')
scores = reranker.compute_score([['λͺ‡ 년도에 μ§€λ°©μ„Έμ™Έμˆ˜μž…λ²•μ΄ μ‹œν–‰λμ„κΉŒ?', 'μ‹€λ¬΄κ΅μœ‘μ„ 톡해 β€˜μ§€λ°©μ„Έμ™Έμˆ˜μž…λ²•β€™μ— λŒ€ν•œ μžμΉ˜λ‹¨μ²΄μ˜ 관심을 μ œκ³ ν•˜κ³  μžμΉ˜λ‹¨μ²΄μ˜ 차질 μ—†λŠ” 업무 좔진을 μ§€μ›ν•˜μ˜€λ‹€. μ΄λŸ¬ν•œ 쀀비과정을 거쳐 2014λ…„ 8μ›” 7일뢀터 β€˜μ§€λ°©μ„Έμ™Έμˆ˜μž…λ²•β€™μ΄ μ‹œν–‰λ˜μ—ˆλ‹€.'],
['λͺ‡ 년도에 μ§€λ°©μ„Έμ™Έμˆ˜μž…λ²•μ΄ μ‹œν–‰λμ„κΉŒ?', 'μ‹ν’ˆμ˜μ•½ν’ˆμ•ˆμ „μ²˜λŠ” 21일 κ΅­λ‚΄ μ œμ•½κΈ°μ—… μœ λ°”μ΄μ˜€λ‘œμ§μŠ€κ°€ 개발 쀑인 μ‹ μ’… μ½”λ‘œλ‚˜λ°”μ΄λŸ¬μŠ€ 감염증(μ½”λ‘œλ‚˜19) λ°±μ‹  ν›„λ³΄λ¬Όμ§ˆ β€˜μœ μ½”λ°±-19β€™μ˜ μž„μƒμ‹œν—˜ κ³„νšμ„ μ§€λ‚œ 20일 μŠΉμΈν–ˆλ‹€κ³  λ°ν˜”λ‹€.']], normalize=True)
print(scores)
# [9.9997962e-01 5.0702977e-07]
```
## Fine-tune
Refer to https://github.com/FlagOpen/FlagEmbedding
## Evaluation
### Bi-encoder and Cross-encoder
Bi-Encoders convert texts into fixed-size vectors and efficiently calculate similarities between them. They are fast and ideal for tasks like semantic search and classification, making them suitable for processing large datasets quickly.
Cross-Encoders directly compare pairs of texts to compute similarity scores, providing more accurate results. While they are slower due to needing to process each pair, they excel in re-ranking top results and are important in Advanced RAG techniques for enhancing text generation.
### Korean Embedding Benchmark with AutoRAG
(https://github.com/Marker-Inc-Korea/AutoRAG-example-korean-embedding-benchmark)
This is a Korean embedding benchmark for the financial sector.
**Top-k 1**
Bi-Encoder (Sentence Transformer)
| Model name | F1 | Recall | Precision |
|---------------------------------------|------------|------------|------------|
| paraphrase-multilingual-mpnet-base-v2 | 0.3596 | 0.3596 | 0.3596 |
| KoSimCSE-roberta | 0.4298 | 0.4298 | 0.4298 |
| Cohere embed-multilingual-v3.0 | 0.3596 | 0.3596 | 0.3596 |
| openai ada 002 | 0.4737 | 0.4737 | 0.4737 |
| multilingual-e5-large-instruct | 0.4649 | 0.4649 | 0.4649 |
| Upstage Embedding | 0.6579 | 0.6579 | 0.6579 |
| paraphrase-multilingual-MiniLM-L12-v2 | 0.2982 | 0.2982 | 0.2982 |
| openai_embed_3_small | 0.5439 | 0.5439 | 0.5439 |
| ko-sroberta-multitask | 0.4211 | 0.4211 | 0.4211 |
| openai_embed_3_large | 0.6053 | 0.6053 | 0.6053 |
| KU-HIAI-ONTHEIT-large-v1 | 0.7105 | 0.7105 | 0.7105 |
| KU-HIAI-ONTHEIT-large-v1.1 | 0.7193 | 0.7193 | 0.7193 |
| kf-deberta-multitask | 0.4561 | 0.4561 | 0.4561 |
| gte-multilingual-base | 0.5877 | 0.5877 | 0.5877 |
| BGE-m3 | 0.6578 | 0.6578 | 0.6578 |
| bge-m3-korean | 0.5351 | 0.5351 | 0.5351 |
| **BGE-m3-ko** | **0.7456** | **0.7456** | **0.7456** |
Cross-Encoder (Reranker)
| Model name | F1 | Recall | Precision |
|---------------------------------------|------------|------------|------------|
| gte-multilingual-reranker-base | 0.7281 | 0.7281 | 0.7281 |
| jina-reranker-v2-base-multilingual | 0.8070 | 0.8070 | 0.8070 |
| bge-reranker-v2-m3 | 0.8772 | 0.8772 | 0.8772 |
| **bge-reranker-v2-m3-ko** | **0.9123** | **0.9123** | **0.9123** |
**Top-k 3**
Bi-Encoder (Sentence Transformer)
| Model name | F1 | Recall | Precision |
|---------------------------------------|------------|------------|------------|
| paraphrase-multilingual-mpnet-base-v2 | 0.2368 | 0.4737 | 0.1579 |
| KoSimCSE-roberta | 0.3026 | 0.6053 | 0.2018 |
| Cohere embed-multilingual-v3.0 | 0.2851 | 0.5702 | 0.1901 |
| openai ada 002 | 0.3553 | 0.7105 | 0.2368 |
| multilingual-e5-large-instruct | 0.3333 | 0.6667 | 0.2222 |
| Upstage Embedding | 0.4211 | 0.8421 | 0.2807 |
| paraphrase-multilingual-MiniLM-L12-v2 | 0.2061 | 0.4123 | 0.1374 |
| openai_embed_3_small | 0.3640 | 0.7281 | 0.2427 |
| ko-sroberta-multitask | 0.2939 | 0.5877 | 0.1959 |
| openai_embed_3_large | 0.3947 | 0.7895 | 0.2632 |
| KU-HIAI-ONTHEIT-large-v1 | 0.4386 | 0.8772 | 0.2924 |
| KU-HIAI-ONTHEIT-large-v1.1 | 0.4430 | 0.8860 | 0.2953 |
| kf-deberta-multitask | 0.3158 | 0.6316 | 0.2105 |
| gte-multilingual-base | 0.4035 | 0.8070 | 0.2690 |
| BGE-m3 | 0.4254 | 0.8508 | 0.2836 |
| bge-m3-korean | 0.3684 | 0.7368 | 0.2456 |
| **BGE-m3-ko** | **0.4517** | **0.9035** | **0.3011** |
Cross-Encoder (Reranker)
| Model name | F1 | Recall | Precision |
|---------------------------------------|------------|------------|------------|
| gte-multilingual-reranker-base | 0.4605 | 0.9211 | 0.3070 |
| jina-reranker-v2-base-multilingual | 0.4649 | 0.9298 | 0.3099 |
| bge-reranker-v2-m3 | 0.4781 | 0.9561 | 0.3187 |
| **bge-reranker-v2-m3-ko** | **0.4825** | **0.9649** | **0.3216** |
**Top-k 5**
Bi-Encoder (Sentence Transformer)
| Model name | F1 | Recall | Precision |
|---------------------------------------|------------|------------|------------|
| paraphrase-multilingual-mpnet-base-v2 | 0.1813 | 0.5439 | 0.1088 |
| KoSimCSE-roberta | 0.2164 | 0.6491 | 0.1298 |
| Cohere embed-multilingual-v3.0 | 0.2076 | 0.6228 | 0.1246 |
| openai ada 002 | 0.2602 | 0.7807 | 0.1561 |
| multilingual-e5-large-instruct | 0.2544 | 0.7632 | 0.1526 |
| Upstage Embedding | 0.2982 | 0.8947 | 0.1789 |
| paraphrase-multilingual-MiniLM-L12-v2 | 0.1637 | 0.4912 | 0.0982 |
| openai_embed_3_small | 0.2690 | 0.8070 | 0.1614 |
| ko-sroberta-multitask | 0.2164 | 0.6491 | 0.1298 |
| openai_embed_3_large | 0.2807 | 0.8421 | 0.1684 |
| KU-HIAI-ONTHEIT-large-v1 | 0.3041 | 0.9123 | 0.1825 |
| KU-HIAI-ONTHEIT-large-v1.1 | **0.3099** | **0.9298** | **0.1860** |
| kf-deberta-multitask | 0.2281 | 0.6842 | 0.1368 |
| gte-multilingual-base | 0.2865 | 0.8596 | 0.1719 |
| BGE-m3 | 0.3041 | 0.9123 | 0.1825 |
| bge-m3-korean | 0.2661 | 0.7982 | 0.1596 |
| **BGE-m3-ko** | **0.3099** | **0.9298** | **0.1860** |
Cross-Encoder (Reranker)
| Model name | F1 | Recall | Precision |
|---------------------------------------|------------|------------|------------|
| gte-multilingual-reranker-base | 0.3158 | 0.9474 | 0.1895 |
| jina-reranker-v2-base-multilingual | 0.3129 | 0.9386 | 0.1877 |
| bge-reranker-v2-m3 | **0.3216** | **0.9649** | **0.1930** |
| **bge-reranker-v2-m3-ko** | **0.3216** | **0.9649** | **0.1930** |