dragonkue commited on
Commit
4356376
β€’
1 Parent(s): 8ae8e25

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +200 -0
README.md ADDED
@@ -0,0 +1,200 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ language:
4
+ - ko
5
+ - en
6
+ metrics:
7
+ - accuracy
8
+ base_model:
9
+ - BAAI/bge-reranker-v2-m3
10
+ pipeline_tag: text-classification
11
+ library_name: sentence-transformers
12
+ ---
13
+
14
+ # Reranker (Cross-Encoder)
15
+
16
+ Different from embedding model, reranker uses question and document as input and directly output similarity instead of embedding. You can get a relevance score by inputting query and passage to the reranker. And the score can be mapped to a float value in [0,1] by sigmoid function.
17
+
18
+ ## Model Details
19
+ - Base model : BAAI/bge-reranker-v2-m3
20
+ - The multilingual model has been optimized for Korean.
21
+
22
+ ## Usage with Transformers
23
+
24
+ ```python
25
+ from transformers import AutoTokenizer, AutoModelForSequenceClassification
26
+ import torch
27
+
28
+ model = AutoModelForSequenceClassification.from_pretrained('dragonkue/bge-reranker-v2-m3-ko')
29
+ tokenizer = AutoTokenizer.from_pretrained('dragonkue/bge-reranker-v2-m3-ko')
30
+
31
+ features = tokenizer(['λͺ‡ 년도에 μ§€λ°©μ„Έμ™Έμˆ˜μž…λ²•μ΄ μ‹œν–‰λμ„κΉŒ?', 'μ‹€λ¬΄κ΅μœ‘μ„ 톡해 β€˜μ§€λ°©μ„Έμ™Έμˆ˜μž…λ²•β€™μ— λŒ€ν•œ μžμΉ˜λ‹¨μ²΄μ˜ 관심을 μ œκ³ ν•˜κ³  μžμΉ˜λ‹¨μ²΄μ˜ 차질 μ—†λŠ” 업무 좔진을 μ§€μ›ν•˜μ˜€λ‹€. μ΄λŸ¬ν•œ 쀀비과정을 거쳐 2014λ…„ 8μ›” 7일뢀터 β€˜μ§€λ°©μ„Έμ™Έμˆ˜μž…λ²•β€™μ΄ μ‹œν–‰λ˜μ—ˆλ‹€.'],
32
+ ['λͺ‡ 년도에 μ§€λ°©μ„Έμ™Έμˆ˜μž…λ²•μ΄ μ‹œν–‰λμ„κΉŒ?', 'μ‹ν’ˆμ˜μ•½ν’ˆμ•ˆμ „μ²˜λŠ” 21일 κ΅­λ‚΄ μ œμ•½κΈ°μ—… μœ λ°”μ΄μ˜€λ‘œμ§μŠ€κ°€ 개발 쀑인 μ‹ μ’… μ½”λ‘œλ‚˜λ°”μ΄λŸ¬μŠ€ 감염증(μ½”λ‘œλ‚˜19) λ°±μ‹  ν›„λ³΄λ¬Όμ§ˆ β€˜μœ μ½”λ°±-19β€™μ˜ μž„μƒμ‹œν—˜ κ³„νšμ„ μ§€λ‚œ 20일 μŠΉμΈν–ˆλ‹€κ³  λ°ν˜”λ‹€.'], padding=True, truncation=True, return_tensors="pt")
33
+
34
+ model.eval()
35
+ with torch.no_grad():
36
+ logits = model(**features).logits
37
+ scores = torch.sigmoid(logits)
38
+ print(scores)
39
+ ```
40
+
41
+
42
+ ## Usage with SentenceTransformers
43
+ First install the Sentence Transformers library:
44
+ ```
45
+ pip install -U sentence-transformers
46
+ ```
47
+
48
+ ```python
49
+ from sentence_transformers import CrossEncoder
50
+
51
+ model = CrossEncoder('dragonkue/bge-reranker-v2-m3-ko', default_activation_function=torch.nn.Sigmoid())
52
+
53
+ scores = model.predict(['λͺ‡ 년도에 μ§€λ°©μ„Έμ™Έμˆ˜μž…λ²•μ΄ μ‹œν–‰λμ„κΉŒ?', 'μ‹€λ¬΄κ΅μœ‘μ„ 톡해 β€˜μ§€λ°©μ„Έμ™Έμˆ˜μž…λ²•β€™μ— λŒ€ν•œ μžμΉ˜λ‹¨μ²΄μ˜ 관심을 μ œκ³ ν•˜κ³  μžμΉ˜λ‹¨μ²΄μ˜ 차질 μ—†λŠ” 업무 좔진을 μ§€μ›ν•˜μ˜€λ‹€. μ΄λŸ¬ν•œ 쀀비과정을 거쳐 2014λ…„ 8μ›” 7일뢀터 β€˜μ§€λ°©μ„Έμ™Έμˆ˜μž…λ²•β€™μ΄ μ‹œν–‰λ˜μ—ˆλ‹€.'],
54
+ ['λͺ‡ 년도에 μ§€λ°©μ„Έμ™Έμˆ˜μž…λ²•μ΄ μ‹œν–‰λμ„κΉŒ?', 'μ‹ν’ˆμ˜μ•½ν’ˆμ•ˆμ „μ²˜λŠ” 21일 κ΅­λ‚΄ μ œμ•½κΈ°μ—… μœ λ°”μ΄μ˜€λ‘œμ§μŠ€κ°€ 개발 쀑인 μ‹ μ’… μ½”λ‘œλ‚˜λ°”μ΄λŸ¬μŠ€ 감염증(μ½”λ‘œλ‚˜19) λ°±μ‹  ν›„λ³΄λ¬Όμ§ˆ β€˜μœ μ½”λ°±-19β€™μ˜ μž„μƒμ‹œν—˜ κ³„νšμ„ μ§€λ‚œ 20일 μŠΉμΈν–ˆλ‹€κ³  λ°ν˜”λ‹€.'])
55
+ print(scores)
56
+ ```
57
+
58
+ ## Usage with FlagEmbedding
59
+ First install the FlagEmbedding library:
60
+ ```
61
+ pip install -U FlagEmbedding
62
+ ```
63
+ ```python
64
+ from FlagEmbedding import FlagReranker
65
+
66
+ reranker = FlagReranker('dragonkue/bge-reranker-v2-m3-ko')
67
+
68
+ scores = reranker.compute_score([['λͺ‡ 년도에 μ§€λ°©μ„Έμ™Έμˆ˜μž…λ²•μ΄ μ‹œν–‰λμ„κΉŒ?', 'μ‹€λ¬΄κ΅μœ‘μ„ 톡해 β€˜μ§€λ°©μ„Έμ™Έμˆ˜μž…λ²•β€™μ— λŒ€ν•œ μžμΉ˜λ‹¨μ²΄μ˜ 관심을 μ œκ³ ν•˜κ³  μžμΉ˜λ‹¨μ²΄μ˜ 차질 μ—†λŠ” 업무 좔진을 μ§€μ›ν•˜μ˜€λ‹€. μ΄λŸ¬ν•œ 쀀비과정을 거쳐 2014λ…„ 8μ›” 7일뢀터 β€˜μ§€λ°©μ„Έμ™Έμˆ˜μž…λ²•β€™μ΄ μ‹œν–‰λ˜μ—ˆλ‹€.'],
69
+ ['λͺ‡ 년도에 μ§€λ°©μ„Έμ™Έμˆ˜μž…λ²•μ΄ μ‹œν–‰λμ„κΉŒ?', 'μ‹ν’ˆμ˜μ•½ν’ˆμ•ˆμ „μ²˜λŠ” 21일 κ΅­λ‚΄ μ œμ•½κΈ°μ—… μœ λ°”μ΄μ˜€λ‘œμ§μŠ€κ°€ 개발 쀑인 μ‹ μ’… μ½”λ‘œλ‚˜λ°”μ΄λŸ¬μŠ€ 감염증(μ½”λ‘œλ‚˜19) λ°±μ‹  ν›„λ³΄λ¬Όμ§ˆ β€˜μœ μ½”λ°±-19β€™μ˜ μž„μƒμ‹œν—˜ κ³„νšμ„ μ§€λ‚œ 20일 μŠΉμΈν–ˆλ‹€κ³  λ°ν˜”λ‹€.']], normalize=True)
70
+ print(scores)
71
+ ```
72
+
73
+ ## Fine-tune
74
+ Refer to https://github.com/FlagOpen/FlagEmbedding
75
+
76
+
77
+ ## Evaluation
78
+
79
+ ### Metrics
80
+ - ndcg, mrr, map metrics are metrics that consider ranking, while accuracy, precision, and recall are metrics that do not consider ranking. (Example: When considering ranking for retrieval top 10, different scores are given when the correct document is in 1st place and when it is in 10th place. However, accuracy, precision, and recall scores are the same if they are in the top 10.)
81
+
82
+
83
+
84
+ ### Bi-encoder and Cross-encoder
85
+
86
+ Bi-Encoders convert texts into fixed-size vectors and efficiently calculate similarities between them. They are fast and ideal for tasks like semantic search and classification, making them suitable for processing large datasets quickly.
87
+
88
+ Cross-Encoders directly compare pairs of texts to compute similarity scores, providing more accurate results. While they are slower due to needing to process each pair, they excel in re-ranking top results and are important in Advanced RAG techniques for enhancing text generation.
89
+
90
+
91
+ ### Korean Embedding Benchmark with AutoRAG
92
+ (https://github.com/Marker-Inc-Korea/AutoRAG-example-korean-embedding-benchmark)
93
+
94
+ This is a Korean embedding benchmark for the financial sector.
95
+
96
+
97
+ **Top-k 1**
98
+
99
+ Bi-Encoder (Sentence Transformer)
100
+
101
+ | Model name | F1 | Recall | Precision | mAP | mRR |
102
+ |---------------------------------------|------------|------------|------------|------------|------------|
103
+ | paraphrase-multilingual-mpnet-base-v2 | 0.3596 | 0.3596 | 0.3596 | 0.3596 | 0.3596 |
104
+ | KoSimCSE-roberta | 0.4298 | 0.4298 | 0.4298 | 0.4298 | 0.4298 |
105
+ | Cohere embed-multilingual-v3.0 | 0.3596 | 0.3596 | 0.3596 | 0.3596 | 0.3596 |
106
+ | openai ada 002 | 0.4737 | 0.4737 | 0.4737 | 0.4737 | 0.4737 |
107
+ | multilingual-e5-large-instruct | 0.4649 | 0.4649 | 0.4649 | 0.4649 | 0.4649 |
108
+ | Upstage Embedding | 0.6579 | 0.6579 | 0.6579 | 0.6579 | 0.6579 |
109
+ | paraphrase-multilingual-MiniLM-L12-v2 | 0.2982 | 0.2982 | 0.2982 | 0.2982 | 0.2982 |
110
+ | openai_embed_3_small | 0.5439 | 0.5439 | 0.5439 | 0.5439 | 0.5439 |
111
+ | ko-sroberta-multitask | 0.4211 | 0.4211 | 0.4211 | 0.4211 | 0.4211 |
112
+ | openai_embed_3_large | 0.6053 | 0.6053 | 0.6053 | 0.6053 | 0.6053 |
113
+ | KU-HIAI-ONTHEIT-large-v1 | 0.7105 | 0.7105 | 0.7105 | 0.7105 | 0.7105 |
114
+ | KU-HIAI-ONTHEIT-large-v1.1 | 0.7193 | 0.7193 | 0.7193 | 0.7193 | 0.7193 |
115
+ | kf-deberta-multitask | 0.4561 | 0.4561 | 0.4561 | 0.4561 | 0.4561 |
116
+ | gte-multilingual-base | 0.5877 | 0.5877 | 0.5877 | 0.5877 | 0.5877 |
117
+ | BGE-m3 | 0.6578 | 0.6578 | 0.6578 | 0.6578 | 0.6578 |
118
+ | bge-m3-korean | 0.5351 | 0.5351 | 0.5351 | 0.5351 | 0.5351 |
119
+ | **BGE-m3-ko** | **0.7456** | **0.7456** | **0.7456** | **0.7456** | **0.7456** |
120
+
121
+
122
+ Cross-Encoder (Reranker)
123
+
124
+ | Model name | F1 | Recall | Precision | mAP | mRR |
125
+ |---------------------------------------|------------|------------|------------|------------|------------|
126
+ | jinaai/jina-reranker-v2-base-multilingual | 0.8070 | 0.8070 | 0.8070 | 0.8070 | 0.8070 |
127
+ | Alibaba-NLP/gte-multilingual-reranker-base | 0.7281 | 0.7281 | 0.7281 | 0.7281 | 0.7281 |
128
+ | BAAI/bge-reranker-v2-m3 | 0.8772 | 0.8772 | 0.8772 | 0.8772 | 0.8772 |
129
+ | **bge-reranker-v2-m3-ko** | **0.9123** | **0.9123** | **0.9123** | **0.9123** | **0.9123** |
130
+
131
+
132
+ **Top-k 3**
133
+
134
+ Bi-Encoder (Sentence Transformer)
135
+
136
+ | Model name | F1 | Recall | Precision | mAP | mRR |
137
+ |---------------------------------------|------------|------------|------------|------------|------------|
138
+ | paraphrase-multilingual-mpnet-base-v2 | 0.2368 | 0.4737 | 0.1579 | 0.2032 | 0.2032 |
139
+ | KoSimCSE-roberta | 0.3026 | 0.6053 | 0.2018 | 0.2661 | 0.2661 |
140
+ | Cohere embed-multilingual-v3.0 | 0.2851 | 0.5702 | 0.1901 | 0.2515 | 0.2515 |
141
+ | openai ada 002 | 0.3553 | 0.7105 | 0.2368 | 0.3202 | 0.3202 |
142
+ | multilingual-e5-large-instruct | 0.3333 | 0.6667 | 0.2222 | 0.2909 | 0.2909 |
143
+ | Upstage Embedding | 0.4211 | 0.8421 | 0.2807 | **0.3509** | **0.3509** |
144
+ | paraphrase-multilingual-MiniLM-L12-v2 | 0.2061 | 0.4123 | 0.1374 | 0.1740 | 0.1740 |
145
+ | openai_embed_3_small | 0.3640 | 0.7281 | 0.2427 | 0.3026 | 0.3026 |
146
+ | ko-sroberta-multitask | 0.2939 | 0.5877 | 0.1959 | 0.2500 | 0.2500 |
147
+ | openai_embed_3_large | 0.3947 | 0.7895 | 0.2632 | 0.3348 | 0.3348 |
148
+ | KU-HIAI-ONTHEIT-large-v1 | 0.4386 | 0.8772 | 0.2924 | 0.3421 | 0.3421 |
149
+ | KU-HIAI-ONTHEIT-large-v1.1 | 0.4430 | 0.8860 | 0.2953 | 0.3406 | 0.3406 |
150
+ | kf-deberta-multitask | 0.3158 | 0.6316 | 0.2105 | 0.2792 | 0.2792 |
151
+ | gte-multilingual-base | 0.4035 | 0.8070 | 0.2690 | 0.3450 | 0.3450 |
152
+ | BGE-m3 | 0.4254 | 0.8508 | 0.2836 | 0.3421 | 0.3421 |
153
+ | bge-m3-korean | 0.3684 | 0.7368 | 0.2456 | 0.3143 | 0.3143 |
154
+ | **BGE-m3-ko** | **0.4517** | **0.9035** | **0.3011** | 0.3494 | 0.3494 |
155
+
156
+ Cross-Encoder (Reranker)
157
+
158
+ | Model name | F1 | Recall | Precision | mAP | mRR |
159
+ |---------------------------------------|------------|------------|------------|------------|------------|
160
+ | jinaai/jina-reranker-v2-base-multilingual | 0.4649 | 0.9298 | 0.3099 | 0.8626 | 0.8626 |
161
+ | Alibaba-NLP/gte-multilingual-reranker-base | 0.4605 | 0.9211 | 0.3070 | 0.8173 | 0.8173 |
162
+ | BAAI/bge-reranker-v2-m3 | 0.4781 | 0.9561 | 0.3187 | 0.9167 | 0.9167 |
163
+ | **bge-reranker-v2-m3-ko** | **0.4825** | **0.9649** | **0.3216** | **0.9371** | **0.9371** |
164
+
165
+
166
+ **Top-k 5**
167
+
168
+ Bi-Encoder (Sentence Transformer)
169
+
170
+ | Model name | F1 | Recall | Precision | mAP | mRR |
171
+ |---------------------------------------|------------|------------|------------|------------|------------|
172
+ | paraphrase-multilingual-mpnet-base-v2 | 0.1813 | 0.5439 | 0.1088 | 0.1575 | 0.1575 |
173
+ | KoSimCSE-roberta | 0.2164 | 0.6491 | 0.1298 | 0.1751 | 0.1751 |
174
+ | Cohere embed-multilingual-v3.0 | 0.2076 | 0.6228 | 0.1246 | 0.1640 | 0.1640 |
175
+ | openai ada 002 | 0.2602 | 0.7807 | 0.1561 | 0.2139 | 0.2139 |
176
+ | multilingual-e5-large-instruct | 0.2544 | 0.7632 | 0.1526 | 0.2194 | 0.2194 |
177
+ | Upstage Embedding | 0.2982 | 0.8947 | 0.1789 | **0.2237** | **0.2237** |
178
+ | paraphrase-multilingual-MiniLM-L12-v2 | 0.1637 | 0.4912 | 0.0982 | 0.1437 | 0.1437 |
179
+ | openai_embed_3_small | 0.2690 | 0.8070 | 0.1614 | 0.2148 | 0.2148 |
180
+ | ko-sroberta-multitask | 0.2164 | 0.6491 | 0.1298 | 0.1697 | 0.1697 |
181
+ | openai_embed_3_large | 0.2807 | 0.8421 | 0.1684 | 0.2088 | 0.2088 |
182
+ | KU-HIAI-ONTHEIT-large-v1 | 0.3041 | 0.9123 | 0.1825 | 0.2137 | 0.2137 |
183
+ | KU-HIAI-ONTHEIT-large-v1.1 | **0.3099** | **0.9298** | **0.1860** | 0.2148 | 0.2148 |
184
+ | kf-deberta-multitask | 0.2281 | 0.6842 | 0.1368 | 0.1724 | 0.1724 |
185
+ | gte-multilingual-base | 0.2865 | 0.8596 | 0.1719 | 0.2096 | 0.2096 |
186
+ | BGE-m3 | 0.3041 | 0.9123 | 0.1825 | 0.2193 | 0.2193 |
187
+ | bge-m3-korean | 0.2661 | 0.7982 | 0.1596 | 0.2116 | 0.2116 |
188
+ | **BGE-m3-ko** | **0.3099** | **0.9298** | **0.1860** | 0.2098 | 0.2098 |
189
+
190
+ Cross-Encoder (Reranker)
191
+
192
+ | Model name | F1 | Recall | Precision | mAP | mRR |
193
+ |---------------------------------------|------------|------------|------------|------------|------------|
194
+ | jinaai/jina-reranker-v2-base-multilingual | 0.3129 | 0.9386 | 0.1877 | 0.8643 | 0.8643 |
195
+ | Alibaba-NLP/gte-multilingual-reranker-base | 0.3158 | 0.9474 | 0.1895 | 0.8234 | 0.8234 |
196
+ | BAAI/bge-reranker-v2-m3 | **0.3216** | **0.9649** | **0.1930** | 0.9189 | 0.9189 |
197
+ | **bge-reranker-v2-m3-ko** | **0.3216** | **0.9649** | **0.1930** | **0.9371** | **0.9371** |
198
+
199
+
200
+