--- license: apache-2.0 language: - ko - en metrics: - accuracy base_model: - BAAI/bge-reranker-v2-m3 pipeline_tag: text-classification library_name: sentence-transformers --- # Reranker (Cross-Encoder) Different from embedding model, reranker uses question and document as input and directly output similarity instead of embedding. You can get a relevance score by inputting query and passage to the reranker. And the score can be mapped to a float value in [0,1] by sigmoid function. ## Model Details - Base model : BAAI/bge-reranker-v2-m3 - The multilingual model has been optimized for Korean. ## Usage with Transformers ```python from transformers import AutoTokenizer, AutoModelForSequenceClassification import torch model = AutoModelForSequenceClassification.from_pretrained('dragonkue/bge-reranker-v2-m3-ko') tokenizer = AutoTokenizer.from_pretrained('dragonkue/bge-reranker-v2-m3-ko') features = tokenizer(['몇 년도에 지방세외수입법이 시행됐을까?', '실무교육을 통해 ‘지방세외수입법’에 대한 자치단체의 관심을 제고하고 자치단체의 차질 없는 업무 추진을 지원하였다. 이러한 준비과정을 거쳐 2014년 8월 7일부터 ‘지방세외수입법’이 시행되었다.'], ['몇 년도에 지방세외수입법이 시행됐을까?', '식품의약품안전처는 21일 국내 제약기업 유바이오로직스가 개발 중인 신종 코로나바이러스 감염증(코로나19) 백신 후보물질 ‘유코백-19’의 임상시험 계획을 지난 20일 승인했다고 밝혔다.'], padding=True, truncation=True, return_tensors="pt") model.eval() with torch.no_grad(): logits = model(**features).logits scores = torch.sigmoid(logits) print(scores) ``` ## Usage with SentenceTransformers First install the Sentence Transformers library: ``` pip install -U sentence-transformers ``` ```python from sentence_transformers import CrossEncoder model = CrossEncoder('dragonkue/bge-reranker-v2-m3-ko', default_activation_function=torch.nn.Sigmoid()) scores = model.predict(['몇 년도에 지방세외수입법이 시행됐을까?', '실무교육을 통해 ‘지방세외수입법’에 대한 자치단체의 관심을 제고하고 자치단체의 차질 없는 업무 추진을 지원하였다. 이러한 준비과정을 거쳐 2014년 8월 7일부터 ‘지방세외수입법’이 시행되었다.'], ['몇 년도에 지방세외수입법이 시행됐을까?', '식품의약품안전처는 21일 국내 제약기업 유바이오로직스가 개발 중인 신종 코로나바이러스 감염증(코로나19) 백신 후보물질 ‘유코백-19’의 임상시험 계획을 지난 20일 승인했다고 밝혔다.']) print(scores) ``` ## Usage with FlagEmbedding First install the FlagEmbedding library: ``` pip install -U FlagEmbedding ``` ```python from FlagEmbedding import FlagReranker reranker = FlagReranker('dragonkue/bge-reranker-v2-m3-ko') scores = reranker.compute_score([['몇 년도에 지방세외수입법이 시행됐을까?', '실무교육을 통해 ‘지방세외수입법’에 대한 자치단체의 관심을 제고하고 자치단체의 차질 없는 업무 추진을 지원하였다. 이러한 준비과정을 거쳐 2014년 8월 7일부터 ‘지방세외수입법’이 시행되었다.'], ['몇 년도에 지방세외수입법이 시행됐을까?', '식품의약품안전처는 21일 국내 제약기업 유바이오로직스가 개발 중인 신종 코로나바이러스 감염증(코로나19) 백신 후보물질 ‘유코백-19’의 임상시험 계획을 지난 20일 승인했다고 밝혔다.']], normalize=True) print(scores) ``` ## Fine-tune Refer to https://github.com/FlagOpen/FlagEmbedding ## Evaluation ### Metrics - ndcg, mrr, map metrics are metrics that consider ranking, while accuracy, precision, and recall are metrics that do not consider ranking. (Example: When considering ranking for retrieval top 10, different scores are given when the correct document is in 1st place and when it is in 10th place. However, accuracy, precision, and recall scores are the same if they are in the top 10.) ### Bi-encoder and Cross-encoder Bi-Encoders convert texts into fixed-size vectors and efficiently calculate similarities between them. They are fast and ideal for tasks like semantic search and classification, making them suitable for processing large datasets quickly. Cross-Encoders directly compare pairs of texts to compute similarity scores, providing more accurate results. While they are slower due to needing to process each pair, they excel in re-ranking top results and are important in Advanced RAG techniques for enhancing text generation. ### Korean Embedding Benchmark with AutoRAG (https://github.com/Marker-Inc-Korea/AutoRAG-example-korean-embedding-benchmark) This is a Korean embedding benchmark for the financial sector. **Top-k 1** Bi-Encoder (Sentence Transformer) | Model name | F1 | Recall | Precision | mAP | mRR | |---------------------------------------|------------|------------|------------|------------|------------| | paraphrase-multilingual-mpnet-base-v2 | 0.3596 | 0.3596 | 0.3596 | 0.3596 | 0.3596 | | KoSimCSE-roberta | 0.4298 | 0.4298 | 0.4298 | 0.4298 | 0.4298 | | Cohere embed-multilingual-v3.0 | 0.3596 | 0.3596 | 0.3596 | 0.3596 | 0.3596 | | openai ada 002 | 0.4737 | 0.4737 | 0.4737 | 0.4737 | 0.4737 | | multilingual-e5-large-instruct | 0.4649 | 0.4649 | 0.4649 | 0.4649 | 0.4649 | | Upstage Embedding | 0.6579 | 0.6579 | 0.6579 | 0.6579 | 0.6579 | | paraphrase-multilingual-MiniLM-L12-v2 | 0.2982 | 0.2982 | 0.2982 | 0.2982 | 0.2982 | | openai_embed_3_small | 0.5439 | 0.5439 | 0.5439 | 0.5439 | 0.5439 | | ko-sroberta-multitask | 0.4211 | 0.4211 | 0.4211 | 0.4211 | 0.4211 | | openai_embed_3_large | 0.6053 | 0.6053 | 0.6053 | 0.6053 | 0.6053 | | KU-HIAI-ONTHEIT-large-v1 | 0.7105 | 0.7105 | 0.7105 | 0.7105 | 0.7105 | | KU-HIAI-ONTHEIT-large-v1.1 | 0.7193 | 0.7193 | 0.7193 | 0.7193 | 0.7193 | | kf-deberta-multitask | 0.4561 | 0.4561 | 0.4561 | 0.4561 | 0.4561 | | gte-multilingual-base | 0.5877 | 0.5877 | 0.5877 | 0.5877 | 0.5877 | | BGE-m3 | 0.6578 | 0.6578 | 0.6578 | 0.6578 | 0.6578 | | bge-m3-korean | 0.5351 | 0.5351 | 0.5351 | 0.5351 | 0.5351 | | **BGE-m3-ko** | **0.7456** | **0.7456** | **0.7456** | **0.7456** | **0.7456** | Cross-Encoder (Reranker) | Model name | F1 | Recall | Precision | mAP | mRR | |---------------------------------------|------------|------------|------------|------------|------------| | jinaai/jina-reranker-v2-base-multilingual | 0.8070 | 0.8070 | 0.8070 | 0.8070 | 0.8070 | | Alibaba-NLP/gte-multilingual-reranker-base | 0.7281 | 0.7281 | 0.7281 | 0.7281 | 0.7281 | | BAAI/bge-reranker-v2-m3 | 0.8772 | 0.8772 | 0.8772 | 0.8772 | 0.8772 | | **bge-reranker-v2-m3-ko** | **0.9123** | **0.9123** | **0.9123** | **0.9123** | **0.9123** | **Top-k 3** Bi-Encoder (Sentence Transformer) | Model name | F1 | Recall | Precision | mAP | mRR | |---------------------------------------|------------|------------|------------|------------|------------| | paraphrase-multilingual-mpnet-base-v2 | 0.2368 | 0.4737 | 0.1579 | 0.2032 | 0.2032 | | KoSimCSE-roberta | 0.3026 | 0.6053 | 0.2018 | 0.2661 | 0.2661 | | Cohere embed-multilingual-v3.0 | 0.2851 | 0.5702 | 0.1901 | 0.2515 | 0.2515 | | openai ada 002 | 0.3553 | 0.7105 | 0.2368 | 0.3202 | 0.3202 | | multilingual-e5-large-instruct | 0.3333 | 0.6667 | 0.2222 | 0.2909 | 0.2909 | | Upstage Embedding | 0.4211 | 0.8421 | 0.2807 | **0.3509** | **0.3509** | | paraphrase-multilingual-MiniLM-L12-v2 | 0.2061 | 0.4123 | 0.1374 | 0.1740 | 0.1740 | | openai_embed_3_small | 0.3640 | 0.7281 | 0.2427 | 0.3026 | 0.3026 | | ko-sroberta-multitask | 0.2939 | 0.5877 | 0.1959 | 0.2500 | 0.2500 | | openai_embed_3_large | 0.3947 | 0.7895 | 0.2632 | 0.3348 | 0.3348 | | KU-HIAI-ONTHEIT-large-v1 | 0.4386 | 0.8772 | 0.2924 | 0.3421 | 0.3421 | | KU-HIAI-ONTHEIT-large-v1.1 | 0.4430 | 0.8860 | 0.2953 | 0.3406 | 0.3406 | | kf-deberta-multitask | 0.3158 | 0.6316 | 0.2105 | 0.2792 | 0.2792 | | gte-multilingual-base | 0.4035 | 0.8070 | 0.2690 | 0.3450 | 0.3450 | | BGE-m3 | 0.4254 | 0.8508 | 0.2836 | 0.3421 | 0.3421 | | bge-m3-korean | 0.3684 | 0.7368 | 0.2456 | 0.3143 | 0.3143 | | **BGE-m3-ko** | **0.4517** | **0.9035** | **0.3011** | 0.3494 | 0.3494 | Cross-Encoder (Reranker) | Model name | F1 | Recall | Precision | mAP | mRR | |---------------------------------------|------------|------------|------------|------------|------------| | jinaai/jina-reranker-v2-base-multilingual | 0.4649 | 0.9298 | 0.3099 | 0.8626 | 0.8626 | | Alibaba-NLP/gte-multilingual-reranker-base | 0.4605 | 0.9211 | 0.3070 | 0.8173 | 0.8173 | | BAAI/bge-reranker-v2-m3 | 0.4781 | 0.9561 | 0.3187 | 0.9167 | 0.9167 | | **bge-reranker-v2-m3-ko** | **0.4825** | **0.9649** | **0.3216** | **0.9371** | **0.9371** | **Top-k 5** Bi-Encoder (Sentence Transformer) | Model name | F1 | Recall | Precision | mAP | mRR | |---------------------------------------|------------|------------|------------|------------|------------| | paraphrase-multilingual-mpnet-base-v2 | 0.1813 | 0.5439 | 0.1088 | 0.1575 | 0.1575 | | KoSimCSE-roberta | 0.2164 | 0.6491 | 0.1298 | 0.1751 | 0.1751 | | Cohere embed-multilingual-v3.0 | 0.2076 | 0.6228 | 0.1246 | 0.1640 | 0.1640 | | openai ada 002 | 0.2602 | 0.7807 | 0.1561 | 0.2139 | 0.2139 | | multilingual-e5-large-instruct | 0.2544 | 0.7632 | 0.1526 | 0.2194 | 0.2194 | | Upstage Embedding | 0.2982 | 0.8947 | 0.1789 | **0.2237** | **0.2237** | | paraphrase-multilingual-MiniLM-L12-v2 | 0.1637 | 0.4912 | 0.0982 | 0.1437 | 0.1437 | | openai_embed_3_small | 0.2690 | 0.8070 | 0.1614 | 0.2148 | 0.2148 | | ko-sroberta-multitask | 0.2164 | 0.6491 | 0.1298 | 0.1697 | 0.1697 | | openai_embed_3_large | 0.2807 | 0.8421 | 0.1684 | 0.2088 | 0.2088 | | KU-HIAI-ONTHEIT-large-v1 | 0.3041 | 0.9123 | 0.1825 | 0.2137 | 0.2137 | | KU-HIAI-ONTHEIT-large-v1.1 | **0.3099** | **0.9298** | **0.1860** | 0.2148 | 0.2148 | | kf-deberta-multitask | 0.2281 | 0.6842 | 0.1368 | 0.1724 | 0.1724 | | gte-multilingual-base | 0.2865 | 0.8596 | 0.1719 | 0.2096 | 0.2096 | | BGE-m3 | 0.3041 | 0.9123 | 0.1825 | 0.2193 | 0.2193 | | bge-m3-korean | 0.2661 | 0.7982 | 0.1596 | 0.2116 | 0.2116 | | **BGE-m3-ko** | **0.3099** | **0.9298** | **0.1860** | 0.2098 | 0.2098 | Cross-Encoder (Reranker) | Model name | F1 | Recall | Precision | mAP | mRR | |---------------------------------------|------------|------------|------------|------------|------------| | jinaai/jina-reranker-v2-base-multilingual | 0.3129 | 0.9386 | 0.1877 | 0.8643 | 0.8643 | | Alibaba-NLP/gte-multilingual-reranker-base | 0.3158 | 0.9474 | 0.1895 | 0.8234 | 0.8234 | | BAAI/bge-reranker-v2-m3 | **0.3216** | **0.9649** | **0.1930** | 0.9189 | 0.9189 | | **bge-reranker-v2-m3-ko** | **0.3216** | **0.9649** | **0.1930** | **0.9371** | **0.9371** |