diff --git "a/README.md" "b/README.md" new file mode 100644--- /dev/null +++ "b/README.md" @@ -0,0 +1,2835 @@ +--- +tags: +- sentence-transformers +- feature-extraction +- sentence-similarity +- transformers +- mteb +license: apache-2.0 +model-index: +- name: bge-en-mistral + results: + - dataset: + config: en + name: MTEB AmazonCounterfactualClassification (en) + revision: e8379541af4e31359cca9fbcf4b00f2671dba205 + split: test + type: mteb/amazon_counterfactual + metrics: + - type: accuracy + value: 93.1492537313433 + - type: ap + value: 72.56132559564212 + - type: f1 + value: 89.71796898040243 + - type: main_score + value: 93.1492537313433 + task: + type: Classification + - dataset: + config: en + name: MTEB AmazonCounterfactualClassification (en) + revision: e8379541af4e31359cca9fbcf4b00f2671dba205 + split: validation + type: mteb/amazon_counterfactual + metrics: + - type: accuracy + value: 93.04477611940298 + - type: ap + value: 68.51763006673485 + - type: f1 + value: 88.44832081571468 + - type: main_score + value: 93.04477611940298 + task: + type: Classification + - dataset: + config: default + name: MTEB AmazonPolarityClassification (default) + revision: e2d317d38cd51312af73b3d32a06d1a08b442046 + split: test + type: mteb/amazon_polarity + metrics: + - type: accuracy + value: 96.98372499999999 + - type: ap + value: 95.62303091773919 + - type: f1 + value: 96.98308191715637 + - type: main_score + value: 96.98372499999999 + task: + type: Classification + - dataset: + config: en + name: MTEB AmazonReviewsClassification (en) + revision: 1399c76144fd37290681b995c656ef9b2e06e26d + split: test + type: mteb/amazon_reviews_multi + metrics: + - type: accuracy + value: 61.461999999999996 + - type: f1 + value: 60.57257766583118 + - type: main_score + value: 61.461999999999996 + task: + type: Classification + - dataset: + config: en + name: MTEB AmazonReviewsClassification (en) + revision: 1399c76144fd37290681b995c656ef9b2e06e26d + split: validation + type: mteb/amazon_reviews_multi + metrics: + - type: accuracy + value: 61.204 + - type: f1 + value: 60.262736729265384 + - type: main_score + value: 61.204 + task: + type: Classification + - dataset: + config: default + name: MTEB ArxivClusteringP2P (default) + revision: a122ad7f3f0291bf49cc6f4d32aa80929df69d5d + split: test + type: mteb/arxiv-clustering-p2p + metrics: + - type: main_score + value: 54.43859683357485 + - type: v_measure + value: 54.43859683357485 + - type: v_measure_std + value: 14.511128158596337 + task: + type: Clustering + - dataset: + config: default + name: MTEB ArxivClusteringS2S (default) + revision: f910caf1a6075f7329cdf8c1a6135696f37dbd53 + split: test + type: mteb/arxiv-clustering-s2s + metrics: + - type: main_score + value: 49.33365996236564 + - type: v_measure + value: 49.33365996236564 + - type: v_measure_std + value: 14.61261944856548 + task: + type: Clustering + - dataset: + config: default + name: MTEB AskUbuntuDupQuestions (default) + revision: 2000358ca161889fa9c082cb41daa8dcfb161a54 + split: test + type: mteb/askubuntudupquestions-reranking + metrics: + - type: main_score + value: 65.15263966490278 + - type: map + value: 65.15263966490278 + - type: mrr + value: 77.90331090885107 + task: + type: Reranking + - dataset: + config: default + name: MTEB BIOSSES (default) + revision: d3fb88f8f02e40887cd149695127462bbcf29b4a + split: test + type: mteb/biosses-sts + metrics: + - type: main_score + value: 86.47365710792691 + task: + type: STS + - dataset: + config: default + name: MTEB Banking77Classification (default) + revision: 0fd18e25b25c072e09e0d92ab615fda904d66300 + split: test + type: mteb/banking77 + metrics: + - type: accuracy + value: 91.48701298701299 + - type: f1 + value: 91.4733869423637 + - type: main_score + value: 91.48701298701299 + task: + type: Classification + - dataset: + config: default + name: MTEB BiorxivClusteringP2P (default) + revision: 65b79d1d13f80053f67aca9498d9402c2d9f1f40 + split: test + type: mteb/biorxiv-clustering-p2p + metrics: + - type: main_score + value: 53.050461108038036 + - type: v_measure + value: 53.050461108038036 + - type: v_measure_std + value: 0.9436104839012786 + task: + type: Clustering + - dataset: + config: default + name: MTEB BiorxivClusteringS2S (default) + revision: 258694dd0231531bc1fd9de6ceb52a0853c6d908 + split: test + type: mteb/biorxiv-clustering-s2s + metrics: + - type: main_score + value: 48.38215568371151 + - type: v_measure + value: 48.38215568371151 + - type: v_measure_std + value: 0.9104384504649026 + task: + type: Clustering + - dataset: + config: default + name: MTEB EmotionClassification (default) + revision: 4f58c6b202a23cf9a4da393831edf4f9183cad37 + split: test + type: mteb/emotion + metrics: + - type: accuracy + value: 93.36 + - type: f1 + value: 89.73665936982262 + - type: main_score + value: 93.36 + task: + type: Classification + - dataset: + config: default + name: MTEB EmotionClassification (default) + revision: 4f58c6b202a23cf9a4da393831edf4f9183cad37 + split: validation + type: mteb/emotion + metrics: + - type: accuracy + value: 94.14 + - type: f1 + value: 91.63163961443355 + - type: main_score + value: 94.14 + task: + type: Classification + - dataset: + config: default + name: MTEB ImdbClassification (default) + revision: 3d86128a09e091d6018b6d26cad27f2739fc2db7 + split: test + type: mteb/imdb + metrics: + - type: accuracy + value: 96.9144 + - type: ap + value: 95.45276911068486 + - type: f1 + value: 96.91412729455966 + - type: main_score + value: 96.9144 + task: + type: Classification + - dataset: + config: en + name: MTEB MTOPDomainClassification (en) + revision: d80d48c1eb48d3562165c59d59d0034df9fff0bf + split: test + type: mteb/mtop_domain + metrics: + - type: accuracy + value: 98.42225262197901 + - type: f1 + value: 98.31652547061115 + - type: main_score + value: 98.42225262197901 + task: + type: Classification + - dataset: + config: en + name: MTEB MTOPDomainClassification (en) + revision: d80d48c1eb48d3562165c59d59d0034df9fff0bf + split: validation + type: mteb/mtop_domain + metrics: + - type: accuracy + value: 98.60850111856824 + - type: f1 + value: 98.49625189176408 + - type: main_score + value: 98.60850111856824 + task: + type: Classification + - dataset: + config: en + name: MTEB MTOPIntentClassification (en) + revision: ae001d0e6b1228650b7bd1c2c65fb50ad11a8aba + split: test + type: mteb/mtop_intent + metrics: + - type: accuracy + value: 94.00136798905609 + - type: f1 + value: 82.7022316533099 + - type: main_score + value: 94.00136798905609 + task: + type: Classification + - dataset: + config: en + name: MTEB MTOPIntentClassification (en) + revision: ae001d0e6b1228650b7bd1c2c65fb50ad11a8aba + split: validation + type: mteb/mtop_intent + metrics: + - type: accuracy + value: 93.89261744966441 + - type: f1 + value: 78.76796618262529 + - type: main_score + value: 93.89261744966441 + task: + type: Classification + - dataset: + config: en + name: MTEB MassiveIntentClassification (en) + revision: 4672e20407010da34463acc759c162ca9734bca6 + split: test + type: mteb/amazon_massive_intent + metrics: + - type: accuracy + value: 82.92535305985204 + - type: f1 + value: 79.885538231847 + - type: main_score + value: 82.92535305985204 + task: + type: Classification + - dataset: + config: en + name: MTEB MassiveIntentClassification (en) + revision: 4672e20407010da34463acc759c162ca9734bca6 + split: validation + type: mteb/amazon_massive_intent + metrics: + - type: accuracy + value: 83.55140186915888 + - type: f1 + value: 81.09072707555056 + - type: main_score + value: 83.55140186915888 + task: + type: Classification + - dataset: + config: en + name: MTEB MassiveScenarioClassification (en) + revision: fad2c6e8459f9e1c45d9315f4953d921437d70f8 + split: test + type: mteb/amazon_massive_scenario + metrics: + - type: accuracy + value: 85.60188298587758 + - type: f1 + value: 84.87416963499224 + - type: main_score + value: 85.60188298587758 + task: + type: Classification + - dataset: + config: en + name: MTEB MassiveScenarioClassification (en) + revision: fad2c6e8459f9e1c45d9315f4953d921437d70f8 + split: validation + type: mteb/amazon_massive_scenario + metrics: + - type: accuracy + value: 85.01721593703886 + - type: f1 + value: 84.05277245992066 + - type: main_score + value: 85.01721593703886 + task: + type: Classification + - dataset: + config: default + name: MTEB MedrxivClusteringP2P (default) + revision: e7a26af6f3ae46b30dde8737f02c07b1505bcc73 + split: test + type: mteb/medrxiv-clustering-p2p + metrics: + - type: main_score + value: 45.86171497327639 + - type: v_measure + value: 45.86171497327639 + - type: v_measure_std + value: 1.551347259003324 + task: + type: Clustering + - dataset: + config: default + name: MTEB MedrxivClusteringS2S (default) + revision: 35191c8c0dca72d8ff3efcd72aa802307d469663 + split: test + type: mteb/medrxiv-clustering-s2s + metrics: + - type: main_score + value: 44.33336692345644 + - type: v_measure + value: 44.33336692345644 + - type: v_measure_std + value: 1.5931408596404715 + task: + type: Clustering + - dataset: + config: default + name: MTEB MindSmallReranking (default) + revision: 59042f120c80e8afa9cdbb224f67076cec0fc9a7 + split: test + type: mteb/mind_small + metrics: + - type: main_score + value: 30.597409734750503 + - type: map + value: 30.597409734750503 + - type: mrr + value: 31.397041548018457 + task: + type: Reranking + - dataset: + config: default + name: MTEB RedditClustering (default) + revision: 24640382cdbf8abc73003fb0fa6d111a705499eb + split: test + type: mteb/reddit-clustering + metrics: + - type: main_score + value: 72.33008348681277 + - type: v_measure + value: 72.33008348681277 + - type: v_measure_std + value: 2.9203215463933008 + task: + type: Clustering + - dataset: + config: default + name: MTEB RedditClusteringP2P (default) + revision: 385e3cb46b4cfa89021f56c4380204149d0efe33 + split: test + type: mteb/reddit-clustering-p2p + metrics: + - type: main_score + value: 72.72079657828903 + - type: v_measure + value: 72.72079657828903 + - type: v_measure_std + value: 11.930271663428735 + task: + type: Clustering + - dataset: + config: default + name: MTEB SICK-R (default) + revision: 20a6d6f312dd54037fe07a32d58e5e168867909d + split: test + type: mteb/sickr-sts + metrics: + - type: main_score + value: 83.86733787791422 + task: + type: STS + - dataset: + config: default + name: MTEB STS12 (default) + revision: a0d554a64d88156834ff5ae9920b964011b16384 + split: test + type: mteb/sts12-sts + metrics: + - type: main_score + value: 78.14269330480724 + task: + type: STS + - dataset: + config: default + name: MTEB STS13 (default) + revision: 7e90230a92c190f1bf69ae9002b8cea547a64cca + split: test + type: mteb/sts13-sts + metrics: + - type: main_score + value: 86.58640009300751 + task: + type: STS + - dataset: + config: default + name: MTEB STS14 (default) + revision: 6031580fec1f6af667f0bd2da0a551cf4f0b2375 + split: test + type: mteb/sts14-sts + metrics: + - type: main_score + value: 82.8292579957437 + task: + type: STS + - dataset: + config: default + name: MTEB STS15 (default) + revision: ae752c7c21bf194d8b67fd573edf7ae58183cbe3 + split: test + type: mteb/sts15-sts + metrics: + - type: main_score + value: 87.77203714228862 + task: + type: STS + - dataset: + config: default + name: MTEB STS16 (default) + revision: 4d8694f8f0e0100860b497b999b3dbed754a0513 + split: test + type: mteb/sts16-sts + metrics: + - type: main_score + value: 87.0439304006969 + task: + type: STS + - dataset: + config: en-en + name: MTEB STS17 (en-en) + revision: faeb762787bd10488a50c8b5be4a3b82e411949c + split: test + type: mteb/sts17-crosslingual-sts + metrics: + - type: main_score + value: 91.24736138013424 + task: + type: STS + - dataset: + config: en + name: MTEB STS22 (en) + revision: de9d86b3b84231dc21f76c7b7af1f28e2f57f6e3 + split: test + type: mteb/sts22-crosslingual-sts + metrics: + - type: main_score + value: 70.07326214706 + task: + type: STS + - dataset: + config: default + name: MTEB STSBenchmark (default) + revision: b0fddb56ed78048fa8b90373c8a3cfc37b684831 + split: test + type: mteb/stsbenchmark-sts + metrics: + - type: main_score + value: 88.42076443255168 + task: + type: STS + - dataset: + config: default + name: MTEB SciDocsRR (default) + revision: d3c5e1fc0b855ab6097bf1cda04dd73947d7caab + split: test + type: mteb/scidocs-reranking + metrics: + - type: main_score + value: 86.9584489124583 + - type: map + value: 86.9584489124583 + - type: mrr + value: 96.59475328592976 + task: + type: Reranking + - dataset: + config: default + name: MTEB SprintDuplicateQuestions (default) + revision: d66bd1f72af766a5cc4b0ca5e00c162f89e8cc46 + split: test + type: mteb/sprintduplicatequestions-pairclassification + metrics: + - type: main_score + value: 97.26819027722253 + - type: cos_sim_accuracy + value: 99.88019801980198 + - type: cos_sim_accuracy_threshold + value: 76.67685151100159 + - type: cos_sim_ap + value: 97.23260568085786 + - type: cos_sim_f1 + value: 93.91824526420737 + - type: cos_sim_f1_threshold + value: 75.82710981369019 + - type: cos_sim_precision + value: 93.63817097415506 + - type: cos_sim_recall + value: 94.19999999999999 + - type: dot_accuracy + value: 99.88019801980198 + - type: dot_accuracy_threshold + value: 76.67686343193054 + - type: dot_ap + value: 97.23260568085786 + - type: dot_f1 + value: 93.91824526420737 + - type: dot_f1_threshold + value: 75.8271336555481 + - type: dot_precision + value: 93.63817097415506 + - type: dot_recall + value: 94.19999999999999 + - type: euclidean_accuracy + value: 99.88019801980198 + - type: euclidean_accuracy_threshold + value: 68.29807758331299 + - type: euclidean_ap + value: 97.23259982599497 + - type: euclidean_f1 + value: 93.91824526420737 + - type: euclidean_f1_threshold + value: 69.53110694885254 + - type: euclidean_precision + value: 93.63817097415506 + - type: euclidean_recall + value: 94.19999999999999 + - type: manhattan_accuracy + value: 99.87821782178217 + - type: manhattan_accuracy_threshold + value: 3482.6908111572266 + - type: manhattan_ap + value: 97.26819027722253 + - type: manhattan_f1 + value: 93.92592592592592 + - type: manhattan_f1_threshold + value: 3555.5641174316406 + - type: manhattan_precision + value: 92.78048780487805 + - type: manhattan_recall + value: 95.1 + - type: max_accuracy + value: 99.88019801980198 + - type: max_ap + value: 97.26819027722253 + - type: max_f1 + value: 93.92592592592592 + task: + type: PairClassification + - dataset: + config: default + name: MTEB SprintDuplicateQuestions (default) + revision: d66bd1f72af766a5cc4b0ca5e00c162f89e8cc46 + split: validation + type: mteb/sprintduplicatequestions-pairclassification + metrics: + - type: main_score + value: 98.02470052972619 + - type: cos_sim_accuracy + value: 99.88811881188118 + - type: cos_sim_accuracy_threshold + value: 75.25776028633118 + - type: cos_sim_ap + value: 97.97198133050095 + - type: cos_sim_f1 + value: 94.37531110004977 + - type: cos_sim_f1_threshold + value: 75.25776028633118 + - type: cos_sim_precision + value: 93.95441030723488 + - type: cos_sim_recall + value: 94.8 + - type: dot_accuracy + value: 99.88811881188118 + - type: dot_accuracy_threshold + value: 75.25776624679565 + - type: dot_ap + value: 97.97198133050095 + - type: dot_f1 + value: 94.37531110004977 + - type: dot_f1_threshold + value: 75.25776624679565 + - type: dot_precision + value: 93.95441030723488 + - type: dot_recall + value: 94.8 + - type: euclidean_accuracy + value: 99.88811881188118 + - type: euclidean_accuracy_threshold + value: 70.34507989883423 + - type: euclidean_ap + value: 97.97198133050095 + - type: euclidean_f1 + value: 94.37531110004977 + - type: euclidean_f1_threshold + value: 70.34507989883423 + - type: euclidean_precision + value: 93.95441030723488 + - type: euclidean_recall + value: 94.8 + - type: manhattan_accuracy + value: 99.89207920792079 + - type: manhattan_accuracy_threshold + value: 3481.599807739258 + - type: manhattan_ap + value: 98.02470052972619 + - type: manhattan_f1 + value: 94.52536413862381 + - type: manhattan_f1_threshold + value: 3481.599807739258 + - type: manhattan_precision + value: 94.95459132189707 + - type: manhattan_recall + value: 94.1 + - type: max_accuracy + value: 99.89207920792079 + - type: max_ap + value: 98.02470052972619 + - type: max_f1 + value: 94.52536413862381 + task: + type: PairClassification + - dataset: + config: default + name: MTEB StackExchangeClustering (default) + revision: 6cbc1f7b2bc0622f2e39d2c77fa502909748c259 + split: test + type: mteb/stackexchange-clustering + metrics: + - type: main_score + value: 81.32419328350603 + - type: v_measure + value: 81.32419328350603 + - type: v_measure_std + value: 2.666861121694755 + task: + type: Clustering + - dataset: + config: default + name: MTEB StackExchangeClusteringP2P (default) + revision: 815ca46b2622cec33ccafc3735d572c266efdb44 + split: test + type: mteb/stackexchange-clustering-p2p + metrics: + - type: main_score + value: 46.048387963107565 + - type: v_measure + value: 46.048387963107565 + - type: v_measure_std + value: 1.4102848576321703 + task: + type: Clustering + - dataset: + config: default + name: MTEB StackOverflowDupQuestions (default) + revision: e185fbe320c72810689fc5848eb6114e1ef5ec69 + split: test + type: mteb/stackoverflowdupquestions-reranking + metrics: + - type: main_score + value: 56.70574900554072 + - type: map + value: 56.70574900554072 + - type: mrr + value: 57.517109116373824 + task: + type: Reranking + - dataset: + config: default + name: MTEB SummEval (default) + revision: cda12ad7615edc362dbf25a00fdd61d3b1eaf93c + split: test + type: mteb/summeval + metrics: + - type: main_score + value: 30.76932903185174 + task: + type: Summarization + - dataset: + config: default + name: MTEB ToxicConversationsClassification (default) + revision: edfaf9da55d3dd50d43143d90c1ac476895ae6de + split: test + type: mteb/toxic_conversations_50k + metrics: + - type: accuracy + value: 93.173828125 + - type: ap + value: 46.040184641424396 + - type: f1 + value: 80.77280549412752 + - type: main_score + value: 93.173828125 + task: + type: Classification + - dataset: + config: default + name: MTEB TweetSentimentExtractionClassification (default) + revision: d604517c81ca91fe16a244d1248fc021f9ecee7a + split: test + type: mteb/tweet_sentiment_extraction + metrics: + - type: accuracy + value: 79.9320882852292 + - type: f1 + value: 80.22638685975485 + - type: main_score + value: 79.9320882852292 + task: + type: Classification + - dataset: + config: default + name: MTEB TwentyNewsgroupsClustering (default) + revision: 6125ec4e24fa026cec8a478383ee943acfbd5449 + split: test + type: mteb/twentynewsgroups-clustering + metrics: + - type: main_score + value: 68.98152919711418 + - type: v_measure + value: 68.98152919711418 + - type: v_measure_std + value: 1.2519720970652428 + task: + type: Clustering + - dataset: + config: default + name: MTEB TwitterSemEval2015 (default) + revision: 70970daeab8776df92f5ea462b6173c0b46fd2d1 + split: test + type: mteb/twittersemeval2015-pairclassification + metrics: + - type: main_score + value: 79.34189681158234 + - type: cos_sim_accuracy + value: 87.68552184538356 + - type: cos_sim_accuracy_threshold + value: 76.06316804885864 + - type: cos_sim_ap + value: 79.34189149773933 + - type: cos_sim_f1 + value: 72.16386554621849 + - type: cos_sim_f1_threshold + value: 73.62890243530273 + - type: cos_sim_precision + value: 71.82435964453737 + - type: cos_sim_recall + value: 72.5065963060686 + - type: dot_accuracy + value: 87.68552184538356 + - type: dot_accuracy_threshold + value: 76.06316208839417 + - type: dot_ap + value: 79.34189231911259 + - type: dot_f1 + value: 72.16386554621849 + - type: dot_f1_threshold + value: 73.62889647483826 + - type: dot_precision + value: 71.82435964453737 + - type: dot_recall + value: 72.5065963060686 + - type: euclidean_accuracy + value: 87.68552184538356 + - type: euclidean_accuracy_threshold + value: 69.19080018997192 + - type: euclidean_ap + value: 79.34189681158234 + - type: euclidean_f1 + value: 72.16386554621849 + - type: euclidean_f1_threshold + value: 72.62383103370667 + - type: euclidean_precision + value: 71.82435964453737 + - type: euclidean_recall + value: 72.5065963060686 + - type: manhattan_accuracy + value: 87.661679680515 + - type: manhattan_accuracy_threshold + value: 3408.807373046875 + - type: manhattan_ap + value: 79.29617544165136 + - type: manhattan_f1 + value: 72.1957671957672 + - type: manhattan_f1_threshold + value: 3597.7684020996094 + - type: manhattan_precision + value: 72.38726790450929 + - type: manhattan_recall + value: 72.00527704485488 + - type: max_accuracy + value: 87.68552184538356 + - type: max_ap + value: 79.34189681158234 + - type: max_f1 + value: 72.1957671957672 + task: + type: PairClassification + - dataset: + config: default + name: MTEB TwitterURLCorpus (default) + revision: 8b6510b0b1fa4e4c4f879467980e9be563ec1cdf + split: test + type: mteb/twitterurlcorpus-pairclassification + metrics: + - type: main_score + value: 87.8635519535718 + - type: cos_sim_accuracy + value: 89.80672953778088 + - type: cos_sim_accuracy_threshold + value: 73.09532165527344 + - type: cos_sim_ap + value: 87.84251379545145 + - type: cos_sim_f1 + value: 80.25858884373845 + - type: cos_sim_f1_threshold + value: 70.57080268859863 + - type: cos_sim_precision + value: 77.14103110353643 + - type: cos_sim_recall + value: 83.63874345549738 + - type: dot_accuracy + value: 89.80672953778088 + - type: dot_accuracy_threshold + value: 73.09532761573792 + - type: dot_ap + value: 87.84251881260793 + - type: dot_f1 + value: 80.25858884373845 + - type: dot_f1_threshold + value: 70.57079076766968 + - type: dot_precision + value: 77.14103110353643 + - type: dot_recall + value: 83.63874345549738 + - type: euclidean_accuracy + value: 89.80672953778088 + - type: euclidean_accuracy_threshold + value: 73.3548641204834 + - type: euclidean_ap + value: 87.84251335039049 + - type: euclidean_f1 + value: 80.25858884373845 + - type: euclidean_f1_threshold + value: 76.71923041343689 + - type: euclidean_precision + value: 77.14103110353643 + - type: euclidean_recall + value: 83.63874345549738 + - type: manhattan_accuracy + value: 89.78150347343501 + - type: manhattan_accuracy_threshold + value: 3702.7603149414062 + - type: manhattan_ap + value: 87.8635519535718 + - type: manhattan_f1 + value: 80.27105660516332 + - type: manhattan_f1_threshold + value: 3843.5962677001953 + - type: manhattan_precision + value: 76.9361101306036 + - type: manhattan_recall + value: 83.90822297505389 + - type: max_accuracy + value: 89.80672953778088 + - type: max_ap + value: 87.8635519535718 + - type: max_f1 + value: 80.27105660516332 + task: + type: PairClassification + - task: + type: Retrieval + dataset: + type: nfcorpus + name: MTEB NFCorpus + config: default + split: test + revision: None + metrics: + - type: ndcg_at_1 + value: 52.47678018575851 + - type: ndcg_at_3 + value: 47.43993801247414 + - type: ndcg_at_5 + value: 45.08173173082719 + - type: ndcg_at_10 + value: 41.850870119787835 + - type: ndcg_at_100 + value: 37.79284946590978 + - type: ndcg_at_1000 + value: 46.58046062123418 + - type: map_at_1 + value: 6.892464464226138 + - type: map_at_3 + value: 12.113195798233127 + - type: map_at_5 + value: 13.968475602788812 + - type: map_at_10 + value: 16.47564069781326 + - type: map_at_100 + value: 20.671726065190025 + - type: map_at_1000 + value: 22.328875914012006 + - type: precision_at_1 + value: 53.86996904024768 + - type: precision_at_3 + value: 43.96284829721363 + - type: precision_at_5 + value: 38.69969040247682 + - type: precision_at_10 + value: 30.928792569659457 + - type: precision_at_100 + value: 9.507739938080498 + - type: precision_at_1000 + value: 2.25882352941176 + - type: recall_at_1 + value: 6.892464464226138 + - type: recall_at_3 + value: 13.708153358278407 + - type: recall_at_5 + value: 16.651919797359145 + - type: recall_at_10 + value: 21.01801714352559 + - type: recall_at_100 + value: 37.01672102843443 + - type: recall_at_1000 + value: 69.8307270724072 + - task: + type: Retrieval + dataset: + type: msmarco + name: MTEB MSMARCO + config: default + split: dev + revision: None + metrics: + - type: ndcg_at_1 + value: 26.63323782234957 + - type: ndcg_at_3 + value: 38.497585804985754 + - type: ndcg_at_5 + value: 42.72761631631636 + - type: ndcg_at_10 + value: 46.78865753107054 + - type: ndcg_at_100 + value: 51.96170786623209 + - type: ndcg_at_1000 + value: 52.82713901970963 + - type: map_at_1 + value: 25.89063992359121 + - type: map_at_3 + value: 35.299466730340654 + - type: map_at_5 + value: 37.68771887933786 + - type: map_at_10 + value: 39.40908074468253 + - type: map_at_100 + value: 40.53444082323405 + - type: map_at_1000 + value: 40.57183037649452 + - type: precision_at_1 + value: 26.63323782234957 + - type: precision_at_3 + value: 16.265520534861793 + - type: precision_at_5 + value: 11.902578796562304 + - type: precision_at_10 + value: 7.262177650430416 + - type: precision_at_100 + value: 0.9819484240687512 + - type: precision_at_1000 + value: 0.10571633237823287 + - type: recall_at_1 + value: 25.89063992359121 + - type: recall_at_3 + value: 46.99737344794652 + - type: recall_at_5 + value: 57.160936007640906 + - type: recall_at_10 + value: 69.43409742120343 + - type: recall_at_100 + value: 92.86413562559697 + - type: recall_at_1000 + value: 99.3230659025788 + - task: + type: Retrieval + dataset: + type: fiqa + name: MTEB FiQA2018 + config: default + split: test + revision: None + metrics: + - type: ndcg_at_1 + value: 57.407407407407405 + - type: ndcg_at_3 + value: 53.79975378289304 + - type: ndcg_at_5 + value: 56.453379423655406 + - type: ndcg_at_10 + value: 59.67151242793314 + - type: ndcg_at_100 + value: 65.34055762539253 + - type: ndcg_at_1000 + value: 67.07707746043032 + - type: map_at_1 + value: 30.65887045053714 + - type: map_at_3 + value: 44.09107110881799 + - type: map_at_5 + value: 48.18573748068346 + - type: map_at_10 + value: 51.03680979612876 + - type: map_at_100 + value: 53.03165194566928 + - type: map_at_1000 + value: 53.16191096190861 + - type: precision_at_1 + value: 57.407407407407405 + - type: precision_at_3 + value: 35.493827160493886 + - type: precision_at_5 + value: 26.913580246913547 + - type: precision_at_10 + value: 16.435185185185155 + - type: precision_at_100 + value: 2.2685185185184986 + - type: precision_at_1000 + value: 0.25864197530863964 + - type: recall_at_1 + value: 30.65887045053714 + - type: recall_at_3 + value: 48.936723427464194 + - type: recall_at_5 + value: 58.55942925387371 + - type: recall_at_10 + value: 68.45128551147073 + - type: recall_at_100 + value: 88.24599311867836 + - type: recall_at_1000 + value: 98.18121693121691 + - task: + type: Retrieval + dataset: + type: scidocs + name: MTEB SCIDOCS + config: default + split: test + revision: None + metrics: + - type: ndcg_at_1 + value: 28.7 + - type: ndcg_at_3 + value: 23.61736427940938 + - type: ndcg_at_5 + value: 20.845690325673885 + - type: ndcg_at_10 + value: 25.25865384510787 + - type: ndcg_at_100 + value: 36.18596641088721 + - type: ndcg_at_1000 + value: 41.7166868935345 + - type: map_at_1 + value: 5.828333333333361 + - type: map_at_3 + value: 10.689166666666676 + - type: map_at_5 + value: 13.069916666666668 + - type: map_at_10 + value: 15.4901164021164 + - type: map_at_100 + value: 18.61493245565425 + - type: map_at_1000 + value: 18.99943478016456 + - type: precision_at_1 + value: 28.7 + - type: precision_at_3 + value: 22.30000000000006 + - type: precision_at_5 + value: 18.55999999999997 + - type: precision_at_10 + value: 13.289999999999946 + - type: precision_at_100 + value: 2.905000000000005 + - type: precision_at_1000 + value: 0.4218999999999946 + - type: recall_at_1 + value: 5.828333333333361 + - type: recall_at_3 + value: 13.548333333333387 + - type: recall_at_5 + value: 18.778333333333308 + - type: recall_at_10 + value: 26.939999999999902 + - type: recall_at_100 + value: 58.91333333333344 + - type: recall_at_1000 + value: 85.57499999999972 + - task: + type: Retrieval + dataset: + type: fever + name: MTEB FEVER + config: defaultcqa + split: test + revision: None + metrics: + - type: ndcg_at_1 + value: 88.98889888988899 + - type: ndcg_at_3 + value: 91.82404417747676 + - type: ndcg_at_5 + value: 92.41785792357787 + - type: ndcg_at_10 + value: 92.82809814626805 + - type: ndcg_at_100 + value: 93.31730867509245 + - type: ndcg_at_1000 + value: 93.45171203408582 + - type: map_at_1 + value: 82.64125817343636 + - type: map_at_3 + value: 89.39970782792554 + - type: map_at_5 + value: 89.96799501378695 + - type: map_at_10 + value: 90.27479706587437 + - type: map_at_100 + value: 90.45185655778057 + - type: map_at_1000 + value: 90.46130471574544 + - type: precision_at_1 + value: 88.98889888988899 + - type: precision_at_3 + value: 34.923492349234245 + - type: precision_at_5 + value: 21.524152415244043 + - type: precision_at_10 + value: 11.033603360337315 + - type: precision_at_100 + value: 1.1521152115211895 + - type: precision_at_1000 + value: 0.11765676567657675 + - type: recall_at_1 + value: 82.64125817343636 + - type: recall_at_3 + value: 94.35195900542428 + - type: recall_at_5 + value: 95.9071323799047 + - type: recall_at_10 + value: 97.04234113887586 + - type: recall_at_100 + value: 98.77282371094255 + - type: recall_at_1000 + value: 99.5555567461508 + - task: + type: Retrieval + dataset: + type: arguana + name: MTEB ArguAna + config: default + split: test + revision: None + metrics: + - type: ndcg_at_1 + value: 66.50071123755335 + - type: ndcg_at_3 + value: 80.10869593172173 + - type: ndcg_at_5 + value: 81.89670542467924 + - type: ndcg_at_10 + value: 83.07967801208441 + - type: ndcg_at_100 + value: 83.5991349601075 + - type: ndcg_at_1000 + value: 83.5991349601075 + - type: map_at_1 + value: 66.50071123755335 + - type: map_at_3 + value: 76.83736367946898 + - type: map_at_5 + value: 77.8473210052158 + - type: map_at_10 + value: 78.35472690735851 + - type: map_at_100 + value: 78.47388207611678 + - type: map_at_1000 + value: 78.47388207611678 + - type: precision_at_1 + value: 66.50071123755335 + - type: precision_at_3 + value: 29.848269321953076 + - type: precision_at_5 + value: 18.762446657183045 + - type: precision_at_10 + value: 9.736842105262909 + - type: precision_at_100 + value: 0.9964438122332677 + - type: precision_at_1000 + value: 0.09964438122332549 + - type: recall_at_1 + value: 66.50071123755335 + - type: recall_at_3 + value: 89.5448079658606 + - type: recall_at_5 + value: 93.8122332859175 + - type: recall_at_10 + value: 97.36842105263158 + - type: recall_at_100 + value: 99.6443812233286 + - type: recall_at_1000 + value: 99.6443812233286 + - task: + type: Retrieval + dataset: + type: scifact + name: MTEB SciFact + config: default + split: test + revision: None + metrics: + - type: ndcg_at_1 + value: 66.0 + - type: ndcg_at_3 + value: 74.98853481223065 + - type: ndcg_at_5 + value: 77.29382051205019 + - type: ndcg_at_10 + value: 79.09159079425369 + - type: ndcg_at_100 + value: 80.29692802526776 + - type: ndcg_at_1000 + value: 80.55210036585547 + - type: map_at_1 + value: 62.994444444444454 + - type: map_at_3 + value: 71.7425925925926 + - type: map_at_5 + value: 73.6200925925926 + - type: map_at_10 + value: 74.50223544973547 + - type: map_at_100 + value: 74.82438594015447 + - type: map_at_1000 + value: 74.83420474892468 + - type: precision_at_1 + value: 66.0 + - type: precision_at_3 + value: 29.44444444444439 + - type: precision_at_5 + value: 19.40000000000008 + - type: precision_at_10 + value: 10.366666666666715 + - type: precision_at_100 + value: 1.0999999999999928 + - type: precision_at_1000 + value: 0.11200000000000007 + - type: recall_at_1 + value: 62.994444444444454 + - type: recall_at_3 + value: 80.89999999999998 + - type: recall_at_5 + value: 86.72777777777779 + - type: recall_at_10 + value: 91.88888888888887 + - type: recall_at_100 + value: 97.0 + - type: recall_at_1000 + value: 99.0 + - task: + type: Retrieval + dataset: + type: trec-covid + name: MTEB TRECCOVID + config: default + split: test + revision: None + metrics: + - type: ndcg_at_1 + value: 83.0 + - type: ndcg_at_3 + value: 79.86598407528447 + - type: ndcg_at_5 + value: 79.27684428714952 + - type: ndcg_at_10 + value: 79.07987651251462 + - type: ndcg_at_100 + value: 64.55029164391163 + - type: ndcg_at_1000 + value: 59.42333857860492 + - type: map_at_1 + value: 0.226053732680979 + - type: map_at_3 + value: 0.644034626013194 + - type: map_at_5 + value: 1.045196967937728 + - type: map_at_10 + value: 2.0197496659905085 + - type: map_at_100 + value: 13.316018005224159 + - type: map_at_1000 + value: 33.784766957424104 + - type: precision_at_1 + value: 88.0 + - type: precision_at_3 + value: 86.66666666666667 + - type: precision_at_5 + value: 85.20000000000002 + - type: precision_at_10 + value: 84.19999999999997 + - type: precision_at_100 + value: 67.88000000000001 + - type: precision_at_1000 + value: 26.573999999999998 + - type: recall_at_1 + value: 0.226053732680979 + - type: recall_at_3 + value: 0.6754273711472734 + - type: recall_at_5 + value: 1.1168649828059245 + - type: recall_at_10 + value: 2.2215081031265207 + - type: recall_at_100 + value: 16.694165236664727 + - type: recall_at_1000 + value: 56.7022214857503 + - task: + type: Retrieval + dataset: + type: climate-fever + name: MTEB ClimateFEVER + config: default + split: test + revision: None + metrics: + - type: ndcg_at_1 + value: 44.36482084690554 + - type: ndcg_at_3 + value: 38.13005747178844 + - type: ndcg_at_5 + value: 40.83474510717123 + - type: ndcg_at_10 + value: 45.4272998284769 + - type: ndcg_at_100 + value: 52.880220707479516 + - type: ndcg_at_1000 + value: 55.364753427333 + - type: map_at_1 + value: 19.200868621064064 + - type: map_at_3 + value: 28.33785740137525 + - type: map_at_5 + value: 31.67162504524064 + - type: map_at_10 + value: 34.417673164090075 + - type: map_at_100 + value: 36.744753097028976 + - type: map_at_1000 + value: 36.91262189016135 + - type: precision_at_1 + value: 44.36482084690554 + - type: precision_at_3 + value: 29.14223669923975 + - type: precision_at_5 + value: 22.410423452768388 + - type: precision_at_10 + value: 14.293159609120309 + - type: precision_at_100 + value: 2.248859934853431 + - type: precision_at_1000 + value: 0.2722475570032542 + - type: recall_at_1 + value: 19.200868621064064 + - type: recall_at_3 + value: 34.132464712269176 + - type: recall_at_5 + value: 42.35613463626491 + - type: recall_at_10 + value: 52.50814332247546 + - type: recall_at_100 + value: 77.16178067318128 + - type: recall_at_1000 + value: 90.59174809989138 + - task: + type: Retrieval + dataset: + type: hotpotqa + name: MTEB HotpotQA + config: default + split: test + revision: None + metrics: + - type: ndcg_at_1 + value: 89.9392302498312 + - type: ndcg_at_3 + value: 81.2061569376288 + - type: ndcg_at_5 + value: 83.53311592078133 + - type: ndcg_at_10 + value: 85.13780800141961 + - type: ndcg_at_100 + value: 87.02630661625386 + - type: ndcg_at_1000 + value: 87.47294723601075 + - type: map_at_1 + value: 44.9696151249156 + - type: map_at_3 + value: 76.46972766148966 + - type: map_at_5 + value: 78.47749268512187 + - type: map_at_10 + value: 79.49792611170005 + - type: map_at_100 + value: 80.09409086274644 + - type: map_at_1000 + value: 80.11950878917663 + - type: precision_at_1 + value: 89.9392302498312 + - type: precision_at_3 + value: 53.261309925724234 + - type: precision_at_5 + value: 33.79338284942924 + - type: precision_at_10 + value: 17.69750168805041 + - type: precision_at_100 + value: 1.9141120864280805 + - type: precision_at_1000 + value: 0.19721809588118133 + - type: recall_at_1 + value: 44.9696151249156 + - type: recall_at_3 + value: 79.8919648885888 + - type: recall_at_5 + value: 84.48345712356516 + - type: recall_at_10 + value: 88.48750844024308 + - type: recall_at_100 + value: 95.70560432140446 + - type: recall_at_1000 + value: 98.60904794058068 + - task: + type: Retrieval + dataset: + type: nq + name: MTEB NQ + config: default + split: test + revision: None + metrics: + - type: ndcg_at_1 + value: 57.0683661645423 + - type: ndcg_at_3 + value: 67.89935813080585 + - type: ndcg_at_5 + value: 71.47769719452941 + - type: ndcg_at_10 + value: 73.88350836507092 + - type: ndcg_at_100 + value: 75.76561068060907 + - type: ndcg_at_1000 + value: 75.92437662684215 + - type: map_at_1 + value: 51.00424874468904 + - type: map_at_3 + value: 63.87359984550011 + - type: map_at_5 + value: 66.23696407879494 + - type: map_at_10 + value: 67.42415446608673 + - type: map_at_100 + value: 67.92692839842621 + - type: map_at_1000 + value: 67.93437922640133 + - type: precision_at_1 + value: 57.0683661645423 + - type: precision_at_3 + value: 29.692931633836416 + - type: precision_at_5 + value: 20.046349942062854 + - type: precision_at_10 + value: 10.950173812283 + - type: precision_at_100 + value: 1.1995944380069687 + - type: precision_at_1000 + value: 0.12146581691772171 + - type: recall_at_1 + value: 51.00424874468904 + - type: recall_at_3 + value: 75.93665507918116 + - type: recall_at_5 + value: 83.95133256083433 + - type: recall_at_10 + value: 90.78794901506375 + - type: recall_at_100 + value: 98.61915797605253 + - type: recall_at_1000 + value: 99.7827346465817 + - task: + type: Retrieval + dataset: + type: quora + name: MTEB QuoraRetrieval + config: default + split: test + revision: None + metrics: + - type: ndcg_at_1 + value: 84.61999999999999 + - type: ndcg_at_3 + value: 88.57366734033212 + - type: ndcg_at_5 + value: 89.89804048972175 + - type: ndcg_at_10 + value: 90.95410848372035 + - type: ndcg_at_100 + value: 91.83227134455773 + - type: ndcg_at_1000 + value: 91.88368412611601 + - type: map_at_1 + value: 73.4670089207039 + - type: map_at_3 + value: 84.87862925508942 + - type: map_at_5 + value: 86.68002324701408 + - type: map_at_10 + value: 87.7165466015312 + - type: map_at_100 + value: 88.28718809614146 + - type: map_at_1000 + value: 88.29877148480672 + - type: precision_at_1 + value: 84.61999999999999 + - type: precision_at_3 + value: 38.82333333333838 + - type: precision_at_5 + value: 25.423999999998642 + - type: precision_at_10 + value: 13.787999999998583 + - type: precision_at_100 + value: 1.5442999999999767 + - type: precision_at_1000 + value: 0.15672999999997972 + - type: recall_at_1 + value: 73.4670089207039 + - type: recall_at_3 + value: 89.98389854832143 + - type: recall_at_5 + value: 93.88541046010576 + - type: recall_at_10 + value: 96.99779417520634 + - type: recall_at_100 + value: 99.80318763957743 + - type: recall_at_1000 + value: 99.99638888888889 + - task: + type: Retrieval + dataset: + type: webis-touche2020 + name: MTEB Touche2020 + config: default + split: test + revision: None + metrics: + - type: ndcg_at_1 + value: 33.6734693877551 + - type: ndcg_at_3 + value: 34.36843900446739 + - type: ndcg_at_5 + value: 32.21323786731918 + - type: ndcg_at_10 + value: 30.47934263207554 + - type: ndcg_at_100 + value: 41.49598869753928 + - type: ndcg_at_1000 + value: 52.32963949183662 + - type: map_at_1 + value: 3.0159801678718168 + - type: map_at_3 + value: 7.13837927642557 + - type: map_at_5 + value: 9.274004610363466 + - type: map_at_10 + value: 12.957368366814324 + - type: map_at_100 + value: 19.3070585127604 + - type: map_at_1000 + value: 20.809777161133532 + - type: precision_at_1 + value: 34.69387755102041 + - type: precision_at_3 + value: 36.054421768707485 + - type: precision_at_5 + value: 32.24489795918368 + - type: precision_at_10 + value: 27.142857142857146 + - type: precision_at_100 + value: 8.326530612244898 + - type: precision_at_1000 + value: 1.5755102040816336 + - type: recall_at_1 + value: 3.0159801678718168 + - type: recall_at_3 + value: 8.321771388428257 + - type: recall_at_5 + value: 11.737532394366069 + - type: recall_at_10 + value: 19.49315139822179 + - type: recall_at_100 + value: 50.937064145519685 + - type: recall_at_1000 + value: 83.4358283484675 + - task: + type: Retrieval + dataset: + type: dbpedia-entity + name: MTEB DBPedia + config: default + split: test + revision: None + metrics: + - type: ndcg_at_1 + value: 64.375 + - type: ndcg_at_3 + value: 55.677549598242614 + - type: ndcg_at_5 + value: 53.44347199908503 + - type: ndcg_at_10 + value: 51.634197691802754 + - type: ndcg_at_100 + value: 56.202861267183415 + - type: ndcg_at_1000 + value: 63.146019108272576 + - type: map_at_1 + value: 9.789380503780919 + - type: map_at_3 + value: 16.146582195277016 + - type: map_at_5 + value: 19.469695222167193 + - type: map_at_10 + value: 24.163327344766145 + - type: map_at_100 + value: 35.47047690245571 + - type: map_at_1000 + value: 37.5147432331838 + - type: precision_at_1 + value: 76.25 + - type: precision_at_3 + value: 59.08333333333333 + - type: precision_at_5 + value: 52.24999999999997 + - type: precision_at_10 + value: 42.54999999999994 + - type: precision_at_100 + value: 13.460000000000008 + - type: precision_at_1000 + value: 2.4804999999999966 + - type: recall_at_1 + value: 9.789380503780919 + - type: recall_at_3 + value: 17.48487134027656 + - type: recall_at_5 + value: 22.312024269698806 + - type: recall_at_10 + value: 30.305380335237324 + - type: recall_at_100 + value: 62.172868946596424 + - type: recall_at_1000 + value: 85.32410301328747 + - task: + type: Retrieval + dataset: + type: BeIR/cqadupstack + name: MTEB CQADupstackPhysicsRetrieval + config: default + split: test + revision: None + metrics: + - type: ndcg_at_1 + value: 42.15591915303176 + - type: ndcg_at_3 + value: 48.15261407846446 + - type: ndcg_at_5 + value: 50.58031819816491 + - type: ndcg_at_10 + value: 53.159393156983015 + - type: ndcg_at_100 + value: 58.64024684800366 + - type: ndcg_at_1000 + value: 60.017254762428166 + - type: map_at_1 + value: 34.78577058702179 + - type: map_at_3 + value: 43.52147299813321 + - type: map_at_5 + value: 45.47857625732981 + - type: map_at_10 + value: 46.94467579029768 + - type: map_at_100 + value: 48.364473257035456 + - type: map_at_1000 + value: 48.460199893487435 + - type: precision_at_1 + value: 42.15591915303176 + - type: precision_at_3 + value: 22.842476740455762 + - type: precision_at_5 + value: 16.073147256977784 + - type: precision_at_10 + value: 9.566891241578338 + - type: precision_at_100 + value: 1.441770933589971 + - type: precision_at_1000 + value: 0.17045235803656864 + - type: recall_at_1 + value: 34.78577058702179 + - type: recall_at_3 + value: 51.705004026948195 + - type: recall_at_5 + value: 57.99470738835514 + - type: recall_at_10 + value: 65.73761786225693 + - type: recall_at_100 + value: 88.03733579833336 + - type: recall_at_1000 + value: 96.505175424102 + - task: + type: Retrieval + dataset: + type: BeIR/cqadupstack + name: MTEB CQADupstackStatsRetrieval + config: default + split: test + revision: None + metrics: + - type: ndcg_at_1 + value: 30.061349693251532 + - type: ndcg_at_3 + value: 36.63708916157646 + - type: ndcg_at_5 + value: 38.61671491681753 + - type: ndcg_at_10 + value: 41.350655796840066 + - type: ndcg_at_100 + value: 46.45326227358081 + - type: ndcg_at_1000 + value: 48.582285457159266 + - type: map_at_1 + value: 26.9244205862304 + - type: map_at_3 + value: 33.585406725744164 + - type: map_at_5 + value: 34.91193310921073 + - type: map_at_10 + value: 36.15920645617732 + - type: map_at_100 + value: 37.25917602757753 + - type: map_at_1000 + value: 37.35543998586382 + - type: precision_at_1 + value: 30.061349693251532 + - type: precision_at_3 + value: 16.002044989775 + - type: precision_at_5 + value: 11.012269938650379 + - type: precision_at_10 + value: 6.625766871165693 + - type: precision_at_100 + value: 1.0015337423312758 + - type: precision_at_1000 + value: 0.12638036809815958 + - type: recall_at_1 + value: 26.9244205862304 + - type: recall_at_3 + value: 40.92407975460122 + - type: recall_at_5 + value: 45.74576284315548 + - type: recall_at_10 + value: 54.04032657867014 + - type: recall_at_100 + value: 76.89573533447586 + - type: recall_at_1000 + value: 92.10000029943193 + - task: + type: Retrieval + dataset: + type: BeIR/cqadupstack + name: MTEB CQADupstackWebmastersRetrieval + config: default + split: test + revision: None + metrics: + - type: ndcg_at_1 + value: 36.16600790513834 + - type: ndcg_at_3 + value: 41.39539336351464 + - type: ndcg_at_5 + value: 44.286188181817465 + - type: ndcg_at_10 + value: 46.8079293900759 + - type: ndcg_at_100 + value: 52.77618002686582 + - type: ndcg_at_1000 + value: 54.74554787022661 + - type: map_at_1 + value: 29.947644735902585 + - type: map_at_3 + value: 36.84394907359118 + - type: map_at_5 + value: 38.9461665221235 + - type: map_at_10 + value: 40.38325122041743 + - type: map_at_100 + value: 42.15067269020822 + - type: map_at_1000 + value: 42.396412886053454 + - type: precision_at_1 + value: 36.16600790513834 + - type: precision_at_3 + value: 19.23583662714091 + - type: precision_at_5 + value: 14.268774703557394 + - type: precision_at_10 + value: 9.071146245059353 + - type: precision_at_100 + value: 1.7905138339920774 + - type: precision_at_1000 + value: 0.2537549407114581 + - type: recall_at_1 + value: 29.947644735902585 + - type: recall_at_3 + value: 43.95135576014935 + - type: recall_at_5 + value: 51.33413524177249 + - type: recall_at_10 + value: 58.597439631665615 + - type: recall_at_100 + value: 85.04925879936505 + - type: recall_at_1000 + value: 96.93189262162947 + - task: + type: Retrieval + dataset: + type: BeIR/cqadupstack + name: MTEB CQADupstackWordpressRetrieval + config: default + split: test + revision: None + metrics: + - type: ndcg_at_1 + value: 26.247689463955638 + - type: ndcg_at_3 + value: 33.25421096011386 + - type: ndcg_at_5 + value: 35.274958043979055 + - type: ndcg_at_10 + value: 37.895337114228504 + - type: ndcg_at_100 + value: 43.16359215810417 + - type: ndcg_at_1000 + value: 45.46544464874392 + - type: map_at_1 + value: 23.730646155069266 + - type: map_at_3 + value: 30.328510192859376 + - type: map_at_5 + value: 31.646131881091033 + - type: map_at_10 + value: 32.834529811633146 + - type: map_at_100 + value: 33.887475191512124 + - type: map_at_1000 + value: 33.98635376333761 + - type: precision_at_1 + value: 26.247689463955638 + - type: precision_at_3 + value: 14.417744916820693 + - type: precision_at_5 + value: 10.018484288354932 + - type: precision_at_10 + value: 6.00739371534199 + - type: precision_at_100 + value: 0.9426987060998051 + - type: precision_at_1000 + value: 0.12476894639556387 + - type: recall_at_1 + value: 23.730646155069266 + - type: recall_at_3 + value: 38.561206845149364 + - type: recall_at_5 + value: 43.38560610577783 + - type: recall_at_10 + value: 51.21370222407728 + - type: recall_at_100 + value: 75.61661144095109 + - type: recall_at_1000 + value: 92.54472715089256 + - task: + type: Retrieval + dataset: + type: BeIR/cqadupstack + name: MTEB CQADupstackProgrammersRetrieval + config: default + split: test + revision: None + metrics: + - type: ndcg_at_1 + value: 38.81278538812785 + - type: ndcg_at_3 + value: 43.78338523503654 + - type: ndcg_at_5 + value: 47.097296563014325 + - type: ndcg_at_10 + value: 50.282579667519435 + - type: ndcg_at_100 + value: 55.729033960190286 + - type: ndcg_at_1000 + value: 57.33724814332862 + - type: map_at_1 + value: 31.69764033847938 + - type: map_at_3 + value: 39.42951244122387 + - type: map_at_5 + value: 41.943723140417774 + - type: map_at_10 + value: 43.61013816936983 + - type: map_at_100 + value: 45.02590557151775 + - type: map_at_1000 + value: 45.125950171245066 + - type: precision_at_1 + value: 38.81278538812785 + - type: precision_at_3 + value: 20.96651445966523 + - type: precision_at_5 + value: 15.388127853881361 + - type: precision_at_10 + value: 9.474885844748805 + - type: precision_at_100 + value: 1.400684931506831 + - type: precision_at_1000 + value: 0.17191780821917388 + - type: recall_at_1 + value: 31.69764033847938 + - type: recall_at_3 + value: 46.60687843152849 + - type: recall_at_5 + value: 55.17297638322793 + - type: recall_at_10 + value: 64.45674471217188 + - type: recall_at_100 + value: 87.1937426751484 + - type: recall_at_1000 + value: 97.32787875629423 + - task: + type: Retrieval + dataset: + type: BeIR/cqadupstack + name: MTEB CQADupstackEnglishRetrieval + config: default + split: test + revision: None + metrics: + - type: ndcg_at_1 + value: 45.22292993630573 + - type: ndcg_at_3 + value: 50.48933696278536 + - type: ndcg_at_5 + value: 52.51230339563936 + - type: ndcg_at_10 + value: 54.63834990956019 + - type: ndcg_at_100 + value: 58.4908966688059 + - type: ndcg_at_1000 + value: 60.25262455573039 + - type: map_at_1 + value: 36.14176917496391 + - type: map_at_3 + value: 45.293425362542706 + - type: map_at_5 + value: 47.228727919799 + - type: map_at_10 + value: 48.603664692804365 + - type: map_at_100 + value: 49.87291685915334 + - type: map_at_1000 + value: 49.99758620164822 + - type: precision_at_1 + value: 45.22292993630573 + - type: precision_at_3 + value: 24.607218683651517 + - type: precision_at_5 + value: 17.273885350318157 + - type: precision_at_10 + value: 10.401273885350104 + - type: precision_at_100 + value: 1.5840764331210677 + - type: precision_at_1000 + value: 0.20216560509553294 + - type: recall_at_1 + value: 36.14176917496391 + - type: recall_at_3 + value: 52.458133860965276 + - type: recall_at_5 + value: 58.30933220798927 + - type: recall_at_10 + value: 64.76267431694271 + - type: recall_at_100 + value: 81.11863633256955 + - type: recall_at_1000 + value: 91.95898877878803 + - task: + type: Retrieval + dataset: + type: BeIR/cqadupstack + name: MTEB CQADupstackMathematicaRetrieval + config: default + split: test + revision: None + metrics: + - type: ndcg_at_1 + value: 25.37313432835821 + - type: ndcg_at_3 + value: 31.513955857649872 + - type: ndcg_at_5 + value: 33.894814999901286 + - type: ndcg_at_10 + value: 36.567795091777775 + - type: ndcg_at_100 + value: 42.692861355185926 + - type: ndcg_at_1000 + value: 45.1650634517594 + - type: map_at_1 + value: 20.137260127931768 + - type: map_at_3 + value: 27.513893824528164 + - type: map_at_5 + value: 29.228223959567245 + - type: map_at_10 + value: 30.486342453382235 + - type: map_at_100 + value: 31.93773531700923 + - type: map_at_1000 + value: 32.045221355885026 + - type: precision_at_1 + value: 25.37313432835821 + - type: precision_at_3 + value: 15.713101160862273 + - type: precision_at_5 + value: 11.218905472636896 + - type: precision_at_10 + value: 6.828358208955276 + - type: precision_at_100 + value: 1.1318407960198864 + - type: precision_at_1000 + value: 0.14776119402984852 + - type: recall_at_1 + value: 20.137260127931768 + - type: recall_at_3 + value: 35.516761430940534 + - type: recall_at_5 + value: 41.81044183842692 + - type: recall_at_10 + value: 49.84812658320122 + - type: recall_at_100 + value: 75.52224965471233 + - type: recall_at_1000 + value: 93.00114617278797 + - task: + type: Retrieval + dataset: + type: BeIR/cqadupstack + name: MTEB CQADupstackGamingRetrieval + config: default + split: test + revision: None + metrics: + - type: ndcg_at_1 + value: 52.10031347962383 + - type: ndcg_at_3 + value: 59.09283306711919 + - type: ndcg_at_5 + value: 61.70364710499664 + - type: ndcg_at_10 + value: 64.43508234673456 + - type: ndcg_at_100 + value: 68.08258162359128 + - type: ndcg_at_1000 + value: 68.78220525177915 + - type: map_at_1 + value: 45.67593534991653 + - type: map_at_3 + value: 55.17968153498597 + - type: map_at_5 + value: 57.073161405223026 + - type: map_at_10 + value: 58.55427425972989 + - type: map_at_100 + value: 59.58877825514076 + - type: map_at_1000 + value: 59.62753156251917 + - type: precision_at_1 + value: 52.10031347962383 + - type: precision_at_3 + value: 25.95611285266423 + - type: precision_at_5 + value: 17.667711598745708 + - type: precision_at_10 + value: 10.169278996864973 + - type: precision_at_100 + value: 1.2852664576802733 + - type: precision_at_1000 + value: 0.13786833855798794 + - type: recall_at_1 + value: 45.67593534991653 + - type: recall_at_3 + value: 63.87786043907147 + - type: recall_at_5 + value: 70.25761057674107 + - type: recall_at_10 + value: 77.97283230161469 + - type: recall_at_100 + value: 93.12900411473255 + - type: recall_at_1000 + value: 97.98040752351098 + - task: + type: Retrieval + dataset: + type: BeIR/cqadupstack + name: MTEB CQADupstackGisRetrieval + config: default + split: test + revision: None + metrics: + - type: ndcg_at_1 + value: 29.830508474576273 + - type: ndcg_at_3 + value: 36.43753958419226 + - type: ndcg_at_5 + value: 39.55362935996899 + - type: ndcg_at_10 + value: 43.11482816486947 + - type: ndcg_at_100 + value: 48.55701741086406 + - type: ndcg_at_1000 + value: 50.12437449225312 + - type: map_at_1 + value: 27.58676351896691 + - type: map_at_3 + value: 33.9831853645413 + - type: map_at_5 + value: 35.81743341404356 + - type: map_at_10 + value: 37.38087764923922 + - type: map_at_100 + value: 38.54334689204219 + - type: map_at_1000 + value: 38.60999368829795 + - type: precision_at_1 + value: 29.830508474576273 + - type: precision_at_3 + value: 15.21657250470804 + - type: precision_at_5 + value: 10.960451977401222 + - type: precision_at_10 + value: 6.779661016949213 + - type: precision_at_100 + value: 0.9977401129943356 + - type: precision_at_1000 + value: 0.11661016949152515 + - type: recall_at_1 + value: 27.58676351896691 + - type: recall_at_3 + value: 41.050040355125105 + - type: recall_at_5 + value: 48.356201237557165 + - type: recall_at_10 + value: 58.86871132633844 + - type: recall_at_100 + value: 83.44115081403217 + - type: recall_at_1000 + value: 95.14032985219426 + - task: + type: Retrieval + dataset: + type: BeIR/cqadupstack + name: MTEB CQADupstackUnixRetrieval + config: default + split: test + revision: None + metrics: + - type: ndcg_at_1 + value: 37.77985074626866 + - type: ndcg_at_3 + value: 42.68906535122145 + - type: ndcg_at_5 + value: 45.42572671347988 + - type: ndcg_at_10 + value: 48.503281334563006 + - type: ndcg_at_100 + value: 53.90759554634032 + - type: ndcg_at_1000 + value: 55.6750143459022 + - type: map_at_1 + value: 32.05179459843639 + - type: map_at_3 + value: 39.174397111663886 + - type: map_at_5 + value: 41.09602758395897 + - type: map_at_10 + value: 42.57548284992813 + - type: map_at_100 + value: 43.88590856115191 + - type: map_at_1000 + value: 43.97573928697477 + - type: precision_at_1 + value: 37.77985074626866 + - type: precision_at_3 + value: 19.40298507462699 + - type: precision_at_5 + value: 13.768656716417915 + - type: precision_at_10 + value: 8.330223880596947 + - type: precision_at_100 + value: 1.2266791044775944 + - type: precision_at_1000 + value: 0.14860074626865238 + - type: recall_at_1 + value: 32.05179459843639 + - type: recall_at_3 + value: 46.19290082326463 + - type: recall_at_5 + value: 53.065248391740916 + - type: recall_at_10 + value: 61.95742612487016 + - type: recall_at_100 + value: 84.95720140659506 + - type: recall_at_1000 + value: 96.7945875641771 + - task: + type: Retrieval + dataset: + type: BeIR/cqadupstack + name: MTEB CQADupstackTexRetrieval + config: default + split: test + revision: None + metrics: + - type: ndcg_at_1 + value: 24.36338609772884 + - type: ndcg_at_3 + value: 29.344263505458546 + - type: ndcg_at_5 + value: 31.648927411353355 + - type: ndcg_at_10 + value: 34.37718834167528 + - type: ndcg_at_100 + value: 39.988489670143565 + - type: ndcg_at_1000 + value: 42.59253219178224 + - type: map_at_1 + value: 20.102111701568827 + - type: map_at_3 + value: 26.034827870203504 + - type: map_at_5 + value: 27.635335063884625 + - type: map_at_10 + value: 28.955304300478456 + - type: map_at_100 + value: 30.17348927054766 + - type: map_at_1000 + value: 30.29821812881463 + - type: precision_at_1 + value: 24.36338609772884 + - type: precision_at_3 + value: 13.971094287680497 + - type: precision_at_5 + value: 10.178940123881386 + - type: precision_at_10 + value: 6.362697866482958 + - type: precision_at_100 + value: 1.0784583620096873 + - type: precision_at_1000 + value: 0.14810736407432443 + - type: recall_at_1 + value: 20.102111701568827 + - type: recall_at_3 + value: 32.51720798237882 + - type: recall_at_5 + value: 38.47052010632308 + - type: recall_at_10 + value: 46.560251311326375 + - type: recall_at_100 + value: 71.37281646052087 + - type: recall_at_1000 + value: 89.54176274473149 + - task: + type: Retrieval + dataset: + type: BeIR/cqadupstack + name: MTEB CQADupstackAndroidRetrieval + config: default + split: test + revision: None + metrics: + - type: ndcg_at_1 + value: 44.34907010014306 + - type: ndcg_at_3 + value: 50.3866971503038 + - type: ndcg_at_5 + value: 53.15366139760711 + - type: ndcg_at_10 + value: 56.56459368482132 + - type: ndcg_at_100 + value: 61.49499162448754 + - type: ndcg_at_1000 + value: 62.750952246569824 + - type: map_at_1 + value: 35.87684730898816 + - type: map_at_3 + value: 44.81019864282626 + - type: map_at_5 + value: 47.24254516428158 + - type: map_at_10 + value: 49.28704567095768 + - type: map_at_100 + value: 50.85906250580416 + - type: map_at_1000 + value: 50.96818352379094 + - type: precision_at_1 + value: 44.34907010014306 + - type: precision_at_3 + value: 24.463519313304776 + - type: precision_at_5 + value: 17.68240343347653 + - type: precision_at_10 + value: 11.173104434906978 + - type: precision_at_100 + value: 1.7095851216022702 + - type: precision_at_1000 + value: 0.21087267525035264 + - type: recall_at_1 + value: 35.87684730898816 + - type: recall_at_3 + value: 52.8360317975774 + - type: recall_at_5 + value: 60.826717819116716 + - type: recall_at_10 + value: 70.64783984145798 + - type: recall_at_100 + value: 90.90247835876467 + - type: recall_at_1000 + value: 98.27352916110131 + - task: + type: Retrieval + dataset: + type: BeIR/cqadupstack + name: MTEB CQADupstackRetrieval + config: default + split: test + revision: None + metrics: + - type: ndcg_at_1 + value: 36.038578730542476 + - type: ndcg_at_3 + value: 41.931365356453036 + - type: ndcg_at_5 + value: 44.479015523894994 + - type: ndcg_at_10 + value: 47.308084499970704 + - type: ndcg_at_100 + value: 52.498062430513606 + - type: ndcg_at_1000 + value: 54.2908789514719 + - type: map_at_1 + value: 30.38821701528966 + - type: map_at_3 + value: 37.974871761903636 + - type: map_at_5 + value: 39.85399878507757 + - type: map_at_10 + value: 41.31456611036795 + - type: map_at_100 + value: 42.62907836655835 + - type: map_at_1000 + value: 42.737235870659845 + - type: precision_at_1 + value: 36.038578730542476 + - type: precision_at_3 + value: 19.39960180094633 + - type: precision_at_5 + value: 13.79264655952497 + - type: precision_at_10 + value: 8.399223517333388 + - type: precision_at_100 + value: 1.2992373779520896 + - type: precision_at_1000 + value: 0.16327170951909567 + - type: recall_at_1 + value: 30.38821701528966 + - type: recall_at_3 + value: 45.51645512564165 + - type: recall_at_5 + value: 52.06077167834868 + - type: recall_at_10 + value: 60.38864106788279 + - type: recall_at_100 + value: 82.76968509918343 + - type: recall_at_1000 + value: 94.84170217080344 +--- + + +
+ Model List | + FAQ | + Usage | + Evaluation | + Train | + Contact | + Citation | + License +
+
+
+For more details please refer to our Github: [FlagEmbedding](https://github.com/FlagOpen/FlagEmbedding).
+
+If you are looking for a model with rich semantic expression capabilities, consider choosing **BGE-EN-Mistral**. It combines the ability of in-context learning with the strengths of large models and dense retrieval, achieving outstanding results.
+
+**BGE-EN-Mistral** primarily demonstrates the following capabilities:
+- In-context learning ability: By providing few-shot examples in the query, it can significantly enhance the model's ability to handle new tasks.
+- Outstanding performance: The model has achieved state-of-the-art (SOTA) performance on both BEIR and AIR-Bench.
+
+We will release a technical report about **BGE-EN-Mistral** soon with more details.
+
+[English](README.md) | [中文](https://github.com/FlagOpen/FlagEmbedding/blob/master/README_zh.md)
+
+FlagEmbedding focuses on retrieval-augmented LLMs, consisting of the following projects currently:
+
+- **LLM-based Dense Retrieval**: BGE-EN-Mistral, BGE-Multilingual-Gemma2
+- **Long-Context LLM**: [Activation Beacon](https://github.com/FlagOpen/FlagEmbedding/tree/master/Long_LLM/activation_beacon)
+- **Fine-tuning of LM** : [LM-Cocktail](https://github.com/FlagOpen/FlagEmbedding/tree/master/LM_Cocktail)
+- **Dense Retrieval**: [BGE-M3](https://github.com/FlagOpen/FlagEmbedding/tree/master/FlagEmbedding/BGE_M3), [LLM Embedder](https://github.com/FlagOpen/FlagEmbedding/tree/master/FlagEmbedding/llm_embedder), [BGE Embedding](https://github.com/FlagOpen/FlagEmbedding/tree/master/FlagEmbedding/baai_general_embedding)
+- **Reranker Model**: [BGE Reranker](https://github.com/FlagOpen/FlagEmbedding/tree/master/FlagEmbedding/reranker)
+- **Benchmark**: [C-MTEB](https://github.com/FlagOpen/FlagEmbedding/tree/master/C_MTEB)
+
+## News
+- 7/26/2024: Release **BGE-En-Mistral**, a Mistral-7B based dense retriever, by integrating in-context learning abilities into the embedding model, new state-of-the-art results have been achieved on both the MTEB and AIR-Benchmark.
+- 1/30/2024: Release **BGE-M3**, a new member to BGE model series! M3 stands for **M**ulti-linguality (100+ languages), **M**ulti-granularities (input length up to 8192), **M**ulti-Functionality (unification of dense, lexical, multi-vec/colbert retrieval).
+It is the first embedding model that supports all three retrieval methods, achieving new SOTA on multi-lingual (MIRACL) and cross-lingual (MKQA) benchmarks.
+[Technical Report](https://github.com/FlagOpen/FlagEmbedding/blob/master/FlagEmbedding/BGE_M3/BGE_M3.pdf) and [Code](https://github.com/FlagOpen/FlagEmbedding/tree/master/FlagEmbedding/BGE_M3). :fire:
+- 1/9/2024: Release [Activation-Beacon](https://github.com/FlagOpen/FlagEmbedding/tree/master/Long_LLM/activation_beacon), an effective, efficient, compatible, and low-cost (training) method to extend the context length of LLM. [Technical Report](https://arxiv.org/abs/2401.03462) :fire:
+- 12/24/2023: Release **LLaRA**, a LLaMA-7B based dense retriever, leading to state-of-the-art performances on MS MARCO and BEIR. Model and code will be open-sourced. Please stay tuned. [Technical Report](https://arxiv.org/abs/2312.15503) :fire:
+- 11/23/2023: Release [LM-Cocktail](https://github.com/FlagOpen/FlagEmbedding/tree/master/LM_Cocktail), a method to maintain general capabilities during fine-tuning by merging multiple language models. [Technical Report](https://arxiv.org/abs/2311.13534) :fire:
+- 10/12/2023: Release [LLM-Embedder](https://github.com/FlagOpen/FlagEmbedding/tree/master/FlagEmbedding/llm_embedder), a unified embedding model to support diverse retrieval augmentation needs for LLMs. [Technical Report](https://arxiv.org/pdf/2310.07554.pdf)
+- 09/15/2023: The [technical report](https://arxiv.org/pdf/2309.07597.pdf) and [massive training data](https://data.baai.ac.cn/details/BAAI-MTP) of BGE has been released
+- 09/12/2023: New models:
+ - **New reranker model**: release cross-encoder models `BAAI/bge-reranker-base` and `BAAI/bge-reranker-large`, which are more powerful than embedding model. We recommend to use/fine-tune them to re-rank top-k documents returned by embedding models.
+ - **update embedding model**: release `bge-*-v1.5` embedding model to alleviate the issue of the similarity distribution, and enhance its retrieval ability without instruction.
+
+
+More
+
+
+- 09/07/2023: Update [fine-tune code](https://github.com/FlagOpen/FlagEmbedding/blob/master/FlagEmbedding/baai_general_embedding/README.md): Add script to mine hard negatives and support adding instruction during fine-tuning.
+- 08/09/2023: BGE Models are integrated into **Langchain**, you can use it like [this](#using-langchain); C-MTEB **leaderboard** is [available](https://huggingface.co/spaces/mteb/leaderboard).
+- 08/05/2023: Release base-scale and small-scale models, **best performance among the models of the same size 🤗**
+- 08/02/2023: Release `bge-large-*`(short for BAAI General Embedding) Models, **rank 1st on MTEB and C-MTEB benchmark!** :tada: :tada:
+- 08/01/2023: We release the [Chinese Massive Text Embedding Benchmark](https://github.com/FlagOpen/FlagEmbedding/blob/master/C_MTEB) (**C-MTEB**), consisting of 31 test dataset.
+
+
+