--- model-index: - name: karsar/paraphrase-multilingual-MiniLM-L12-hu-v2 results: - dataset: config: hun_Latn-hun_Latn name: MTEB BelebeleRetrieval (hun_Latn-hun_Latn) revision: 75b399394a9803252cfec289d103de462763db7c split: test type: facebook/belebele metrics: - type: main_score value: 80.204 - type: map_at_1 value: 69.111 - type: map_at_10 value: 76.773 - type: map_at_100 value: 77.169 - type: map_at_1000 value: 77.173 - type: map_at_20 value: 77.033 - type: map_at_3 value: 75.333 - type: map_at_5 value: 76.19399999999999 - type: mrr_at_1 value: 69.11111111111111 - type: mrr_at_10 value: 76.77345679012352 - type: mrr_at_100 value: 77.16929744881674 - type: mrr_at_1000 value: 77.17269244765126 - type: mrr_at_20 value: 77.03286768605402 - type: mrr_at_3 value: 75.33333333333334 - type: mrr_at_5 value: 76.19444444444449 - type: nauc_map_at_1000_diff1 value: 80.43265248651925 - type: nauc_map_at_1000_max value: 71.870230668987 - type: nauc_map_at_1000_std value: -3.0084092423300604 - type: nauc_map_at_100_diff1 value: 80.42911054177607 - type: nauc_map_at_100_max value: 71.86888714337594 - type: nauc_map_at_100_std value: -3.0086379837670716 - type: nauc_map_at_10_diff1 value: 80.36522921472617 - type: nauc_map_at_10_max value: 71.97959119190223 - type: nauc_map_at_10_std value: -2.7429598598137104 - type: nauc_map_at_1_diff1 value: 83.07496179427446 - type: nauc_map_at_1_max value: 70.1472835630915 - type: nauc_map_at_1_std value: -4.892100745257178 - type: nauc_map_at_20_diff1 value: 80.4010557171958 - type: nauc_map_at_20_max value: 71.9262402987486 - type: nauc_map_at_20_std value: -2.855142719268829 - type: nauc_map_at_3_diff1 value: 80.21618957663902 - type: nauc_map_at_3_max value: 72.32078865805673 - type: nauc_map_at_3_std value: -3.307117509227628 - type: nauc_map_at_5_diff1 value: 80.25726339569668 - type: nauc_map_at_5_max value: 71.96694381406756 - type: nauc_map_at_5_std value: -2.835991564758579 - type: nauc_mrr_at_1000_diff1 value: 80.43265248651925 - type: nauc_mrr_at_1000_max value: 71.870230668987 - type: nauc_mrr_at_1000_std value: -3.0084092423300604 - type: nauc_mrr_at_100_diff1 value: 80.42911054177607 - type: nauc_mrr_at_100_max value: 71.86888714337594 - type: nauc_mrr_at_100_std value: -3.0086379837670716 - type: nauc_mrr_at_10_diff1 value: 80.36522921472617 - type: nauc_mrr_at_10_max value: 71.97959119190223 - type: nauc_mrr_at_10_std value: -2.7429598598137104 - type: nauc_mrr_at_1_diff1 value: 83.07496179427446 - type: nauc_mrr_at_1_max value: 70.1472835630915 - type: nauc_mrr_at_1_std value: -4.892100745257178 - type: nauc_mrr_at_20_diff1 value: 80.4010557171958 - type: nauc_mrr_at_20_max value: 71.9262402987486 - type: nauc_mrr_at_20_std value: -2.855142719268829 - type: nauc_mrr_at_3_diff1 value: 80.21618957663902 - type: nauc_mrr_at_3_max value: 72.32078865805673 - type: nauc_mrr_at_3_std value: -3.307117509227628 - type: nauc_mrr_at_5_diff1 value: 80.25726339569668 - type: nauc_mrr_at_5_max value: 71.96694381406756 - type: nauc_mrr_at_5_std value: -2.835991564758579 - type: nauc_ndcg_at_1000_diff1 value: 79.98494037296896 - type: nauc_ndcg_at_1000_max value: 72.09578054274171 - type: nauc_ndcg_at_1000_std value: -2.480464992138408 - type: nauc_ndcg_at_100_diff1 value: 79.80423727797705 - type: nauc_ndcg_at_100_max value: 72.0536867142539 - type: nauc_ndcg_at_100_std value: -2.344303480460221 - type: nauc_ndcg_at_10_diff1 value: 79.4824234416871 - type: nauc_ndcg_at_10_max value: 72.68066855765318 - type: nauc_ndcg_at_10_std value: -1.0802283735752285 - type: nauc_ndcg_at_1_diff1 value: 83.07496179427446 - type: nauc_ndcg_at_1_max value: 70.1472835630915 - type: nauc_ndcg_at_1_std value: -4.892100745257178 - type: nauc_ndcg_at_20_diff1 value: 79.57286963155312 - type: nauc_ndcg_at_20_max value: 72.45565146275474 - type: nauc_ndcg_at_20_std value: -1.5388256709848513 - type: nauc_ndcg_at_3_diff1 value: 79.27965557528921 - type: nauc_ndcg_at_3_max value: 73.21665805867235 - type: nauc_ndcg_at_3_std value: -2.325102213384337 - type: nauc_ndcg_at_5_diff1 value: 79.24430450556383 - type: nauc_ndcg_at_5_max value: 72.55798047361041 - type: nauc_ndcg_at_5_std value: -1.346397266164686 - type: nauc_precision_at_1000_diff1 value: .nan - type: nauc_precision_at_1000_max value: .nan - type: nauc_precision_at_1000_std value: .nan - type: nauc_precision_at_100_diff1 value: 48.65946378551628 - type: nauc_precision_at_100_max value: 65.39282379618602 - type: nauc_precision_at_100_std value: 24.616513271977254 - type: nauc_precision_at_10_diff1 value: 74.02416251053245 - type: nauc_precision_at_10_max value: 77.2101523536241 - type: nauc_precision_at_10_std value: 10.31518298376219 - type: nauc_precision_at_1_diff1 value: 83.07496179427446 - type: nauc_precision_at_1_max value: 70.1472835630915 - type: nauc_precision_at_1_std value: -4.892100745257178 - type: nauc_precision_at_20_diff1 value: 71.5107376283843 - type: nauc_precision_at_20_max value: 77.21568627450955 - type: nauc_precision_at_20_std value: 11.723622782445986 - type: nauc_precision_at_3_diff1 value: 75.76535137737912 - type: nauc_precision_at_3_max value: 76.6685040168277 - type: nauc_precision_at_3_std value: 1.5867736131175436 - type: nauc_precision_at_5_diff1 value: 74.58750059437934 - type: nauc_precision_at_5_max value: 75.14502860946868 - type: nauc_precision_at_5_std value: 5.935474156377063 - type: nauc_recall_at_1000_diff1 value: .nan - type: nauc_recall_at_1000_max value: .nan - type: nauc_recall_at_1000_std value: .nan - type: nauc_recall_at_100_diff1 value: 48.6594637855143 - type: nauc_recall_at_100_max value: 65.39282379618471 - type: nauc_recall_at_100_std value: 24.616513271975943 - type: nauc_recall_at_10_diff1 value: 74.0241625105327 - type: nauc_recall_at_10_max value: 77.21015235362442 - type: nauc_recall_at_10_std value: 10.31518298376255 - type: nauc_recall_at_1_diff1 value: 83.07496179427446 - type: nauc_recall_at_1_max value: 70.1472835630915 - type: nauc_recall_at_1_std value: -4.892100745257178 - type: nauc_recall_at_20_diff1 value: 71.51073762838479 - type: nauc_recall_at_20_max value: 77.21568627450962 - type: nauc_recall_at_20_std value: 11.72362278244639 - type: nauc_recall_at_3_diff1 value: 75.7653513773791 - type: nauc_recall_at_3_max value: 76.66850401682761 - type: nauc_recall_at_3_std value: 1.5867736131174044 - type: nauc_recall_at_5_diff1 value: 74.58750059437952 - type: nauc_recall_at_5_max value: 75.1450286094688 - type: nauc_recall_at_5_std value: 5.935474156377355 - type: ndcg_at_1 value: 69.111 - type: ndcg_at_10 value: 80.204 - type: ndcg_at_100 value: 82.03399999999999 - type: ndcg_at_1000 value: 82.132 - type: ndcg_at_20 value: 81.119 - type: ndcg_at_3 value: 77.227 - type: ndcg_at_5 value: 78.781 - type: precision_at_1 value: 69.111 - type: precision_at_10 value: 9.089 - type: precision_at_100 value: 0.992 - type: precision_at_1000 value: 0.1 - type: precision_at_20 value: 4.7219999999999995 - type: precision_at_3 value: 27.556000000000004 - type: precision_at_5 value: 17.288999999999998 - type: recall_at_1 value: 69.111 - type: recall_at_10 value: 90.889 - type: recall_at_100 value: 99.222 - type: recall_at_1000 value: 100.0 - type: recall_at_20 value: 94.44399999999999 - type: recall_at_3 value: 82.667 - type: recall_at_5 value: 86.444 task: type: Retrieval - dataset: config: hun_Latn-eng_Latn name: MTEB BelebeleRetrieval (hun_Latn-eng_Latn) revision: 75b399394a9803252cfec289d103de462763db7c split: test type: facebook/belebele metrics: - type: main_score value: 75.395 - type: map_at_1 value: 62.666999999999994 - type: map_at_10 value: 71.30300000000001 - type: map_at_100 value: 71.774 - type: map_at_1000 value: 71.782 - type: map_at_20 value: 71.584 - type: map_at_3 value: 69.352 - type: map_at_5 value: 70.53 - type: mrr_at_1 value: 62.66666666666667 - type: mrr_at_10 value: 71.3027777777778 - type: mrr_at_100 value: 71.77425164712943 - type: mrr_at_1000 value: 71.78156792911966 - type: mrr_at_20 value: 71.58381913064578 - type: mrr_at_3 value: 69.35185185185185 - type: mrr_at_5 value: 70.52962962962965 - type: nauc_map_at_1000_diff1 value: 75.881602960504 - type: nauc_map_at_1000_max value: 68.66296274339753 - type: nauc_map_at_1000_std value: 7.517075184571474 - type: nauc_map_at_100_diff1 value: 75.8786843747508 - type: nauc_map_at_100_max value: 68.66828124033619 - type: nauc_map_at_100_std value: 7.525587871036576 - type: nauc_map_at_10_diff1 value: 75.5973833371205 - type: nauc_map_at_10_max value: 68.65021557056664 - type: nauc_map_at_10_std value: 7.562660323790659 - type: nauc_map_at_1_diff1 value: 79.25984371863814 - type: nauc_map_at_1_max value: 66.56457853173036 - type: nauc_map_at_1_std value: 3.9501186990857162 - type: nauc_map_at_20_diff1 value: 75.85810159356491 - type: nauc_map_at_20_max value: 68.76976086961005 - type: nauc_map_at_20_std value: 7.819971956110064 - type: nauc_map_at_3_diff1 value: 75.89565847594535 - type: nauc_map_at_3_max value: 68.86426509148927 - type: nauc_map_at_3_std value: 6.916006381683043 - type: nauc_map_at_5_diff1 value: 75.61832788795184 - type: nauc_map_at_5_max value: 68.66734871116772 - type: nauc_map_at_5_std value: 7.108445006055354 - type: nauc_mrr_at_1000_diff1 value: 75.881602960504 - type: nauc_mrr_at_1000_max value: 68.66296274339753 - type: nauc_mrr_at_1000_std value: 7.517075184571474 - type: nauc_mrr_at_100_diff1 value: 75.8786843747508 - type: nauc_mrr_at_100_max value: 68.66828124033619 - type: nauc_mrr_at_100_std value: 7.525587871036576 - type: nauc_mrr_at_10_diff1 value: 75.5973833371205 - type: nauc_mrr_at_10_max value: 68.65021557056664 - type: nauc_mrr_at_10_std value: 7.562660323790659 - type: nauc_mrr_at_1_diff1 value: 79.25984371863814 - type: nauc_mrr_at_1_max value: 66.56457853173036 - type: nauc_mrr_at_1_std value: 3.9501186990857162 - type: nauc_mrr_at_20_diff1 value: 75.85810159356491 - type: nauc_mrr_at_20_max value: 68.76976086961005 - type: nauc_mrr_at_20_std value: 7.819971956110064 - type: nauc_mrr_at_3_diff1 value: 75.89565847594535 - type: nauc_mrr_at_3_max value: 68.86426509148927 - type: nauc_mrr_at_3_std value: 6.916006381683043 - type: nauc_mrr_at_5_diff1 value: 75.61832788795184 - type: nauc_mrr_at_5_max value: 68.66734871116772 - type: nauc_mrr_at_5_std value: 7.108445006055354 - type: nauc_ndcg_at_1000_diff1 value: 75.2994418691362 - type: nauc_ndcg_at_1000_max value: 69.06426768849241 - type: nauc_ndcg_at_1000_std value: 8.535785357759078 - type: nauc_ndcg_at_100_diff1 value: 75.24120510322648 - type: nauc_ndcg_at_100_max value: 69.20598137031494 - type: nauc_ndcg_at_100_std value: 8.809082971368174 - type: nauc_ndcg_at_10_diff1 value: 73.85929786184265 - type: nauc_ndcg_at_10_max value: 69.35906735202224 - type: nauc_ndcg_at_10_std value: 9.803390649271314 - type: nauc_ndcg_at_1_diff1 value: 79.25984371863814 - type: nauc_ndcg_at_1_max value: 66.56457853173036 - type: nauc_ndcg_at_1_std value: 3.9501186990857162 - type: nauc_ndcg_at_20_diff1 value: 74.908346673254 - type: nauc_ndcg_at_20_max value: 69.94089128246969 - type: nauc_ndcg_at_20_std value: 11.040261082698441 - type: nauc_ndcg_at_3_diff1 value: 74.63723173221176 - type: nauc_ndcg_at_3_max value: 69.66882097579499 - type: nauc_ndcg_at_3_std value: 8.070938288986905 - type: nauc_ndcg_at_5_diff1 value: 74.03823148610475 - type: nauc_ndcg_at_5_max value: 69.35847081273427 - type: nauc_ndcg_at_5_std value: 8.544629619697409 - type: nauc_precision_at_1000_diff1 value: .nan - type: nauc_precision_at_1000_max value: .nan - type: nauc_precision_at_1000_std value: .nan - type: nauc_precision_at_100_diff1 value: 68.52007469654558 - type: nauc_precision_at_100_max value: 86.99813258636793 - type: nauc_precision_at_100_std value: 44.83193277310911 - type: nauc_precision_at_10_diff1 value: 63.121147314127334 - type: nauc_precision_at_10_max value: 73.48453534137256 - type: nauc_precision_at_10_std value: 24.29146523372198 - type: nauc_precision_at_1_diff1 value: 79.25984371863814 - type: nauc_precision_at_1_max value: 66.56457853173036 - type: nauc_precision_at_1_std value: 3.9501186990857162 - type: nauc_precision_at_20_diff1 value: 68.71215152727741 - type: nauc_precision_at_20_max value: 81.60730959050274 - type: nauc_precision_at_20_std value: 44.10897692410284 - type: nauc_precision_at_3_diff1 value: 70.11884635511014 - type: nauc_precision_at_3_max value: 72.53602624481753 - type: nauc_precision_at_3_std value: 12.235337922151093 - type: nauc_precision_at_5_diff1 value: 67.13640746909785 - type: nauc_precision_at_5_max value: 72.25763628849465 - type: nauc_precision_at_5_std value: 14.868376560758348 - type: nauc_recall_at_1000_diff1 value: .nan - type: nauc_recall_at_1000_max value: .nan - type: nauc_recall_at_1000_std value: .nan - type: nauc_recall_at_100_diff1 value: 68.52007469654454 - type: nauc_recall_at_100_max value: 86.99813258636755 - type: nauc_recall_at_100_std value: 44.83193277310874 - type: nauc_recall_at_10_diff1 value: 63.121147314127604 - type: nauc_recall_at_10_max value: 73.48453534137263 - type: nauc_recall_at_10_std value: 24.291465233722057 - type: nauc_recall_at_1_diff1 value: 79.25984371863814 - type: nauc_recall_at_1_max value: 66.56457853173036 - type: nauc_recall_at_1_std value: 3.9501186990857162 - type: nauc_recall_at_20_diff1 value: 68.71215152727751 - type: nauc_recall_at_20_max value: 81.60730959050267 - type: nauc_recall_at_20_std value: 44.10897692410295 - type: nauc_recall_at_3_diff1 value: 70.11884635511011 - type: nauc_recall_at_3_max value: 72.5360262448175 - type: nauc_recall_at_3_std value: 12.23533792215117 - type: nauc_recall_at_5_diff1 value: 67.13640746909788 - type: nauc_recall_at_5_max value: 72.25763628849458 - type: nauc_recall_at_5_std value: 14.868376560758476 - type: ndcg_at_1 value: 62.666999999999994 - type: ndcg_at_10 value: 75.395 - type: ndcg_at_100 value: 77.684 - type: ndcg_at_1000 value: 77.836 - type: ndcg_at_20 value: 76.41 - type: ndcg_at_3 value: 71.411 - type: ndcg_at_5 value: 73.52499999999999 - type: precision_at_1 value: 62.666999999999994 - type: precision_at_10 value: 8.822000000000001 - type: precision_at_100 value: 0.989 - type: precision_at_1000 value: 0.1 - type: precision_at_20 value: 4.611 - type: precision_at_3 value: 25.778000000000002 - type: precision_at_5 value: 16.489 - type: recall_at_1 value: 62.666999999999994 - type: recall_at_10 value: 88.222 - type: recall_at_100 value: 98.88900000000001 - type: recall_at_1000 value: 100.0 - type: recall_at_20 value: 92.22200000000001 - type: recall_at_3 value: 77.333 - type: recall_at_5 value: 82.44399999999999 task: type: Retrieval - dataset: config: eng_Latn-hun_Latn name: MTEB BelebeleRetrieval (eng_Latn-hun_Latn) revision: 75b399394a9803252cfec289d103de462763db7c split: test type: facebook/belebele metrics: - type: main_score value: 76.872 - type: map_at_1 value: 65.0 - type: map_at_10 value: 72.896 - type: map_at_100 value: 73.358 - type: map_at_1000 value: 73.36500000000001 - type: map_at_20 value: 73.2 - type: map_at_3 value: 70.907 - type: map_at_5 value: 72.002 - type: mrr_at_1 value: 65.0 - type: mrr_at_10 value: 72.89603174603175 - type: mrr_at_100 value: 73.3579205518051 - type: mrr_at_1000 value: 73.3654112460061 - type: mrr_at_20 value: 73.19952956877624 - type: mrr_at_3 value: 70.90740740740742 - type: mrr_at_5 value: 72.00185185185185 - type: nauc_map_at_1000_diff1 value: 77.77369357560062 - type: nauc_map_at_1000_max value: 70.94830494912476 - type: nauc_map_at_1000_std value: 6.522974403262641 - type: nauc_map_at_100_diff1 value: 77.77362905601957 - type: nauc_map_at_100_max value: 70.95095989526841 - type: nauc_map_at_100_std value: 6.5352551972569435 - type: nauc_map_at_10_diff1 value: 77.5247904322094 - type: nauc_map_at_10_max value: 71.02603340796348 - type: nauc_map_at_10_std value: 6.757278192437519 - type: nauc_map_at_1_diff1 value: 80.6553286136653 - type: nauc_map_at_1_max value: 68.35724812614716 - type: nauc_map_at_1_std value: 4.038661494923131 - type: nauc_map_at_20_diff1 value: 77.68304529150893 - type: nauc_map_at_20_max value: 70.95365196926124 - type: nauc_map_at_20_std value: 6.459235608230074 - type: nauc_map_at_3_diff1 value: 77.65108311925263 - type: nauc_map_at_3_max value: 71.27300229679268 - type: nauc_map_at_3_std value: 6.421413249698873 - type: nauc_map_at_5_diff1 value: 77.62058612073584 - type: nauc_map_at_5_max value: 71.28166308466814 - type: nauc_map_at_5_std value: 7.148832281239676 - type: nauc_mrr_at_1000_diff1 value: 77.77369357560062 - type: nauc_mrr_at_1000_max value: 70.94830494912476 - type: nauc_mrr_at_1000_std value: 6.522974403262641 - type: nauc_mrr_at_100_diff1 value: 77.77362905601957 - type: nauc_mrr_at_100_max value: 70.95095989526841 - type: nauc_mrr_at_100_std value: 6.5352551972569435 - type: nauc_mrr_at_10_diff1 value: 77.5247904322094 - type: nauc_mrr_at_10_max value: 71.02603340796348 - type: nauc_mrr_at_10_std value: 6.757278192437519 - type: nauc_mrr_at_1_diff1 value: 80.6553286136653 - type: nauc_mrr_at_1_max value: 68.35724812614716 - type: nauc_mrr_at_1_std value: 4.038661494923131 - type: nauc_mrr_at_20_diff1 value: 77.68304529150893 - type: nauc_mrr_at_20_max value: 70.95365196926124 - type: nauc_mrr_at_20_std value: 6.459235608230074 - type: nauc_mrr_at_3_diff1 value: 77.65108311925263 - type: nauc_mrr_at_3_max value: 71.27300229679268 - type: nauc_mrr_at_3_std value: 6.421413249698873 - type: nauc_mrr_at_5_diff1 value: 77.62058612073584 - type: nauc_mrr_at_5_max value: 71.28166308466814 - type: nauc_mrr_at_5_std value: 7.148832281239676 - type: nauc_ndcg_at_1000_diff1 value: 77.32834118213609 - type: nauc_ndcg_at_1000_max value: 71.28407034639005 - type: nauc_ndcg_at_1000_std value: 7.054791737753761 - type: nauc_ndcg_at_100_diff1 value: 77.3138740535263 - type: nauc_ndcg_at_100_max value: 71.38841430408482 - type: nauc_ndcg_at_100_std value: 7.495181794448738 - type: nauc_ndcg_at_10_diff1 value: 76.09808428652988 - type: nauc_ndcg_at_10_max value: 71.69225339870586 - type: nauc_ndcg_at_10_std value: 8.049899262534995 - type: nauc_ndcg_at_1_diff1 value: 80.6553286136653 - type: nauc_ndcg_at_1_max value: 68.35724812614716 - type: nauc_ndcg_at_1_std value: 4.038661494923131 - type: nauc_ndcg_at_20_diff1 value: 76.72021561109376 - type: nauc_ndcg_at_20_max value: 71.44696555289187 - type: nauc_ndcg_at_20_std value: 6.921724399287313 - type: nauc_ndcg_at_3_diff1 value: 76.56243231944167 - type: nauc_ndcg_at_3_max value: 72.19254115417164 - type: nauc_ndcg_at_3_std value: 7.41142651827797 - type: nauc_ndcg_at_5_diff1 value: 76.42995455832103 - type: nauc_ndcg_at_5_max value: 72.29448332833202 - type: nauc_ndcg_at_5_std value: 8.945757639249557 - type: nauc_precision_at_1000_diff1 value: .nan - type: nauc_precision_at_1000_max value: .nan - type: nauc_precision_at_1000_std value: .nan - type: nauc_precision_at_100_diff1 value: 75.5835667600378 - type: nauc_precision_at_100_max value: 81.91840838899772 - type: nauc_precision_at_100_std value: 50.65000359118282 - type: nauc_precision_at_10_diff1 value: 66.80290959473481 - type: nauc_precision_at_10_max value: 75.3540501756643 - type: nauc_precision_at_10_std value: 16.157652531050392 - type: nauc_precision_at_1_diff1 value: 80.6553286136653 - type: nauc_precision_at_1_max value: 68.35724812614716 - type: nauc_precision_at_1_std value: 4.038661494923131 - type: nauc_precision_at_20_diff1 value: 68.47710835746726 - type: nauc_precision_at_20_max value: 74.90069474117318 - type: nauc_precision_at_20_std value: 8.589311430786644 - type: nauc_precision_at_3_diff1 value: 72.69790694125835 - type: nauc_precision_at_3_max value: 75.40287436733661 - type: nauc_precision_at_3_std value: 10.97556244649672 - type: nauc_precision_at_5_diff1 value: 71.35323836023953 - type: nauc_precision_at_5_max value: 76.56660725505522 - type: nauc_precision_at_5_std value: 17.011489094336106 - type: nauc_recall_at_1000_diff1 value: .nan - type: nauc_recall_at_1000_max value: .nan - type: nauc_recall_at_1000_std value: .nan - type: nauc_recall_at_100_diff1 value: 75.58356676003798 - type: nauc_recall_at_100_max value: 81.9184083889972 - type: nauc_recall_at_100_std value: 50.65000359118012 - type: nauc_recall_at_10_diff1 value: 66.80290959473508 - type: nauc_recall_at_10_max value: 75.35405017566428 - type: nauc_recall_at_10_std value: 16.157652531050555 - type: nauc_recall_at_1_diff1 value: 80.6553286136653 - type: nauc_recall_at_1_max value: 68.35724812614716 - type: nauc_recall_at_1_std value: 4.038661494923131 - type: nauc_recall_at_20_diff1 value: 68.47710835746722 - type: nauc_recall_at_20_max value: 74.90069474117321 - type: nauc_recall_at_20_std value: 8.589311430787212 - type: nauc_recall_at_3_diff1 value: 72.69790694125835 - type: nauc_recall_at_3_max value: 75.40287436733665 - type: nauc_recall_at_3_std value: 10.97556244649683 - type: nauc_recall_at_5_diff1 value: 71.35323836023952 - type: nauc_recall_at_5_max value: 76.56660725505535 - type: nauc_recall_at_5_std value: 17.011489094336287 - type: ndcg_at_1 value: 65.0 - type: ndcg_at_10 value: 76.872 - type: ndcg_at_100 value: 78.914 - type: ndcg_at_1000 value: 79.103 - type: ndcg_at_20 value: 77.916 - type: ndcg_at_3 value: 72.763 - type: ndcg_at_5 value: 74.733 - type: precision_at_1 value: 65.0 - type: precision_at_10 value: 8.944 - type: precision_at_100 value: 0.9860000000000001 - type: precision_at_1000 value: 0.1 - type: precision_at_20 value: 4.672 - type: precision_at_3 value: 26.037 - type: precision_at_5 value: 16.578 - type: recall_at_1 value: 65.0 - type: recall_at_10 value: 89.444 - type: recall_at_100 value: 98.556 - type: recall_at_1000 value: 100.0 - type: recall_at_20 value: 93.444 - type: recall_at_3 value: 78.11099999999999 - type: recall_at_5 value: 82.889 task: type: Retrieval - dataset: config: eng_Latn-hun_Latn name: MTEB BibleNLPBitextMining (eng_Latn-hun_Latn) revision: 264a18480c529d9e922483839b4b9758e690b762 split: train type: davidstap/biblenlp-corpus-mmteb metrics: - type: accuracy value: 90.234375 - type: f1 value: 87.39583333333334 - type: main_score value: 87.39583333333334 - type: precision value: 86.16536458333334 - type: recall value: 90.234375 task: type: BitextMining - dataset: config: hun_Latn-eng_Latn name: MTEB BibleNLPBitextMining (hun_Latn-eng_Latn) revision: 264a18480c529d9e922483839b4b9758e690b762 split: train type: davidstap/biblenlp-corpus-mmteb metrics: - type: accuracy value: 94.140625 - type: f1 value: 92.31770833333333 - type: main_score value: 92.31770833333333 - type: precision value: 91.47135416666667 - type: recall value: 94.140625 task: type: BitextMining - dataset: config: default name: MTEB HunSum2AbstractiveRetrieval (default) revision: 24e1445c8180d937f0a16f8ae8a62e77cc952e56 split: test type: SZTAKI-HLT/HunSum-2-abstractive metrics: - type: main_score value: 65.616 - type: map_at_1 value: 65.616 - type: map_at_10 value: 72.17 - type: map_at_100 value: 72.596 - type: map_at_1000 value: 72.615 - type: map_at_20 value: 72.418 - type: map_at_3 value: 70.596 - type: map_at_5 value: 71.532 - type: mrr_at_1 value: 65.61561561561562 - type: mrr_at_10 value: 72.17006689228913 - type: mrr_at_100 value: 72.59630726413003 - type: mrr_at_1000 value: 72.61533408042457 - type: mrr_at_20 value: 72.41848803381308 - type: mrr_at_3 value: 70.59559559559558 - type: mrr_at_5 value: 71.53153153153156 - type: nauc_map_at_1000_diff1 value: 82.11477551036097 - type: nauc_map_at_1000_max value: 69.93216235961877 - type: nauc_map_at_1000_std value: -4.901120373521347 - type: nauc_map_at_100_diff1 value: 82.10806987112343 - type: nauc_map_at_100_max value: 69.93576246377116 - type: nauc_map_at_100_std value: -4.888276482937281 - type: nauc_map_at_10_diff1 value: 82.04302562091283 - type: nauc_map_at_10_max value: 69.98073646418275 - type: nauc_map_at_10_std value: -4.939406021960221 - type: nauc_map_at_1_diff1 value: 85.20668775288361 - type: nauc_map_at_1_max value: 69.09789335497814 - type: nauc_map_at_1_std value: -6.442884331049151 - type: nauc_map_at_20_diff1 value: 82.0225558509114 - type: nauc_map_at_20_max value: 69.91856798991726 - type: nauc_map_at_20_std value: -4.865775113195285 - type: nauc_map_at_3_diff1 value: 82.27637852190405 - type: nauc_map_at_3_max value: 70.15220150711396 - type: nauc_map_at_3_std value: -5.459983423558685 - type: nauc_map_at_5_diff1 value: 81.99868387570363 - type: nauc_map_at_5_max value: 69.96808732325626 - type: nauc_map_at_5_std value: -5.066978528413986 - type: nauc_mrr_at_1000_diff1 value: 82.11477551036097 - type: nauc_mrr_at_1000_max value: 69.93216235961877 - type: nauc_mrr_at_1000_std value: -4.901120373521347 - type: nauc_mrr_at_100_diff1 value: 82.10806987112343 - type: nauc_mrr_at_100_max value: 69.93576246377116 - type: nauc_mrr_at_100_std value: -4.888276482937281 - type: nauc_mrr_at_10_diff1 value: 82.04302562091283 - type: nauc_mrr_at_10_max value: 69.98073646418275 - type: nauc_mrr_at_10_std value: -4.939406021960221 - type: nauc_mrr_at_1_diff1 value: 85.20668775288361 - type: nauc_mrr_at_1_max value: 69.09789335497814 - type: nauc_mrr_at_1_std value: -6.442884331049151 - type: nauc_mrr_at_20_diff1 value: 82.0225558509114 - type: nauc_mrr_at_20_max value: 69.91856798991726 - type: nauc_mrr_at_20_std value: -4.865775113195285 - type: nauc_mrr_at_3_diff1 value: 82.27637852190405 - type: nauc_mrr_at_3_max value: 70.15220150711396 - type: nauc_mrr_at_3_std value: -5.459983423558685 - type: nauc_mrr_at_5_diff1 value: 81.99868387570363 - type: nauc_mrr_at_5_max value: 69.96808732325626 - type: nauc_mrr_at_5_std value: -5.066978528413986 - type: nauc_ndcg_at_1000_diff1 value: 81.5126840771622 - type: nauc_ndcg_at_1000_max value: 70.12763018849093 - type: nauc_ndcg_at_1000_std value: -3.990763331803255 - type: nauc_ndcg_at_100_diff1 value: 81.35314690542201 - type: nauc_ndcg_at_100_max value: 70.29954500310211 - type: nauc_ndcg_at_100_std value: -3.4200000144945704 - type: nauc_ndcg_at_10_diff1 value: 80.79088866851619 - type: nauc_ndcg_at_10_max value: 70.32243683355195 - type: nauc_ndcg_at_10_std value: -3.661632655363061 - type: nauc_ndcg_at_1_diff1 value: 85.20668775288361 - type: nauc_ndcg_at_1_max value: 69.09789335497814 - type: nauc_ndcg_at_1_std value: -6.442884331049151 - type: nauc_ndcg_at_20_diff1 value: 80.65723696292129 - type: nauc_ndcg_at_20_max value: 70.0781627958487 - type: nauc_ndcg_at_20_std value: -3.3268850467427455 - type: nauc_ndcg_at_3_diff1 value: 81.30422620216359 - type: nauc_ndcg_at_3_max value: 70.57201377939089 - type: nauc_ndcg_at_3_std value: -4.867226820935226 - type: nauc_ndcg_at_5_diff1 value: 80.73570523236309 - type: nauc_ndcg_at_5_max value: 70.26219056465638 - type: nauc_ndcg_at_5_std value: -4.058787558085215 - type: nauc_precision_at_1000_diff1 value: 74.43924751465183 - type: nauc_precision_at_1000_max value: 86.51759703882918 - type: nauc_precision_at_1000_std value: 82.31711770484912 - type: nauc_precision_at_100_diff1 value: 75.70004441716591 - type: nauc_precision_at_100_max value: 77.61085332471666 - type: nauc_precision_at_100_std value: 22.502459762868824 - type: nauc_precision_at_10_diff1 value: 74.62622879461453 - type: nauc_precision_at_10_max value: 72.02341591543264 - type: nauc_precision_at_10_std value: 3.227096017453475 - type: nauc_precision_at_1_diff1 value: 85.20668775288361 - type: nauc_precision_at_1_max value: 69.09789335497814 - type: nauc_precision_at_1_std value: -6.442884331049151 - type: nauc_precision_at_20_diff1 value: 71.82812396378273 - type: nauc_precision_at_20_max value: 70.67250776468195 - type: nauc_precision_at_20_std value: 7.67961919934869 - type: nauc_precision_at_3_diff1 value: 78.00819461913801 - type: nauc_precision_at_3_max value: 72.02361600228878 - type: nauc_precision_at_3_std value: -2.749950579759329 - type: nauc_precision_at_5_diff1 value: 75.76784513448565 - type: nauc_precision_at_5_max value: 71.38070257500367 - type: nauc_precision_at_5_std value: 0.14048678160511355 - type: nauc_recall_at_1000_diff1 value: 74.43924751465049 - type: nauc_recall_at_1000_max value: 86.51759703883047 - type: nauc_recall_at_1000_std value: 82.31711770484824 - type: nauc_recall_at_100_diff1 value: 75.70004441716617 - type: nauc_recall_at_100_max value: 77.61085332471673 - type: nauc_recall_at_100_std value: 22.502459762868686 - type: nauc_recall_at_10_diff1 value: 74.62622879461455 - type: nauc_recall_at_10_max value: 72.02341591543261 - type: nauc_recall_at_10_std value: 3.2270960174534973 - type: nauc_recall_at_1_diff1 value: 85.20668775288361 - type: nauc_recall_at_1_max value: 69.09789335497814 - type: nauc_recall_at_1_std value: -6.442884331049151 - type: nauc_recall_at_20_diff1 value: 71.82812396378263 - type: nauc_recall_at_20_max value: 70.6725077646819 - type: nauc_recall_at_20_std value: 7.679619199348971 - type: nauc_recall_at_3_diff1 value: 78.00819461913807 - type: nauc_recall_at_3_max value: 72.02361600228886 - type: nauc_recall_at_3_std value: -2.749950579759317 - type: nauc_recall_at_5_diff1 value: 75.76784513448574 - type: nauc_recall_at_5_max value: 71.38070257500377 - type: nauc_recall_at_5_std value: 0.1404867816050954 - type: ndcg_at_1 value: 65.616 - type: ndcg_at_10 value: 75.372 - type: ndcg_at_100 value: 77.536 - type: ndcg_at_1000 value: 78.051 - type: ndcg_at_20 value: 76.281 - type: ndcg_at_3 value: 72.162 - type: ndcg_at_5 value: 73.83999999999999 - type: precision_at_1 value: 65.616 - type: precision_at_10 value: 8.544 - type: precision_at_100 value: 0.9570000000000001 - type: precision_at_1000 value: 0.1 - type: precision_at_20 value: 4.452 - type: precision_at_3 value: 25.558999999999997 - type: precision_at_5 value: 16.146 - type: recall_at_1 value: 65.616 - type: recall_at_10 value: 85.435 - type: recall_at_100 value: 95.746 - type: recall_at_1000 value: 99.8 - type: recall_at_20 value: 89.039 - type: recall_at_3 value: 76.67699999999999 - type: recall_at_5 value: 80.731 task: type: Retrieval - dataset: config: hu name: MTEB MassiveIntentClassification (hu) revision: 4672e20407010da34463acc759c162ca9734bca6 split: test type: mteb/amazon_massive_intent metrics: - type: accuracy value: 61.93678547410896 - type: f1 value: 59.18089758951288 - type: f1_weighted value: 62.33480431880768 - type: main_score value: 61.93678547410896 task: type: Classification - dataset: config: hu name: MTEB MassiveIntentClassification (hu) revision: 4672e20407010da34463acc759c162ca9734bca6 split: validation type: mteb/amazon_massive_intent metrics: - type: accuracy value: 61.65272995573046 - type: f1 value: 59.300294731108615 - type: f1_weighted value: 61.95329485924452 - type: main_score value: 61.65272995573046 task: type: Classification - dataset: config: hu name: MTEB MassiveScenarioClassification (hu) revision: fad2c6e8459f9e1c45d9315f4953d921437d70f8 split: test type: mteb/amazon_massive_scenario metrics: - type: accuracy value: 66.93342299932749 - type: f1 value: 66.09393745126239 - type: f1_weighted value: 67.11013732647363 - type: main_score value: 66.93342299932749 task: type: Classification - dataset: config: hu name: MTEB MassiveScenarioClassification (hu) revision: fad2c6e8459f9e1c45d9315f4953d921437d70f8 split: validation type: mteb/amazon_massive_scenario metrics: - type: accuracy value: 66.27643876045252 - type: f1 value: 65.84263838771432 - type: f1_weighted value: 66.48633782928637 - type: main_score value: 66.27643876045252 task: type: Classification - dataset: config: hu name: MTEB MultiEURLEXMultilabelClassification (hu) revision: 2aea5a6dc8fdcfeca41d0fb963c0a338930bde5c split: test type: mteb/eurlex-multilingual metrics: - type: accuracy value: 2.6879999999999997 - type: f1 value: 25.112198433514166 - type: lrap value: 41.790686190475135 - type: main_score value: 2.6879999999999997 task: type: MultilabelClassification - dataset: config: arb_Arab-hun_Latn name: MTEB NTREXBitextMining (arb_Arab-hun_Latn) revision: ed9a4403ed4adbfaf4aab56d5b2709e9f6c3ba33 split: test type: mteb/NTREX metrics: - type: accuracy value: 86.07911867801702 - type: f1 value: 82.34184610248707 - type: main_score value: 82.34184610248707 - type: precision value: 80.65598397596395 - type: recall value: 86.07911867801702 task: type: BitextMining - dataset: config: ben_Beng-hun_Latn name: MTEB NTREXBitextMining (ben_Beng-hun_Latn) revision: ed9a4403ed4adbfaf4aab56d5b2709e9f6c3ba33 split: test type: mteb/NTREX metrics: - type: accuracy value: 40.91136705057586 - type: f1 value: 36.01175728956383 - type: main_score value: 36.01175728956383 - type: precision value: 34.36916434339978 - type: recall value: 40.91136705057586 task: type: BitextMining - dataset: config: deu_Latn-hun_Latn name: MTEB NTREXBitextMining (deu_Latn-hun_Latn) revision: ed9a4403ed4adbfaf4aab56d5b2709e9f6c3ba33 split: test type: mteb/NTREX metrics: - type: accuracy value: 93.54031046569855 - type: f1 value: 91.73760640961443 - type: main_score value: 91.73760640961443 - type: precision value: 90.87130696044066 - type: recall value: 93.54031046569855 task: type: BitextMining - dataset: config: ell_Grek-hun_Latn name: MTEB NTREXBitextMining (ell_Grek-hun_Latn) revision: ed9a4403ed4adbfaf4aab56d5b2709e9f6c3ba33 split: test type: mteb/NTREX metrics: - type: accuracy value: 91.3870806209314 - type: f1 value: 88.87998664663662 - type: main_score value: 88.87998664663662 - type: precision value: 87.69821398764815 - type: recall value: 91.3870806209314 task: type: BitextMining - dataset: config: eng_Latn-hun_Latn name: MTEB NTREXBitextMining (eng_Latn-hun_Latn) revision: ed9a4403ed4adbfaf4aab56d5b2709e9f6c3ba33 split: test type: mteb/NTREX metrics: - type: accuracy value: 94.69203805708563 - type: f1 value: 93.04790519112001 - type: main_score value: 93.04790519112001 - type: precision value: 92.24670338841595 - type: recall value: 94.69203805708563 task: type: BitextMining - dataset: config: fas_Arab-hun_Latn name: MTEB NTREXBitextMining (fas_Arab-hun_Latn) revision: ed9a4403ed4adbfaf4aab56d5b2709e9f6c3ba33 split: test type: mteb/NTREX metrics: - type: accuracy value: 89.43415122684027 - type: f1 value: 86.48138874979135 - type: main_score value: 86.48138874979135 - type: precision value: 85.1235186112502 - type: recall value: 89.43415122684027 task: type: BitextMining - dataset: config: fin_Latn-hun_Latn name: MTEB NTREXBitextMining (fin_Latn-hun_Latn) revision: ed9a4403ed4adbfaf4aab56d5b2709e9f6c3ba33 split: test type: mteb/NTREX metrics: - type: accuracy value: 90.73610415623435 - type: f1 value: 88.10716074111167 - type: main_score value: 88.10716074111167 - type: precision value: 86.84860624269739 - type: recall value: 90.73610415623435 task: type: BitextMining - dataset: config: fra_Latn-hun_Latn name: MTEB NTREXBitextMining (fra_Latn-hun_Latn) revision: ed9a4403ed4adbfaf4aab56d5b2709e9f6c3ba33 split: test type: mteb/NTREX metrics: - type: accuracy value: 93.03955933900852 - type: f1 value: 90.97312635620098 - type: main_score value: 90.97312635620098 - type: precision value: 89.97245868803205 - type: recall value: 93.03955933900852 task: type: BitextMining - dataset: config: heb_Hebr-hun_Latn name: MTEB NTREXBitextMining (heb_Hebr-hun_Latn) revision: ed9a4403ed4adbfaf4aab56d5b2709e9f6c3ba33 split: test type: mteb/NTREX metrics: - type: accuracy value: 88.03204807210815 - type: f1 value: 84.71540644299783 - type: main_score value: 84.71540644299783 - type: precision value: 83.14972458688032 - type: recall value: 88.03204807210815 task: type: BitextMining - dataset: config: hin_Deva-hun_Latn name: MTEB NTREXBitextMining (hin_Deva-hun_Latn) revision: ed9a4403ed4adbfaf4aab56d5b2709e9f6c3ba33 split: test type: mteb/NTREX metrics: - type: accuracy value: 86.9804707060591 - type: f1 value: 83.51527290936404 - type: main_score value: 83.51527290936404 - type: precision value: 81.92038057085628 - type: recall value: 86.9804707060591 task: type: BitextMining - dataset: config: hun_Latn-arb_Arab name: MTEB NTREXBitextMining (hun_Latn-arb_Arab) revision: ed9a4403ed4adbfaf4aab56d5b2709e9f6c3ba33 split: test type: mteb/NTREX metrics: - type: accuracy value: 86.47971957936905 - type: f1 value: 82.83592054748789 - type: main_score value: 82.83592054748789 - type: precision value: 81.18260724419963 - type: recall value: 86.47971957936905 task: type: BitextMining - dataset: config: hun_Latn-ben_Beng name: MTEB NTREXBitextMining (hun_Latn-ben_Beng) revision: ed9a4403ed4adbfaf4aab56d5b2709e9f6c3ba33 split: test type: mteb/NTREX metrics: - type: accuracy value: 41.86279419128693 - type: f1 value: 33.232896964494365 - type: main_score value: 33.232896964494365 - type: precision value: 30.249043850094402 - type: recall value: 41.86279419128693 task: type: BitextMining - dataset: config: hun_Latn-deu_Latn name: MTEB NTREXBitextMining (hun_Latn-deu_Latn) revision: ed9a4403ed4adbfaf4aab56d5b2709e9f6c3ba33 split: test type: mteb/NTREX metrics: - type: accuracy value: 93.94091136705057 - type: f1 value: 92.14989150392255 - type: main_score value: 92.14989150392255 - type: precision value: 91.28275746953764 - type: recall value: 93.94091136705057 task: type: BitextMining - dataset: config: hun_Latn-ell_Grek name: MTEB NTREXBitextMining (hun_Latn-ell_Grek) revision: ed9a4403ed4adbfaf4aab56d5b2709e9f6c3ba33 split: test type: mteb/NTREX metrics: - type: accuracy value: 92.8392588883325 - type: f1 value: 90.86296110832916 - type: main_score value: 90.86296110832916 - type: precision value: 89.93072942747456 - type: recall value: 92.8392588883325 task: type: BitextMining - dataset: config: hun_Latn-eng_Latn name: MTEB NTREXBitextMining (hun_Latn-eng_Latn) revision: ed9a4403ed4adbfaf4aab56d5b2709e9f6c3ba33 split: test type: mteb/NTREX metrics: - type: accuracy value: 95.54331497245869 - type: f1 value: 94.2330161909531 - type: main_score value: 94.2330161909531 - type: precision value: 93.59873143047905 - type: recall value: 95.54331497245869 task: type: BitextMining - dataset: config: hun_Latn-fas_Arab name: MTEB NTREXBitextMining (hun_Latn-fas_Arab) revision: ed9a4403ed4adbfaf4aab56d5b2709e9f6c3ba33 split: test type: mteb/NTREX metrics: - type: accuracy value: 89.43415122684027 - type: f1 value: 86.54481722583876 - type: main_score value: 86.54481722583876 - type: precision value: 85.20447337673176 - type: recall value: 89.43415122684027 task: type: BitextMining - dataset: config: hun_Latn-fin_Latn name: MTEB NTREXBitextMining (hun_Latn-fin_Latn) revision: ed9a4403ed4adbfaf4aab56d5b2709e9f6c3ba33 split: test type: mteb/NTREX metrics: - type: accuracy value: 89.58437656484726 - type: f1 value: 86.70839592722417 - type: main_score value: 86.70839592722417 - type: precision value: 85.37389417459522 - type: recall value: 89.58437656484726 task: type: BitextMining - dataset: config: hun_Latn-fra_Latn name: MTEB NTREXBitextMining (hun_Latn-fra_Latn) revision: ed9a4403ed4adbfaf4aab56d5b2709e9f6c3ba33 split: test type: mteb/NTREX metrics: - type: accuracy value: 92.13820731096645 - type: f1 value: 89.883158070439 - type: main_score value: 89.883158070439 - type: precision value: 88.81822734101151 - type: recall value: 92.13820731096645 task: type: BitextMining - dataset: config: hun_Latn-heb_Hebr name: MTEB NTREXBitextMining (hun_Latn-heb_Hebr) revision: ed9a4403ed4adbfaf4aab56d5b2709e9f6c3ba33 split: test type: mteb/NTREX metrics: - type: accuracy value: 86.93039559339009 - type: f1 value: 83.32336166587544 - type: main_score value: 83.32336166587544 - type: precision value: 81.67334334835587 - type: recall value: 86.93039559339009 task: type: BitextMining - dataset: config: hun_Latn-hin_Deva name: MTEB NTREXBitextMining (hun_Latn-hin_Deva) revision: ed9a4403ed4adbfaf4aab56d5b2709e9f6c3ba33 split: test type: mteb/NTREX metrics: - type: accuracy value: 85.97896845267901 - type: f1 value: 82.34685361375396 - type: main_score value: 82.34685361375396 - type: precision value: 80.72859288933401 - type: recall value: 85.97896845267901 task: type: BitextMining - dataset: config: hun_Latn-ind_Latn name: MTEB NTREXBitextMining (hun_Latn-ind_Latn) revision: ed9a4403ed4adbfaf4aab56d5b2709e9f6c3ba33 split: test type: mteb/NTREX metrics: - type: accuracy value: 92.33850776164246 - type: f1 value: 90.06843598731432 - type: main_score value: 90.06843598731432 - type: precision value: 88.97512936070773 - type: recall value: 92.33850776164246 task: type: BitextMining - dataset: config: hun_Latn-jpn_Jpan name: MTEB NTREXBitextMining (hun_Latn-jpn_Jpan) revision: ed9a4403ed4adbfaf4aab56d5b2709e9f6c3ba33 split: test type: mteb/NTREX metrics: - type: accuracy value: 87.48122183274913 - type: f1 value: 84.08779836421299 - type: main_score value: 84.08779836421299 - type: precision value: 82.53380070105159 - type: recall value: 87.48122183274913 task: type: BitextMining - dataset: config: hun_Latn-kor_Hang name: MTEB NTREXBitextMining (hun_Latn-kor_Hang) revision: ed9a4403ed4adbfaf4aab56d5b2709e9f6c3ba33 split: test type: mteb/NTREX metrics: - type: accuracy value: 84.82724086129194 - type: f1 value: 80.77859213062017 - type: main_score value: 80.77859213062017 - type: precision value: 78.98931730929726 - type: recall value: 84.82724086129194 task: type: BitextMining - dataset: config: hun_Latn-lav_Latn name: MTEB NTREXBitextMining (hun_Latn-lav_Latn) revision: ed9a4403ed4adbfaf4aab56d5b2709e9f6c3ba33 split: test type: mteb/NTREX metrics: - type: accuracy value: 89.9849774661993 - type: f1 value: 87.0422300116842 - type: main_score value: 87.0422300116842 - type: precision value: 85.65932231680856 - type: recall value: 89.9849774661993 task: type: BitextMining - dataset: config: hun_Latn-lit_Latn name: MTEB NTREXBitextMining (hun_Latn-lit_Latn) revision: ed9a4403ed4adbfaf4aab56d5b2709e9f6c3ba33 split: test type: mteb/NTREX metrics: - type: accuracy value: 90.38557836755132 - type: f1 value: 87.60474044399933 - type: main_score value: 87.60474044399933 - type: precision value: 86.28776498080455 - type: recall value: 90.38557836755132 task: type: BitextMining - dataset: config: hun_Latn-nld_Latn name: MTEB NTREXBitextMining (hun_Latn-nld_Latn) revision: ed9a4403ed4adbfaf4aab56d5b2709e9f6c3ba33 split: test type: mteb/NTREX metrics: - type: accuracy value: 93.64046069103655 - type: f1 value: 91.81271907861792 - type: main_score value: 91.81271907861792 - type: precision value: 90.93807377733266 - type: recall value: 93.64046069103655 task: type: BitextMining - dataset: config: hun_Latn-pol_Latn name: MTEB NTREXBitextMining (hun_Latn-pol_Latn) revision: ed9a4403ed4adbfaf4aab56d5b2709e9f6c3ba33 split: test type: mteb/NTREX metrics: - type: accuracy value: 91.2368552829244 - type: f1 value: 88.85924124281661 - type: main_score value: 88.85924124281661 - type: precision value: 87.7524620263729 - type: recall value: 91.2368552829244 task: type: BitextMining - dataset: config: hun_Latn-por_Latn name: MTEB NTREXBitextMining (hun_Latn-por_Latn) revision: ed9a4403ed4adbfaf4aab56d5b2709e9f6c3ba33 split: test type: mteb/NTREX metrics: - type: accuracy value: 93.18978467701552 - type: f1 value: 91.15172759138709 - type: main_score value: 91.15172759138709 - type: precision value: 90.19362376898682 - type: recall value: 93.18978467701552 task: type: BitextMining - dataset: config: hun_Latn-rus_Cyrl name: MTEB NTREXBitextMining (hun_Latn-rus_Cyrl) revision: ed9a4403ed4adbfaf4aab56d5b2709e9f6c3ba33 split: test type: mteb/NTREX metrics: - type: accuracy value: 92.23835753630446 - type: f1 value: 89.9382406943749 - type: main_score value: 89.9382406943749 - type: precision value: 88.85411450509096 - type: recall value: 92.23835753630446 task: type: BitextMining - dataset: config: hun_Latn-spa_Latn name: MTEB NTREXBitextMining (hun_Latn-spa_Latn) revision: ed9a4403ed4adbfaf4aab56d5b2709e9f6c3ba33 split: test type: mteb/NTREX metrics: - type: accuracy value: 93.34001001502253 - type: f1 value: 91.47888499415792 - type: main_score value: 91.47888499415792 - type: precision value: 90.58587881822734 - type: recall value: 93.34001001502253 task: type: BitextMining - dataset: config: hun_Latn-swa_Latn name: MTEB NTREXBitextMining (hun_Latn-swa_Latn) revision: ed9a4403ed4adbfaf4aab56d5b2709e9f6c3ba33 split: test type: mteb/NTREX metrics: - type: accuracy value: 40.76114171256886 - type: f1 value: 32.341475401874824 - type: main_score value: 32.341475401874824 - type: precision value: 29.515621549076144 - type: recall value: 40.76114171256886 task: type: BitextMining - dataset: config: hun_Latn-swe_Latn name: MTEB NTREXBitextMining (hun_Latn-swe_Latn) revision: ed9a4403ed4adbfaf4aab56d5b2709e9f6c3ba33 split: test type: mteb/NTREX metrics: - type: accuracy value: 93.44016024036054 - type: f1 value: 91.490569187114 - type: main_score value: 91.490569187114 - type: precision value: 90.56501418794859 - type: recall value: 93.44016024036054 task: type: BitextMining - dataset: config: hun_Latn-tam_Taml name: MTEB NTREXBitextMining (hun_Latn-tam_Taml) revision: ed9a4403ed4adbfaf4aab56d5b2709e9f6c3ba33 split: test type: mteb/NTREX metrics: - type: accuracy value: 27.591387080620933 - type: f1 value: 18.875023187991868 - type: main_score value: 18.875023187991868 - type: precision value: 16.43982939607956 - type: recall value: 27.591387080620933 task: type: BitextMining - dataset: config: hun_Latn-tur_Latn name: MTEB NTREXBitextMining (hun_Latn-tur_Latn) revision: ed9a4403ed4adbfaf4aab56d5b2709e9f6c3ba33 split: test type: mteb/NTREX metrics: - type: accuracy value: 91.3870806209314 - type: f1 value: 88.90836254381573 - type: main_score value: 88.90836254381573 - type: precision value: 87.72325154398266 - type: recall value: 91.3870806209314 task: type: BitextMining - dataset: config: hun_Latn-vie_Latn name: MTEB NTREXBitextMining (hun_Latn-vie_Latn) revision: ed9a4403ed4adbfaf4aab56d5b2709e9f6c3ba33 split: test type: mteb/NTREX metrics: - type: accuracy value: 91.13670505758637 - type: f1 value: 88.62054987242769 - type: main_score value: 88.62054987242769 - type: precision value: 87.41445501585711 - type: recall value: 91.13670505758637 task: type: BitextMining - dataset: config: hun_Latn-zho_Hant name: MTEB NTREXBitextMining (hun_Latn-zho_Hant) revision: ed9a4403ed4adbfaf4aab56d5b2709e9f6c3ba33 split: test type: mteb/NTREX metrics: - type: accuracy value: 90.33550325488233 - type: f1 value: 87.71574027708229 - type: main_score value: 87.71574027708229 - type: precision value: 86.53861744998451 - type: recall value: 90.33550325488233 task: type: BitextMining - dataset: config: hun_Latn-zul_Latn name: MTEB NTREXBitextMining (hun_Latn-zul_Latn) revision: ed9a4403ed4adbfaf4aab56d5b2709e9f6c3ba33 split: test type: mteb/NTREX metrics: - type: accuracy value: 17.626439659489236 - type: f1 value: 11.826546194507252 - type: main_score value: 11.826546194507252 - type: precision value: 10.340822386979896 - type: recall value: 17.626439659489236 task: type: BitextMining - dataset: config: ind_Latn-hun_Latn name: MTEB NTREXBitextMining (ind_Latn-hun_Latn) revision: ed9a4403ed4adbfaf4aab56d5b2709e9f6c3ba33 split: test type: mteb/NTREX metrics: - type: accuracy value: 92.93940911367051 - type: f1 value: 90.91470539142045 - type: main_score value: 90.91470539142045 - type: precision value: 89.96411283592055 - type: recall value: 92.93940911367051 task: type: BitextMining - dataset: config: jpn_Jpan-hun_Latn name: MTEB NTREXBitextMining (jpn_Jpan-hun_Latn) revision: ed9a4403ed4adbfaf4aab56d5b2709e9f6c3ba33 split: test type: mteb/NTREX metrics: - type: accuracy value: 88.33249874812218 - type: f1 value: 85.07260891337006 - type: main_score value: 85.07260891337006 - type: precision value: 83.54114505090969 - type: recall value: 88.33249874812218 task: type: BitextMining - dataset: config: kor_Hang-hun_Latn name: MTEB NTREXBitextMining (kor_Hang-hun_Latn) revision: ed9a4403ed4adbfaf4aab56d5b2709e9f6c3ba33 split: test type: mteb/NTREX metrics: - type: accuracy value: 86.07911867801702 - type: f1 value: 82.32348522784176 - type: main_score value: 82.32348522784176 - type: precision value: 80.59339008512768 - type: recall value: 86.07911867801702 task: type: BitextMining - dataset: config: lav_Latn-hun_Latn name: MTEB NTREXBitextMining (lav_Latn-hun_Latn) revision: ed9a4403ed4adbfaf4aab56d5b2709e9f6c3ba33 split: test type: mteb/NTREX metrics: - type: accuracy value: 90.73610415623435 - type: f1 value: 88.25833989078856 - type: main_score value: 88.25833989078856 - type: precision value: 87.09480887998664 - type: recall value: 90.73610415623435 task: type: BitextMining - dataset: config: lit_Latn-hun_Latn name: MTEB NTREXBitextMining (lit_Latn-hun_Latn) revision: ed9a4403ed4adbfaf4aab56d5b2709e9f6c3ba33 split: test type: mteb/NTREX metrics: - type: accuracy value: 91.88783174762143 - type: f1 value: 89.59105324653646 - type: main_score value: 89.59105324653646 - type: precision value: 88.49106993824068 - type: recall value: 91.88783174762143 task: type: BitextMining - dataset: config: nld_Latn-hun_Latn name: MTEB NTREXBitextMining (nld_Latn-hun_Latn) revision: ed9a4403ed4adbfaf4aab56d5b2709e9f6c3ba33 split: test type: mteb/NTREX metrics: - type: accuracy value: 92.98948422633951 - type: f1 value: 90.93139709564348 - type: main_score value: 90.93139709564348 - type: precision value: 89.93072942747456 - type: recall value: 92.98948422633951 task: type: BitextMining - dataset: config: pol_Latn-hun_Latn name: MTEB NTREXBitextMining (pol_Latn-hun_Latn) revision: ed9a4403ed4adbfaf4aab56d5b2709e9f6c3ba33 split: test type: mteb/NTREX metrics: - type: accuracy value: 91.4371557336004 - type: f1 value: 89.10699382406943 - type: main_score value: 89.10699382406943 - type: precision value: 88.00701051577366 - type: recall value: 91.4371557336004 task: type: BitextMining - dataset: config: por_Latn-hun_Latn name: MTEB NTREXBitextMining (por_Latn-hun_Latn) revision: ed9a4403ed4adbfaf4aab56d5b2709e9f6c3ba33 split: test type: mteb/NTREX metrics: - type: accuracy value: 92.98948422633951 - type: f1 value: 91.02320146886997 - type: main_score value: 91.02320146886997 - type: precision value: 90.09764646970456 - type: recall value: 92.98948422633951 task: type: BitextMining - dataset: config: rus_Cyrl-hun_Latn name: MTEB NTREXBitextMining (rus_Cyrl-hun_Latn) revision: ed9a4403ed4adbfaf4aab56d5b2709e9f6c3ba33 split: test type: mteb/NTREX metrics: - type: accuracy value: 90.98647971957938 - type: f1 value: 88.3942580537473 - type: main_score value: 88.3942580537473 - type: precision value: 87.16992154899015 - type: recall value: 90.98647971957938 task: type: BitextMining - dataset: config: spa_Latn-hun_Latn name: MTEB NTREXBitextMining (spa_Latn-hun_Latn) revision: ed9a4403ed4adbfaf4aab56d5b2709e9f6c3ba33 split: test type: mteb/NTREX metrics: - type: accuracy value: 93.13970956434652 - type: f1 value: 91.19846436321149 - type: main_score value: 91.19846436321149 - type: precision value: 90.26456351193457 - type: recall value: 93.13970956434652 task: type: BitextMining - dataset: config: swa_Latn-hun_Latn name: MTEB NTREXBitextMining (swa_Latn-hun_Latn) revision: ed9a4403ed4adbfaf4aab56d5b2709e9f6c3ba33 split: test type: mteb/NTREX metrics: - type: accuracy value: 39.05858788182273 - type: f1 value: 33.98323169908456 - type: main_score value: 33.98323169908456 - type: precision value: 32.41376425186998 - type: recall value: 39.05858788182273 task: type: BitextMining - dataset: config: swe_Latn-hun_Latn name: MTEB NTREXBitextMining (swe_Latn-hun_Latn) revision: ed9a4403ed4adbfaf4aab56d5b2709e9f6c3ba33 split: test type: mteb/NTREX metrics: - type: accuracy value: 93.03955933900852 - type: f1 value: 91.01485561675847 - type: main_score value: 91.01485561675847 - type: precision value: 90.04757135703555 - type: recall value: 93.03955933900852 task: type: BitextMining - dataset: config: tam_Taml-hun_Latn name: MTEB NTREXBitextMining (tam_Taml-hun_Latn) revision: ed9a4403ed4adbfaf4aab56d5b2709e9f6c3ba33 split: test type: mteb/NTREX metrics: - type: accuracy value: 27.341011517275916 - type: f1 value: 24.114490363365103 - type: main_score value: 24.114490363365103 - type: precision value: 23.01465131730559 - type: recall value: 27.341011517275916 task: type: BitextMining - dataset: config: tur_Latn-hun_Latn name: MTEB NTREXBitextMining (tur_Latn-hun_Latn) revision: ed9a4403ed4adbfaf4aab56d5b2709e9f6c3ba33 split: test type: mteb/NTREX metrics: - type: accuracy value: 91.03655483224837 - type: f1 value: 88.4843932565515 - type: main_score value: 88.4843932565515 - type: precision value: 87.31180103488568 - type: recall value: 91.03655483224837 task: type: BitextMining - dataset: config: vie_Latn-hun_Latn name: MTEB NTREXBitextMining (vie_Latn-hun_Latn) revision: ed9a4403ed4adbfaf4aab56d5b2709e9f6c3ba33 split: test type: mteb/NTREX metrics: - type: accuracy value: 90.38557836755132 - type: f1 value: 87.73493573693874 - type: main_score value: 87.73493573693874 - type: precision value: 86.5005842096478 - type: recall value: 90.38557836755132 task: type: BitextMining - dataset: config: zho_Hant-hun_Latn name: MTEB NTREXBitextMining (zho_Hant-hun_Latn) revision: ed9a4403ed4adbfaf4aab56d5b2709e9f6c3ba33 split: test type: mteb/NTREX metrics: - type: accuracy value: 90.33550325488233 - type: f1 value: 87.59806376231013 - type: main_score value: 87.59806376231013 - type: precision value: 86.3253213153063 - type: recall value: 90.33550325488233 task: type: BitextMining - dataset: config: zul_Latn-hun_Latn name: MTEB NTREXBitextMining (zul_Latn-hun_Latn) revision: ed9a4403ed4adbfaf4aab56d5b2709e9f6c3ba33 split: test type: mteb/NTREX metrics: - type: accuracy value: 17.676514772158235 - type: f1 value: 13.907186347256669 - type: main_score value: 13.907186347256669 - type: precision value: 12.923210518264245 - type: recall value: 17.676514772158235 task: type: BitextMining - dataset: config: rom-hun name: MTEB RomaTalesBitextMining (rom-hun) revision: f4394dbca6845743cd33eba77431767b232ef489 split: test type: kardosdrur/roma-tales metrics: - type: accuracy value: 5.116279069767442 - type: f1 value: 1.8488798023681745 - type: main_score value: 1.8488798023681745 - type: precision value: 1.472523686477175 - type: recall value: 5.116279069767442 task: type: BitextMining - dataset: config: hun_Latn name: MTEB SIB200Classification (hun_Latn) revision: a74d7350ea12af010cfb1c21e34f1f81fd2e615b split: test type: mteb/sib200 metrics: - type: accuracy value: 68.43137254901961 - type: f1 value: 67.64424216338097 - type: f1_weighted value: 68.34815340541722 - type: main_score value: 68.43137254901961 task: type: Classification - dataset: config: hun_Latn name: MTEB SIB200Classification (hun_Latn) revision: a74d7350ea12af010cfb1c21e34f1f81fd2e615b split: train type: mteb/sib200 metrics: - type: accuracy value: 69.04422253922966 - type: f1 value: 67.9515950437183 - type: f1_weighted value: 69.07832158763667 - type: main_score value: 69.04422253922966 task: type: Classification - dataset: config: hun_Latn name: MTEB SIB200Classification (hun_Latn) revision: a74d7350ea12af010cfb1c21e34f1f81fd2e615b split: validation type: mteb/sib200 metrics: - type: accuracy value: 64.54545454545453 - type: f1 value: 63.78373491440388 - type: f1_weighted value: 64.98788954233397 - type: main_score value: 64.54545454545453 task: type: Classification - dataset: config: hun_Latn name: MTEB SIB200ClusteringS2S (hun_Latn) revision: a74d7350ea12af010cfb1c21e34f1f81fd2e615b split: test type: mteb/sib200 metrics: - type: main_score value: 34.91858402487903 - type: v_measure value: 34.91858402487903 - type: v_measure_std value: 3.377463869658173 task: type: Clustering - dataset: config: hun-eng name: MTEB Tatoeba (hun-eng) revision: 69e8f12da6e31d59addadda9a9c8a2e601a0e282 split: test type: mteb/tatoeba-bitext-mining metrics: - type: accuracy value: 91.5 - type: f1 value: 89.06666666666666 - type: main_score value: 89.06666666666666 - type: precision value: 87.9 - type: recall value: 91.5 task: type: BitextMining tags: - mteb --- # paraphrase-multilingual-MiniLM-L12-hu-v2 This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2](https://huggingface.co/sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2) on the train dataset. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more. ## Model Details ### Model Description - **Model Type:** Sentence Transformer - **Base model:** [sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2](https://huggingface.co/sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2) - **Maximum Sequence Length:** 128 tokens - **Output Dimensionality:** 384 tokens - **Similarity Function:** Cosine Similarity - **Training Dataset:** - train - **Language:** hu - **License:** apache-2.0 ### Model Sources - **Documentation:** [Sentence Transformers Documentation](https://sbert.net) - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers) - **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers) ### Full Model Architecture ``` SentenceTransformer( (0): Transformer({'max_seq_length': 128, 'do_lower_case': False}) with Transformer model: BertModel (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True}) ) ``` ## Usage ### Direct Usage (Sentence Transformers) First install the Sentence Transformers library: ```bash pip install -U sentence-transformers ``` Then you can load this model and run inference. ```python from sentence_transformers import SentenceTransformer # Download from the 🤗 Hub model = SentenceTransformer("karsar/paraphrase-multilingual-MiniLM-L12-hu-v2") # Run inference sentences = [ 'Az emberek alszanak.', 'Egy apa és a fia ölelgeti alvás közben.', 'Egy csoport ember ül egy nyitott, térszerű területen, mögötte nagy bokrok és egy sor viktoriánus stílusú épület, melyek közül sokat a kép jobb oldalán lévő erős elmosódás tesz kivehetetlenné.', ] embeddings = model.encode(sentences) print(embeddings.shape) # [3, 384] # Get the similarity scores for the embeddings similarities = model.similarity(embeddings, embeddings) print(similarities.shape) # [3, 3] ``` ## Evaluation ### Metrics #### Triplet * Dataset: `all-nli-dev` * Evaluated with [TripletEvaluator](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.TripletEvaluator) | Metric | Value | |:-------------------|:-----------| | cosine_accuracy | 0.9918 | | dot_accuracy | 0.0102 | | manhattan_accuracy | 0.99 | | euclidean_accuracy | 0.99 | | **max_accuracy** | **0.9918** | #### Triplet * Dataset: `all-nli-test` * Evaluated with [TripletEvaluator](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.TripletEvaluator) | Metric | Value | |:-------------------|:-----------| | cosine_accuracy | 0.9938 | | dot_accuracy | 0.008 | | manhattan_accuracy | 0.9929 | | euclidean_accuracy | 0.9924 | | **max_accuracy** | **0.9938** | ## Training Details ### Training Dataset #### train * Dataset: train * Size: 1,044,013 training samples * Columns: anchor, positive, and negative * Approximate statistics based on the first 1000 samples: | | anchor | positive | negative | |:--------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------| | type | string | string | string | | details | | | | * Samples: | anchor | positive | negative | |:---------------------------------------------------------------------------|:----------------------------------------------|:---------------------------------------------------------------| | Egy lóháton ülő ember átugrik egy lerombolt repülőgép felett. | Egy ember a szabadban, lóháton. | Egy ember egy étteremben van, és omlettet rendel. | | Gyerekek mosolyogva és integetett a kamera | Gyermekek vannak jelen | A gyerekek homlokot rántanak | | Egy fiú ugrál a gördeszkát a közepén egy piros híd. | A fiú gördeszkás trükköt csinál. | A fiú korcsolyázik a járdán. | * Loss: [MultipleNegativesRankingLoss](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters: ```json { "scale": 20.0, "similarity_fct": "cos_sim" } ``` ### Evaluation Dataset #### train * Dataset: train * Size: 5,000 evaluation samples * Columns: anchor, positive, and negative * Approximate statistics based on the first 1000 samples: | | anchor | positive | negative | |:--------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------| | type | string | string | string | | details | | | | * Samples: | anchor | positive | negative | |:---------------------------------------------------------------------------|:----------------------------------------------|:---------------------------------------------------------------| | Egy lóháton ülő ember átugrik egy lerombolt repülőgép felett. | Egy ember a szabadban, lóháton. | Egy ember egy étteremben van, és omlettet rendel. | | Gyerekek mosolyogva és integetett a kamera | Gyermekek vannak jelen | A gyerekek homlokot rántanak | | Egy fiú ugrál a gördeszkát a közepén egy piros híd. | A fiú gördeszkás trükköt csinál. | A fiú korcsolyázik a járdán. | * Loss: [MultipleNegativesRankingLoss](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters: ```json { "scale": 20.0, "similarity_fct": "cos_sim" } ``` ### Training Hyperparameters #### Non-Default Hyperparameters - `eval_strategy`: steps - `per_device_train_batch_size`: 128 - `per_device_eval_batch_size`: 128 - `num_train_epochs`: 1 - `warmup_ratio`: 0.1 - `bf16`: True - `batch_sampler`: no_duplicates ### Framework Versions - Python: 3.11.8 - Sentence Transformers: 3.1.1 - Transformers: 4.44.0 - PyTorch: 2.3.0.post101 - Accelerate: 0.33.0 - Datasets: 3.0.2 - Tokenizers: 0.19.0 ## Citation ### BibTeX #### Sentence Transformers ```bibtex @inproceedings{reimers-2019-sentence-bert, title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks", author = "Reimers, Nils and Gurevych, Iryna", booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing", month = "11", year = "2019", publisher = "Association for Computational Linguistics", url = "https://arxiv.org/abs/1908.10084", } ``` #### MultipleNegativesRankingLoss ```bibtex @misc{henderson2017efficient, title={Efficient Natural Language Response Suggestion for Smart Reply}, author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil}, year={2017}, eprint={1705.00652}, archivePrefix={arXiv}, primaryClass={cs.CL} } ```