Added some simple evaluation results
Browse files
README.md
CHANGED
@@ -91,7 +91,24 @@ For further details see [Niklaus et al. 2023](https://arxiv.org/abs/2306.02069?u
|
|
91 |
|
92 |
## Evaluation
|
93 |
|
94 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
95 |
|
96 |
### Model Architecture and Objective
|
97 |
|
|
|
91 |
|
92 |
## Evaluation
|
93 |
|
94 |
+
The results are based on the text classification tasks presented in [Niklaus et al. (2023)](https://arxiv.org/abs/2306.09237) which are part of [LEXTREME](https://huggingface.co/datasets/joelito/lextreme).
|
95 |
+
We provide the arithmetic mean over three seeds (1, 2, 3) based on the macro-F1-score on the test set.
|
96 |
+
We compare joelito/legal-swiss-longformer-base with the five different multilingual models, namely: [microsoft/Multilingual-MiniLM-L12-H384](https://huggingface.co/microsoft/Multilingual-MiniLM-L12-H384), [distilbert-base-multilingual-cased](https://huggingface.co/distilbert-base-multilingual-cased), [microsoft/mdeberta-v3-base](https://huggingface.co/microsoft/deberta-v3-base), [xlm-roberta-base](https://huggingface.co/xlm-roberta-base), [xlm-roberta-large](https://huggingface.co/xlm-roberta-large).
|
97 |
+
The highest values are in bold.
|
98 |
+
|
99 |
+
|
100 |
+
| _name_or_path | SCP-BC | SCP-BF | SCP-CC | SCP-CF | SJPXL-C | SJPXL-F | SLAP-SC | SLAP-SF |
|
101 |
+
|:---------------------------------------|:----------|:----------|:----------|:----------|:----------|:----------|:----------|:----------|
|
102 |
+
| microsoft/Multilingual-MiniLM-L12-H384 | 67.29 | 56.56 | 24.23 | 14.9 | 79.52 | 58.29 | 63.03 | 67.57 |
|
103 |
+
| distilbert-base-multilingual-cased | 66.56 | 56.58 | 22.67 | 21.31 | 77.26 | 60.79 | 73.54 | 72.24 |
|
104 |
+
| microsoft/mdeberta-v3-base | 72.01 | 57.59 | 22.93 | **25.18** | 79.41 | 60.89 | 67.64 | 74.13 |
|
105 |
+
| xlm-roberta-base | 68.55 | 58.48 | 25.66 | 21.52 | 80.98 | 61.45 | 79.3 | 74.47 |
|
106 |
+
| xlm-roberta-large | 69.5 | 58.15 | 27.9 | 22.05 | 82.19 | 61.24 | 81.09 | 71.82 |
|
107 |
+
| joelito/legal-swiss-longformer-base | **73.25** | **60.06** | **28.68** | 24.39 | **87.46** | **65.23** | **83.84** | **77.96** |
|
108 |
+
|
109 |
+
|
110 |
+
|
111 |
+
For more detailed insights into the performance on downstream tasks, such as [LEXTREME](https://huggingface.co/datasets/joelito/lextreme) ([Niklaus et al. 2023](https://arxiv.org/abs/2301.13126)) or [LEXGLUE](https://huggingface.co/datasets/lex_glue) ([Chalkidis et al. 2021](https://arxiv.org/abs/2110.00976)), we refer to the results presented in Niklaus et al. (2023) [1](https://arxiv.org/abs/2306.02069), [2](https://arxiv.org/abs/2306.09237).
|
112 |
|
113 |
### Model Architecture and Objective
|
114 |
|