Update README.md
Browse files
README.md
CHANGED
@@ -120,7 +120,7 @@ If you plan to fine-tune this model on some downstream tasks, you can follow the
|
|
120 |
|
121 |
#### Task-Specific Learning Rates
|
122 |
|
123 |
-
##### Sequence Classification
|
124 |
|
125 |
| Dataset | EuroBERT-210m | EuroBERT-610m | EuroBERT-2.1B |
|
126 |
|--------------------------------------|----------------|----------------|----------------|
|
@@ -133,7 +133,7 @@ If you plan to fine-tune this model on some downstream tasks, you can follow the
|
|
133 |
| CodeComplexity | 3.6e-05 | 3.6e-05 | 1.0e-05 |
|
134 |
| MathShepherd | 7.7e-05 | 2.8e-05 | 1.7e-05 |
|
135 |
|
136 |
-
##### Sequence Regression
|
137 |
|
138 |
| Dataset | EuroBERT-210m | EuroBERT-610m | EuroBERT-2.1B |
|
139 |
|--------------------------|----------------|----------------|----------------|
|
@@ -141,8 +141,7 @@ If you plan to fine-tune this model on some downstream tasks, you can follow the
|
|
141 |
| SummevalMultilingual | 3.6e-05 | 2.8e-05 | 3.6e-05 |
|
142 |
| WMT | 2.8e-05 | 2.8e-05 | 1.3e-05 |
|
143 |
|
144 |
-
##### Retrieval
|
145 |
-
|
146 |
| Dataset | EuroBERT-210m | EuroBERT-610m | EuroBERT-2.1B |
|
147 |
|-----------------------------------------|----------------|----------------|----------------|
|
148 |
| MIRACL | 4.6e-05 | 3.6e-05 | 2.8e-05 |
|
@@ -153,8 +152,6 @@ If you plan to fine-tune this model on some downstream tasks, you can follow the
|
|
153 |
| CqaDupStackMath | 4.6e-05 | 2.8e-05 | 3.6e-05 |
|
154 |
| MathFormula | 1.7e-05 | 3.6e-05 | 3.6e-05 |
|
155 |
|
156 |
-
**Disclaimer**: These are suggested hyperparameters based on our experiments. We recommend conducting your own grid search for best results on your specific downstream task.
|
157 |
-
|
158 |
## License
|
159 |
|
160 |
We release the EuroBERT model architectures, model weights, and training codebase under the Apache 2.0 license.
|
|
|
120 |
|
121 |
#### Task-Specific Learning Rates
|
122 |
|
123 |
+
##### Sequence Classification:
|
124 |
|
125 |
| Dataset | EuroBERT-210m | EuroBERT-610m | EuroBERT-2.1B |
|
126 |
|--------------------------------------|----------------|----------------|----------------|
|
|
|
133 |
| CodeComplexity | 3.6e-05 | 3.6e-05 | 1.0e-05 |
|
134 |
| MathShepherd | 7.7e-05 | 2.8e-05 | 1.7e-05 |
|
135 |
|
136 |
+
##### Sequence Regression:
|
137 |
|
138 |
| Dataset | EuroBERT-210m | EuroBERT-610m | EuroBERT-2.1B |
|
139 |
|--------------------------|----------------|----------------|----------------|
|
|
|
141 |
| SummevalMultilingual | 3.6e-05 | 2.8e-05 | 3.6e-05 |
|
142 |
| WMT | 2.8e-05 | 2.8e-05 | 1.3e-05 |
|
143 |
|
144 |
+
##### Retrieval:
|
|
|
145 |
| Dataset | EuroBERT-210m | EuroBERT-610m | EuroBERT-2.1B |
|
146 |
|-----------------------------------------|----------------|----------------|----------------|
|
147 |
| MIRACL | 4.6e-05 | 3.6e-05 | 2.8e-05 |
|
|
|
152 |
| CqaDupStackMath | 4.6e-05 | 2.8e-05 | 3.6e-05 |
|
153 |
| MathFormula | 1.7e-05 | 3.6e-05 | 3.6e-05 |
|
154 |
|
|
|
|
|
155 |
## License
|
156 |
|
157 |
We release the EuroBERT model architectures, model weights, and training codebase under the Apache 2.0 license.
|