Update README.md
Browse files
README.md
CHANGED
@@ -13,7 +13,7 @@ license: mit
|
|
13 |
|
14 |
Please check the [official repository](https://github.com/microsoft/DeBERTa) for more details and updates.
|
15 |
|
16 |
-
In DeBERTa V3 we replaced MLM objective with RTD(Replaced Token Detection) objective
|
17 |
|
18 |
This is the DeBERTa V3 small model with 6 layers, 768 hidden size. Total parameters is 143M while Embedding layer take about 98M due to the usage of 128k vocabulary. It's trained with 160GB data.
|
19 |
|
|
|
13 |
|
14 |
Please check the [official repository](https://github.com/microsoft/DeBERTa) for more details and updates.
|
15 |
|
16 |
+
In DeBERTa V3 we replaced MLM objective with RTD(Replaced Token Detection) objective which was first introduced by ELECTRA for pre-training. The new objective significantly improves the model performance. Please check appendix A11 in our paper [DeBERTa](https://arxiv.org/abs/2006.03654) for more details.
|
17 |
|
18 |
This is the DeBERTa V3 small model with 6 layers, 768 hidden size. Total parameters is 143M while Embedding layer take about 98M due to the usage of 128k vocabulary. It's trained with 160GB data.
|
19 |
|