microsoft
/

deberta-v3-small

Inference Endpoints

Model card Files Files and versions Community

DeBERTa commited on Oct 20, 2021

Commit

f439f2e

•

1 Parent(s): 35addd0

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -13,7 +13,7 @@ license: mit
 Please check the [official repository](https://github.com/microsoft/DeBERTa) for more details and updates.
-In DeBERTa V3 we replaced MLM objective with RTD(Replaced Token Detection) objective during pre-training, which significantly improves the model performance. Please check appendix A11 in our paper [DeBERTa](https://arxiv.org/abs/2006.03654) for more details.
 This is the DeBERTa V3 small model with 6 layers, 768 hidden size. Total parameters is 143M while Embedding layer take about 98M due to the usage of 128k vocabulary. It's trained with 160GB data.

 Please check the [official repository](https://github.com/microsoft/DeBERTa) for more details and updates.
+In DeBERTa V3 we replaced MLM objective with RTD(Replaced Token Detection) objective which was first introduced by ELECTRA for pre-training. The new objective significantly improves the model performance. Please check appendix A11 in our paper [DeBERTa](https://arxiv.org/abs/2006.03654) for more details.
 This is the DeBERTa V3 small model with 6 layers, 768 hidden size. Total parameters is 143M while Embedding layer take about 98M due to the usage of 128k vocabulary. It's trained with 160GB data.