osiria
/

deberta-italian-question-answering

Question Answering

Inference Endpoints

Model card Files Files and versions Community

osiria commited on Jun 10, 2023

Commit

55e62ec

•

1 Parent(s): 35b9bf5

Update README.md

Files changed (1) hide show

README.md +2 -2

README.md CHANGED Viewed

@@ -50,13 +50,13 @@ pipeline_tag: question-answering
 <h3>Model description</h3>
-This is a <b>DeBERTa</b> <b>[1]</b> model for the <b>Italian</b> language, fine-tuned for <b>Extractive Question Answering</b> on the [SQuAD-IT](https://huggingface.co/datasets/squad_it) dataset <b>[2]</b>, using <b>DeBERTa-ITALIAN</b> ([deberta-base-italian](https://huggingface.co/osiria/deberta-base-italian)) as a pre-trained model.
 <b>update: version 2.0</b>
 The 2.0 version further improves the performances by exploiting a 2-phases fine-tuning strategy: the model is first fine-tuned on the English SQuAD v2 (1 epoch, 20% warmup ratio, and max learning rate of 3e-5) then further fine-tuned on the Italian SQuAD (2 epochs, no warmup, initial learning rate of 3e-5)
-In order to maximize the benefits of the procedure, [mdeberta-v3-base](https://huggingface.co/microsoft/mdeberta-v3-base) is now directly used as a pre-trained model. When the double fine-tuning is completed, the embedding layer is then compressed as in [deberta-base-italian](https://huggingface.co/osiria/deberta-base-italian) to obtain a mono-lingual model size
 <h3>Training and Performances</h3>

 <h3>Model description</h3>
+This is a <b>DeBERTa</b> <b>[1]</b> model for the <b>Italian</b> language, fine-tuned for <b>Extractive Question Answering</b> on the [SQuAD-IT](https://huggingface.co/datasets/squad_it) dataset <b>[2]</b>.
 <b>update: version 2.0</b>
 The 2.0 version further improves the performances by exploiting a 2-phases fine-tuning strategy: the model is first fine-tuned on the English SQuAD v2 (1 epoch, 20% warmup ratio, and max learning rate of 3e-5) then further fine-tuned on the Italian SQuAD (2 epochs, no warmup, initial learning rate of 3e-5)
+In order to maximize the benefits of the multilingual procedure, [mdeberta-v3-base](https://huggingface.co/microsoft/mdeberta-v3-base) is used as a pre-trained model. When the double fine-tuning is completed, the embedding layer is then compressed as in [deberta-base-italian](https://huggingface.co/osiria/deberta-base-italian) to obtain a mono-lingual model size
 <h3>Training and Performances</h3>