|
--- |
|
license: mit |
|
language: |
|
- it |
|
datasets: |
|
- squad_it |
|
widget: |
|
- text: Quale libro fu scritto da Alessandro Manzoni? |
|
context: Alessandro Manzoni pubblicò la prima versione dei Promessi Sposi nel 1827 |
|
- text: In quali competizioni gareggia la Ferrari? |
|
context: La Scuderia Ferrari è una squadra corse italiana di Formula 1 con sede a Maranello |
|
- text: Quale sport è riferito alla Serie A? |
|
context: Il campionato di Serie A è la massima divisione professionistica del campionato italiano di calcio maschile |
|
model-index: |
|
- name: osiria/deberta-italian-question-answering |
|
results: |
|
- task: |
|
type: question-answering |
|
name: Question Answering |
|
dataset: |
|
name: squad_it |
|
type: squad_it |
|
metrics: |
|
- type: exact-match |
|
value: 0.7004 |
|
name: Exact Match |
|
- type: f1 |
|
value: 0.8097 |
|
name: F1 |
|
pipeline_tag: question-answering |
|
--- |
|
|
|
-------------------------------------------------------------------------------------------------- |
|
|
|
<body> |
|
<span class="vertical-text" style="background-color:lightgreen;border-radius: 3px;padding: 3px;"> </span> |
|
<br> |
|
<span class="vertical-text" style="background-color:orange;border-radius: 3px;padding: 3px;"> Task: Question Answering</span> |
|
<br> |
|
<span class="vertical-text" style="background-color:lightblue;border-radius: 3px;padding: 3px;"> Model: DeBERTa</span> |
|
<br> |
|
<span class="vertical-text" style="background-color:tomato;border-radius: 3px;padding: 3px;"> Lang: IT</span> |
|
<br> |
|
<span class="vertical-text" style="background-color:lightgrey;border-radius: 3px;padding: 3px;"> </span> |
|
<br> |
|
<span class="vertical-text" style="background-color:#CF9FFF;border-radius: 3px;padding: 3px;"> </span> |
|
</body> |
|
|
|
-------------------------------------------------------------------------------------------------- |
|
|
|
<h3>Model description</h3> |
|
|
|
This is a <b>DeBERTa</b> <b>[1]</b> model for the <b>Italian</b> language, fine-tuned for <b>Extractive Question Answering</b> on the [SQuAD-IT](https://huggingface.co/datasets/squad_it) dataset <b>[2]</b>. |
|
The model is trained with an enhanced procedure that delivers top-level performance and reliability. The latest upgrade, code-name <b>LITEQA</b>, offers increased robustness and maintains optimal results even in uncased settings. |
|
|
|
<h3>Training and Performances</h3> |
|
|
|
The model is trained to perform question answering, given a context and a question (under the assumption that the context contains the answer to the question). It has been fine-tuned for Extractive Question Answering, using the SQuAD-IT dataset, for 2 epochs with a linearly decaying learning rate starting from 3e-5, maximum sequence length of 384 and document stride of 128. |
|
<br>The dataset includes 54.159 training instances and 7.609 test instances |
|
|
|
<b>update: version 2.0</b> |
|
|
|
The 2.0 version further improves the performances by exploiting a 2-phases fine-tuning strategy: the model is first fine-tuned on the English SQuAD v2 (1 epoch, 20% warmup ratio, and max learning rate of 3e-5) then further fine-tuned on the Italian SQuAD (2 epochs, no warmup, initial learning rate of 3e-5) |
|
|
|
In order to maximize the benefits of the multilingual procedure, [mdeberta-v3-base](https://huggingface.co/microsoft/mdeberta-v3-base) is used as a pre-trained model. When the double fine-tuning is completed, the embedding layer is then compressed as in [deberta-base-italian](https://huggingface.co/osiria/deberta-base-italian) to obtain a mono-lingual model size |
|
|
|
The performances on the test set are reported in the following table: |
|
|
|
(<b>version 2.0</b> performances) |
|
|
|
<br> |
|
|
|
<b>Cased setting:</b> |
|
|
|
| EM | F1 | |
|
| ------ | ------ | |
|
| 70.04 | 80.97 | |
|
|
|
<b>Uncased setting:</b> |
|
|
|
| EM | F1 | |
|
| ------ | ------ | |
|
| 68.55 | 80.11 | |
|
|
|
Testing notebook: https://huggingface.co/osiria/deberta-italian-question-answering/blob/main/osiria_deberta_italian_qa_evaluation.ipynb |
|
|
|
<b>update: version 3.0 (LITEQA)</b> |
|
|
|
The 3.0 version, with the nickname LITEQA, further improves the performances by exploiting a 3-phases fine-tuning strategy: the model is first fine-tuned on the English SQuAD v2 (1 epoch, 20% warmup ratio, and max learning rate of 3e-5) then further fine-tuned on the Italian SQuAD (2 epochs, no warmup, initial learning rate of 3e-5) and lastly fine-tuned on the lowercase Italian SQuAD (1 epoch, no warmup, initial learning rate of 3e-5). |
|
This helps making the model generally more robust, but particularly in uncased settings. |
|
|
|
The 3.0 version can be downloaded from the <b>liteqa</b> branch of this repo. |
|
The performances on the test set are reported in the following table: |
|
|
|
(<b>version 3.0</b> performances) |
|
|
|
<br> |
|
|
|
<b>Cased setting:</b> |
|
|
|
| EM | F1 | |
|
| ------ | ------ | |
|
| 70.19 | 81.01 | |
|
|
|
<b>Uncased setting:</b> |
|
|
|
| EM | F1 | |
|
| ------ | ------ | |
|
| 69.60 | 80.74 | |
|
|
|
Testing notebook: https://huggingface.co/osiria/deberta-italian-question-answering/blob/liteqa/osiria_liteqa_evaluation.ipynb |
|
|
|
<h3>Quick usage</h3> |
|
|
|
In order to get the best possible outputs from the model, it is recommended to use the following pipeline |
|
|
|
```python |
|
from transformers import DebertaV2TokenizerFast, DebertaV2ForQuestionAnswering |
|
import re |
|
import string |
|
from transformers.pipelines import QuestionAnsweringPipeline |
|
|
|
tokenizer = DebertaV2TokenizerFast.from_pretrained("osiria/deberta-italian-question-answering") |
|
model = DebertaV2ForQuestionAnswering.from_pretrained("osiria/deberta-italian-question-answering") |
|
|
|
class OsiriaQA(QuestionAnsweringPipeline): |
|
|
|
def __init__(self, punctuation = ',;.:!?()[\]{}', **kwargs): |
|
|
|
QuestionAnsweringPipeline.__init__(self, **kwargs) |
|
self.post_regex_left = "^[\s" + punctuation + "]+" |
|
self.post_regex_right = "[\s" + punctuation + "]+$" |
|
|
|
def postprocess(self, output): |
|
|
|
output = QuestionAnsweringPipeline.postprocess(self, model_outputs=output) |
|
output_length = len(output["answer"]) |
|
output["answer"] = re.sub(self.post_regex_left, "", output["answer"]) |
|
output["start"] = output["start"] + (output_length - len(output["answer"])) |
|
output_length = len(output["answer"]) |
|
output["answer"] = re.sub(self.post_regex_right, "", output["answer"]) |
|
output["end"] = output["end"] - (output_length - len(output["answer"])) |
|
|
|
return output |
|
|
|
pipeline_qa = OsiriaQA(model = model, tokenizer = tokenizer) |
|
pipeline_qa(context = "Alessandro Manzoni è nato a Milano nel 1785", |
|
question = "Dove è nato Manzoni?") |
|
|
|
# {'score': 0.9899800419807434, 'start': 28, 'end': 34, 'answer': 'Milano'} |
|
``` |
|
|
|
You can also try the model online using this web app: https://huggingface.co/spaces/osiria/deberta-italian-question-answering |
|
|
|
<h3>References</h3> |
|
|
|
[1] https://arxiv.org/abs/2111.09543 |
|
|
|
[2] https://link.springer.com/chapter/10.1007/978-3-030-03840-3_29 |
|
|
|
<h3>Limitations</h3> |
|
|
|
This model was trained on the English SQuAD v2 and on SQuAD-IT, which is mainly a machine translated version of the original SQuAD v1.1. This means that the quality of the training set is limited by the machine translation. |
|
Moreover, the model is meant to answer questions under the assumption that the required information is actually contained in the given context (which is the underlying assumption of SQuAD v1.1). |
|
If the assumption is violated, the model will try to return an answer in any case, which is going to be incorrect. |
|
|
|
<h3>License</h3> |
|
|
|
The model is released under <b>MIT</b> license |