metadata

base_model: google/mt5-xxl
license: apache-2.0
language:
  - it

modafact-ita

modafact-ita is a sequence-to-sequence fine-tuned model for joint event Factuality and Modality detection in Italian. The model was fine-tuned on ModaFact, a dataset manually annotated with Factuality and Modality values, using mT5-xxl as a base model.

Model Details

Model Description

Developed by: DH Group @ FBK
Model type: Sequence-to-sequence
Language(s) (NLP): Italian
License: apache-2.0
Finetuned from model: google/mt5-xxl

Model Sources

Inference script: if you want to use the model for inference, please refer to our github repo.
Paper: ModaFact: Multi-paradigm Evaluation for Joint Event Modality and Factuality Detection

Uses

The model can be used to detect event Factuality and Modality values. If you want to tag your own text, please refer to the inference script on our github repo. The model takes in input one sentence at a time, for example:

Per chiarire la questione la Santa Sede autorizzò il prelievo di campioni del legno che vennero datati attraverso l'utilizzo del metodo del carbonio-14.

and outputs a sequence of span=labels, in this format:

chiarire=POSSIBLE-POS-FUTURE-FINAL | autorizzò=CERTAIN-POS-PRESENT/PAST | prelievo=UNDERSPECIFIED-POS-FUTURE-CONCESSIVE | datati=CERTAIN-POS-PRESENT/PAST | utilizzo=CERTAIN-POS-PRESENT/PAST

Training Details

Training Data

https://huggingface.co/datasets/dhfbk/modafact-ita

Citation

If you use or refer to ModaFact, please consider citing this paper:

@inproceedings{rovera-etal-2025-modafact,
    title = "{M}oda{F}act: Multi-paradigm Evaluation for Joint Event Modality and Factuality Detection",
    author = "Rovera, Marco  and
      Cristoforetti, Serena  and
      Tonelli, Sara",
    editor = "Rambow, Owen  and
      Wanner, Leo  and
      Apidianaki, Marianna  and
      Al-Khalifa, Hend  and
      Eugenio, Barbara Di  and
      Schockaert, Steven",
    booktitle = "Proceedings of the 31st International Conference on Computational Linguistics",
    month = jan,
    year = "2025",
    address = "Abu Dhabi, UAE",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2025.coling-main.425/",
    pages = "6378--6396",
    abstract = "Factuality and modality are two crucial aspects concerning events, since they convey the speaker`s commitment to a situation in discourse as well as how this event is supposed to occur in terms of norms, wishes, necessity, duty and so on. Capturing them both is necessary to truly understand an utterance meaning and the speaker`s perspective with respect to a mentioned event. Yet, NLP studies have mostly dealt with these two aspects separately, mainly devoting past efforts to the development of English datasets. In this work, we propose ModaFact, a novel resource with joint factuality and modality information for event-denoting expressions in Italian. We propose a novel annotation scheme, which however is consistent with existing ones, and compare different classification systems trained on ModaFact, as a preliminary step to the use of factuality and modality information in downstream tasks. The dataset and the best-performing model are publicly released and available under an open license."
}