metadata

license: cc-by-nc-sa-4.0

Inclusively Classification Model

This model is an Italian classification model fine-tuned from the Italian BERT model for the classification of inclusive language in Italian.

It has been trained to detect three classes:

inclusive: the sentence is inclusive (e.g. "Il personale docente e non docente")
not_inclusive: the sentence is not inclusive (e.g. "I professori")
not_pertinent: the sentence is not pertinent to the task (e.g. "La scuola è chiusa")

Training data

The model has been trained on a dataset containing:

The data collection has been manually annotated by experts in the field of inclusive language (dataset is not publicly available yet).

The model has been fine-tuned from the Italian BERT model using the following hyperparameters:

The model has been evaluated on the test set and obtained the following results:

Model	Accuracy	Inclusive F1	Not inclusive F1	Not pertinent F1
TF-IDF + MLP	0.68	0.63	0.69	0.66
TF-IDF + SVM	0.61	0.53	0.60	0.78
TF-IDF + GB	0.74	0.74	0.76	0.72
multilingual	0.86	0.88	0.89	0.83
This	0.89	0.88	0.92	0.85

The model has been compared with a multilingual model trained on the same data and obtained better results.

If you use this model, please make sure to cite the following papers:

Demo paper:

Main paper: