|
--- |
|
license: cc-by-nc-sa-4.0 |
|
--- |
|
|
|
# Inclusively Classification Model |
|
|
|
This model is an Italian classification model fine-tuned from the [Italian BERT model](https://huggingface.co/dbmdz/bert-base-italian-xxl-cased) for the classification of inclusive language in Italian. |
|
|
|
It has been trained to detect three classes: |
|
- `inclusive`: the sentence is inclusive (e.g. "Il personale docente e non docente") |
|
- `not_inclusive`: the sentence is not inclusive (e.g. "I professori") |
|
- `not_pertinent`: the sentence is not pertinent to the task (e.g. "La scuola è chiusa") |
|
|
|
## Training data |
|
|
|
The model has been trained on a dataset containing: |
|
- 8580 training sentences |
|
- 1073 validation sentences |
|
- 1072 test sentences |
|
|
|
The data collection has been manually annotated by experts in the field of inclusive language (dataset is not publicly available yet). |
|
|
|
## Training procedure |
|
|
|
The model has been fine-tuned from the [Italian BERT model](https://huggingface.co/dbmdz/bert-base-italian-xxl-cased) using the following hyperparameters: |
|
- `max_length`: 128 |
|
- `batch_size`: 128 |
|
- `learning_rate`: 5e-5 |
|
- `warmup_steps`: 500 |
|
- `epochs`: 10 (best model is selected based on validation accuracy) |
|
- `optimizer`: AdamW |
|
|
|
## Evaluation results |
|
|
|
The model has been evaluated on the test set and obtained the following results: |
|
|
|
| Model | Accuracy | Inclusive F1 | Not inclusive F1 | Not pertinent F1 | |
|
|-------|----------|--------------|------------------|------------------| |
|
| TF-IDF + MLP | 0.68 | 0.63 | 0.69 | 0.66 | |
|
| TF-IDF + SVM | 0.61 | 0.53 | 0.60 | 0.78 | |
|
| TF-IDF + GB | 0.74 | 0.74 | 0.76 | 0.72 | |
|
| multilingual | 0.86 | 0.88 | 0.89 | 0.83 | |
|
| **This** | 0.89 | 0.88 | 0.92 | 0.85 | |
|
|
|
The model has been compared with a multilingual model trained on the same data and obtained better results. |
|
|
|
## Citation |
|
|
|
If you use this model, please make sure to cite the following papers: |
|
|
|
**Demo paper**: |
|
|
|
```bibtex |
|
|
|
``` |
|
|
|
**Main paper**: |
|
|
|
```bibtex |
|
|
|
``` |
|
|