|
--- |
|
license: apache-2.0 |
|
metrics: |
|
- accuracy |
|
- f1 |
|
pipeline_tag: text-classification |
|
tags: |
|
- irony |
|
language: |
|
- it |
|
--- |
|
# Irony at aequa-tech |
|
|
|
## Model Description |
|
|
|
- **Developed by:** [aequa-tech](https://aequa-tech.com/) |
|
- **Funded by:** [NGI-Search](https://www.ngi.eu/ngi-projects/ngi-search/) |
|
- **Language(s) (NLP):** Italian |
|
- **License:** apache-2.0 |
|
- **Finetuned from model:** [AlBERTo](https://huggingface.co/m-polignano-uniba/bert_uncased_L-12_H-768_A-12_italian_alberto) |
|
|
|
This model is a fine-tuned version of [AlBERTo](https://huggingface.co/m-polignano-uniba/bert_uncased_L-12_H-768_A-12_italian_alberto) Italian model on **irony detection** |
|
|
|
# Training Details |
|
|
|
## Training Data |
|
|
|
- [IronITA 2018](https://live.european-language-grid.eu/catalogue/corpus/7372) |
|
- [Sarcastic Hate Speech dataset](https://github.com/simonasnow/Sarcastic-Hate-Speech) |
|
- SENTIPOLC [2014](https://live.european-language-grid.eu/catalogue/corpus/7480)/[2016](https://live.european-language-grid.eu/catalogue/corpus/7479) |
|
- [Debunker-Assistant corpus](https://github.com/AequaTech/DebunkerAssistant/tree/main/evaluation/training_datasets) |
|
|
|
## Training Hyperparameters |
|
|
|
- learning_rate: 2e-5 |
|
- train_batch_size: 16 |
|
- eval_batch_size: 16 |
|
- seed: 42 |
|
- optimizer: Adam |
|
|
|
|
|
# Evaluation |
|
|
|
## Testing Data |
|
It was tested on IronITA test set obtaining the following results: |
|
|
|
## Metrics and Results |
|
|
|
- macro F1: 0.79 |
|
- accuracy: 0.79 |
|
- precision of positive class: 0.77 |
|
- recall of positive class: 0.84 |
|
- F1 of positive class: 0.80 |
|
|
|
# Framework versions |
|
|
|
- Transformers 4.30.2 |
|
- Pytorch 2.1.2 |
|
- Datasets 2.19.0 |
|
- Accelerate 0.30.0 |
|
|
|
# How to use this model: |
|
```Python |
|
model = AutoModelForSequenceClassification.from_pretrained('aequa-tech/irony-it',num_labels=2) |
|
tokenizer = AutoTokenizer.from_pretrained("m-polignano-uniba/bert_uncased_L-12_H-768_A-12_italian_alb3rt0") |
|
classifier = pipeline("text-classification", model=model, tokenizer=tokenizer) |
|
classifier("Prendi una gioia. Ora posala, che non è tua.") |
|
``` |
|
|