|
--- |
|
language: da |
|
tags: |
|
- danish |
|
- bert |
|
- sentiment |
|
- text-classification |
|
- Maltehb/danish-bert-botxo |
|
- Helsinki-NLP/opus-mt-en-da |
|
- go-emotion |
|
- Certainly |
|
license: cc-by-4.0 |
|
datasets: |
|
- go_emotions |
|
metrics: |
|
- Accuracy |
|
widget: |
|
- text: "Det er så sødt af dig at tænke på andre på den måde ved du det?" |
|
- text: "Jeg vil gerne have en playstation." |
|
- text: "Jeg elsker dig" |
|
- text: "Hvordan håndterer jeg min irriterende nabo?" |
|
--- |
|
|
|
# Danish-Bert-GoÆmotion |
|
|
|
Danish Go-Emotions classifier. [Maltehb/danish-bert-botxo](https://huggingface.co/Maltehb/danish-bert-botxo) (uncased) finetuned on a translation of the [go_emotions](https://huggingface.co/datasets/go_emotions) dataset using [Helsinki-NLP/opus-mt-en-da](https://huggingface.co/Helsinki-NLP/opus-mt-de-en). Thus, performance is obviousely dependent on the translation model. |
|
|
|
## Training |
|
- Translating the training data with MT: [Notebook](https://colab.research.google.com/github/RJuro/Da-HyggeBERT-finetuning/blob/main/HyggeBERT_translation_en_da.ipynb) |
|
- Fine-tuning danish-bert-botxo: coming soon... |
|
|
|
## Training Parameters: |
|
|
|
``` |
|
Num examples = 189900 |
|
Num Epochs = 3 |
|
Train batch = 8 |
|
Eval batch = 8 |
|
Learning Rate = 3e-5 |
|
Warmup steps = 4273 |
|
Total optimization steps = 71125 |
|
``` |
|
|
|
## Loss |
|
### Training loss |
|
![](wb_loss.png) |
|
|
|
### Eval. loss |
|
``` |
|
0.1178 (21100 examples) |
|
``` |
|
|
|
|
|
## Using the model with `transformers` |
|
Easiest use with `transformers` and `pipeline`: |
|
|
|
```python |
|
from transformers import AutoTokenizer, AutoModelForSequenceClassification, pipeline |
|
|
|
model = AutoModelForSequenceClassification.from_pretrained('RJuro/danish-bert-go-aemotion') |
|
tokenizer = AutoTokenizer.from_pretrained('RJuro/danish-bert-go-aemotion') |
|
|
|
classifier = pipeline("sentiment-analysis", model=model, tokenizer=tokenizer) |
|
|
|
classifier('jeg elsker dig') |
|
``` |
|
|
|
`[{'label': 'kærlighed', 'score': 0.9634820818901062}]` |
|
|
|
## Using the model with `simpletransformers` |
|
|
|
```python |
|
from simpletransformers.classification import MultiLabelClassificationModel |
|
|
|
model = MultiLabelClassificationModel('bert', 'RJuro/danish-bert-go-aemotion') |
|
|
|
predictions, raw_outputs = model.predict(df['text']) |
|
``` |