Update README.md
Browse files
README.md
CHANGED
@@ -1,3 +1,74 @@
|
|
1 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
2 |
license: cc-by-4.0
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
3 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
---
|
2 |
+
language: da
|
3 |
+
tags:
|
4 |
+
- danish
|
5 |
+
- bert
|
6 |
+
- sentiment
|
7 |
+
- text-classification
|
8 |
+
- Maltehb/danish-bert-botxo
|
9 |
+
- Helsinki-NLP/opus-mt-en-da
|
10 |
+
- go-emotion
|
11 |
+
- Certainly
|
12 |
license: cc-by-4.0
|
13 |
+
datasets:
|
14 |
+
- go_emotions
|
15 |
+
metrics:
|
16 |
+
- Accuracy
|
17 |
+
widget:
|
18 |
+
- text: "Det er så sødt af dig at tænke på andre på den måde ved du det?"
|
19 |
+
- text: "Jeg vil gerne have en playstation."
|
20 |
+
- text: "Jeg elsker dig"
|
21 |
---
|
22 |
+
|
23 |
+
# Danish-Bert-GoÆmotion
|
24 |
+
|
25 |
+
Danish Go-Emotion classifier. [Maltehb/danish-bert-botxo](https://huggingface.co/Maltehb/danish-bert-botxo) (uncased) finetuned on a translation of the [go_emotion](https://huggingface.co/datasets/go_emotions) dataset using [Helsinki-NLP/opus-mt-en-da](https://huggingface.co/Helsinki-NLP/opus-mt-de-en). Thus,performance is obviousely only as good as the translation model.
|
26 |
+
|
27 |
+
|
28 |
+
## Training Parameters:
|
29 |
+
|
30 |
+
```
|
31 |
+
Num examples = 189900
|
32 |
+
Num Epochs = 3
|
33 |
+
Train batch = 8
|
34 |
+
Eval batch = 8
|
35 |
+
Learning Rate = 3e-5
|
36 |
+
Warmup steps = 4273
|
37 |
+
Total optimization steps = 71125
|
38 |
+
```
|
39 |
+
|
40 |
+
## Loss
|
41 |
+
### Training loss
|
42 |
+
![](wb_loss.png)
|
43 |
+
|
44 |
+
### Eval. loss
|
45 |
+
```
|
46 |
+
0.1178 (21100 examples)
|
47 |
+
```
|
48 |
+
|
49 |
+
|
50 |
+
## Using the model with `transformers`
|
51 |
+
Easiest use with `transformers` and `pipeline`:
|
52 |
+
|
53 |
+
```python
|
54 |
+
from transformers import AutoTokenizer, AutoModelForSequenceClassification, pipeline
|
55 |
+
|
56 |
+
model = AutoModelForSequenceClassification.from_pretrained('RJuro/danish-bert-go-aemotion')
|
57 |
+
tokenizer = AutoTokenizer.from_pretrained('RJuro/danish-bert-go-aemotion')
|
58 |
+
|
59 |
+
classifier = pipeline("sentiment-analysis", model=model, tokenizer=tokenizer)
|
60 |
+
|
61 |
+
classifier('jeg elsker dig')
|
62 |
+
```
|
63 |
+
|
64 |
+
`[{'label': 'kærlighed', 'score': 0.9634820818901062}]`
|
65 |
+
|
66 |
+
## Using the model with `simpletransformers`
|
67 |
+
|
68 |
+
```python
|
69 |
+
from simpletransformers.classification import MultiLabelClassificationModel
|
70 |
+
|
71 |
+
model = MultiLabelClassificationModel('bert', 'RJuro/danish-bert-go-aemotion')
|
72 |
+
|
73 |
+
predictions, raw_outputs = model.predict(df['text'])
|
74 |
+
```
|