This is RuBERT model fine-tuned for emotion classification of short Russian texts. The task is a multi-label classification with the following labels:

0: admiration
1: amusement
2: anger
3: annoyance
4: approval
5: caring
6: confusion
7: curiosity
8: desire
9: disappointment
10: disapproval
11: disgust
12: embarrassment
13: excitement
14: fear
15: gratitude
16: grief
17: joy
18: love
19: nervousness
20: optimism
21: pride
22: realization
23: relief
24: remorse
25: sadness
26: surprise
27: neutral

Label to Russian label:

admiration: восхищение
amusement: веселье
anger: злость
annoyance: раздражение
approval: одобрение
caring: забота
confusion: непонимание
curiosity: любопытство
desire: желание
disappointment: разочарование
disapproval: неодобрение
disgust: отвращение
embarrassment: смущение
excitement: возбуждение
fear: страх
gratitude: признательность
grief: горе
joy: радость
love: любовь
nervousness: нервозность
optimism: оптимизм
pride: гордость
realization: осознание
relief: облегчение
remorse: раскаяние
sadness: грусть
surprise: удивление
neutral: нейтральность

Usage

from transformers import pipeline
model = pipeline(model="seara/rubert-base-cased-ru-go-emotions")
model("Привет, ты мне нравишься!")
# [{'label': 'love', 'score': 0.5456761717796326}]

Dataset

This model was trained on translated GoEmotions dataset called ru_go_emotions.

An overview of the training data can be found on Hugging Face card and on Github repository.

Training

Training were done in this project with this parameters:

tokenizer.max_length: null
batch_size: 32
optimizer: adam
lr: 0.00001
weight_decay: 0
num_epochs: 5

Eval results (on test split)

precision recall f1-score auc-roc support
admiration 0.66 0.66 0.66 0.93 504
amusement 0.79 0.81 0.8 0.97 264
anger 0.53 0.3 0.39 0.91 198
annoyance 0.0 0.0 0.0 0.82 320
approval 0.62 0.25 0.36 0.82 351
caring 0.69 0.13 0.22 0.86 135
confusion 0.56 0.18 0.28 0.92 153
curiosity 0.52 0.4 0.45 0.95 284
desire 0.67 0.24 0.35 0.89 83
disappointment 0.88 0.05 0.09 0.82 151
disapproval 0.56 0.17 0.26 0.88 267
disgust 0.83 0.2 0.33 0.92 123
embarrassment 0.0 0.0 0.0 0.88 37
excitement 0.78 0.14 0.23 0.9 103
fear 0.83 0.37 0.51 0.92 78
gratitude 0.94 0.9 0.92 0.99 352
grief 0.0 0.0 0.0 0.72 6
joy 0.7 0.4 0.51 0.94 161
love 0.77 0.81 0.79 0.97 238
nervousness 0.0 0.0 0.0 0.85 23
optimism 0.66 0.52 0.58 0.92 186
pride 0.0 0.0 0.0 0.76 16
realization 0.0 0.0 0.0 0.74 145
relief 0.0 0.0 0.0 0.72 11
remorse 0.58 0.68 0.63 0.99 56
sadness 0.58 0.44 0.5 0.92 156
surprise 0.62 0.45 0.52 0.91 141
neutral 0.72 0.47 0.57 0.84 1787
micro avg 0.7 0.42 0.53 0.94 6329
macro avg 0.52 0.31 0.36 0.88 6329
weighted avg 0.63 0.42 0.49 0.88 6329
Downloads last month
537
Safetensors
Model size
178M params
Tensor type
I64
·
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train seara/rubert-base-cased-russian-emotion-detection-ru-go-emotions