File size: 4,131 Bytes
e29eabb
1611116
f664777
1611116
 
 
 
 
 
 
 
 
 
 
 
 
 
 
5009d15
 
 
 
 
 
 
 
 
 
 
 
 
46b6413
5009d15
 
 
 
 
 
 
 
 
 
 
 
 
16b199f
3a5a5c0
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
16b199f
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
---

license: mit
language: ["ru"]
tags:
- russian
- classification
- emotion
- emotion-detection
- emotion-recognition
- multiclass
widget:
- text: "Как дела?"
- text: "Дурак твой дед"
- text: "Только попробуй!!!"
- text: "Не хочу в школу("
- text: "Сейчас ровно час дня"
- text: "А ты уверен, что эти полоски снизу не врут? Точно уверен? Вот прям 100 процентов?"
datasets:
- Aniemore/cedr-m7
model-index:
- name: RuBERT tiny2 For Russian Text Emotion Detection by Ilya Lubenets
  results:
  - task:
      name: Multilabel Text Classification
      type: multilabel-text-classification
    dataset:
      name: CEDR M7
      type: Aniemore/cedr-m7
      args: ru
    metrics:
    - name: multilabel accuracy
      type: accuracy
      value: 85%
  - task:
      name: Text Classification
      type: text-classification
    dataset:
      name: CEDR M7
      type: Aniemore/cedr-m7
      args: ru
    metrics:
    - name: accuracy
      type: accuracy
      value: 76%

---

# First - you should prepare few functions to talk to model

```python
import torch
from transformers import BertForSequenceClassification, AutoTokenizer

LABELS = ['neutral', 'happiness', 'sadness', 'enthusiasm', 'fear', 'anger', 'disgust']
tokenizer = AutoTokenizer.from_pretrained('Aniemore/rubert-tiny2-russian-emotion-detection')
model = BertForSequenceClassification.from_pretrained('Aniemore/rubert-tiny2-russian-emotion-detection')

@torch.no_grad()
def predict_emotion(text: str) -> str:
    """
        We take the input text, tokenize it, pass it through the model, and then return the predicted label
        :param text: The text to be classified
        :type text: str
        :return: The predicted emotion
    """
    inputs = tokenizer(text, max_length=512, padding=True, truncation=True, return_tensors='pt')
    outputs = model(**inputs)
    predicted = torch.nn.functional.softmax(outputs.logits, dim=1)
    predicted = torch.argmax(predicted, dim=1).numpy()
        
    return LABELS[predicted[0]]

@torch.no_grad()    
def predict_emotions(text: str) -> list:
    """
        It takes a string of text, tokenizes it, feeds it to the model, and returns a dictionary of emotions and their
        probabilities
        :param text: The text you want to classify
        :type text: str
        :return: A dictionary of emotions and their probabilities.
    """
    inputs = tokenizer(text, max_length=512, padding=True, truncation=True, return_tensors='pt')
    outputs = model(**inputs)
    predicted = torch.nn.functional.softmax(outputs.logits, dim=1)
    emotions_list = {}
    for i in range(len(predicted.numpy()[0].tolist())):
        emotions_list[LABELS[i]] = predicted.numpy()[0].tolist()[i]
    return emotions_list
```

# And then - just gently ask a model to predict your emotion

```python
simple_prediction = predict_emotion("Какой же сегодня прекрасный день, братья")
not_simple_prediction = predict_emotions("Какой же сегодня прекрасный день, братья")

print(simple_prediction)
print(not_simple_prediction)
# happiness
# {'neutral': 0.0004941817605867982, 'happiness': 0.9979524612426758, 'sadness': 0.0002536600804887712, 'enthusiasm': 0.0005498139653354883, 'fear': 0.00025326196919195354, 'anger': 0.0003583927755244076, 'disgust': 0.00013807788491249084}
```

# Or, just simply use [our package (GitHub)](https://github.com/aniemore/Aniemore), that can do whatever you want (or maybe not)
🤗

# Citations
```
@misc{Aniemore,
  author = {Артем Аментес, Илья Лубенец, Никита Давидчук},
  title = {Открытая библиотека искусственного интеллекта для анализа и выявления эмоциональных оттенков речи человека},
  year = {2022},
  publisher = {Hugging Face},
  journal = {Hugging Face Hub},
  howpublished = {\url{https://huggingface.com/aniemore/Aniemore}},
  email = {[email protected]}
}
```