Update README.md
Browse files
README.md
CHANGED
@@ -1,3 +1,65 @@
|
|
1 |
-
---
|
2 |
-
|
3 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
language: en
|
3 |
+
tags:
|
4 |
+
- text-classification
|
5 |
+
- tensorflow
|
6 |
+
- roberta
|
7 |
+
datasets:
|
8 |
+
- go_emotions
|
9 |
+
license: mit
|
10 |
+
---
|
11 |
+
|
12 |
+
Contributors:
|
13 |
+
- Rohan Kamath [linkedin.com/in/rohanrkamath](https://www.linkedin.com/in/rohanrkamath/)
|
14 |
+
- Arpan Ghoshal [linkedin.com/in/arpanghoshal](https://www.linkedin.com/in/arpanghoshal)
|
15 |
+
|
16 |
+
## What is GoEmotions
|
17 |
+
|
18 |
+
Dataset labelled 58000 Reddit comments with 28 emotions
|
19 |
+
|
20 |
+
- admiration, amusement, anger, annoyance, approval, caring, confusion, curiosity, desire, disappointment, disapproval, disgust, embarrassment, excitement, fear, gratitude, grief, joy, love, nervousness, optimism, pride, realization, relief, remorse, sadness, surprise + neutral
|
21 |
+
|
22 |
+
|
23 |
+
## What is RoBERTa
|
24 |
+
|
25 |
+
RoBERTa builds on BERT’s language masking strategy and modifies key hyperparameters in BERT, including removing BERT’s next-sentence pretraining objective, and training with much larger mini-batches and learning rates. RoBERTa was also trained on an order of magnitude more data than BERT, for a longer amount of time. This allows RoBERTa representations to generalize even better to downstream tasks compared to BERT.
|
26 |
+
|
27 |
+
|
28 |
+
## Hyperparameters
|
29 |
+
|
30 |
+
| Parameter | |
|
31 |
+
| ----------------- | :---: |
|
32 |
+
| Learning rate | 5e-5 |
|
33 |
+
| Epochs | 10 |
|
34 |
+
| Max Seq Length | 50 |
|
35 |
+
| Batch size | 16 |
|
36 |
+
| Warmup Proportion | 0.1 |
|
37 |
+
| Epsilon | 1e-8 |
|
38 |
+
|
39 |
+
|
40 |
+
## Results
|
41 |
+
|
42 |
+
Best Result of `Macro F1` - 49.30%
|
43 |
+
|
44 |
+
## Usage
|
45 |
+
|
46 |
+
```python
|
47 |
+
|
48 |
+
from transformers import RobertaTokenizerFast, TFRobertaForSequenceClassification, pipeline
|
49 |
+
|
50 |
+
tokenizer = RobertaTokenizerFast.from_pretrained("arpanghoshal/EmoRoBERTa")
|
51 |
+
model = TFRobertaForSequenceClassification.from_pretrained("arpanghoshal/EmoRoBERTa")
|
52 |
+
|
53 |
+
emotion = pipeline('sentiment-analysis',
|
54 |
+
model='arpanghoshal/EmoRoBERTa')
|
55 |
+
|
56 |
+
emotion_labels = emotion("Thanks for using it.")
|
57 |
+
print(emotion_labels)
|
58 |
+
|
59 |
+
```
|
60 |
+
Output
|
61 |
+
|
62 |
+
```
|
63 |
+
[{'label': 'gratitude', 'score': 0.9964383244514465}]
|
64 |
+
```
|
65 |
+
|