File size: 2,138 Bytes
149c9e6
b9011d9
 
149c9e6
b9011d9
 
 
149c9e6
b9011d9
 
 
 
 
 
 
 
 
149c9e6
 
 
 
b9011d9
 
149c9e6
b9011d9
 
149c9e6
b9011d9
 
149c9e6
8d7c991
 
 
 
 
 
ae19b65
 
8a5d442
 
8d7c991
8a5d442
 
149c9e6
8a5d442
 
 
 
 
8d7c991
8a5d442
8d7c991
 
 
 
 
 
 
 
 
f793a7b
 
8d7c991
 
 
 
 
f793a7b
ae19b65
f793a7b
 
8d7c991
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
---
language:
- pl
tags:
- text
- sentiment
- politics
metrics:
- accuracy
- f1
pipeline_tag: text-classification
widget:
- text: Przykro patrzeć, a słuchać się nie da.
  example_title: example 1
- text: Oczywiście ze Pan Prezydent to nasza duma narodowa!!
  example_title: example 2
base_model: dkleczek/bert-base-polish-cased-v1
model-index:
- name: PaReS-sentimenTw-political-PL
  results:
  - task:
      type: sentiment-classification
      name: Text Classification
    dataset:
      name: tweets_2020_electionsPL
      type: tweets
    metrics:
    - type: f1
      value: 94.4
---

# PaReS-sentimenTw-political-PL

This model is a fine-tuned version of [dkleczek/bert-base-polish-cased-v1](https://huggingface.co/dkleczek/bert-base-polish-cased-v1) to predict 3-categorical sentiment.
Fine-tuned on 1k sample of manually annotated Twitter data.

Model developed as a part of ComPathos project: https://www.ncn.gov.pl/sites/default/files/listy-rankingowe/2020-09-30apsv2/streszczenia/497124-en.pdf

```
from transformers import pipeline

model_path = "eevvgg/PaReS-sentimenTw-political-PL"
sentiment_task = pipeline(task = "sentiment-analysis", model = model_path, tokenizer = model_path)

sequence = ["Cała ta śmieszna debata była próbą ukrycia problemów gospodarczych jakie są i nadejdą, pytania w większości o mało istotnych sprawach", 
            "Brawo panie ministrze!"]
            
result = sentiment_task(sequence)
labels = [i['label'] for i in result] # ['Negative', 'Positive']            

```


## Intended uses & limitations

Sentiment detection in Polish data (fine-tuned on tweets from political domain).


## Training and evaluation data

- Trained for 3 epochs, mini-batch size of 8.
- Training results: loss: 0.1358926964368792



It achieves the following results on the test set (10%):

- No. examples = 100 
- mini batch size = 8 
- accuracy = 0.950 
- macro f1 = 0.944 

              precision    recall  f1-score   support

           0      0.960     0.980     0.970        49
           1      0.958     0.885     0.920        26
           2      0.923     0.960     0.941        25