File size: 2,823 Bytes
149c9e6
 
 
8a5d442
 
 
 
5cedba1
8a5d442
5cedba1
 
149c9e6
 
 
 
8a5d442
149c9e6
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
8d7c991
 
 
 
 
 
ae19b65
 
8a5d442
 
8d7c991
8a5d442
 
149c9e6
8a5d442
 
 
 
 
8d7c991
8a5d442
8d7c991
 
819cccd
 
 
 
 
 
 
 
 
 
 
 
 
 
8d7c991
 
 
 
 
 
 
f793a7b
 
8d7c991
 
 
 
 
f793a7b
ae19b65
f793a7b
 
8d7c991
 
 
 
 
 
 
 
819cccd
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
---
language: 
  - pl

pipeline_tag: text-classification

widget:
- text: "Przykro patrzeć, a słuchać się nie da."
  example_title: "example 1"
- text: "Oczywiście ze Pan Prezydent to nasza duma narodowa!!"
  example_title: "example 2"
  
tags:
  - text
  - sentiment
  - politics

metrics:
  - accuracy 
  - f1

model-index:
- name: PaReS-sentimenTw-political-PL
  results:
  - task:
      type: sentiment-classification             # Required. Example: automatic-speech-recognition
      name: Text Classification            # Optional. Example: Speech Recognition
    dataset:
      type: tweets          # Required. Example: common_voice. Use dataset id from https://hf.co/datasets
      name: tweets_2020_electionsPL          # Required. A pretty name for the dataset. Example: Common Voice (French)
    metrics:
      - type: f1          # Required. Example: wer. Use metric id from https://hf.co/metrics
        value: 94.4       # Required. Example: 20.90

---

# PaReS-sentimenTw-political-PL

This model is a fine-tuned version of [dkleczek/bert-base-polish-cased-v1](https://huggingface.co/dkleczek/bert-base-polish-cased-v1) to predict 3-categorical sentiment.
Fine-tuned on 1k sample of manually annotated Twitter data.

Model developed as a part of ComPathos project: https://www.ncn.gov.pl/sites/default/files/listy-rankingowe/2020-09-30apsv2/streszczenia/497124-en.pdf

```
from transformers import pipeline

model_path = "eevvgg/PaReS-sentimenTw-political-PL"
sentiment_task = pipeline(task = "sentiment-analysis", model = model_path, tokenizer = model_path)

sequence = ["Cała ta śmieszna debata była próbą ukrycia problemów gospodarczych jakie są i nadejdą, pytania w większości o mało istotnych sprawach", 
            "Brawo panie ministrze!"]
            
result = sentiment_task(sequence)
labels = [i['label'] for i in result] # ['Negative', 'Positive']            

```


## Model Sources 
- **BibTex citation:** 
```
@misc{SentimenTwPLGK2023,
  author={Gajewska, Ewelina and Konat, Barbara},
  title={PaReSTw: BERT for Sentiment Detection in Polish Language},
  year={2023},
  howpublished = {\url{https://huggingface.co/eevvgg/PaReS-sentimenTw-political-PL}},
}
```




## Intended uses & limitations

Sentiment detection in Polish data (fine-tuned on tweets from political domain).


## Training and evaluation data

- Trained for 3 epochs, mini-batch size of 8.
- Training results: loss: 0.1358926964368792



It achieves the following results on the test set (10%):

- No. examples = 100 
- mini batch size = 8 
- accuracy = 0.950 
- macro f1 = 0.944 

              precision    recall  f1-score   support

           0      0.960     0.980     0.970        49
           1      0.958     0.885     0.920        26
           2      0.923     0.960     0.941        25