hplisiecki commited on
Commit
0976528
·
verified ·
1 Parent(s): 556698c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -14,7 +14,7 @@ Our research utilizes a comprehensive database of Polish political texts from so
14
  - YouTube: 42,252 comments
15
  - Facebook: 414,595 posts
16
 
17
- The texts were processed to fit transformer models' length constraints. Facebook texts were split into sentences, and all texts longer than 280 characters were removed. Non-Polish texts were filtered out using the `langdetect` software, and all online links and usernames were replaced with placeholders. We focused on texts with higher emotional content for training, resulting in a final dataset of 10,000 texts, annotated by 20 expert annotators.
18
 
19
  ### Annotation Process
20
 
 
14
  - YouTube: 42,252 comments
15
  - Facebook: 414,595 posts
16
 
17
+ The texts were processed to fit transformer models' length constraints. Facebook texts were split into sentences, and all texts longer than 280 characters were removed. Non-Polish texts were filtered out using the `langdetect` software, and all online links and usernames were replaced with placeholders. We focused on texts with higher emotional content for training, which we have filtered using a lexicon approach, resulting in a final dataset of 10,000 texts, annotated by 20 expert annotators.
18
 
19
  ### Annotation Process
20