Nikola299 commited on
Commit
da173ea
·
verified ·
1 Parent(s): 4f103f1

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +17 -2
README.md CHANGED
@@ -56,6 +56,9 @@ To be used as a multilabel classifier to identify if the sample text contains on
56
 
57
  ### Example
58
 
 
 
 
59
  First install direct dependencies:
60
  ```
61
  pip install transformers torch accelerate
@@ -65,8 +68,8 @@ Then the model can be downloaded and used for inference:
65
  ```py
66
  from transformers import AutoModelForSequenceClassification, AutoTokenizer
67
 
68
- model = AutoModelForSequenceClassification.from_pretrained("identrics/EN_propaganda_detector", num_labels=2)
69
- tokenizer = AutoTokenizer.from_pretrained("identrics/BG_propaganda_detector")
70
 
71
  tokens = tokenizer("Our country is the most powerful country in the world!", return_tensors="pt")
72
  output = model(**tokens)
@@ -74,6 +77,18 @@ print(output.logits)
74
  ```
75
 
76
 
 
 
 
 
 
 
 
 
 
 
 
 
77
  ## Training Details
78
 
79
  The training datasets for the model consist of a balanced set totaling 734 Bulgarian examples that include both propaganda and non-propaganda content. These examples are collected from a variety of traditional media and social media sources, ensuring a diverse range of content. Aditionally, the training dataset is enriched with AI-generated samples. The total distribution of the training data is shown in the table below:
 
56
 
57
  ### Example
58
 
59
+
60
+
61
+
62
  First install direct dependencies:
63
  ```
64
  pip install transformers torch accelerate
 
68
  ```py
69
  from transformers import AutoModelForSequenceClassification, AutoTokenizer
70
 
71
+ model = AutoModelForSequenceClassification.from_pretrained("identrics/BG_propaganda_classifier", num_labels=5)
72
+ tokenizer = AutoTokenizer.from_pretrained("identrics/BG_propaganda_classifier")
73
 
74
  tokens = tokenizer("Our country is the most powerful country in the world!", return_tensors="pt")
75
  output = model(**tokens)
 
77
  ```
78
 
79
 
80
+
81
+
82
+
83
+
84
+
85
+
86
+
87
+
88
+
89
+
90
+
91
+
92
  ## Training Details
93
 
94
  The training datasets for the model consist of a balanced set totaling 734 Bulgarian examples that include both propaganda and non-propaganda content. These examples are collected from a variety of traditional media and social media sources, ensuring a diverse range of content. Aditionally, the training dataset is enriched with AI-generated samples. The total distribution of the training data is shown in the table below: