Files changed (1) hide show
  1. README.md +101 -1
README.md CHANGED
@@ -10,6 +10,106 @@ language:
10
  - en
11
  metrics:
12
  - accuracy
 
 
13
  library_name: transformers
14
  pipeline_tag: text-classification
15
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
10
  - en
11
  metrics:
12
  - accuracy
13
+ - sparse_val accuracy
14
+ - sparse_val categorical accuracy
15
  library_name: transformers
16
  pipeline_tag: text-classification
17
+ tags:
18
+ - textclassisification
19
+ - roberta
20
+ - robertabase
21
+ - sentimentanalysis
22
+ - nlp
23
+ - tweetanalysis
24
+ - tweet
25
+ - analysis
26
+ - sentiment
27
+ - positive
28
+ - newsanalysis
29
+ ---
30
+
31
+ ---
32
+ <b>BYRD'S I - ROBERTA BASED TWEET/REVIEW/TEXT ANALYSIS</b>
33
+ ---
34
+
35
+ This is ro<b>BERT</b>a-base model fine tuned on 8 datasets with ~20 M tweets this model is suitable for english while can do a fine job on other languages.
36
+
37
+ <b>Git Repo:</b><a href = "https://github.com/Caffeine-Coders/Sentiment-Analysis-Project"> SENTIMENTANALYSIS-PROJECT</a>
38
+
39
+ <b>Demo:</b><a href = "https://byrdi.netlify.app/"> BYRD'S I</a>
40
+
41
+ <b>labels: </b>
42
+ 0 -> Negative;
43
+ 1 -> Neutral;
44
+ 2 -> Positive;
45
+
46
+ <b>Model Metrics</b><br/>
47
+ <b>Accuracy: </b> ~96% <br/>
48
+ <b>Sparse Categorical Accuracy: </b> 0.9597 <br/>
49
+ <b>Loss: </b> 0.1144 <br/>
50
+ <b>val_loss -- [onLast_train] : </b> 0.1482 <br/>
51
+ <b>Note: </b>
52
+ Due to dataset discrepencies of Neutral data we published another model <a href = "https://huggingface.co/AK776161/birdseye_roberta-base-18">
53
+ Byrd's I only positive_negative model</a> to find only neutral data and have used
54
+ <b>AdaBoot</b> method to get the accurate output.
55
+ # Example of Classification:
56
+ ```python
57
+ from transformers import AutoTokenizer, AutoModelForSequenceClassification, AutoModelForSeq2SeqLM
58
+ from transformers import TFAutoModelForSequenceClassification
59
+ import pandas as pd
60
+ import numpy as np
61
+ import tensorflow
62
+
63
+ # model 0
64
+ tokenizer = AutoTokenizer.from_pretrained("AK776161/birdseye_roberta-base-18", use_fast = True)
65
+ model = AutoModelForSequenceClassification.from_pretrained("AK776161/birdseye_roberta-base-18", from_tf=True)
66
+ # model1
67
+ tokenizer1 = AutoTokenizer.from_pretrained("AK776161/birdseye_roberta-base-tweet-eval", use_fast = True)
68
+ model1 = AutoModelForSequenceClassification.from_pretrained("AK776161/birdseye_roberta-base-tweet-eval",from_tf=True)
69
+
70
+ #-----------------------Adaboot technique---------------------------
71
+ def nparraymeancalc(arr1, arr2):
72
+ returner = []
73
+ for i in range(0,len(arr1)):
74
+ if(arr1[i][1] < -7):
75
+ arr1[i][1] = 0
76
+ returner.append(np.mean([arr1[i],arr2[i]], axis = 0))
77
+
78
+ return np.array(returner)
79
+
80
+ def predictions(tokenizedtext):
81
+ output1 = model(**tokenizedtext)
82
+ output2 = model1(**tokenizedtext)
83
+
84
+ logits1 = output1.logits
85
+ logits1 = logits1.detach().numpy()
86
+
87
+ logits2 = output2.logits
88
+ logits2 = logits2.detach().numpy()
89
+
90
+ # print(logits1, logits2)
91
+ predictionresult = nparraymeancalc(logits1,logits2)
92
+
93
+ return np.array(predictionresult)
94
+
95
+ def labelassign(predictionresult):
96
+ labels = []
97
+ for i in predictionresult:
98
+ label_id = i.argmax()
99
+ labels.append(label_id)
100
+ return labels
101
+
102
+ tokenizeddata = tokenizer("----YOUR_TEXT---", return_tensors = 'pt', padding = True, truncation = True)
103
+ result = predictions(tokenizeddata)
104
+
105
+ print(labelassign(result))
106
+ ```
107
+ Output for "I LOVE YOU":
108
+ ```
109
+ 1) Positive: 0.994
110
+ 2) Negative: 0.000
111
+ 3) Neutral: 0.006
112
+ ```
113
+
114
+
115
+