baohuynhbk14 commited on
Commit
23877fb
·
verified ·
1 Parent(s): 26acd70

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +25 -27
README.md CHANGED
@@ -14,7 +14,17 @@ library_name: transformers
14
 
15
  This model is based on our pretrained [5CD-AI/visobert-14gb-corpus](https://huggingface.co/5CD-AI/visobert-14gb-corpus), which has been continuously trained on a 14GB dataset of Vietnamese social content.
16
 
17
- Our model is fine-tuned on <b>120K</b> Vietnamese sentiment datasets, including comments and reviews from e-commerce platforms, social media, and forums.
 
 
 
 
 
 
 
 
 
 
18
 
19
  The model will give softmax outputs for three labels.
20
 
@@ -27,7 +37,7 @@ The model will give softmax outputs for three labels.
27
  ```
28
 
29
  ## Dataset
30
-
31
  <table border="2">
32
  <tr align="center">
33
  <th rowspan="2">Dataset</th>
@@ -83,7 +93,7 @@ The model will give softmax outputs for three labels.
83
  <td>-</td>
84
  </tr>
85
  <tr align="center">
86
- <td align="left">UIT-VSMEC</td>
87
  <td>3219</td>
88
  <td>1665</td>
89
  <td>594</td>
@@ -108,7 +118,7 @@ The model will give softmax outputs for three labels.
108
  <td>-</td>
109
  </tr>
110
  <tr align="center">
111
- <td align="left">UIT-ViCTSD</td>
112
  <td>3370</td>
113
  <td>2615</td>
114
  <td>933</td>
@@ -156,7 +166,7 @@ The model will give softmax outputs for three labels.
156
  <td>-</td>
157
  </tr>
158
  <tr align="center">
159
- <td align="left">Ecommerce-reviews</td>
160
  <td>20093</td>
161
  <td>6669</td>
162
  <td>4698</td>
@@ -168,16 +178,16 @@ The model will give softmax outputs for three labels.
168
  <td>-</td>
169
  </tr>
170
  <tr align="center">
171
- <td align="left">VOZ-HSD</td>
172
  <td>2676</td>
173
  <td>1213</td>
174
  <td>1071</td>
175
- <td>1068</td>
176
- <td>495</td>
177
- <td>420</td>
178
- <td>511</td>
179
- <td>224</td>
180
- <td>199</td>
181
  </tr>
182
  <tr align="center">
183
  <td align="left">Vietnamese-amazon-polarity</td>
@@ -200,8 +210,8 @@ The model will give softmax outputs for three labels.
200
  <td colspan=4><b>SA-VLSP2016</td>
201
  <td colspan=4><b>AIVIVN-2019</td>
202
  <td colspan=4><b>UIT-VSFC</td>
203
- <td colspan=4><b>UIT-VSMEC</td>
204
- <td colspan=4><b>UIT-ViCTSD</td>
205
  </tr>
206
  <tr align="center">
207
  <td><b>Acc</td>
@@ -281,7 +291,6 @@ The model will give softmax outputs for three labels.
281
  <td rowspan=2><b>Model</td>
282
  <td colspan=4><b>UIT-ViOCD</td>
283
  <td colspan=4><b>UIT-ViSFD</td>
284
- <td colspan=4><b>VOZ-HSD</td>
285
  <td colspan=4><b>Vi-amazon-polar</td>
286
  </tr>
287
  <tr align="center">
@@ -297,14 +306,11 @@ The model will give softmax outputs for three labels.
297
  <td><b>Prec</td>
298
  <td><b>Recall</td>
299
  <td><b>WF1</td>
300
- <td><b>Acc</td>
301
- <td><b>Prec</td>
302
- <td><b>Recall</td>
303
- <td><b>WF1</td>
304
  </tr>
305
  <tr align="center">
306
  <tr align="center">
307
  <td align="left">wonrax/phobert-base-vietnamese-sentiment</td>
 
308
  <td>87.14</td>
309
  <td>74.68</td>
310
  <td>78.13</td>
@@ -312,10 +318,6 @@ The model will give softmax outputs for three labels.
312
  <td>67.95</td>
313
  <td>67.90</td>
314
  <td>66.98</td>
315
- <td>51.89</td>
316
- <td>60.18</td>
317
- <td>51.89</td>
318
- <td>53.61</td>
319
  <td>61.40</td>
320
  <td>76.53</td>
321
  <td>61.40</td>
@@ -331,10 +333,6 @@ The model will give softmax outputs for three labels.
331
  <td><b>93.20</td>
332
  <td><b>93.26</td>
333
  <td><b>93.21</td>
334
- <td><b>67.78</td>
335
- <td><b>69.82</td>
336
- <td><b>67.78</td>
337
- <td><b>68.39</td>
338
  <td><b>89.90</td>
339
  <td><b>90.13</td>
340
  <td><b>89.90</td>
 
14
 
15
  This model is based on our pretrained [5CD-AI/visobert-14gb-corpus](https://huggingface.co/5CD-AI/visobert-14gb-corpus), which has been continuously trained on a 14GB dataset of Vietnamese social content.
16
 
17
+ Our model is fine-tuned on <b>120K</b> Vietnamese sentiment datasets, including comments and reviews from e-commerce platforms, social media, and forums
18
+
19
+ Our model get over performace in datasets:
20
+ - SA-VLSP2016
21
+ - AIVIVN-2019
22
+ - UIT-VSFC
23
+ - UIT-VSMEC
24
+ - UIT-ViCTSD
25
+ - UIT-ViOCD
26
+ - UIT-ViSFD
27
+ - Vi-amazon-polar
28
 
29
  The model will give softmax outputs for three labels.
30
 
 
37
  ```
38
 
39
  ## Dataset
40
+ Our training dataset. With UIT-VSMEC, UIT-ViCTSD, VOZ-HSD, we re-label the dataset with Gemini 1.5 Flash API follow the 3 labels.
41
  <table border="2">
42
  <tr align="center">
43
  <th rowspan="2">Dataset</th>
 
93
  <td>-</td>
94
  </tr>
95
  <tr align="center">
96
+ <td align="left">UIT-VSMEC (Gemini-label)</td>
97
  <td>3219</td>
98
  <td>1665</td>
99
  <td>594</td>
 
118
  <td>-</td>
119
  </tr>
120
  <tr align="center">
121
+ <td align="left">UIT-ViCTSD (Gemini-label)</td>
122
  <td>3370</td>
123
  <td>2615</td>
124
  <td>933</td>
 
166
  <td>-</td>
167
  </tr>
168
  <tr align="center">
169
+ <td align="left">Tiki-reviews</td>
170
  <td>20093</td>
171
  <td>6669</td>
172
  <td>4698</td>
 
178
  <td>-</td>
179
  </tr>
180
  <tr align="center">
181
+ <td align="left">VOZ-HSD (Gemini-label)</td>
182
  <td>2676</td>
183
  <td>1213</td>
184
  <td>1071</td>
185
+ <td>-</td>
186
+ <td>-</td>
187
+ <td>-</td>
188
+ <td>-</td>
189
+ <td>-</td>
190
+ <td>-</td>
191
  </tr>
192
  <tr align="center">
193
  <td align="left">Vietnamese-amazon-polarity</td>
 
210
  <td colspan=4><b>SA-VLSP2016</td>
211
  <td colspan=4><b>AIVIVN-2019</td>
212
  <td colspan=4><b>UIT-VSFC</td>
213
+ <td colspan=4><b>UIT-VSMEC (Gemini-label)</td>
214
+ <td colspan=4><b>UIT-ViCTSD (Gemini-label)</td>
215
  </tr>
216
  <tr align="center">
217
  <td><b>Acc</td>
 
291
  <td rowspan=2><b>Model</td>
292
  <td colspan=4><b>UIT-ViOCD</td>
293
  <td colspan=4><b>UIT-ViSFD</td>
 
294
  <td colspan=4><b>Vi-amazon-polar</td>
295
  </tr>
296
  <tr align="center">
 
306
  <td><b>Prec</td>
307
  <td><b>Recall</td>
308
  <td><b>WF1</td>
 
 
 
 
309
  </tr>
310
  <tr align="center">
311
  <tr align="center">
312
  <td align="left">wonrax/phobert-base-vietnamese-sentiment</td>
313
+ <td>74.68</td>
314
  <td>87.14</td>
315
  <td>74.68</td>
316
  <td>78.13</td>
 
318
  <td>67.95</td>
319
  <td>67.90</td>
320
  <td>66.98</td>
 
 
 
 
321
  <td>61.40</td>
322
  <td>76.53</td>
323
  <td>61.40</td>
 
333
  <td><b>93.20</td>
334
  <td><b>93.26</td>
335
  <td><b>93.21</td>
 
 
 
 
336
  <td><b>89.90</td>
337
  <td><b>90.13</td>
338
  <td><b>89.90</td>