update model card README.md
Browse files
README.md
CHANGED
@@ -14,32 +14,22 @@ should probably proofread and complete it, then remove this comment. -->
|
|
14 |
|
15 |
# tweet_instruct_detect
|
16 |
|
17 |
-
This model is a fine-tuned version of [microsoft/Multilingual-MiniLM-L12-H384](https://huggingface.co/microsoft/Multilingual-MiniLM-L12-H384) on an
|
18 |
It achieves the following results on the evaluation set:
|
19 |
-
- Loss: 0.
|
20 |
-
- Accuracy: 0.
|
21 |
|
22 |
## Model description
|
23 |
|
24 |
-
|
25 |
|
26 |
## Intended uses & limitations
|
27 |
|
28 |
-
|
29 |
-
|
30 |
-
The model will be biased towards english data, and maybe be biased towards certain ways of phrasing "instructions". Instructions in this case may also be questions.
|
31 |
-
|
32 |
-
Current version of the model is very basic and can get confused by simple things. For example, simply adding a ? character will bias it heavily towards an instruction, even if using the same sentence so it is highly sensitive to certain characters and ways of phrasing things. This can hopefully be fixed by better training data or model tuning.
|
33 |
|
34 |
## Training and evaluation data
|
35 |
|
36 |
-
|
37 |
-
|
38 |
-
Train data: 749 examples
|
39 |
-
Test data: 251 examples
|
40 |
-
|
41 |
-
Out of the total number of examples, 526 of them were manually labelled tweets, most of which were spam due to the high noise ratio in tweets.
|
42 |
-
Spam in this case can refer to actual spam, gibberish, or also statements that are generally fine but not useful as an instruction or question.
|
43 |
|
44 |
## Training procedure
|
45 |
|
@@ -52,17 +42,32 @@ The following hyperparameters were used during training:
|
|
52 |
- seed: 42
|
53 |
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
|
54 |
- lr_scheduler_type: linear
|
55 |
-
- num_epochs:
|
56 |
|
57 |
### Training results
|
58 |
|
59 |
| Training Loss | Epoch | Step | Validation Loss | Accuracy |
|
60 |
|:-------------:|:-----:|:----:|:---------------:|:--------:|
|
61 |
-
| No log | 1.0 |
|
62 |
-
| No log | 2.0 |
|
63 |
-
| No log | 3.0 |
|
64 |
-
| No log | 4.0 |
|
65 |
-
| No log | 5.0 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
66 |
|
67 |
|
68 |
### Framework versions
|
|
|
14 |
|
15 |
# tweet_instruct_detect
|
16 |
|
17 |
+
This model is a fine-tuned version of [microsoft/Multilingual-MiniLM-L12-H384](https://huggingface.co/microsoft/Multilingual-MiniLM-L12-H384) on an unknown dataset.
|
18 |
It achieves the following results on the evaluation set:
|
19 |
+
- Loss: 0.1452
|
20 |
+
- Accuracy: 0.9680
|
21 |
|
22 |
## Model description
|
23 |
|
24 |
+
More information needed
|
25 |
|
26 |
## Intended uses & limitations
|
27 |
|
28 |
+
More information needed
|
|
|
|
|
|
|
|
|
29 |
|
30 |
## Training and evaluation data
|
31 |
|
32 |
+
More information needed
|
|
|
|
|
|
|
|
|
|
|
|
|
33 |
|
34 |
## Training procedure
|
35 |
|
|
|
42 |
- seed: 42
|
43 |
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
|
44 |
- lr_scheduler_type: linear
|
45 |
+
- num_epochs: 20
|
46 |
|
47 |
### Training results
|
48 |
|
49 |
| Training Loss | Epoch | Step | Validation Loss | Accuracy |
|
50 |
|:-------------:|:-----:|:----:|:---------------:|:--------:|
|
51 |
+
| No log | 1.0 | 53 | 0.3291 | 0.9537 |
|
52 |
+
| No log | 2.0 | 106 | 0.1896 | 0.9537 |
|
53 |
+
| No log | 3.0 | 159 | 0.1724 | 0.9573 |
|
54 |
+
| No log | 4.0 | 212 | 0.1102 | 0.9751 |
|
55 |
+
| No log | 5.0 | 265 | 0.1450 | 0.9644 |
|
56 |
+
| No log | 6.0 | 318 | 0.1223 | 0.9715 |
|
57 |
+
| No log | 7.0 | 371 | 0.1434 | 0.9680 |
|
58 |
+
| No log | 8.0 | 424 | 0.1400 | 0.9680 |
|
59 |
+
| No log | 9.0 | 477 | 0.1349 | 0.9715 |
|
60 |
+
| 0.1523 | 10.0 | 530 | 0.1370 | 0.9715 |
|
61 |
+
| 0.1523 | 11.0 | 583 | 0.1376 | 0.9715 |
|
62 |
+
| 0.1523 | 12.0 | 636 | 0.1385 | 0.9715 |
|
63 |
+
| 0.1523 | 13.0 | 689 | 0.1392 | 0.9715 |
|
64 |
+
| 0.1523 | 14.0 | 742 | 0.1399 | 0.9715 |
|
65 |
+
| 0.1523 | 15.0 | 795 | 0.1395 | 0.9715 |
|
66 |
+
| 0.1523 | 16.0 | 848 | 0.1402 | 0.9715 |
|
67 |
+
| 0.1523 | 17.0 | 901 | 0.1462 | 0.9680 |
|
68 |
+
| 0.1523 | 18.0 | 954 | 0.1533 | 0.9680 |
|
69 |
+
| 0.0492 | 19.0 | 1007 | 0.1472 | 0.9680 |
|
70 |
+
| 0.0492 | 20.0 | 1060 | 0.1452 | 0.9680 |
|
71 |
|
72 |
|
73 |
### Framework versions
|