scenario-KD-PR-MSV-D2_data-cl-massive_all_1_144
This model is a fine-tuned version of haryoaw/scenario-MDBT-TCR_data-cl-massive_all_1_1 on the massive dataset. It achieves the following results on the evaluation set:
- Loss: 2.5342
- Accuracy: 0.6242
- F1: 0.5945
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 32
- eval_batch_size: 32
- seed: 44
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 30
Training results
Training Loss | Epoch | Step | Validation Loss | Accuracy | F1 |
---|---|---|---|---|---|
1.3572 | 0.56 | 5000 | 2.3093 | 0.6266 | 0.5756 |
1.1368 | 1.11 | 10000 | 2.3610 | 0.6232 | 0.5831 |
1.0884 | 1.67 | 15000 | 2.3177 | 0.6323 | 0.5864 |
0.9993 | 2.22 | 20000 | 2.3903 | 0.6208 | 0.5754 |
0.9971 | 2.78 | 25000 | 2.3353 | 0.6305 | 0.5819 |
0.9487 | 3.33 | 30000 | 2.3824 | 0.6281 | 0.5838 |
0.9282 | 3.89 | 35000 | 2.4776 | 0.6199 | 0.5797 |
0.9133 | 4.45 | 40000 | 2.4677 | 0.6207 | 0.5774 |
0.921 | 5.0 | 45000 | 2.4799 | 0.6195 | 0.5889 |
0.8781 | 5.56 | 50000 | 2.5276 | 0.6145 | 0.5823 |
0.8679 | 6.11 | 55000 | 2.5109 | 0.6219 | 0.5837 |
0.8667 | 6.67 | 60000 | 2.4956 | 0.6257 | 0.5863 |
0.8541 | 7.23 | 65000 | 2.5621 | 0.6089 | 0.5827 |
0.858 | 7.78 | 70000 | 2.4640 | 0.6258 | 0.5818 |
0.8341 | 8.34 | 75000 | 2.5379 | 0.6147 | 0.5771 |
0.8431 | 8.89 | 80000 | 2.5004 | 0.6307 | 0.5894 |
0.8409 | 9.45 | 85000 | 2.6250 | 0.6013 | 0.5683 |
0.8317 | 10.0 | 90000 | 2.5641 | 0.6205 | 0.5883 |
0.8264 | 10.56 | 95000 | 2.5379 | 0.6169 | 0.5787 |
0.8159 | 11.12 | 100000 | 2.4846 | 0.6287 | 0.5892 |
0.8211 | 11.67 | 105000 | 2.4920 | 0.6260 | 0.5834 |
0.8127 | 12.23 | 110000 | 2.5126 | 0.6268 | 0.5904 |
0.8176 | 12.78 | 115000 | 2.4977 | 0.6298 | 0.5907 |
0.8135 | 13.34 | 120000 | 2.6144 | 0.6130 | 0.5816 |
0.8092 | 13.9 | 125000 | 2.6534 | 0.6015 | 0.5770 |
0.8057 | 14.45 | 130000 | 2.6538 | 0.5992 | 0.5661 |
0.8106 | 15.01 | 135000 | 2.5595 | 0.6138 | 0.5671 |
0.8041 | 15.56 | 140000 | 2.6846 | 0.6044 | 0.5753 |
0.8008 | 16.12 | 145000 | 2.6878 | 0.6045 | 0.5876 |
0.7928 | 16.67 | 150000 | 2.6002 | 0.6144 | 0.5883 |
0.792 | 17.23 | 155000 | 2.5880 | 0.6171 | 0.5801 |
0.7953 | 17.79 | 160000 | 2.5090 | 0.6269 | 0.5887 |
0.7888 | 18.34 | 165000 | 2.5957 | 0.6162 | 0.5906 |
0.7938 | 18.9 | 170000 | 2.5766 | 0.6192 | 0.5725 |
0.7909 | 19.45 | 175000 | 2.5189 | 0.6237 | 0.5904 |
0.7864 | 20.01 | 180000 | 2.5648 | 0.6172 | 0.5840 |
0.7869 | 20.56 | 185000 | 2.5519 | 0.6210 | 0.5910 |
0.7833 | 21.12 | 190000 | 2.6989 | 0.5995 | 0.5803 |
0.7835 | 21.68 | 195000 | 2.5599 | 0.6151 | 0.5815 |
0.7803 | 22.23 | 200000 | 2.5249 | 0.6231 | 0.5893 |
0.7837 | 22.79 | 205000 | 2.5989 | 0.6135 | 0.5859 |
0.7793 | 23.34 | 210000 | 2.5693 | 0.6210 | 0.5930 |
0.7851 | 23.9 | 215000 | 2.5545 | 0.6208 | 0.5893 |
0.7799 | 24.46 | 220000 | 2.5639 | 0.6178 | 0.5884 |
0.7791 | 25.01 | 225000 | 2.5613 | 0.6200 | 0.5934 |
0.775 | 25.57 | 230000 | 2.5726 | 0.6199 | 0.5958 |
0.7747 | 26.12 | 235000 | 2.5184 | 0.6237 | 0.5914 |
0.775 | 26.68 | 240000 | 2.5404 | 0.6226 | 0.5961 |
0.7749 | 27.23 | 245000 | 2.5392 | 0.6202 | 0.5874 |
0.7767 | 27.79 | 250000 | 2.5781 | 0.6194 | 0.5953 |
0.7762 | 28.35 | 255000 | 2.4980 | 0.6290 | 0.5953 |
0.7737 | 28.9 | 260000 | 2.5199 | 0.6239 | 0.5938 |
0.7708 | 29.46 | 265000 | 2.5342 | 0.6242 | 0.5945 |
Framework versions
- Transformers 4.33.3
- Pytorch 2.1.1+cu121
- Datasets 2.14.5
- Tokenizers 0.13.3
- Downloads last month
- 1
Model tree for haryoaw/scenario-KD-PR-MSV-D2_data-cl-massive_all_1_144
Base model
microsoft/mdeberta-v3-base
Finetuned
haryoaw/scenario-MDBT-TCR-MSV-CL