scenario-NON-KD-PO-COPY-D2_data-cl-massive_all_1_166
This model is a fine-tuned version of haryoaw/scenario-MDBT-TCR-MSV-CL on the massive dataset. It achieves the following results on the evaluation set:
- Loss: 3.4255
- Accuracy: 0.6379
- F1: 0.5920
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 32
- eval_batch_size: 64
- seed: 66
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 10
Training results
Training Loss | Epoch | Step | Validation Loss | Accuracy | F1 |
---|---|---|---|---|---|
0.7212 | 0.5558 | 5000 | 1.5835 | 0.6111 | 0.5437 |
0.4554 | 1.1116 | 10000 | 1.6185 | 0.6276 | 0.5700 |
0.4132 | 1.6674 | 15000 | 1.6466 | 0.6333 | 0.5784 |
0.2821 | 2.2232 | 20000 | 1.7639 | 0.6374 | 0.5855 |
0.2714 | 2.7790 | 25000 | 1.7789 | 0.6393 | 0.5865 |
0.2005 | 3.3348 | 30000 | 2.1278 | 0.6238 | 0.5738 |
0.1893 | 3.8906 | 35000 | 2.1050 | 0.6324 | 0.5876 |
0.1542 | 4.4464 | 40000 | 2.2392 | 0.6360 | 0.5896 |
0.1397 | 5.0022 | 45000 | 2.3114 | 0.6287 | 0.5875 |
0.1171 | 5.5580 | 50000 | 2.5310 | 0.6291 | 0.5886 |
0.0794 | 6.1138 | 55000 | 2.7319 | 0.6340 | 0.5865 |
0.0871 | 6.6696 | 60000 | 2.8662 | 0.6369 | 0.5904 |
0.0569 | 7.2254 | 65000 | 3.1081 | 0.6357 | 0.5858 |
0.0685 | 7.7812 | 70000 | 3.0348 | 0.6374 | 0.5915 |
0.0518 | 8.3370 | 75000 | 3.3951 | 0.6326 | 0.5923 |
0.0472 | 8.8928 | 80000 | 3.4227 | 0.6362 | 0.5919 |
0.0337 | 9.4486 | 85000 | 3.4255 | 0.6379 | 0.5920 |
Framework versions
- Transformers 4.44.2
- Pytorch 2.1.1+cu121
- Datasets 2.14.5
- Tokenizers 0.19.1
- Downloads last month
- 6
Model tree for haryoaw/scenario-NON-KD-PO-COPY-D2_data-cl-massive_all_1_166
Base model
microsoft/mdeberta-v3-base
Finetuned
haryoaw/scenario-MDBT-TCR-MSV-CL