RobertaLr6.906e-08Wd0.0207E30
This model is a fine-tuned version of deepset/roberta-base-squad2 on the None dataset. It achieves the following results on the evaluation set:
- Loss: 1.1220
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 6.906e-08
- train_batch_size: 16
- eval_batch_size: 16
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 30
Training results
Training Loss | Epoch | Step | Validation Loss |
---|---|---|---|
5.0029 | 1.0 | 1124 | 3.8479 |
2.853 | 2.0 | 2248 | 2.7678 |
1.5129 | 3.0 | 3372 | 2.2366 |
1.9825 | 4.0 | 4496 | 1.9843 |
1.1715 | 5.0 | 5620 | 1.8204 |
0.8653 | 6.0 | 6744 | 1.6976 |
2.3326 | 7.0 | 7868 | 1.6087 |
1.3648 | 8.0 | 8992 | 1.5312 |
2.1233 | 9.0 | 10116 | 1.4714 |
1.6096 | 10.0 | 11240 | 1.4175 |
1.573 | 11.0 | 12364 | 1.3730 |
2.0344 | 12.0 | 13488 | 1.3369 |
1.3262 | 13.0 | 14612 | 1.3031 |
0.6578 | 14.0 | 15736 | 1.2779 |
1.9317 | 15.0 | 16860 | 1.2524 |
2.3933 | 16.0 | 17984 | 1.2323 |
1.2294 | 17.0 | 19108 | 1.2146 |
1.3428 | 18.0 | 20232 | 1.1984 |
0.6418 | 19.0 | 21356 | 1.1850 |
2.2177 | 20.0 | 22480 | 1.1729 |
1.3615 | 21.0 | 23604 | 1.1604 |
0.9407 | 22.0 | 24728 | 1.1519 |
1.5497 | 23.0 | 25852 | 1.1449 |
0.9906 | 24.0 | 26976 | 1.1379 |
1.2726 | 25.0 | 28100 | 1.1320 |
1.0019 | 26.0 | 29224 | 1.1278 |
0.5763 | 27.0 | 30348 | 1.1249 |
0.9258 | 28.0 | 31472 | 1.1231 |
0.5245 | 29.0 | 32596 | 1.1222 |
0.6106 | 30.0 | 33720 | 1.1220 |
Framework versions
- Transformers 4.41.2
- Pytorch 2.5.0
- Datasets 2.19.1
- Tokenizers 0.19.1
- Downloads last month
- 32