metadata
license: apache-2.0
base_model: bert-base-cased
tags:
- generated_from_trainer
model-index:
- name: bert-finetuned-squad
results: []
bert-finetuned-squad
This model is a fine-tuned version of bert-base-cased on the SQUAD dataset.
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 3
- mixed_precision_training: Native AMP
Training results
Step | Training Loss |
---|---|
500 | 2.6355 |
1000 | 1.6559 |
1500 | 1.4608 |
2000 | 1.3781 |
2500 | 1.3286 |
3000 | 1.2879 |
3500 | 1.2369 |
4000 | 1.1795 |
4500 | 1.1303 |
5000 | 1.1637 |
5500 | 1.1227 |
6000 | 1.1406 |
6500 | 1.1413 |
7000 | 1.0821 |
7500 | 1.0964 |
8000 | 1.1083 |
8500 | 1.0583 |
9000 | 1.0825 |
9500 | 1.0264 |
10000 | 1.0407 |
10500 | 1.0352 |
11000 | 1.0107 |
11500 | 0.8077 |
12000 | 0.7105 |
12500 | 0.7843 |
13000 | 0.7401 |
13500 | 0.7716 |
14000 | 0.7772 |
14500 | 0.749 |
15000 | 0.7348 |
15500 | 0.7495 |
16000 | 0.7756 |
16500 | 0.7243 |
17000 | 0.7683 |
17500 | 0.7536 |
18000 | 0.7329 |
18500 | 0.7342 |
19000 | 0.6998 |
19500 | 0.7326 |
20000 | 0.7646 |
20500 | 0.7729 |
21000 | 0.734 |
21500 | 0.734 |
22000 | 0.691 |
22500 | 0.5887 |
23000 | 0.5148 |
23500 | 0.539 |
24000 | 0.5159 |
24500 | 0.4908 |
25000 | 0.5242 |
25500 | 0.5162 |
26000 | 0.4862 |
26500 | 0.526 |
27000 | 0.4953 |
27500 | 0.5276 |
28000 | 0.4848 |
28500 | 0.4863 |
29000 | 0.5222 |
29500 | 0.5192 |
30000 | 0.5088 |
30500 | 0.5167 |
31000 | 0.4906 |
31500 | 0.5161 |
32000 | 0.4995 |
32500 | 0.4961 |
33000 | 0.4653 |
Framework versions
- Transformers 4.38.2
- Pytorch 2.1.0+cu118
- Datasets 2.18.0
- Tokenizers 0.15.2