Skier8402's picture
Update README.md
1ce7b94 verified
---
license: apache-2.0
base_model: bert-base-cased
tags:
- generated_from_trainer
model-index:
- name: bert-finetuned-squad
results: []
---
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->
# bert-finetuned-squad
This model is a fine-tuned version of [bert-base-cased](https://huggingface.co/bert-base-cased) on the [SQUAD dataset](https://huggingface.co/datasets/rajpurkar/squad).
## Model description
More information needed
## Intended uses & limitations
More information needed
## Training and evaluation data
More information needed
## Training procedure
### Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 3
- mixed_precision_training: Native AMP
### Training results
Step|Training Loss
---|---
500|2.6355
1000|1.6559
1500|1.4608
2000|1.3781
2500|1.3286
3000|1.2879
3500|1.2369
4000|1.1795
4500|1.1303
5000|1.1637
5500|1.1227
6000|1.1406
6500|1.1413
7000|1.0821
7500|1.0964
8000|1.1083
8500|1.0583
9000|1.0825
9500|1.0264
10000|1.0407
10500|1.0352
11000|1.0107
11500|0.8077
12000|0.7105
12500|0.7843
13000|0.7401
13500|0.7716
14000|0.7772
14500|0.749
15000|0.7348
15500|0.7495
16000|0.7756
16500|0.7243
17000|0.7683
17500|0.7536
18000|0.7329
18500|0.7342
19000|0.6998
19500|0.7326
20000|0.7646
20500|0.7729
21000|0.734
21500|0.734
22000|0.691
22500|0.5887
23000|0.5148
23500|0.539
24000|0.5159
24500|0.4908
25000|0.5242
25500|0.5162
26000|0.4862
26500|0.526
27000|0.4953
27500|0.5276
28000|0.4848
28500|0.4863
29000|0.5222
29500|0.5192
30000|0.5088
30500|0.5167
31000|0.4906
31500|0.5161
32000|0.4995
32500|0.4961
33000|0.4653
### Framework versions
- Transformers 4.38.2
- Pytorch 2.1.0+cu118
- Datasets 2.18.0
- Tokenizers 0.15.2