|
--- |
|
license: apache-2.0 |
|
base_model: bert-base-cased |
|
tags: |
|
- generated_from_trainer |
|
model-index: |
|
- name: bert-finetuned-squad |
|
results: [] |
|
--- |
|
|
|
<!-- This model card has been generated automatically according to the information the Trainer had access to. You |
|
should probably proofread and complete it, then remove this comment. --> |
|
|
|
# bert-finetuned-squad |
|
|
|
This model is a fine-tuned version of [bert-base-cased](https://huggingface.co/bert-base-cased) on the [SQUAD dataset](). |
|
|
|
## Model description |
|
|
|
More information needed |
|
|
|
## Intended uses & limitations |
|
|
|
More information needed |
|
|
|
## Training and evaluation data |
|
|
|
More information needed |
|
|
|
## Training procedure |
|
|
|
|
|
### Training hyperparameters |
|
|
|
The following hyperparameters were used during training: |
|
- learning_rate: 2e-05 |
|
- train_batch_size: 8 |
|
- eval_batch_size: 8 |
|
- seed: 42 |
|
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 |
|
- lr_scheduler_type: linear |
|
- num_epochs: 3 |
|
- mixed_precision_training: Native AMP |
|
|
|
### Training results |
|
Step Training Loss |
|
500 2.635500 |
|
1000 1.655900 |
|
1500 1.460800 |
|
2000 1.378100 |
|
2500 1.328600 |
|
3000 1.287900 |
|
3500 1.236900 |
|
4000 1.179500 |
|
4500 1.130300 |
|
5000 1.163700 |
|
5500 1.122700 |
|
6000 1.140600 |
|
6500 1.141300 |
|
7000 1.082100 |
|
7500 1.096400 |
|
8000 1.108300 |
|
8500 1.058300 |
|
9000 1.082500 |
|
9500 1.026400 |
|
10000 1.040700 |
|
10500 1.035200 |
|
11000 1.010700 |
|
11500 0.807700 |
|
12000 0.710500 |
|
12500 0.784300 |
|
13000 0.740100 |
|
13500 0.771600 |
|
14000 0.777200 |
|
14500 0.749000 |
|
15000 0.734800 |
|
15500 0.749500 |
|
16000 0.775600 |
|
16500 0.724300 |
|
17000 0.768300 |
|
17500 0.753600 |
|
18000 0.732900 |
|
18500 0.734200 |
|
19000 0.699800 |
|
19500 0.732600 |
|
20000 0.764600 |
|
20500 0.772900 |
|
21000 0.734000 |
|
21500 0.734000 |
|
22000 0.691000 |
|
22500 0.588700 |
|
23000 0.514800 |
|
23500 0.539000 |
|
24000 0.515900 |
|
24500 0.490800 |
|
25000 0.524200 |
|
25500 0.516200 |
|
26000 0.486200 |
|
26500 0.526000 |
|
27000 0.495300 |
|
27500 0.527600 |
|
28000 0.484800 |
|
28500 0.486300 |
|
29000 0.522200 |
|
29500 0.519200 |
|
30000 0.508800 |
|
30500 0.516700 |
|
31000 0.490600 |
|
31500 0.516100 |
|
32000 0.499500 |
|
32500 0.496100 |
|
33000 0.465300 |
|
|
|
|
|
### Framework versions |
|
|
|
- Transformers 4.38.2 |
|
- Pytorch 2.1.0+cu118 |
|
- Datasets 2.18.0 |
|
- Tokenizers 0.15.2 |
|
|