|
--- |
|
license: apache-2.0 |
|
base_model: bert-base-cased |
|
tags: |
|
- generated_from_trainer |
|
model-index: |
|
- name: bert-finetuned-squad |
|
results: [] |
|
--- |
|
|
|
<!-- This model card has been generated automatically according to the information the Trainer had access to. You |
|
should probably proofread and complete it, then remove this comment. --> |
|
|
|
# bert-finetuned-squad |
|
|
|
This model is a fine-tuned version of [bert-base-cased](https://huggingface.co/bert-base-cased) on the [SQUAD dataset](https://huggingface.co/datasets/rajpurkar/squad). |
|
|
|
## Model description |
|
|
|
More information needed |
|
|
|
## Intended uses & limitations |
|
|
|
More information needed |
|
|
|
## Training and evaluation data |
|
|
|
More information needed |
|
|
|
## Training procedure |
|
|
|
|
|
### Training hyperparameters |
|
|
|
The following hyperparameters were used during training: |
|
- learning_rate: 2e-05 |
|
- train_batch_size: 8 |
|
- eval_batch_size: 8 |
|
- seed: 42 |
|
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 |
|
- lr_scheduler_type: linear |
|
- num_epochs: 3 |
|
- mixed_precision_training: Native AMP |
|
|
|
### Training results |
|
Step|Training Loss |
|
---|--- |
|
500|2.6355 |
|
1000|1.6559 |
|
1500|1.4608 |
|
2000|1.3781 |
|
2500|1.3286 |
|
3000|1.2879 |
|
3500|1.2369 |
|
4000|1.1795 |
|
4500|1.1303 |
|
5000|1.1637 |
|
5500|1.1227 |
|
6000|1.1406 |
|
6500|1.1413 |
|
7000|1.0821 |
|
7500|1.0964 |
|
8000|1.1083 |
|
8500|1.0583 |
|
9000|1.0825 |
|
9500|1.0264 |
|
10000|1.0407 |
|
10500|1.0352 |
|
11000|1.0107 |
|
11500|0.8077 |
|
12000|0.7105 |
|
12500|0.7843 |
|
13000|0.7401 |
|
13500|0.7716 |
|
14000|0.7772 |
|
14500|0.749 |
|
15000|0.7348 |
|
15500|0.7495 |
|
16000|0.7756 |
|
16500|0.7243 |
|
17000|0.7683 |
|
17500|0.7536 |
|
18000|0.7329 |
|
18500|0.7342 |
|
19000|0.6998 |
|
19500|0.7326 |
|
20000|0.7646 |
|
20500|0.7729 |
|
21000|0.734 |
|
21500|0.734 |
|
22000|0.691 |
|
22500|0.5887 |
|
23000|0.5148 |
|
23500|0.539 |
|
24000|0.5159 |
|
24500|0.4908 |
|
25000|0.5242 |
|
25500|0.5162 |
|
26000|0.4862 |
|
26500|0.526 |
|
27000|0.4953 |
|
27500|0.5276 |
|
28000|0.4848 |
|
28500|0.4863 |
|
29000|0.5222 |
|
29500|0.5192 |
|
30000|0.5088 |
|
30500|0.5167 |
|
31000|0.4906 |
|
31500|0.5161 |
|
32000|0.4995 |
|
32500|0.4961 |
|
33000|0.4653 |
|
|
|
|
|
### Framework versions |
|
|
|
- Transformers 4.38.2 |
|
- Pytorch 2.1.0+cu118 |
|
- Datasets 2.18.0 |
|
- Tokenizers 0.15.2 |
|
|