bert-finetuned-squad

This model is a fine-tuned version of bert-base-cased on the SQUAD dataset.

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 3
  • mixed_precision_training: Native AMP

Training results

Step Training Loss
500 2.6355
1000 1.6559
1500 1.4608
2000 1.3781
2500 1.3286
3000 1.2879
3500 1.2369
4000 1.1795
4500 1.1303
5000 1.1637
5500 1.1227
6000 1.1406
6500 1.1413
7000 1.0821
7500 1.0964
8000 1.1083
8500 1.0583
9000 1.0825
9500 1.0264
10000 1.0407
10500 1.0352
11000 1.0107
11500 0.8077
12000 0.7105
12500 0.7843
13000 0.7401
13500 0.7716
14000 0.7772
14500 0.749
15000 0.7348
15500 0.7495
16000 0.7756
16500 0.7243
17000 0.7683
17500 0.7536
18000 0.7329
18500 0.7342
19000 0.6998
19500 0.7326
20000 0.7646
20500 0.7729
21000 0.734
21500 0.734
22000 0.691
22500 0.5887
23000 0.5148
23500 0.539
24000 0.5159
24500 0.4908
25000 0.5242
25500 0.5162
26000 0.4862
26500 0.526
27000 0.4953
27500 0.5276
28000 0.4848
28500 0.4863
29000 0.5222
29500 0.5192
30000 0.5088
30500 0.5167
31000 0.4906
31500 0.5161
32000 0.4995
32500 0.4961
33000 0.4653

Framework versions

  • Transformers 4.38.2
  • Pytorch 2.1.0+cu118
  • Datasets 2.18.0
  • Tokenizers 0.15.2
Downloads last month
18
Safetensors
Model size
108M params
Tensor type
F32
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for Skier8402/bert-finetuned-squad

Finetuned
(2002)
this model