t5_base-qg-ap-test / README.md
tiagoblima's picture
End of training
378a488 verified
|
raw
history blame
6.45 kB
metadata
license: mit
base_model: unicamp-dl/ptt5-base-portuguese-vocab
tags:
  - generated_from_trainer
datasets:
  - tiagoblima/du-qg-squadv1_pt
model-index:
  - name: t5_base-qg-ap-test
    results: []

t5_base-qg-ap-test

This model is a fine-tuned version of unicamp-dl/ptt5-base-portuguese-vocab on the tiagoblima/du-qg-squadv1_pt dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0163

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-06
  • lr_scheduler_type: linear
  • num_epochs: 100.0

Training results

Training Loss Epoch Step Validation Loss
No log 1.0 1 12.8054
No log 2.0 2 10.7880
No log 3.0 3 8.8731
No log 4.0 4 7.4068
No log 5.0 5 6.4581
No log 6.0 6 5.6475
No log 7.0 7 4.9596
No log 8.0 8 4.5058
No log 9.0 9 4.0768
No log 10.0 10 3.7047
No log 11.0 11 3.4143
No log 12.0 12 3.1360
No log 13.0 13 2.8866
No log 14.0 14 2.6325
No log 15.0 15 2.3889
No log 16.0 16 2.1914
No log 17.0 17 2.0424
No log 18.0 18 1.9111
No log 19.0 19 1.7763
No log 20.0 20 1.6505
No log 21.0 21 1.5257
No log 22.0 22 1.4126
No log 23.0 23 1.3109
No log 24.0 24 1.2189
No log 25.0 25 1.1338
No log 26.0 26 1.0486
No log 27.0 27 0.9640
No log 28.0 28 0.8828
No log 29.0 29 0.8060
No log 30.0 30 0.7329
No log 31.0 31 0.6639
No log 32.0 32 0.6010
No log 33.0 33 0.5439
No log 34.0 34 0.4925
No log 35.0 35 0.4471
No log 36.0 36 0.4066
No log 37.0 37 0.3690
No log 38.0 38 0.3341
No log 39.0 39 0.3023
No log 40.0 40 0.2746
No log 41.0 41 0.2470
No log 42.0 42 0.2205
No log 43.0 43 0.1968
No log 44.0 44 0.1771
No log 45.0 45 0.1593
No log 46.0 46 0.1424
No log 47.0 47 0.1288
No log 48.0 48 0.1170
No log 49.0 49 0.1070
No log 50.0 50 0.0996
No log 51.0 51 0.0939
No log 52.0 52 0.0888
No log 53.0 53 0.0845
No log 54.0 54 0.0818
No log 55.0 55 0.0790
No log 56.0 56 0.0763
No log 57.0 57 0.0732
No log 58.0 58 0.0697
No log 59.0 59 0.0666
No log 60.0 60 0.0642
No log 61.0 61 0.0611
No log 62.0 62 0.0583
No log 63.0 63 0.0560
No log 64.0 64 0.0532
No log 65.0 65 0.0512
No log 66.0 66 0.0487
No log 67.0 67 0.0464
No log 68.0 68 0.0431
No log 69.0 69 0.0399
No log 70.0 70 0.0381
No log 71.0 71 0.0364
No log 72.0 72 0.0348
No log 73.0 73 0.0333
No log 74.0 74 0.0316
No log 75.0 75 0.0299
No log 76.0 76 0.0285
No log 77.0 77 0.0274
No log 78.0 78 0.0264
No log 79.0 79 0.0253
No log 80.0 80 0.0242
No log 81.0 81 0.0236
No log 82.0 82 0.0231
No log 83.0 83 0.0229
No log 84.0 84 0.0226
No log 85.0 85 0.0223
No log 86.0 86 0.0218
No log 87.0 87 0.0212
No log 88.0 88 0.0205
No log 89.0 89 0.0198
No log 90.0 90 0.0192
No log 91.0 91 0.0186
No log 92.0 92 0.0181
No log 93.0 93 0.0177
No log 94.0 94 0.0173
No log 95.0 95 0.0170
No log 96.0 96 0.0168
No log 97.0 97 0.0166
No log 98.0 98 0.0165
No log 99.0 99 0.0164
1.4009 100.0 100 0.0163

Framework versions

  • Transformers 4.35.2
  • Pytorch 2.0.0
  • Datasets 2.15.0
  • Tokenizers 0.15.0