Edit model card

e500_lr2e-05

This model is a fine-tuned version of adalbertojunior/distilbert-portuguese-cased on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.7396

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 200
  • eval_batch_size: 400
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 500
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
6.7563 1.6949 100 5.4137
5.0553 3.3898 200 4.4824
4.3687 5.0847 300 3.9332
3.9319 6.7797 400 3.5644
3.6101 8.4746 500 3.2889
3.3843 10.1695 600 3.0760
3.1869 11.8644 700 2.9195
3.0395 13.5593 800 2.7842
2.9038 15.2542 900 2.6563
2.7768 16.9492 1000 2.5554
2.6835 18.6441 1100 2.4614
2.5903 20.3390 1200 2.3882
2.5214 22.0339 1300 2.3210
2.4401 23.7288 1400 2.2352
2.373 25.4237 1500 2.2145
2.3147 27.1186 1600 2.1609
2.2606 28.8136 1700 2.0704
2.2064 30.5085 1800 2.0260
2.1572 32.2034 1900 2.0259
2.1258 33.8983 2000 1.9498
2.0683 35.5932 2100 1.9212
2.0374 37.2881 2200 1.8884
1.9998 38.9831 2300 1.8543
1.9582 40.6780 2400 1.8106
1.932 42.3729 2500 1.7822
1.8862 44.0678 2600 1.7673
1.8677 45.7627 2700 1.7280
1.8375 47.4576 2800 1.7147
1.8128 49.1525 2900 1.6882
1.7874 50.8475 3000 1.6357
1.7628 52.5424 3100 1.6502
1.7391 54.2373 3200 1.6312
1.709 55.9322 3300 1.5989
1.6878 57.6271 3400 1.5503
1.6605 59.3220 3500 1.5602
1.6331 61.0169 3600 1.5486
1.6206 62.7119 3700 1.5046
1.6057 64.4068 3800 1.5098
1.5877 66.1017 3900 1.4885
1.5576 67.7966 4000 1.4747
1.5413 69.4915 4100 1.4500
1.5142 71.1864 4200 1.3917
1.4847 72.8814 4300 1.3771
1.4665 74.5763 4400 1.3737
1.4562 76.2712 4500 1.3560
1.4422 77.9661 4600 1.3394
1.4148 79.6610 4700 1.3453
1.4108 81.3559 4800 1.3261
1.3992 83.0508 4900 1.3111
1.3784 84.7458 5000 1.3083
1.3607 86.4407 5100 1.2982
1.352 88.1356 5200 1.2758
1.3353 89.8305 5300 1.2818
1.3173 91.5254 5400 1.2697
1.3085 93.2203 5500 1.2440
1.2955 94.9153 5600 1.2099
1.2933 96.6102 5700 1.2337
1.2757 98.3051 5800 1.2056
1.262 100.0 5900 1.1993
1.2509 101.6949 6000 1.1933
1.2418 103.3898 6100 1.1645
1.2275 105.0847 6200 1.1820
1.2219 106.7797 6300 1.1452
1.216 108.4746 6400 1.1709
1.1954 110.1695 6500 1.1386
1.1858 111.8644 6600 1.1336
1.1799 113.5593 6700 1.1217
1.1707 115.2542 6800 1.1102
1.1653 116.9492 6900 1.1093
1.1476 118.6441 7000 1.1032
1.1406 120.3390 7100 1.1004
1.1364 122.0339 7200 1.0698
1.1173 123.7288 7300 1.0817
1.1129 125.4237 7400 1.0825
1.1077 127.1186 7500 1.0728
1.0943 128.8136 7600 1.0496
1.0881 130.5085 7700 1.0443
1.0774 132.2034 7800 1.0392
1.0789 133.8983 7900 1.0470
1.0608 135.5932 8000 1.0248
1.0516 137.2881 8100 1.0144
1.0533 138.9831 8200 1.0246
1.0401 140.6780 8300 1.0180
1.0347 142.3729 8400 0.9903
1.0268 144.0678 8500 0.9809
1.016 145.7627 8600 0.9839
1.003 147.4576 8700 0.9870
1.0066 149.1525 8800 0.9610
1.004 150.8475 8900 0.9488
0.9918 152.5424 9000 0.9601
0.996 154.2373 9100 0.9660
0.9835 155.9322 9200 0.9376
0.9801 157.6271 9300 0.9504
0.9606 159.3220 9400 0.9482
0.9646 161.0169 9500 0.9312
0.9637 162.7119 9600 0.9304
0.9528 164.4068 9700 0.9270
0.9432 166.1017 9800 0.9205
0.9398 167.7966 9900 0.9202
0.9377 169.4915 10000 0.9167
0.9282 171.1864 10100 0.9122
0.9118 172.8814 10200 0.9034
0.907 174.5763 10300 0.8839
0.9152 176.2712 10400 0.8879
0.9124 177.9661 10500 0.8885
0.9005 179.6610 10600 0.8832
0.8979 181.3559 10700 0.8767
0.8836 183.0508 10800 0.8886
0.882 184.7458 10900 0.8601
0.8818 186.4407 11000 0.8713
0.8724 188.1356 11100 0.8602
0.8688 189.8305 11200 0.8510
0.8677 191.5254 11300 0.8401
0.8643 193.2203 11400 0.8453
0.8638 194.9153 11500 0.8351
0.8539 196.6102 11600 0.8460
0.852 198.3051 11700 0.8474
0.8433 200.0 11800 0.8249
0.8394 201.6949 11900 0.8326
0.8339 203.3898 12000 0.8331
0.8284 205.0847 12100 0.8216
0.8284 206.7797 12200 0.8148
0.8261 208.4746 12300 0.8020
0.8158 210.1695 12400 0.8112
0.8148 211.8644 12500 0.8154
0.8118 213.5593 12600 0.8058
0.8067 215.2542 12700 0.8005
0.8022 216.9492 12800 0.8021
0.793 218.6441 12900 0.8000
0.8003 220.3390 13000 0.7924
0.7891 222.0339 13100 0.7891
0.7802 223.7288 13200 0.7678
0.7906 225.4237 13300 0.7902
0.7756 227.1186 13400 0.7774
0.7788 228.8136 13500 0.7639
0.7654 230.5085 13600 0.7767
0.7686 232.2034 13700 0.7831
0.7691 233.8983 13800 0.7735
0.7656 235.5932 13900 0.7632
0.7597 237.2881 14000 0.7694
0.7562 238.9831 14100 0.7475
0.754 240.6780 14200 0.7585
0.7461 242.3729 14300 0.7502
0.749 244.0678 14400 0.7533
0.7482 245.7627 14500 0.7308
0.7436 247.4576 14600 0.7581
0.7395 249.1525 14700 0.7118
0.7339 250.8475 14800 0.7458
0.7337 252.5424 14900 0.7232
0.7262 254.2373 15000 0.7421
0.7313 255.9322 15100 0.7097
0.7223 257.6271 15200 0.7235
0.7189 259.3220 15300 0.7222
0.7228 261.0169 15400 0.7373
0.7163 262.7119 15500 0.7247
0.7102 264.4068 15600 0.7255

Framework versions

  • Transformers 4.44.2
  • Pytorch 2.4.1+cu121
  • Datasets 3.0.1
  • Tokenizers 0.19.1
Downloads last month
68
Safetensors
Model size
66.4M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for zemaia/e500_lr2e-05

Finetuned
(2)
this model