e500_lr2e-05

This model is a fine-tuned version of adalbertojunior/distilbert-portuguese-cased on the None dataset. It achieves the following results on the evaluation set:

Loss: 0.7396

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 2e-05
train_batch_size: 200
eval_batch_size: 400
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 500
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss
6.7563	1.6949	100	5.4137
5.0553	3.3898	200	4.4824
4.3687	5.0847	300	3.9332
3.9319	6.7797	400	3.5644
3.6101	8.4746	500	3.2889
3.3843	10.1695	600	3.0760
3.1869	11.8644	700	2.9195
3.0395	13.5593	800	2.7842
2.9038	15.2542	900	2.6563
2.7768	16.9492	1000	2.5554
2.6835	18.6441	1100	2.4614
2.5903	20.3390	1200	2.3882
2.5214	22.0339	1300	2.3210
2.4401	23.7288	1400	2.2352
2.373	25.4237	1500	2.2145
2.3147	27.1186	1600	2.1609
2.2606	28.8136	1700	2.0704
2.2064	30.5085	1800	2.0260
2.1572	32.2034	1900	2.0259
2.1258	33.8983	2000	1.9498
2.0683	35.5932	2100	1.9212
2.0374	37.2881	2200	1.8884
1.9998	38.9831	2300	1.8543
1.9582	40.6780	2400	1.8106
1.932	42.3729	2500	1.7822
1.8862	44.0678	2600	1.7673
1.8677	45.7627	2700	1.7280
1.8375	47.4576	2800	1.7147
1.8128	49.1525	2900	1.6882
1.7874	50.8475	3000	1.6357
1.7628	52.5424	3100	1.6502
1.7391	54.2373	3200	1.6312
1.709	55.9322	3300	1.5989
1.6878	57.6271	3400	1.5503
1.6605	59.3220	3500	1.5602
1.6331	61.0169	3600	1.5486
1.6206	62.7119	3700	1.5046
1.6057	64.4068	3800	1.5098
1.5877	66.1017	3900	1.4885
1.5576	67.7966	4000	1.4747
1.5413	69.4915	4100	1.4500
1.5142	71.1864	4200	1.3917
1.4847	72.8814	4300	1.3771
1.4665	74.5763	4400	1.3737
1.4562	76.2712	4500	1.3560
1.4422	77.9661	4600	1.3394
1.4148	79.6610	4700	1.3453
1.4108	81.3559	4800	1.3261
1.3992	83.0508	4900	1.3111
1.3784	84.7458	5000	1.3083
1.3607	86.4407	5100	1.2982
1.352	88.1356	5200	1.2758
1.3353	89.8305	5300	1.2818
1.3173	91.5254	5400	1.2697
1.3085	93.2203	5500	1.2440
1.2955	94.9153	5600	1.2099
1.2933	96.6102	5700	1.2337
1.2757	98.3051	5800	1.2056
1.262	100.0	5900	1.1993
1.2509	101.6949	6000	1.1933
1.2418	103.3898	6100	1.1645
1.2275	105.0847	6200	1.1820
1.2219	106.7797	6300	1.1452
1.216	108.4746	6400	1.1709
1.1954	110.1695	6500	1.1386
1.1858	111.8644	6600	1.1336
1.1799	113.5593	6700	1.1217
1.1707	115.2542	6800	1.1102
1.1653	116.9492	6900	1.1093
1.1476	118.6441	7000	1.1032
1.1406	120.3390	7100	1.1004
1.1364	122.0339	7200	1.0698
1.1173	123.7288	7300	1.0817
1.1129	125.4237	7400	1.0825
1.1077	127.1186	7500	1.0728
1.0943	128.8136	7600	1.0496
1.0881	130.5085	7700	1.0443
1.0774	132.2034	7800	1.0392
1.0789	133.8983	7900	1.0470
1.0608	135.5932	8000	1.0248
1.0516	137.2881	8100	1.0144
1.0533	138.9831	8200	1.0246
1.0401	140.6780	8300	1.0180
1.0347	142.3729	8400	0.9903
1.0268	144.0678	8500	0.9809
1.016	145.7627	8600	0.9839
1.003	147.4576	8700	0.9870
1.0066	149.1525	8800	0.9610
1.004	150.8475	8900	0.9488
0.9918	152.5424	9000	0.9601
0.996	154.2373	9100	0.9660
0.9835	155.9322	9200	0.9376
0.9801	157.6271	9300	0.9504
0.9606	159.3220	9400	0.9482
0.9646	161.0169	9500	0.9312
0.9637	162.7119	9600	0.9304
0.9528	164.4068	9700	0.9270
0.9432	166.1017	9800	0.9205
0.9398	167.7966	9900	0.9202
0.9377	169.4915	10000	0.9167
0.9282	171.1864	10100	0.9122
0.9118	172.8814	10200	0.9034
0.907	174.5763	10300	0.8839
0.9152	176.2712	10400	0.8879
0.9124	177.9661	10500	0.8885
0.9005	179.6610	10600	0.8832
0.8979	181.3559	10700	0.8767
0.8836	183.0508	10800	0.8886
0.882	184.7458	10900	0.8601
0.8818	186.4407	11000	0.8713
0.8724	188.1356	11100	0.8602
0.8688	189.8305	11200	0.8510
0.8677	191.5254	11300	0.8401
0.8643	193.2203	11400	0.8453
0.8638	194.9153	11500	0.8351
0.8539	196.6102	11600	0.8460
0.852	198.3051	11700	0.8474
0.8433	200.0	11800	0.8249
0.8394	201.6949	11900	0.8326
0.8339	203.3898	12000	0.8331
0.8284	205.0847	12100	0.8216
0.8284	206.7797	12200	0.8148
0.8261	208.4746	12300	0.8020
0.8158	210.1695	12400	0.8112
0.8148	211.8644	12500	0.8154
0.8118	213.5593	12600	0.8058
0.8067	215.2542	12700	0.8005
0.8022	216.9492	12800	0.8021
0.793	218.6441	12900	0.8000
0.8003	220.3390	13000	0.7924
0.7891	222.0339	13100	0.7891
0.7802	223.7288	13200	0.7678
0.7906	225.4237	13300	0.7902
0.7756	227.1186	13400	0.7774
0.7788	228.8136	13500	0.7639
0.7654	230.5085	13600	0.7767
0.7686	232.2034	13700	0.7831
0.7691	233.8983	13800	0.7735
0.7656	235.5932	13900	0.7632
0.7597	237.2881	14000	0.7694
0.7562	238.9831	14100	0.7475
0.754	240.6780	14200	0.7585
0.7461	242.3729	14300	0.7502
0.749	244.0678	14400	0.7533
0.7482	245.7627	14500	0.7308
0.7436	247.4576	14600	0.7581
0.7395	249.1525	14700	0.7118
0.7339	250.8475	14800	0.7458
0.7337	252.5424	14900	0.7232
0.7262	254.2373	15000	0.7421
0.7313	255.9322	15100	0.7097
0.7223	257.6271	15200	0.7235
0.7189	259.3220	15300	0.7222
0.7228	261.0169	15400	0.7373
0.7163	262.7119	15500	0.7247
0.7102	264.4068	15600	0.7255

Framework versions

Transformers 4.44.2
Pytorch 2.4.1+cu121
Datasets 3.0.1
Tokenizers 0.19.1

zemaia
/

e500_lr2e-05

e500_lr2e-05

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for zemaia/e500_lr2e-05

Evaluation results