t5-base-p-l-akk-en-20240922-080244

This model was trained from scratch on the None dataset. It achieves the following results on the evaluation set:

Loss: 0.8507

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 3.152142797506865e-05
train_batch_size: 12
eval_batch_size: 12
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 100
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss
0.9444	1.1384	2500	0.8638
0.8431	2.2769	5000	0.8085
0.7912	3.4153	7500	0.7750
0.7434	4.5537	10000	0.7531
0.7171	5.6922	12500	0.7395
0.692	6.8306	15000	0.7278
0.6596	7.9690	17500	0.7165
0.6155	9.1075	20000	0.7231
0.61	10.2459	22500	0.7129
0.5886	11.3843	25000	0.7068
0.5718	12.5228	27500	0.7084
0.5519	13.6612	30000	0.7029
0.5412	14.7996	32500	0.7007
0.5241	15.9381	35000	0.7017
0.5026	17.0765	37500	0.7134
0.4733	18.2149	40000	0.7038
0.489	19.3534	42500	0.7067
0.4666	20.4918	45000	0.7083
0.4494	21.6302	47500	0.7061
0.4545	22.7687	50000	0.7092
0.4357	23.9071	52500	0.7116
0.4332	25.0455	55000	0.7189
0.4152	26.1840	57500	0.7207
0.3995	27.3224	60000	0.7196
0.3976	28.4608	62500	0.7184
0.3879	29.5993	65000	0.7210
0.3812	30.7377	67500	0.7243
0.3749	31.8761	70000	0.7241
0.3663	33.0146	72500	0.7320
0.3612	34.1530	75000	0.7344
0.3469	35.2914	77500	0.7377
0.3407	36.4299	80000	0.7388
0.3309	37.5683	82500	0.7411
0.3354	38.7067	85000	0.7354
0.3252	39.8452	87500	0.7407
0.3167	40.9836	90000	0.7435
0.3182	42.1220	92500	0.7502
0.2994	43.2605	95000	0.7547
0.3064	44.3989	97500	0.7561
0.2923	45.5373	100000	0.7529
0.2848	46.6758	102500	0.7593
0.2843	47.8142	105000	0.7600
0.279	48.9526	107500	0.7650
0.2781	50.0911	110000	0.7706
0.2629	51.2295	112500	0.7730
0.2639	52.3679	115000	0.7726
0.2624	53.5064	117500	0.7791
0.2547	54.6448	120000	0.7776
0.2567	55.7832	122500	0.7747
0.2484	56.9217	125000	0.7792
0.2454	58.0601	127500	0.7893
0.2398	59.1985	130000	0.7864
0.2313	60.3370	132500	0.7973
0.2362	61.4754	135000	0.7964
0.2359	62.6138	137500	0.7962
0.226	63.7523	140000	0.8009
0.2271	64.8907	142500	0.8027
0.2249	66.0291	145000	0.8014
0.2212	67.1676	147500	0.8077
0.2129	68.3060	150000	0.8088
0.2131	69.4444	152500	0.8108
0.2106	70.5829	155000	0.8144
0.2078	71.7213	157500	0.8163
0.2103	72.8597	160000	0.8148
0.2025	73.9982	162500	0.8215
0.2023	75.1366	165000	0.8250
0.197	76.2750	167500	0.8267
0.1945	77.4135	170000	0.8274
0.1919	78.5519	172500	0.8289
0.187	79.6903	175000	0.8308
0.1948	80.8288	177500	0.8339
0.1857	81.9672	180000	0.8346
0.191	83.1056	182500	0.8380
0.1796	84.2441	185000	0.8387
0.1862	85.3825	187500	0.8414
0.185	86.5209	190000	0.8409
0.1778	87.6594	192500	0.8434
0.1824	88.7978	195000	0.8426
0.1735	89.9362	197500	0.8443
0.1737	91.0747	200000	0.8474
0.1787	92.2131	202500	0.8462
0.1759	93.3515	205000	0.8484
0.1744	94.4900	207500	0.8487
0.1778	95.6284	210000	0.8502
0.1767	96.7668	212500	0.8507
0.175	97.9053	215000	0.8499
0.1723	99.0437	217500	0.8507

Framework versions

Transformers 4.41.2
Pytorch 2.3.1+cu121
Datasets 2.19.1
Tokenizers 0.19.1

Thalesian
/

t5-base-p-l-akk-en-20240922-080244

t5-base-p-l-akk-en-20240922-080244

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Evaluation results