scenario-KD-PO-MSV-CL-D2_data-cl-massive_all_1_166

This model is a fine-tuned version of haryoaw/scenario-MDBT-TCR_data-cl-massive_all_1_1 on the massive dataset. It achieves the following results on the evaluation set:

Loss: 6.0186
Accuracy: 0.6461
F1: 0.6134

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 32
eval_batch_size: 32
seed: 66
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 30

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy	F1
2.2524	0.56	5000	5.6187	0.6299	0.5728
1.3325	1.11	10000	5.4671	0.6450	0.5924
1.2156	1.67	15000	6.0747	0.6250	0.5912
0.8855	2.22	20000	5.8471	0.6355	0.5857
0.8518	2.78	25000	6.2545	0.6303	0.5845
0.6853	3.33	30000	6.0057	0.6408	0.6017
0.6658	3.89	35000	6.0161	0.6423	0.6002
0.5544	4.45	40000	6.0854	0.6392	0.6006
0.5357	5.0	45000	6.2732	0.6283	0.5888
0.4924	5.56	50000	6.4624	0.6277	0.5952
0.4369	6.11	55000	6.2119	0.6354	0.5944
0.4276	6.67	60000	6.2395	0.6425	0.6006
0.3974	7.23	65000	6.6542	0.6264	0.5893
0.404	7.78	70000	6.4174	0.6295	0.5975
0.3763	8.34	75000	6.1405	0.6426	0.6025
0.3719	8.89	80000	6.4745	0.6346	0.6024
0.3428	9.45	85000	5.9964	0.6389	0.6030
0.3288	10.0	90000	6.3213	0.6335	0.5988
0.3192	10.56	95000	6.4269	0.6321	0.5937
0.2934	11.12	100000	6.3224	0.6392	0.6039
0.3054	11.67	105000	6.4531	0.6326	0.5989
0.2841	12.23	110000	6.2824	0.6360	0.6075
0.2915	12.78	115000	6.1928	0.6391	0.6039
0.274	13.34	120000	6.1931	0.6401	0.6030
0.2776	13.9	125000	6.2524	0.6384	0.6045
0.2724	14.45	130000	5.9260	0.6456	0.6090
0.2602	15.01	135000	6.3508	0.6347	0.6052
0.2627	15.56	140000	6.1761	0.6421	0.6074
0.2496	16.12	145000	6.1398	0.6391	0.6111
0.253	16.67	150000	6.2431	0.6328	0.6014
0.2451	17.23	155000	6.1746	0.6378	0.6048
0.2369	17.79	160000	6.0915	0.6435	0.6103
0.2332	18.34	165000	6.2138	0.6376	0.6071
0.2325	18.9	170000	6.1176	0.6433	0.6073
0.2239	19.45	175000	5.9650	0.6419	0.6068
0.2229	20.01	180000	6.2025	0.6395	0.6072
0.2241	20.56	185000	6.0510	0.6418	0.6088
0.212	21.12	190000	5.9952	0.6438	0.6100
0.218	21.68	195000	6.2810	0.6376	0.6073
0.212	22.23	200000	5.9274	0.6454	0.6076
0.2091	22.79	205000	6.1958	0.6367	0.6071
0.2091	23.34	210000	5.9633	0.6463	0.6153
0.2065	23.9	215000	6.0132	0.6458	0.6116
0.2048	24.46	220000	5.9809	0.6451	0.6132
0.1996	25.01	225000	6.1021	0.6389	0.6063
0.1966	25.57	230000	5.9612	0.6448	0.6140
0.1964	26.12	235000	6.0715	0.6434	0.6134
0.1971	26.68	240000	6.0237	0.6442	0.6127
0.1893	27.23	245000	6.0213	0.6418	0.6086
0.1891	27.79	250000	6.0386	0.6445	0.6127
0.1942	28.35	255000	6.0043	0.6428	0.6099
0.1966	28.9	260000	5.9983	0.6440	0.6130
0.1883	29.46	265000	6.0186	0.6461	0.6134

Framework versions

Transformers 4.33.3
Pytorch 2.1.1+cu121
Datasets 2.14.5
Tokenizers 0.13.3

haryoaw
/

scenario-KD-PO-MSV-CL-D2_data-cl-massive_all_1_166

scenario-KD-PO-MSV-CL-D2_data-cl-massive_all_1_166

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for haryoaw/scenario-KD-PO-MSV-CL-D2_data-cl-massive_all_1_166

Evaluation results