scenario-KD-PR-MSV-D2_data-cl-massive_all_1_144

This model is a fine-tuned version of haryoaw/scenario-MDBT-TCR_data-cl-massive_all_1_1 on the massive dataset. It achieves the following results on the evaluation set:

Loss: 2.5342
Accuracy: 0.6242
F1: 0.5945

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 32
eval_batch_size: 32
seed: 44
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 30

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy	F1
1.3572	0.56	5000	2.3093	0.6266	0.5756
1.1368	1.11	10000	2.3610	0.6232	0.5831
1.0884	1.67	15000	2.3177	0.6323	0.5864
0.9993	2.22	20000	2.3903	0.6208	0.5754
0.9971	2.78	25000	2.3353	0.6305	0.5819
0.9487	3.33	30000	2.3824	0.6281	0.5838
0.9282	3.89	35000	2.4776	0.6199	0.5797
0.9133	4.45	40000	2.4677	0.6207	0.5774
0.921	5.0	45000	2.4799	0.6195	0.5889
0.8781	5.56	50000	2.5276	0.6145	0.5823
0.8679	6.11	55000	2.5109	0.6219	0.5837
0.8667	6.67	60000	2.4956	0.6257	0.5863
0.8541	7.23	65000	2.5621	0.6089	0.5827
0.858	7.78	70000	2.4640	0.6258	0.5818
0.8341	8.34	75000	2.5379	0.6147	0.5771
0.8431	8.89	80000	2.5004	0.6307	0.5894
0.8409	9.45	85000	2.6250	0.6013	0.5683
0.8317	10.0	90000	2.5641	0.6205	0.5883
0.8264	10.56	95000	2.5379	0.6169	0.5787
0.8159	11.12	100000	2.4846	0.6287	0.5892
0.8211	11.67	105000	2.4920	0.6260	0.5834
0.8127	12.23	110000	2.5126	0.6268	0.5904
0.8176	12.78	115000	2.4977	0.6298	0.5907
0.8135	13.34	120000	2.6144	0.6130	0.5816
0.8092	13.9	125000	2.6534	0.6015	0.5770
0.8057	14.45	130000	2.6538	0.5992	0.5661
0.8106	15.01	135000	2.5595	0.6138	0.5671
0.8041	15.56	140000	2.6846	0.6044	0.5753
0.8008	16.12	145000	2.6878	0.6045	0.5876
0.7928	16.67	150000	2.6002	0.6144	0.5883
0.792	17.23	155000	2.5880	0.6171	0.5801
0.7953	17.79	160000	2.5090	0.6269	0.5887
0.7888	18.34	165000	2.5957	0.6162	0.5906
0.7938	18.9	170000	2.5766	0.6192	0.5725
0.7909	19.45	175000	2.5189	0.6237	0.5904
0.7864	20.01	180000	2.5648	0.6172	0.5840
0.7869	20.56	185000	2.5519	0.6210	0.5910
0.7833	21.12	190000	2.6989	0.5995	0.5803
0.7835	21.68	195000	2.5599	0.6151	0.5815
0.7803	22.23	200000	2.5249	0.6231	0.5893
0.7837	22.79	205000	2.5989	0.6135	0.5859
0.7793	23.34	210000	2.5693	0.6210	0.5930
0.7851	23.9	215000	2.5545	0.6208	0.5893
0.7799	24.46	220000	2.5639	0.6178	0.5884
0.7791	25.01	225000	2.5613	0.6200	0.5934
0.775	25.57	230000	2.5726	0.6199	0.5958
0.7747	26.12	235000	2.5184	0.6237	0.5914
0.775	26.68	240000	2.5404	0.6226	0.5961
0.7749	27.23	245000	2.5392	0.6202	0.5874
0.7767	27.79	250000	2.5781	0.6194	0.5953
0.7762	28.35	255000	2.4980	0.6290	0.5953
0.7737	28.9	260000	2.5199	0.6239	0.5938
0.7708	29.46	265000	2.5342	0.6242	0.5945

Framework versions

Transformers 4.33.3
Pytorch 2.1.1+cu121
Datasets 2.14.5
Tokenizers 0.13.3

haryoaw
/

scenario-KD-PR-MSV-D2_data-cl-massive_all_1_144

scenario-KD-PR-MSV-D2_data-cl-massive_all_1_144

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for haryoaw/scenario-KD-PR-MSV-D2_data-cl-massive_all_1_144

Evaluation results