scenario-KD-PR-MSV-D2_data-cl-massive_all_1_166

This model is a fine-tuned version of haryoaw/scenario-MDBT-TCR_data-cl-massive_all_1_1 on the massive dataset. It achieves the following results on the evaluation set:

Loss: 2.5375
Accuracy: 0.6228
F1: 0.5882

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 32
eval_batch_size: 32
seed: 66
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 30

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy	F1
1.3672	0.56	5000	2.2863	0.6347	0.5695
1.1254	1.11	10000	2.2766	0.6389	0.5787
1.096	1.67	15000	2.3272	0.6320	0.5950
1.0005	2.22	20000	2.3817	0.6288	0.5832
1.0016	2.78	25000	2.3657	0.6298	0.5844
0.9507	3.33	30000	2.3644	0.6336	0.5859
0.9575	3.89	35000	2.3989	0.6260	0.5880
0.9037	4.45	40000	2.4691	0.6229	0.5865
0.9175	5.0	45000	2.4481	0.6209	0.5752
0.8991	5.56	50000	2.5361	0.6132	0.5801
0.8665	6.11	55000	2.5003	0.6167	0.5735
0.8734	6.67	60000	2.4807	0.6249	0.5832
0.8588	7.23	65000	2.5712	0.6115	0.5672
0.8629	7.78	70000	2.5958	0.6076	0.5746
0.8508	8.34	75000	2.5262	0.6229	0.5849
0.8543	8.89	80000	2.5397	0.6171	0.5799
0.8426	9.45	85000	2.5143	0.6119	0.5634
0.8377	10.0	90000	2.5661	0.6131	0.5808
0.8317	10.56	95000	2.5662	0.6168	0.5770
0.8231	11.12	100000	2.5272	0.6207	0.5775
0.8231	11.67	105000	2.5792	0.6047	0.5625
0.8198	12.23	110000	2.5869	0.6144	0.5783
0.8219	12.78	115000	2.5868	0.6126	0.5745
0.8131	13.34	120000	2.6226	0.6043	0.5658
0.8113	13.9	125000	2.5777	0.6174	0.5807
0.8122	14.45	130000	2.6451	0.6022	0.5787
0.8124	15.01	135000	2.5426	0.6215	0.5847
0.8106	15.56	140000	2.6562	0.6031	0.5774
0.8046	16.12	145000	2.6410	0.6059	0.5703
0.8031	16.67	150000	2.6155	0.6088	0.5794
0.7949	17.23	155000	2.6978	0.5997	0.5698
0.799	17.79	160000	2.6272	0.6102	0.5783
0.7964	18.34	165000	2.5934	0.6161	0.5765
0.7943	18.9	170000	2.5863	0.6142	0.5722
0.793	19.45	175000	2.5353	0.6224	0.5762
0.7919	20.01	180000	2.6723	0.6057	0.5759
0.7893	20.56	185000	2.6377	0.6098	0.5820
0.7864	21.12	190000	2.6707	0.6057	0.5824
0.79	21.68	195000	2.7768	0.5904	0.5802
0.7871	22.23	200000	2.6895	0.6001	0.5734
0.786	22.79	205000	2.6505	0.6063	0.5827
0.7862	23.34	210000	2.5607	0.6200	0.5876
0.7863	23.9	215000	2.6414	0.6082	0.5828
0.7839	24.46	220000	2.5978	0.6125	0.5883
0.7828	25.01	225000	2.6076	0.6125	0.5804
0.7838	25.57	230000	2.6193	0.6097	0.5778
0.7825	26.12	235000	2.5599	0.6184	0.5860
0.7832	26.68	240000	2.5363	0.6227	0.5857
0.7788	27.23	245000	2.5842	0.6199	0.5930
0.7768	27.79	250000	2.5907	0.6170	0.5889
0.7802	28.35	255000	2.5625	0.6196	0.5895
0.7808	28.9	260000	2.5512	0.6220	0.5903
0.7776	29.46	265000	2.5375	0.6228	0.5882

Framework versions

Transformers 4.33.3
Pytorch 2.1.1+cu121
Datasets 2.14.5
Tokenizers 0.13.3

haryoaw
/

scenario-KD-PR-MSV-D2_data-cl-massive_all_1_166

scenario-KD-PR-MSV-D2_data-cl-massive_all_1_166

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for haryoaw/scenario-KD-PR-MSV-D2_data-cl-massive_all_1_166

Evaluation results