ModernBERT_SimCSE_v02

This model is a fine-tuned version of x2bee/KoModernBERT-base-v02 on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 0.0370
Pearson Cosine: 0.7760
Spearman Cosine: 0.7753
Pearson Manhattan: 0.7337
Spearman Manhattan: 0.7389
Pearson Euclidean: 0.7316
Spearman Euclidean: 0.7371
Pearson Dot: 0.7343
Spearman Dot: 0.7356

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 1e-05
train_batch_size: 2
eval_batch_size: 1
seed: 42
distributed_type: multi-GPU
num_devices: 8
gradient_accumulation_steps: 16
total_train_batch_size: 256
total_eval_batch_size: 8
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_ratio: 0.1
num_epochs: 10.0

Training results

Training Loss	Epoch	Step	Validation Loss	Pearson Cosine	Spearman Cosine	Pearson Manhattan	Spearman Manhattan	Pearson Euclidean	Spearman Euclidean	Pearson Dot	Spearman Dot
0.3877	0.2343	250	0.1542	0.7471	0.7499	0.7393	0.7393	0.7395	0.7397	0.6414	0.6347
0.2805	0.4686	500	0.1142	0.7578	0.7643	0.7619	0.7652	0.7619	0.7654	0.6366	0.6341
0.2331	0.7029	750	0.0950	0.7674	0.7772	0.7685	0.7747	0.7682	0.7741	0.6584	0.6570
0.2455	0.9372	1000	0.0924	0.7677	0.7781	0.7714	0.7778	0.7712	0.7776	0.6569	0.6558
0.1933	1.1715	1250	0.0802	0.7704	0.7790	0.7678	0.7742	0.7676	0.7740	0.6808	0.6797
0.1872	1.4058	1500	0.0790	0.7685	0.7777	0.7693	0.7755	0.7690	0.7752	0.6580	0.6569
0.1628	1.6401	1750	0.0719	0.7652	0.7734	0.7619	0.7685	0.7616	0.7679	0.6584	0.6574
0.1983	1.8744	2000	0.0737	0.7772	0.7864	0.7654	0.7748	0.7649	0.7741	0.6604	0.6608
0.1448	2.1087	2250	0.0637	0.7666	0.7737	0.7644	0.7706	0.7639	0.7702	0.6530	0.6506
0.1449	2.3430	2500	0.0579	0.7641	0.7698	0.7590	0.7654	0.7584	0.7652	0.6659	0.6637
0.1443	2.5773	2750	0.0596	0.7583	0.7659	0.7599	0.7656	0.7594	0.7652	0.6585	0.6551
0.1363	2.8116	3000	0.0575	0.7671	0.7727	0.7570	0.7629	0.7564	0.7624	0.6769	0.6756
0.1227	3.0459	3250	0.0517	0.7637	0.7670	0.7567	0.7616	0.7560	0.7612	0.6736	0.6714
0.103	3.2802	3500	0.0464	0.7603	0.7643	0.7484	0.7535	0.7475	0.7527	0.6813	0.6796
0.0982	3.5145	3750	0.0451	0.7657	0.7695	0.7452	0.7527	0.7441	0.7516	0.6821	0.6822
0.0987	3.7488	4000	0.0467	0.7577	0.7607	0.7397	0.7446	0.7385	0.7434	0.6644	0.6623
0.1111	3.9831	4250	0.0406	0.7691	0.7703	0.7471	0.7525	0.7457	0.7510	0.6998	0.7006
0.0888	4.2174	4500	0.0421	0.7580	0.7598	0.7412	0.7468	0.7401	0.7457	0.6874	0.6866
0.0756	4.4517	4750	0.0395	0.7664	0.7674	0.7432	0.7480	0.7419	0.7465	0.7008	0.7012
0.0871	4.6860	5000	0.0411	0.7588	0.7604	0.7405	0.7456	0.7389	0.7441	0.6872	0.6867
0.0839	4.9203	5250	0.0400	0.7643	0.7659	0.7311	0.7367	0.7297	0.7351	0.6955	0.6969
0.0499	5.1546	5500	0.0392	0.7609	0.7616	0.7335	0.7393	0.7321	0.7379	0.6993	0.6999
0.0542	5.3889	5750	0.0385	0.7664	0.7669	0.7399	0.7454	0.7386	0.7445	0.7061	0.7065
0.0555	5.6232	6000	0.0396	0.7571	0.7579	0.7293	0.7344	0.7279	0.7331	0.7004	0.6993
0.0547	5.8575	6250	0.0384	0.7664	0.7667	0.7382	0.7432	0.7370	0.7420	0.7110	0.7119
0.0476	6.0918	6500	0.0388	0.7638	0.7642	0.7338	0.7392	0.7323	0.7378	0.7008	0.7013
0.043	6.3261	6750	0.0376	0.7692	0.7696	0.7357	0.7409	0.7343	0.7396	0.7138	0.7152
0.0436	6.5604	7000	0.0381	0.7662	0.7662	0.7351	0.7398	0.7334	0.7384	0.7105	0.7116
0.032	6.7948	7250	0.0377	0.7692	0.7695	0.7333	0.7375	0.7316	0.7357	0.7224	0.7242
0.0342	7.0291	7500	0.0378	0.7685	0.7678	0.7333	0.7376	0.7320	0.7365	0.7184	0.7187
0.0341	7.2634	7750	0.0377	0.7699	0.7695	0.7336	0.7378	0.7317	0.7362	0.7237	0.7244
0.0329	7.4977	8000	0.0375	0.7706	0.7697	0.7364	0.7409	0.7346	0.7395	0.7248	0.7250
0.035	7.7320	8250	0.0380	0.7700	0.7691	0.7308	0.7352	0.7288	0.7335	0.7271	0.7276
0.0361	7.9663	8500	0.0377	0.7717	0.7709	0.7276	0.7318	0.7254	0.7297	0.7309	0.7317
0.0224	8.2006	8750	0.0377	0.7711	0.7703	0.7328	0.7369	0.7310	0.7356	0.7244	0.7254
0.0256	8.4349	9000	0.0386	0.7652	0.7647	0.7274	0.7319	0.7254	0.7303	0.7186	0.7191
0.0283	8.6692	9250	0.0370	0.7740	0.7732	0.7294	0.7331	0.7272	0.7312	0.7285	0.7298
0.0274	8.9035	9500	0.0372	0.7742	0.7739	0.7288	0.7346	0.7266	0.7328	0.7298	0.7317
0.025	9.1378	9750	0.0377	0.7719	0.7718	0.7334	0.7389	0.7313	0.7372	0.7295	0.7309
0.031	9.3721	10000	0.0372	0.7734	0.7735	0.7373	0.7421	0.7357	0.7407	0.7253	0.7266
0.0243	9.6064	10250	0.0374	0.7731	0.7727	0.7321	0.7364	0.7300	0.7346	0.7303	0.7306
0.0233	9.8407	10500	0.0370	0.7760	0.7753	0.7337	0.7389	0.7316	0.7371	0.7343	0.7356

Framework versions

Transformers 4.48.3
Pytorch 2.5.1+cu124
Datasets 3.2.0
Tokenizers 0.21.0

CocoRoF
/

ModernBERT_SimCSE_v02

ModernBERT_SimCSE_v02

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for CocoRoF/ModernBERT_SimCSE_v02

Evaluation results