ModernBERT_SimCSE_v02

This model is a fine-tuned version of x2bee/KoModernBERT-base-v02 on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0370
  • Pearson Cosine: 0.7760
  • Spearman Cosine: 0.7753
  • Pearson Manhattan: 0.7337
  • Spearman Manhattan: 0.7389
  • Pearson Euclidean: 0.7316
  • Spearman Euclidean: 0.7371
  • Pearson Dot: 0.7343
  • Spearman Dot: 0.7356

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 2
  • eval_batch_size: 1
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 8
  • gradient_accumulation_steps: 16
  • total_train_batch_size: 256
  • total_eval_batch_size: 8
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 10.0

Training results

Training Loss Epoch Step Validation Loss Pearson Cosine Spearman Cosine Pearson Manhattan Spearman Manhattan Pearson Euclidean Spearman Euclidean Pearson Dot Spearman Dot
0.3877 0.2343 250 0.1542 0.7471 0.7499 0.7393 0.7393 0.7395 0.7397 0.6414 0.6347
0.2805 0.4686 500 0.1142 0.7578 0.7643 0.7619 0.7652 0.7619 0.7654 0.6366 0.6341
0.2331 0.7029 750 0.0950 0.7674 0.7772 0.7685 0.7747 0.7682 0.7741 0.6584 0.6570
0.2455 0.9372 1000 0.0924 0.7677 0.7781 0.7714 0.7778 0.7712 0.7776 0.6569 0.6558
0.1933 1.1715 1250 0.0802 0.7704 0.7790 0.7678 0.7742 0.7676 0.7740 0.6808 0.6797
0.1872 1.4058 1500 0.0790 0.7685 0.7777 0.7693 0.7755 0.7690 0.7752 0.6580 0.6569
0.1628 1.6401 1750 0.0719 0.7652 0.7734 0.7619 0.7685 0.7616 0.7679 0.6584 0.6574
0.1983 1.8744 2000 0.0737 0.7772 0.7864 0.7654 0.7748 0.7649 0.7741 0.6604 0.6608
0.1448 2.1087 2250 0.0637 0.7666 0.7737 0.7644 0.7706 0.7639 0.7702 0.6530 0.6506
0.1449 2.3430 2500 0.0579 0.7641 0.7698 0.7590 0.7654 0.7584 0.7652 0.6659 0.6637
0.1443 2.5773 2750 0.0596 0.7583 0.7659 0.7599 0.7656 0.7594 0.7652 0.6585 0.6551
0.1363 2.8116 3000 0.0575 0.7671 0.7727 0.7570 0.7629 0.7564 0.7624 0.6769 0.6756
0.1227 3.0459 3250 0.0517 0.7637 0.7670 0.7567 0.7616 0.7560 0.7612 0.6736 0.6714
0.103 3.2802 3500 0.0464 0.7603 0.7643 0.7484 0.7535 0.7475 0.7527 0.6813 0.6796
0.0982 3.5145 3750 0.0451 0.7657 0.7695 0.7452 0.7527 0.7441 0.7516 0.6821 0.6822
0.0987 3.7488 4000 0.0467 0.7577 0.7607 0.7397 0.7446 0.7385 0.7434 0.6644 0.6623
0.1111 3.9831 4250 0.0406 0.7691 0.7703 0.7471 0.7525 0.7457 0.7510 0.6998 0.7006
0.0888 4.2174 4500 0.0421 0.7580 0.7598 0.7412 0.7468 0.7401 0.7457 0.6874 0.6866
0.0756 4.4517 4750 0.0395 0.7664 0.7674 0.7432 0.7480 0.7419 0.7465 0.7008 0.7012
0.0871 4.6860 5000 0.0411 0.7588 0.7604 0.7405 0.7456 0.7389 0.7441 0.6872 0.6867
0.0839 4.9203 5250 0.0400 0.7643 0.7659 0.7311 0.7367 0.7297 0.7351 0.6955 0.6969
0.0499 5.1546 5500 0.0392 0.7609 0.7616 0.7335 0.7393 0.7321 0.7379 0.6993 0.6999
0.0542 5.3889 5750 0.0385 0.7664 0.7669 0.7399 0.7454 0.7386 0.7445 0.7061 0.7065
0.0555 5.6232 6000 0.0396 0.7571 0.7579 0.7293 0.7344 0.7279 0.7331 0.7004 0.6993
0.0547 5.8575 6250 0.0384 0.7664 0.7667 0.7382 0.7432 0.7370 0.7420 0.7110 0.7119
0.0476 6.0918 6500 0.0388 0.7638 0.7642 0.7338 0.7392 0.7323 0.7378 0.7008 0.7013
0.043 6.3261 6750 0.0376 0.7692 0.7696 0.7357 0.7409 0.7343 0.7396 0.7138 0.7152
0.0436 6.5604 7000 0.0381 0.7662 0.7662 0.7351 0.7398 0.7334 0.7384 0.7105 0.7116
0.032 6.7948 7250 0.0377 0.7692 0.7695 0.7333 0.7375 0.7316 0.7357 0.7224 0.7242
0.0342 7.0291 7500 0.0378 0.7685 0.7678 0.7333 0.7376 0.7320 0.7365 0.7184 0.7187
0.0341 7.2634 7750 0.0377 0.7699 0.7695 0.7336 0.7378 0.7317 0.7362 0.7237 0.7244
0.0329 7.4977 8000 0.0375 0.7706 0.7697 0.7364 0.7409 0.7346 0.7395 0.7248 0.7250
0.035 7.7320 8250 0.0380 0.7700 0.7691 0.7308 0.7352 0.7288 0.7335 0.7271 0.7276
0.0361 7.9663 8500 0.0377 0.7717 0.7709 0.7276 0.7318 0.7254 0.7297 0.7309 0.7317
0.0224 8.2006 8750 0.0377 0.7711 0.7703 0.7328 0.7369 0.7310 0.7356 0.7244 0.7254
0.0256 8.4349 9000 0.0386 0.7652 0.7647 0.7274 0.7319 0.7254 0.7303 0.7186 0.7191
0.0283 8.6692 9250 0.0370 0.7740 0.7732 0.7294 0.7331 0.7272 0.7312 0.7285 0.7298
0.0274 8.9035 9500 0.0372 0.7742 0.7739 0.7288 0.7346 0.7266 0.7328 0.7298 0.7317
0.025 9.1378 9750 0.0377 0.7719 0.7718 0.7334 0.7389 0.7313 0.7372 0.7295 0.7309
0.031 9.3721 10000 0.0372 0.7734 0.7735 0.7373 0.7421 0.7357 0.7407 0.7253 0.7266
0.0243 9.6064 10250 0.0374 0.7731 0.7727 0.7321 0.7364 0.7300 0.7346 0.7303 0.7306
0.0233 9.8407 10500 0.0370 0.7760 0.7753 0.7337 0.7389 0.7316 0.7371 0.7343 0.7356

Framework versions

  • Transformers 4.48.3
  • Pytorch 2.5.1+cu124
  • Datasets 3.2.0
  • Tokenizers 0.21.0
Downloads last month
21
Safetensors
Model size
185M params
Tensor type
F32
·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no pipeline_tag.

Model tree for CocoRoF/ModernBERT_SimCSE_v02

Finetuned
(1)
this model