CocoRoF's picture
test Done
5bd4b0b verified
metadata
library_name: transformers
license: apache-2.0
base_model: x2bee/KoModernBERT-base-v02
tags:
  - generated_from_trainer
model-index:
  - name: ModernBERT_SimCSE_v02
    results: []

ModernBERT_SimCSE_v02

This model is a fine-tuned version of x2bee/KoModernBERT-base-v02 on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0370
  • Pearson Cosine: 0.7760
  • Spearman Cosine: 0.7753
  • Pearson Manhattan: 0.7337
  • Spearman Manhattan: 0.7389
  • Pearson Euclidean: 0.7316
  • Spearman Euclidean: 0.7371
  • Pearson Dot: 0.7343
  • Spearman Dot: 0.7356

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 2
  • eval_batch_size: 1
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 8
  • gradient_accumulation_steps: 16
  • total_train_batch_size: 256
  • total_eval_batch_size: 8
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 10.0

Training results

Training Loss Epoch Step Validation Loss Pearson Cosine Spearman Cosine Pearson Manhattan Spearman Manhattan Pearson Euclidean Spearman Euclidean Pearson Dot Spearman Dot
0.3877 0.2343 250 0.1542 0.7471 0.7499 0.7393 0.7393 0.7395 0.7397 0.6414 0.6347
0.2805 0.4686 500 0.1142 0.7578 0.7643 0.7619 0.7652 0.7619 0.7654 0.6366 0.6341
0.2331 0.7029 750 0.0950 0.7674 0.7772 0.7685 0.7747 0.7682 0.7741 0.6584 0.6570
0.2455 0.9372 1000 0.0924 0.7677 0.7781 0.7714 0.7778 0.7712 0.7776 0.6569 0.6558
0.1933 1.1715 1250 0.0802 0.7704 0.7790 0.7678 0.7742 0.7676 0.7740 0.6808 0.6797
0.1872 1.4058 1500 0.0790 0.7685 0.7777 0.7693 0.7755 0.7690 0.7752 0.6580 0.6569
0.1628 1.6401 1750 0.0719 0.7652 0.7734 0.7619 0.7685 0.7616 0.7679 0.6584 0.6574
0.1983 1.8744 2000 0.0737 0.7772 0.7864 0.7654 0.7748 0.7649 0.7741 0.6604 0.6608
0.1448 2.1087 2250 0.0637 0.7666 0.7737 0.7644 0.7706 0.7639 0.7702 0.6530 0.6506
0.1449 2.3430 2500 0.0579 0.7641 0.7698 0.7590 0.7654 0.7584 0.7652 0.6659 0.6637
0.1443 2.5773 2750 0.0596 0.7583 0.7659 0.7599 0.7656 0.7594 0.7652 0.6585 0.6551
0.1363 2.8116 3000 0.0575 0.7671 0.7727 0.7570 0.7629 0.7564 0.7624 0.6769 0.6756
0.1227 3.0459 3250 0.0517 0.7637 0.7670 0.7567 0.7616 0.7560 0.7612 0.6736 0.6714
0.103 3.2802 3500 0.0464 0.7603 0.7643 0.7484 0.7535 0.7475 0.7527 0.6813 0.6796
0.0982 3.5145 3750 0.0451 0.7657 0.7695 0.7452 0.7527 0.7441 0.7516 0.6821 0.6822
0.0987 3.7488 4000 0.0467 0.7577 0.7607 0.7397 0.7446 0.7385 0.7434 0.6644 0.6623
0.1111 3.9831 4250 0.0406 0.7691 0.7703 0.7471 0.7525 0.7457 0.7510 0.6998 0.7006
0.0888 4.2174 4500 0.0421 0.7580 0.7598 0.7412 0.7468 0.7401 0.7457 0.6874 0.6866
0.0756 4.4517 4750 0.0395 0.7664 0.7674 0.7432 0.7480 0.7419 0.7465 0.7008 0.7012
0.0871 4.6860 5000 0.0411 0.7588 0.7604 0.7405 0.7456 0.7389 0.7441 0.6872 0.6867
0.0839 4.9203 5250 0.0400 0.7643 0.7659 0.7311 0.7367 0.7297 0.7351 0.6955 0.6969
0.0499 5.1546 5500 0.0392 0.7609 0.7616 0.7335 0.7393 0.7321 0.7379 0.6993 0.6999
0.0542 5.3889 5750 0.0385 0.7664 0.7669 0.7399 0.7454 0.7386 0.7445 0.7061 0.7065
0.0555 5.6232 6000 0.0396 0.7571 0.7579 0.7293 0.7344 0.7279 0.7331 0.7004 0.6993
0.0547 5.8575 6250 0.0384 0.7664 0.7667 0.7382 0.7432 0.7370 0.7420 0.7110 0.7119
0.0476 6.0918 6500 0.0388 0.7638 0.7642 0.7338 0.7392 0.7323 0.7378 0.7008 0.7013
0.043 6.3261 6750 0.0376 0.7692 0.7696 0.7357 0.7409 0.7343 0.7396 0.7138 0.7152
0.0436 6.5604 7000 0.0381 0.7662 0.7662 0.7351 0.7398 0.7334 0.7384 0.7105 0.7116
0.032 6.7948 7250 0.0377 0.7692 0.7695 0.7333 0.7375 0.7316 0.7357 0.7224 0.7242
0.0342 7.0291 7500 0.0378 0.7685 0.7678 0.7333 0.7376 0.7320 0.7365 0.7184 0.7187
0.0341 7.2634 7750 0.0377 0.7699 0.7695 0.7336 0.7378 0.7317 0.7362 0.7237 0.7244
0.0329 7.4977 8000 0.0375 0.7706 0.7697 0.7364 0.7409 0.7346 0.7395 0.7248 0.7250
0.035 7.7320 8250 0.0380 0.7700 0.7691 0.7308 0.7352 0.7288 0.7335 0.7271 0.7276
0.0361 7.9663 8500 0.0377 0.7717 0.7709 0.7276 0.7318 0.7254 0.7297 0.7309 0.7317
0.0224 8.2006 8750 0.0377 0.7711 0.7703 0.7328 0.7369 0.7310 0.7356 0.7244 0.7254
0.0256 8.4349 9000 0.0386 0.7652 0.7647 0.7274 0.7319 0.7254 0.7303 0.7186 0.7191
0.0283 8.6692 9250 0.0370 0.7740 0.7732 0.7294 0.7331 0.7272 0.7312 0.7285 0.7298
0.0274 8.9035 9500 0.0372 0.7742 0.7739 0.7288 0.7346 0.7266 0.7328 0.7298 0.7317
0.025 9.1378 9750 0.0377 0.7719 0.7718 0.7334 0.7389 0.7313 0.7372 0.7295 0.7309
0.031 9.3721 10000 0.0372 0.7734 0.7735 0.7373 0.7421 0.7357 0.7407 0.7253 0.7266
0.0243 9.6064 10250 0.0374 0.7731 0.7727 0.7321 0.7364 0.7300 0.7346 0.7303 0.7306
0.0233 9.8407 10500 0.0370 0.7760 0.7753 0.7337 0.7389 0.7316 0.7371 0.7343 0.7356

Framework versions

  • Transformers 4.48.3
  • Pytorch 2.5.1+cu124
  • Datasets 3.2.0
  • Tokenizers 0.21.0