ModernBERT_SimCSE_v02
This model is a fine-tuned version of x2bee/KoModernBERT-base-v02 on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 0.0370
- Pearson Cosine: 0.7760
- Spearman Cosine: 0.7753
- Pearson Manhattan: 0.7337
- Spearman Manhattan: 0.7389
- Pearson Euclidean: 0.7316
- Spearman Euclidean: 0.7371
- Pearson Dot: 0.7343
- Spearman Dot: 0.7356
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 1e-05
- train_batch_size: 2
- eval_batch_size: 1
- seed: 42
- distributed_type: multi-GPU
- num_devices: 8
- gradient_accumulation_steps: 16
- total_train_batch_size: 256
- total_eval_batch_size: 8
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- lr_scheduler_warmup_ratio: 0.1
- num_epochs: 10.0
Training results
Training Loss | Epoch | Step | Validation Loss | Pearson Cosine | Spearman Cosine | Pearson Manhattan | Spearman Manhattan | Pearson Euclidean | Spearman Euclidean | Pearson Dot | Spearman Dot |
---|---|---|---|---|---|---|---|---|---|---|---|
0.3877 | 0.2343 | 250 | 0.1542 | 0.7471 | 0.7499 | 0.7393 | 0.7393 | 0.7395 | 0.7397 | 0.6414 | 0.6347 |
0.2805 | 0.4686 | 500 | 0.1142 | 0.7578 | 0.7643 | 0.7619 | 0.7652 | 0.7619 | 0.7654 | 0.6366 | 0.6341 |
0.2331 | 0.7029 | 750 | 0.0950 | 0.7674 | 0.7772 | 0.7685 | 0.7747 | 0.7682 | 0.7741 | 0.6584 | 0.6570 |
0.2455 | 0.9372 | 1000 | 0.0924 | 0.7677 | 0.7781 | 0.7714 | 0.7778 | 0.7712 | 0.7776 | 0.6569 | 0.6558 |
0.1933 | 1.1715 | 1250 | 0.0802 | 0.7704 | 0.7790 | 0.7678 | 0.7742 | 0.7676 | 0.7740 | 0.6808 | 0.6797 |
0.1872 | 1.4058 | 1500 | 0.0790 | 0.7685 | 0.7777 | 0.7693 | 0.7755 | 0.7690 | 0.7752 | 0.6580 | 0.6569 |
0.1628 | 1.6401 | 1750 | 0.0719 | 0.7652 | 0.7734 | 0.7619 | 0.7685 | 0.7616 | 0.7679 | 0.6584 | 0.6574 |
0.1983 | 1.8744 | 2000 | 0.0737 | 0.7772 | 0.7864 | 0.7654 | 0.7748 | 0.7649 | 0.7741 | 0.6604 | 0.6608 |
0.1448 | 2.1087 | 2250 | 0.0637 | 0.7666 | 0.7737 | 0.7644 | 0.7706 | 0.7639 | 0.7702 | 0.6530 | 0.6506 |
0.1449 | 2.3430 | 2500 | 0.0579 | 0.7641 | 0.7698 | 0.7590 | 0.7654 | 0.7584 | 0.7652 | 0.6659 | 0.6637 |
0.1443 | 2.5773 | 2750 | 0.0596 | 0.7583 | 0.7659 | 0.7599 | 0.7656 | 0.7594 | 0.7652 | 0.6585 | 0.6551 |
0.1363 | 2.8116 | 3000 | 0.0575 | 0.7671 | 0.7727 | 0.7570 | 0.7629 | 0.7564 | 0.7624 | 0.6769 | 0.6756 |
0.1227 | 3.0459 | 3250 | 0.0517 | 0.7637 | 0.7670 | 0.7567 | 0.7616 | 0.7560 | 0.7612 | 0.6736 | 0.6714 |
0.103 | 3.2802 | 3500 | 0.0464 | 0.7603 | 0.7643 | 0.7484 | 0.7535 | 0.7475 | 0.7527 | 0.6813 | 0.6796 |
0.0982 | 3.5145 | 3750 | 0.0451 | 0.7657 | 0.7695 | 0.7452 | 0.7527 | 0.7441 | 0.7516 | 0.6821 | 0.6822 |
0.0987 | 3.7488 | 4000 | 0.0467 | 0.7577 | 0.7607 | 0.7397 | 0.7446 | 0.7385 | 0.7434 | 0.6644 | 0.6623 |
0.1111 | 3.9831 | 4250 | 0.0406 | 0.7691 | 0.7703 | 0.7471 | 0.7525 | 0.7457 | 0.7510 | 0.6998 | 0.7006 |
0.0888 | 4.2174 | 4500 | 0.0421 | 0.7580 | 0.7598 | 0.7412 | 0.7468 | 0.7401 | 0.7457 | 0.6874 | 0.6866 |
0.0756 | 4.4517 | 4750 | 0.0395 | 0.7664 | 0.7674 | 0.7432 | 0.7480 | 0.7419 | 0.7465 | 0.7008 | 0.7012 |
0.0871 | 4.6860 | 5000 | 0.0411 | 0.7588 | 0.7604 | 0.7405 | 0.7456 | 0.7389 | 0.7441 | 0.6872 | 0.6867 |
0.0839 | 4.9203 | 5250 | 0.0400 | 0.7643 | 0.7659 | 0.7311 | 0.7367 | 0.7297 | 0.7351 | 0.6955 | 0.6969 |
0.0499 | 5.1546 | 5500 | 0.0392 | 0.7609 | 0.7616 | 0.7335 | 0.7393 | 0.7321 | 0.7379 | 0.6993 | 0.6999 |
0.0542 | 5.3889 | 5750 | 0.0385 | 0.7664 | 0.7669 | 0.7399 | 0.7454 | 0.7386 | 0.7445 | 0.7061 | 0.7065 |
0.0555 | 5.6232 | 6000 | 0.0396 | 0.7571 | 0.7579 | 0.7293 | 0.7344 | 0.7279 | 0.7331 | 0.7004 | 0.6993 |
0.0547 | 5.8575 | 6250 | 0.0384 | 0.7664 | 0.7667 | 0.7382 | 0.7432 | 0.7370 | 0.7420 | 0.7110 | 0.7119 |
0.0476 | 6.0918 | 6500 | 0.0388 | 0.7638 | 0.7642 | 0.7338 | 0.7392 | 0.7323 | 0.7378 | 0.7008 | 0.7013 |
0.043 | 6.3261 | 6750 | 0.0376 | 0.7692 | 0.7696 | 0.7357 | 0.7409 | 0.7343 | 0.7396 | 0.7138 | 0.7152 |
0.0436 | 6.5604 | 7000 | 0.0381 | 0.7662 | 0.7662 | 0.7351 | 0.7398 | 0.7334 | 0.7384 | 0.7105 | 0.7116 |
0.032 | 6.7948 | 7250 | 0.0377 | 0.7692 | 0.7695 | 0.7333 | 0.7375 | 0.7316 | 0.7357 | 0.7224 | 0.7242 |
0.0342 | 7.0291 | 7500 | 0.0378 | 0.7685 | 0.7678 | 0.7333 | 0.7376 | 0.7320 | 0.7365 | 0.7184 | 0.7187 |
0.0341 | 7.2634 | 7750 | 0.0377 | 0.7699 | 0.7695 | 0.7336 | 0.7378 | 0.7317 | 0.7362 | 0.7237 | 0.7244 |
0.0329 | 7.4977 | 8000 | 0.0375 | 0.7706 | 0.7697 | 0.7364 | 0.7409 | 0.7346 | 0.7395 | 0.7248 | 0.7250 |
0.035 | 7.7320 | 8250 | 0.0380 | 0.7700 | 0.7691 | 0.7308 | 0.7352 | 0.7288 | 0.7335 | 0.7271 | 0.7276 |
0.0361 | 7.9663 | 8500 | 0.0377 | 0.7717 | 0.7709 | 0.7276 | 0.7318 | 0.7254 | 0.7297 | 0.7309 | 0.7317 |
0.0224 | 8.2006 | 8750 | 0.0377 | 0.7711 | 0.7703 | 0.7328 | 0.7369 | 0.7310 | 0.7356 | 0.7244 | 0.7254 |
0.0256 | 8.4349 | 9000 | 0.0386 | 0.7652 | 0.7647 | 0.7274 | 0.7319 | 0.7254 | 0.7303 | 0.7186 | 0.7191 |
0.0283 | 8.6692 | 9250 | 0.0370 | 0.7740 | 0.7732 | 0.7294 | 0.7331 | 0.7272 | 0.7312 | 0.7285 | 0.7298 |
0.0274 | 8.9035 | 9500 | 0.0372 | 0.7742 | 0.7739 | 0.7288 | 0.7346 | 0.7266 | 0.7328 | 0.7298 | 0.7317 |
0.025 | 9.1378 | 9750 | 0.0377 | 0.7719 | 0.7718 | 0.7334 | 0.7389 | 0.7313 | 0.7372 | 0.7295 | 0.7309 |
0.031 | 9.3721 | 10000 | 0.0372 | 0.7734 | 0.7735 | 0.7373 | 0.7421 | 0.7357 | 0.7407 | 0.7253 | 0.7266 |
0.0243 | 9.6064 | 10250 | 0.0374 | 0.7731 | 0.7727 | 0.7321 | 0.7364 | 0.7300 | 0.7346 | 0.7303 | 0.7306 |
0.0233 | 9.8407 | 10500 | 0.0370 | 0.7760 | 0.7753 | 0.7337 | 0.7389 | 0.7316 | 0.7371 | 0.7343 | 0.7356 |
Framework versions
- Transformers 4.48.3
- Pytorch 2.5.1+cu124
- Datasets 3.2.0
- Tokenizers 0.21.0
- Downloads last month
- 21
Inference Providers
NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API:
The model has no pipeline_tag.
Model tree for CocoRoF/ModernBERT_SimCSE_v02
Base model
answerdotai/ModernBERT-base
Finetuned
x2bee/KoModernBERT-base-mlm_v02