File size: 10,414 Bytes
5bd4b0b
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
---
library_name: transformers
license: apache-2.0
base_model: x2bee/KoModernBERT-base-v02
tags:
- generated_from_trainer
model-index:
- name: ModernBERT_SimCSE_v02
  results: []
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# ModernBERT_SimCSE_v02

This model is a fine-tuned version of [x2bee/KoModernBERT-base-v02](https://huggingface.co/x2bee/KoModernBERT-base-v02) on an unknown dataset.
It achieves the following results on the evaluation set:
- Loss: 0.0370
- Pearson Cosine: 0.7760
- Spearman Cosine: 0.7753
- Pearson Manhattan: 0.7337
- Spearman Manhattan: 0.7389
- Pearson Euclidean: 0.7316
- Spearman Euclidean: 0.7371
- Pearson Dot: 0.7343
- Spearman Dot: 0.7356

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 1e-05
- train_batch_size: 2
- eval_batch_size: 1
- seed: 42
- distributed_type: multi-GPU
- num_devices: 8
- gradient_accumulation_steps: 16
- total_train_batch_size: 256
- total_eval_batch_size: 8
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- lr_scheduler_warmup_ratio: 0.1
- num_epochs: 10.0

### Training results

| Training Loss | Epoch  | Step  | Validation Loss | Pearson Cosine | Spearman Cosine | Pearson Manhattan | Spearman Manhattan | Pearson Euclidean | Spearman Euclidean | Pearson Dot | Spearman Dot |
|:-------------:|:------:|:-----:|:---------------:|:--------------:|:---------------:|:-----------------:|:------------------:|:-----------------:|:------------------:|:-----------:|:------------:|
| 0.3877        | 0.2343 | 250   | 0.1542          | 0.7471         | 0.7499          | 0.7393            | 0.7393             | 0.7395            | 0.7397             | 0.6414      | 0.6347       |
| 0.2805        | 0.4686 | 500   | 0.1142          | 0.7578         | 0.7643          | 0.7619            | 0.7652             | 0.7619            | 0.7654             | 0.6366      | 0.6341       |
| 0.2331        | 0.7029 | 750   | 0.0950          | 0.7674         | 0.7772          | 0.7685            | 0.7747             | 0.7682            | 0.7741             | 0.6584      | 0.6570       |
| 0.2455        | 0.9372 | 1000  | 0.0924          | 0.7677         | 0.7781          | 0.7714            | 0.7778             | 0.7712            | 0.7776             | 0.6569      | 0.6558       |
| 0.1933        | 1.1715 | 1250  | 0.0802          | 0.7704         | 0.7790          | 0.7678            | 0.7742             | 0.7676            | 0.7740             | 0.6808      | 0.6797       |
| 0.1872        | 1.4058 | 1500  | 0.0790          | 0.7685         | 0.7777          | 0.7693            | 0.7755             | 0.7690            | 0.7752             | 0.6580      | 0.6569       |
| 0.1628        | 1.6401 | 1750  | 0.0719          | 0.7652         | 0.7734          | 0.7619            | 0.7685             | 0.7616            | 0.7679             | 0.6584      | 0.6574       |
| 0.1983        | 1.8744 | 2000  | 0.0737          | 0.7772         | 0.7864          | 0.7654            | 0.7748             | 0.7649            | 0.7741             | 0.6604      | 0.6608       |
| 0.1448        | 2.1087 | 2250  | 0.0637          | 0.7666         | 0.7737          | 0.7644            | 0.7706             | 0.7639            | 0.7702             | 0.6530      | 0.6506       |
| 0.1449        | 2.3430 | 2500  | 0.0579          | 0.7641         | 0.7698          | 0.7590            | 0.7654             | 0.7584            | 0.7652             | 0.6659      | 0.6637       |
| 0.1443        | 2.5773 | 2750  | 0.0596          | 0.7583         | 0.7659          | 0.7599            | 0.7656             | 0.7594            | 0.7652             | 0.6585      | 0.6551       |
| 0.1363        | 2.8116 | 3000  | 0.0575          | 0.7671         | 0.7727          | 0.7570            | 0.7629             | 0.7564            | 0.7624             | 0.6769      | 0.6756       |
| 0.1227        | 3.0459 | 3250  | 0.0517          | 0.7637         | 0.7670          | 0.7567            | 0.7616             | 0.7560            | 0.7612             | 0.6736      | 0.6714       |
| 0.103         | 3.2802 | 3500  | 0.0464          | 0.7603         | 0.7643          | 0.7484            | 0.7535             | 0.7475            | 0.7527             | 0.6813      | 0.6796       |
| 0.0982        | 3.5145 | 3750  | 0.0451          | 0.7657         | 0.7695          | 0.7452            | 0.7527             | 0.7441            | 0.7516             | 0.6821      | 0.6822       |
| 0.0987        | 3.7488 | 4000  | 0.0467          | 0.7577         | 0.7607          | 0.7397            | 0.7446             | 0.7385            | 0.7434             | 0.6644      | 0.6623       |
| 0.1111        | 3.9831 | 4250  | 0.0406          | 0.7691         | 0.7703          | 0.7471            | 0.7525             | 0.7457            | 0.7510             | 0.6998      | 0.7006       |
| 0.0888        | 4.2174 | 4500  | 0.0421          | 0.7580         | 0.7598          | 0.7412            | 0.7468             | 0.7401            | 0.7457             | 0.6874      | 0.6866       |
| 0.0756        | 4.4517 | 4750  | 0.0395          | 0.7664         | 0.7674          | 0.7432            | 0.7480             | 0.7419            | 0.7465             | 0.7008      | 0.7012       |
| 0.0871        | 4.6860 | 5000  | 0.0411          | 0.7588         | 0.7604          | 0.7405            | 0.7456             | 0.7389            | 0.7441             | 0.6872      | 0.6867       |
| 0.0839        | 4.9203 | 5250  | 0.0400          | 0.7643         | 0.7659          | 0.7311            | 0.7367             | 0.7297            | 0.7351             | 0.6955      | 0.6969       |
| 0.0499        | 5.1546 | 5500  | 0.0392          | 0.7609         | 0.7616          | 0.7335            | 0.7393             | 0.7321            | 0.7379             | 0.6993      | 0.6999       |
| 0.0542        | 5.3889 | 5750  | 0.0385          | 0.7664         | 0.7669          | 0.7399            | 0.7454             | 0.7386            | 0.7445             | 0.7061      | 0.7065       |
| 0.0555        | 5.6232 | 6000  | 0.0396          | 0.7571         | 0.7579          | 0.7293            | 0.7344             | 0.7279            | 0.7331             | 0.7004      | 0.6993       |
| 0.0547        | 5.8575 | 6250  | 0.0384          | 0.7664         | 0.7667          | 0.7382            | 0.7432             | 0.7370            | 0.7420             | 0.7110      | 0.7119       |
| 0.0476        | 6.0918 | 6500  | 0.0388          | 0.7638         | 0.7642          | 0.7338            | 0.7392             | 0.7323            | 0.7378             | 0.7008      | 0.7013       |
| 0.043         | 6.3261 | 6750  | 0.0376          | 0.7692         | 0.7696          | 0.7357            | 0.7409             | 0.7343            | 0.7396             | 0.7138      | 0.7152       |
| 0.0436        | 6.5604 | 7000  | 0.0381          | 0.7662         | 0.7662          | 0.7351            | 0.7398             | 0.7334            | 0.7384             | 0.7105      | 0.7116       |
| 0.032         | 6.7948 | 7250  | 0.0377          | 0.7692         | 0.7695          | 0.7333            | 0.7375             | 0.7316            | 0.7357             | 0.7224      | 0.7242       |
| 0.0342        | 7.0291 | 7500  | 0.0378          | 0.7685         | 0.7678          | 0.7333            | 0.7376             | 0.7320            | 0.7365             | 0.7184      | 0.7187       |
| 0.0341        | 7.2634 | 7750  | 0.0377          | 0.7699         | 0.7695          | 0.7336            | 0.7378             | 0.7317            | 0.7362             | 0.7237      | 0.7244       |
| 0.0329        | 7.4977 | 8000  | 0.0375          | 0.7706         | 0.7697          | 0.7364            | 0.7409             | 0.7346            | 0.7395             | 0.7248      | 0.7250       |
| 0.035         | 7.7320 | 8250  | 0.0380          | 0.7700         | 0.7691          | 0.7308            | 0.7352             | 0.7288            | 0.7335             | 0.7271      | 0.7276       |
| 0.0361        | 7.9663 | 8500  | 0.0377          | 0.7717         | 0.7709          | 0.7276            | 0.7318             | 0.7254            | 0.7297             | 0.7309      | 0.7317       |
| 0.0224        | 8.2006 | 8750  | 0.0377          | 0.7711         | 0.7703          | 0.7328            | 0.7369             | 0.7310            | 0.7356             | 0.7244      | 0.7254       |
| 0.0256        | 8.4349 | 9000  | 0.0386          | 0.7652         | 0.7647          | 0.7274            | 0.7319             | 0.7254            | 0.7303             | 0.7186      | 0.7191       |
| 0.0283        | 8.6692 | 9250  | 0.0370          | 0.7740         | 0.7732          | 0.7294            | 0.7331             | 0.7272            | 0.7312             | 0.7285      | 0.7298       |
| 0.0274        | 8.9035 | 9500  | 0.0372          | 0.7742         | 0.7739          | 0.7288            | 0.7346             | 0.7266            | 0.7328             | 0.7298      | 0.7317       |
| 0.025         | 9.1378 | 9750  | 0.0377          | 0.7719         | 0.7718          | 0.7334            | 0.7389             | 0.7313            | 0.7372             | 0.7295      | 0.7309       |
| 0.031         | 9.3721 | 10000 | 0.0372          | 0.7734         | 0.7735          | 0.7373            | 0.7421             | 0.7357            | 0.7407             | 0.7253      | 0.7266       |
| 0.0243        | 9.6064 | 10250 | 0.0374          | 0.7731         | 0.7727          | 0.7321            | 0.7364             | 0.7300            | 0.7346             | 0.7303      | 0.7306       |
| 0.0233        | 9.8407 | 10500 | 0.0370          | 0.7760         | 0.7753          | 0.7337            | 0.7389             | 0.7316            | 0.7371             | 0.7343      | 0.7356       |


### Framework versions

- Transformers 4.48.3
- Pytorch 2.5.1+cu124
- Datasets 3.2.0
- Tokenizers 0.21.0