Evaluation on the test set completed on 2024_10_24.
Browse files- README.md +176 -0
- all_results.json +16 -0
- logs/events.out.tfevents.1729668200.datavisu2 +2 -2
- logs/events.out.tfevents.1729797250.datavisu2 +3 -0
- model.safetensors +1 -1
- test_results.json +11 -0
- train_results.json +9 -0
- trainer_state.json +0 -0
README.md
ADDED
@@ -0,0 +1,176 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
library_name: transformers
|
3 |
+
license: apache-2.0
|
4 |
+
base_model: facebook/dinov2-large
|
5 |
+
tags:
|
6 |
+
- generated_from_trainer
|
7 |
+
metrics:
|
8 |
+
- accuracy
|
9 |
+
model-index:
|
10 |
+
- name: Aina-large-2024_10_23-batch-size32_freeze_monolabel
|
11 |
+
results: []
|
12 |
+
---
|
13 |
+
|
14 |
+
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
|
15 |
+
should probably proofread and complete it, then remove this comment. -->
|
16 |
+
|
17 |
+
# Aina-large-2024_10_23-batch-size32_freeze_monolabel
|
18 |
+
|
19 |
+
This model is a fine-tuned version of [facebook/dinov2-large](https://huggingface.co/facebook/dinov2-large) on the None dataset.
|
20 |
+
It achieves the following results on the evaluation set:
|
21 |
+
- Loss: 0.6806
|
22 |
+
- F1 Micro: 0.7614
|
23 |
+
- F1 Macro: 0.4269
|
24 |
+
- Accuracy: 0.7614
|
25 |
+
- Learning Rate: 0.0000
|
26 |
+
|
27 |
+
## Model description
|
28 |
+
|
29 |
+
More information needed
|
30 |
+
|
31 |
+
## Intended uses & limitations
|
32 |
+
|
33 |
+
More information needed
|
34 |
+
|
35 |
+
## Training and evaluation data
|
36 |
+
|
37 |
+
More information needed
|
38 |
+
|
39 |
+
## Training procedure
|
40 |
+
|
41 |
+
### Training hyperparameters
|
42 |
+
|
43 |
+
The following hyperparameters were used during training:
|
44 |
+
- learning_rate: 0.001
|
45 |
+
- train_batch_size: 32
|
46 |
+
- eval_batch_size: 32
|
47 |
+
- seed: 42
|
48 |
+
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
|
49 |
+
- lr_scheduler_type: linear
|
50 |
+
- num_epochs: 150
|
51 |
+
- mixed_precision_training: Native AMP
|
52 |
+
|
53 |
+
### Training results
|
54 |
+
|
55 |
+
| Training Loss | Epoch | Step | Validation Loss | F1 Micro | F1 Macro | Accuracy | Rate |
|
56 |
+
|:-------------:|:-----:|:------:|:---------------:|:--------:|:--------:|:--------:|:------:|
|
57 |
+
| 0.9658 | 1.0 | 3312 | 0.8468 | 0.7179 | 0.2217 | 0.7179 | 0.001 |
|
58 |
+
| 0.9257 | 2.0 | 6624 | 0.8172 | 0.7247 | 0.3043 | 0.7247 | 0.001 |
|
59 |
+
| 0.9202 | 3.0 | 9936 | 0.8048 | 0.7260 | 0.3035 | 0.7260 | 0.001 |
|
60 |
+
| 0.8905 | 4.0 | 13248 | 0.7947 | 0.7285 | 0.3109 | 0.7285 | 0.001 |
|
61 |
+
| 0.907 | 5.0 | 16560 | 0.7822 | 0.7309 | 0.3046 | 0.7309 | 0.001 |
|
62 |
+
| 0.8925 | 6.0 | 19872 | 0.7838 | 0.7345 | 0.3159 | 0.7345 | 0.001 |
|
63 |
+
| 0.8922 | 7.0 | 23184 | 0.7931 | 0.7357 | 0.3244 | 0.7357 | 0.001 |
|
64 |
+
| 0.883 | 8.0 | 26496 | 0.7688 | 0.7354 | 0.3241 | 0.7354 | 0.001 |
|
65 |
+
| 0.8697 | 9.0 | 29808 | 0.7635 | 0.7377 | 0.3242 | 0.7377 | 0.001 |
|
66 |
+
| 0.8782 | 10.0 | 33120 | 0.7689 | 0.7373 | 0.3327 | 0.7373 | 0.001 |
|
67 |
+
| 0.8869 | 11.0 | 36432 | 0.7676 | 0.7350 | 0.3337 | 0.7350 | 0.001 |
|
68 |
+
| 0.8791 | 12.0 | 39744 | 0.7640 | 0.7369 | 0.3409 | 0.7369 | 0.001 |
|
69 |
+
| 0.9017 | 13.0 | 43056 | 0.7674 | 0.7337 | 0.3400 | 0.7337 | 0.001 |
|
70 |
+
| 0.8753 | 14.0 | 46368 | 0.7586 | 0.7381 | 0.3271 | 0.7381 | 0.001 |
|
71 |
+
| 0.872 | 15.0 | 49680 | 0.7658 | 0.7373 | 0.3229 | 0.7373 | 0.001 |
|
72 |
+
| 0.8672 | 16.0 | 52992 | 0.8086 | 0.7389 | 0.3353 | 0.7389 | 0.001 |
|
73 |
+
| 0.8678 | 17.0 | 56304 | 0.7629 | 0.7390 | 0.3359 | 0.7390 | 0.001 |
|
74 |
+
| 0.8875 | 18.0 | 59616 | 0.7615 | 0.7365 | 0.3353 | 0.7365 | 0.001 |
|
75 |
+
| 0.8645 | 19.0 | 62928 | 0.7682 | 0.7387 | 0.3450 | 0.7387 | 0.001 |
|
76 |
+
| 0.881 | 20.0 | 66240 | 0.7559 | 0.7406 | 0.3411 | 0.7406 | 0.001 |
|
77 |
+
| 0.8927 | 21.0 | 69552 | 0.7755 | 0.7349 | 0.3408 | 0.7349 | 0.001 |
|
78 |
+
| 0.8704 | 22.0 | 72864 | 0.7674 | 0.7344 | 0.3233 | 0.7344 | 0.001 |
|
79 |
+
| 0.8711 | 23.0 | 76176 | 0.7695 | 0.7340 | 0.3139 | 0.7340 | 0.001 |
|
80 |
+
| 0.8722 | 24.0 | 79488 | 0.7538 | 0.7400 | 0.3338 | 0.7400 | 0.001 |
|
81 |
+
| 0.884 | 25.0 | 82800 | 0.7643 | 0.7352 | 0.3480 | 0.7352 | 0.001 |
|
82 |
+
| 0.8661 | 26.0 | 86112 | 0.7568 | 0.7388 | 0.3272 | 0.7388 | 0.001 |
|
83 |
+
| 0.8847 | 27.0 | 89424 | 0.7665 | 0.7371 | 0.3427 | 0.7371 | 0.001 |
|
84 |
+
| 0.8749 | 28.0 | 92736 | 0.7592 | 0.7385 | 0.3129 | 0.7385 | 0.001 |
|
85 |
+
| 0.8782 | 29.0 | 96048 | 0.7544 | 0.7402 | 0.3420 | 0.7402 | 0.001 |
|
86 |
+
| 0.882 | 30.0 | 99360 | 0.7549 | 0.7412 | 0.3503 | 0.7412 | 0.001 |
|
87 |
+
| 0.8481 | 31.0 | 102672 | 0.7332 | 0.7457 | 0.3602 | 0.7457 | 0.0001 |
|
88 |
+
| 0.8329 | 32.0 | 105984 | 0.7296 | 0.7456 | 0.3696 | 0.7456 | 0.0001 |
|
89 |
+
| 0.817 | 33.0 | 109296 | 0.7270 | 0.7467 | 0.3749 | 0.7467 | 0.0001 |
|
90 |
+
| 0.8173 | 34.0 | 112608 | 0.7234 | 0.7471 | 0.3683 | 0.7471 | 0.0001 |
|
91 |
+
| 0.8221 | 35.0 | 115920 | 0.7187 | 0.7492 | 0.3795 | 0.7492 | 0.0001 |
|
92 |
+
| 0.8085 | 36.0 | 119232 | 0.7215 | 0.7484 | 0.3758 | 0.7484 | 0.0001 |
|
93 |
+
| 0.8113 | 37.0 | 122544 | 0.7180 | 0.7505 | 0.3767 | 0.7505 | 0.0001 |
|
94 |
+
| 0.802 | 38.0 | 125856 | 0.7137 | 0.7502 | 0.3861 | 0.7502 | 0.0001 |
|
95 |
+
| 0.8042 | 39.0 | 129168 | 0.7125 | 0.7514 | 0.3868 | 0.7514 | 0.0001 |
|
96 |
+
| 0.7976 | 40.0 | 132480 | 0.7126 | 0.7499 | 0.3844 | 0.7499 | 0.0001 |
|
97 |
+
| 0.7963 | 41.0 | 135792 | 0.7112 | 0.7516 | 0.3905 | 0.7516 | 0.0001 |
|
98 |
+
| 0.8054 | 42.0 | 139104 | 0.7116 | 0.7511 | 0.3926 | 0.7511 | 0.0001 |
|
99 |
+
| 0.8119 | 43.0 | 142416 | 0.7098 | 0.7516 | 0.3901 | 0.7516 | 0.0001 |
|
100 |
+
| 0.8009 | 44.0 | 145728 | 0.7102 | 0.7507 | 0.3897 | 0.7507 | 0.0001 |
|
101 |
+
| 0.7929 | 45.0 | 149040 | 0.7100 | 0.7517 | 0.3883 | 0.7517 | 0.0001 |
|
102 |
+
| 0.8079 | 46.0 | 152352 | 0.7068 | 0.7510 | 0.3912 | 0.7510 | 0.0001 |
|
103 |
+
| 0.8053 | 47.0 | 155664 | 0.7074 | 0.7510 | 0.3888 | 0.7510 | 0.0001 |
|
104 |
+
| 0.7965 | 48.0 | 158976 | 0.7095 | 0.7508 | 0.3890 | 0.7508 | 0.0001 |
|
105 |
+
| 0.8043 | 49.0 | 162288 | 0.7090 | 0.7509 | 0.3935 | 0.7509 | 0.0001 |
|
106 |
+
| 0.7861 | 50.0 | 165600 | 0.7080 | 0.7512 | 0.4026 | 0.7512 | 0.0001 |
|
107 |
+
| 0.7917 | 51.0 | 168912 | 0.7062 | 0.7514 | 0.3942 | 0.7514 | 0.0001 |
|
108 |
+
| 0.7909 | 52.0 | 172224 | 0.7049 | 0.7526 | 0.3971 | 0.7526 | 0.0001 |
|
109 |
+
| 0.7886 | 53.0 | 175536 | 0.7044 | 0.7526 | 0.4017 | 0.7526 | 0.0001 |
|
110 |
+
| 0.7834 | 54.0 | 178848 | 0.7028 | 0.7524 | 0.3992 | 0.7524 | 0.0001 |
|
111 |
+
| 0.7991 | 55.0 | 182160 | 0.7029 | 0.7527 | 0.3966 | 0.7527 | 0.0001 |
|
112 |
+
| 0.7875 | 56.0 | 185472 | 0.7026 | 0.7533 | 0.4011 | 0.7533 | 0.0001 |
|
113 |
+
| 0.7868 | 57.0 | 188784 | 0.7029 | 0.7525 | 0.4056 | 0.7525 | 0.0001 |
|
114 |
+
| 0.7837 | 58.0 | 192096 | 0.7021 | 0.7536 | 0.4020 | 0.7536 | 0.0001 |
|
115 |
+
| 0.7834 | 59.0 | 195408 | 0.7011 | 0.7534 | 0.4049 | 0.7534 | 0.0001 |
|
116 |
+
| 0.7893 | 60.0 | 198720 | 0.7019 | 0.7530 | 0.4029 | 0.7530 | 0.0001 |
|
117 |
+
| 0.7824 | 61.0 | 202032 | 0.7023 | 0.7519 | 0.3995 | 0.7519 | 0.0001 |
|
118 |
+
| 0.789 | 62.0 | 205344 | 0.7038 | 0.7525 | 0.4041 | 0.7525 | 0.0001 |
|
119 |
+
| 0.7778 | 63.0 | 208656 | 0.7003 | 0.7535 | 0.4038 | 0.7535 | 0.0001 |
|
120 |
+
| 0.7719 | 64.0 | 211968 | 0.6997 | 0.7526 | 0.3982 | 0.7526 | 0.0001 |
|
121 |
+
| 0.7909 | 65.0 | 215280 | 0.7074 | 0.7515 | 0.3997 | 0.7515 | 0.0001 |
|
122 |
+
| 0.7854 | 66.0 | 218592 | 0.7018 | 0.7526 | 0.3940 | 0.7526 | 0.0001 |
|
123 |
+
| 0.7746 | 67.0 | 221904 | 0.7023 | 0.7543 | 0.4000 | 0.7543 | 0.0001 |
|
124 |
+
| 0.7905 | 68.0 | 225216 | 0.6975 | 0.7541 | 0.4063 | 0.7541 | 0.0001 |
|
125 |
+
| 0.7824 | 69.0 | 228528 | 0.6994 | 0.7538 | 0.4072 | 0.7538 | 0.0001 |
|
126 |
+
| 0.7795 | 70.0 | 231840 | 0.6969 | 0.7557 | 0.4094 | 0.7557 | 0.0001 |
|
127 |
+
| 0.7763 | 71.0 | 235152 | 0.6969 | 0.7564 | 0.4085 | 0.7564 | 0.0001 |
|
128 |
+
| 0.7723 | 72.0 | 238464 | 0.6987 | 0.7531 | 0.4090 | 0.7531 | 0.0001 |
|
129 |
+
| 0.7914 | 73.0 | 241776 | 0.6945 | 0.7556 | 0.4203 | 0.7556 | 0.0001 |
|
130 |
+
| 0.7658 | 74.0 | 245088 | 0.6951 | 0.7544 | 0.4117 | 0.7544 | 0.0001 |
|
131 |
+
| 0.7803 | 75.0 | 248400 | 0.6989 | 0.7548 | 0.4104 | 0.7548 | 0.0001 |
|
132 |
+
| 0.7772 | 76.0 | 251712 | 0.6997 | 0.7536 | 0.4037 | 0.7536 | 0.0001 |
|
133 |
+
| 0.7813 | 77.0 | 255024 | 0.6986 | 0.7535 | 0.4092 | 0.7535 | 0.0001 |
|
134 |
+
| 0.7938 | 78.0 | 258336 | 0.6982 | 0.7530 | 0.4084 | 0.7530 | 0.0001 |
|
135 |
+
| 0.776 | 79.0 | 261648 | 0.6958 | 0.7545 | 0.4055 | 0.7545 | 0.0001 |
|
136 |
+
| 0.7613 | 80.0 | 264960 | 0.6934 | 0.7548 | 0.4061 | 0.7548 | 1e-05 |
|
137 |
+
| 0.7647 | 81.0 | 268272 | 0.6922 | 0.7560 | 0.4108 | 0.7560 | 1e-05 |
|
138 |
+
| 0.7842 | 82.0 | 271584 | 0.6933 | 0.7543 | 0.4069 | 0.7543 | 1e-05 |
|
139 |
+
| 0.7689 | 83.0 | 274896 | 0.6953 | 0.7535 | 0.4068 | 0.7535 | 1e-05 |
|
140 |
+
| 0.7674 | 84.0 | 278208 | 0.6913 | 0.7570 | 0.4140 | 0.7570 | 1e-05 |
|
141 |
+
| 0.7607 | 85.0 | 281520 | 0.6911 | 0.7564 | 0.4117 | 0.7564 | 1e-05 |
|
142 |
+
| 0.7744 | 86.0 | 284832 | 0.6916 | 0.7563 | 0.4128 | 0.7563 | 1e-05 |
|
143 |
+
| 0.7639 | 87.0 | 288144 | 0.6929 | 0.7550 | 0.4089 | 0.7550 | 1e-05 |
|
144 |
+
| 0.7515 | 88.0 | 291456 | 0.6904 | 0.7565 | 0.4210 | 0.7565 | 1e-05 |
|
145 |
+
| 0.7529 | 89.0 | 294768 | 0.6912 | 0.7554 | 0.4082 | 0.7554 | 1e-05 |
|
146 |
+
| 0.7575 | 90.0 | 298080 | 0.6931 | 0.7557 | 0.4102 | 0.7557 | 1e-05 |
|
147 |
+
| 0.7715 | 91.0 | 301392 | 0.6912 | 0.7555 | 0.4130 | 0.7555 | 1e-05 |
|
148 |
+
| 0.7512 | 92.0 | 304704 | 0.6950 | 0.7534 | 0.4113 | 0.7534 | 1e-05 |
|
149 |
+
| 0.7514 | 93.0 | 308016 | 0.6945 | 0.7539 | 0.4075 | 0.7539 | 1e-05 |
|
150 |
+
| 0.7529 | 94.0 | 311328 | 0.6904 | 0.7564 | 0.4140 | 0.7564 | 1e-05 |
|
151 |
+
| 0.7731 | 95.0 | 314640 | 0.6919 | 0.7555 | 0.4121 | 0.7555 | 0.0000 |
|
152 |
+
| 0.7561 | 96.0 | 317952 | 0.6894 | 0.7563 | 0.4092 | 0.7563 | 0.0000 |
|
153 |
+
| 0.7702 | 97.0 | 321264 | 0.6900 | 0.7565 | 0.4131 | 0.7565 | 0.0000 |
|
154 |
+
| 0.7506 | 98.0 | 324576 | 0.6900 | 0.7566 | 0.4136 | 0.7566 | 0.0000 |
|
155 |
+
| 0.7512 | 99.0 | 327888 | 0.6909 | 0.7564 | 0.4168 | 0.7564 | 0.0000 |
|
156 |
+
| 0.7694 | 100.0 | 331200 | 0.6912 | 0.7562 | 0.4155 | 0.7562 | 0.0000 |
|
157 |
+
| 0.7487 | 101.0 | 334512 | 0.6904 | 0.7550 | 0.4158 | 0.7550 | 0.0000 |
|
158 |
+
| 0.7543 | 102.0 | 337824 | 0.6890 | 0.7570 | 0.4175 | 0.7570 | 0.0000 |
|
159 |
+
| 0.7743 | 103.0 | 341136 | 0.6923 | 0.7546 | 0.4137 | 0.7546 | 0.0000 |
|
160 |
+
| 0.757 | 104.0 | 344448 | 0.6912 | 0.7560 | 0.4183 | 0.7560 | 0.0000 |
|
161 |
+
| 0.7631 | 105.0 | 347760 | 0.6899 | 0.7561 | 0.4088 | 0.7561 | 0.0000 |
|
162 |
+
| 0.755 | 106.0 | 351072 | 0.6912 | 0.7556 | 0.4102 | 0.7556 | 0.0000 |
|
163 |
+
| 0.7545 | 107.0 | 354384 | 0.6898 | 0.7573 | 0.4107 | 0.7573 | 0.0000 |
|
164 |
+
| 0.7533 | 108.0 | 357696 | 0.6910 | 0.7538 | 0.4114 | 0.7538 | 0.0000 |
|
165 |
+
| 0.7725 | 109.0 | 361008 | 0.6899 | 0.7565 | 0.4134 | 0.7565 | 0.0000 |
|
166 |
+
| 0.7544 | 110.0 | 364320 | 0.6922 | 0.7555 | 0.4110 | 0.7555 | 0.0000 |
|
167 |
+
| 0.758 | 111.0 | 367632 | 0.6901 | 0.7559 | 0.4141 | 0.7559 | 0.0000 |
|
168 |
+
| 0.7674 | 112.0 | 370944 | 0.6903 | 0.7560 | 0.4127 | 0.7560 | 0.0000 |
|
169 |
+
|
170 |
+
|
171 |
+
### Framework versions
|
172 |
+
|
173 |
+
- Transformers 4.44.2
|
174 |
+
- Pytorch 2.4.1+cu121
|
175 |
+
- Datasets 3.0.0
|
176 |
+
- Tokenizers 0.19.1
|
all_results.json
ADDED
@@ -0,0 +1,16 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
{
|
2 |
+
"epoch": 112.0,
|
3 |
+
"eval_accuracy": 0.7613533408833522,
|
4 |
+
"eval_f1_macro": 0.4268685036536852,
|
5 |
+
"eval_f1_micro": 0.7613533408833522,
|
6 |
+
"eval_loss": 0.6806153655052185,
|
7 |
+
"eval_runtime": 235.8653,
|
8 |
+
"eval_samples_per_second": 149.746,
|
9 |
+
"eval_steps_per_second": 4.681,
|
10 |
+
"learning_rate": 1.0000000000000002e-07,
|
11 |
+
"total_flos": 6.926864611971372e+20,
|
12 |
+
"train_loss": 0.8085850866355923,
|
13 |
+
"train_runtime": 128547.5283,
|
14 |
+
"train_samples_per_second": 123.643,
|
15 |
+
"train_steps_per_second": 3.865
|
16 |
+
}
|
logs/events.out.tfevents.1729668200.datavisu2
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
-
size
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:a142d606941d15907764ce3a76ba8cb8dbd60df6562de1a9bc16be208dd463a9
|
3 |
+
size 221197
|
logs/events.out.tfevents.1729797250.datavisu2
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:39a33aa522a0dc53450543baef97d2cc9cd14b8d236b3d5154e6962487521098
|
3 |
+
size 40
|
model.safetensors
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 1222680260
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:07252dbdb9651a94f83dfdada813b65cd871954214e2a520f9266eda636c6416
|
3 |
size 1222680260
|
test_results.json
ADDED
@@ -0,0 +1,11 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
{
|
2 |
+
"epoch": 112.0,
|
3 |
+
"eval_accuracy": 0.7613533408833522,
|
4 |
+
"eval_f1_macro": 0.4268685036536852,
|
5 |
+
"eval_f1_micro": 0.7613533408833522,
|
6 |
+
"eval_loss": 0.6806153655052185,
|
7 |
+
"eval_runtime": 235.8653,
|
8 |
+
"eval_samples_per_second": 149.746,
|
9 |
+
"eval_steps_per_second": 4.681,
|
10 |
+
"learning_rate": 1.0000000000000002e-07
|
11 |
+
}
|
train_results.json
ADDED
@@ -0,0 +1,9 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
{
|
2 |
+
"epoch": 112.0,
|
3 |
+
"learning_rate": 1.0000000000000002e-07,
|
4 |
+
"total_flos": 6.926864611971372e+20,
|
5 |
+
"train_loss": 0.8085850866355923,
|
6 |
+
"train_runtime": 128547.5283,
|
7 |
+
"train_samples_per_second": 123.643,
|
8 |
+
"train_steps_per_second": 3.865
|
9 |
+
}
|
trainer_state.json
ADDED
The diff for this file is too large to render.
See raw diff
|
|