End of training
Browse files
README.md
CHANGED
@@ -18,15 +18,15 @@ should probably proofread and complete it, then remove this comment. -->
|
|
18 |
|
19 |
This model is a fine-tuned version of [AI-Sweden-Models/gpt-sw3-1.3b](https://huggingface.co/AI-Sweden-Models/gpt-sw3-1.3b) on an unknown dataset.
|
20 |
It achieves the following results on the evaluation set:
|
21 |
-
- Loss: 0.
|
22 |
-
- Rewards/chosen: 0.
|
23 |
-
- Rewards/rejected: -0.
|
24 |
- Rewards/accuracies: 0.9933
|
25 |
-
- Rewards/margins: 0.
|
26 |
-
- Logps/rejected: -226.
|
27 |
-
- Logps/chosen: -132.
|
28 |
-
- Logits/rejected: -3.
|
29 |
-
- Logits/chosen: -3.
|
30 |
|
31 |
## Model description
|
32 |
|
@@ -59,21 +59,21 @@ The following hyperparameters were used during training:
|
|
59 |
|
60 |
| Training Loss | Epoch | Step | Validation Loss | Rewards/chosen | Rewards/rejected | Rewards/accuracies | Rewards/margins | Logps/rejected | Logps/chosen | Logits/rejected | Logits/chosen |
|
61 |
|:-------------:|:-----:|:----:|:---------------:|:--------------:|:----------------:|:------------------:|:---------------:|:--------------:|:------------:|:---------------:|:-------------:|
|
62 |
-
| 0.
|
63 |
-
| 0.
|
64 |
-
| 0.
|
65 |
-
| 0.
|
66 |
-
| 0.
|
67 |
-
| 0.
|
68 |
-
| 0.
|
69 |
-
| 0.
|
70 |
-
| 0.
|
71 |
-
| 0.
|
72 |
-
| 0.
|
73 |
-
| 0.
|
74 |
-
| 0.
|
75 |
-
| 0.
|
76 |
-
| 0.
|
77 |
|
78 |
|
79 |
### Framework versions
|
|
|
18 |
|
19 |
This model is a fine-tuned version of [AI-Sweden-Models/gpt-sw3-1.3b](https://huggingface.co/AI-Sweden-Models/gpt-sw3-1.3b) on an unknown dataset.
|
20 |
It achieves the following results on the evaluation set:
|
21 |
+
- Loss: 0.4632
|
22 |
+
- Rewards/chosen: 0.0871
|
23 |
+
- Rewards/rejected: -0.4538
|
24 |
- Rewards/accuracies: 0.9933
|
25 |
+
- Rewards/margins: 0.5408
|
26 |
+
- Logps/rejected: -226.9819
|
27 |
+
- Logps/chosen: -132.0129
|
28 |
+
- Logits/rejected: -3.2476
|
29 |
+
- Logits/chosen: -3.3966
|
30 |
|
31 |
## Model description
|
32 |
|
|
|
59 |
|
60 |
| Training Loss | Epoch | Step | Validation Loss | Rewards/chosen | Rewards/rejected | Rewards/accuracies | Rewards/margins | Logps/rejected | Logps/chosen | Logits/rejected | Logits/chosen |
|
61 |
|:-------------:|:-----:|:----:|:---------------:|:--------------:|:----------------:|:------------------:|:---------------:|:--------------:|:------------:|:---------------:|:-------------:|
|
62 |
+
| 0.6689 | 0.2 | 50 | 0.6705 | 0.0068 | -0.0395 | 0.8633 | 0.0463 | -222.8389 | -132.8152 | -3.2702 | -3.4124 |
|
63 |
+
| 0.6349 | 0.4 | 100 | 0.6364 | 0.0198 | -0.0980 | 0.9933 | 0.1178 | -223.4240 | -132.6853 | -3.2678 | -3.4111 |
|
64 |
+
| 0.5959 | 0.6 | 150 | 0.5973 | 0.0365 | -0.1672 | 0.9933 | 0.2037 | -224.1166 | -132.5191 | -3.2657 | -3.4100 |
|
65 |
+
| 0.551 | 0.79 | 200 | 0.5571 | 0.0538 | -0.2436 | 0.9900 | 0.2974 | -224.8797 | -132.3453 | -3.2629 | -3.4084 |
|
66 |
+
| 0.4962 | 0.99 | 250 | 0.5159 | 0.0700 | -0.3294 | 0.9933 | 0.3994 | -225.7385 | -132.1836 | -3.2570 | -3.4041 |
|
67 |
+
| 0.478 | 1.19 | 300 | 0.4843 | 0.0827 | -0.3994 | 0.9933 | 0.4820 | -226.4377 | -132.0566 | -3.2528 | -3.4007 |
|
68 |
+
| 0.4614 | 1.39 | 350 | 0.4722 | 0.0835 | -0.4321 | 0.9933 | 0.5156 | -226.7651 | -132.0486 | -3.2504 | -3.3989 |
|
69 |
+
| 0.4412 | 1.59 | 400 | 0.4670 | 0.0858 | -0.4442 | 0.9933 | 0.5300 | -226.8861 | -132.0253 | -3.2496 | -3.3983 |
|
70 |
+
| 0.4541 | 1.79 | 450 | 0.4636 | 0.0877 | -0.4519 | 0.9933 | 0.5396 | -226.9627 | -132.0064 | -3.2484 | -3.3972 |
|
71 |
+
| 0.4468 | 1.99 | 500 | 0.4630 | 0.0842 | -0.4572 | 0.9933 | 0.5414 | -227.0158 | -132.0411 | -3.2483 | -3.3972 |
|
72 |
+
| 0.4416 | 2.19 | 550 | 0.4629 | 0.0861 | -0.4557 | 0.9933 | 0.5418 | -227.0012 | -132.0224 | -3.2481 | -3.3970 |
|
73 |
+
| 0.4601 | 2.38 | 600 | 0.4616 | 0.0857 | -0.4595 | 0.9933 | 0.5452 | -227.0392 | -132.0268 | -3.2482 | -3.3971 |
|
74 |
+
| 0.4521 | 2.58 | 650 | 0.4623 | 0.0857 | -0.4580 | 0.9933 | 0.5437 | -227.0243 | -132.0269 | -3.2481 | -3.3970 |
|
75 |
+
| 0.4375 | 2.78 | 700 | 0.4623 | 0.0860 | -0.4577 | 0.9933 | 0.5437 | -227.0213 | -132.0234 | -3.2480 | -3.3969 |
|
76 |
+
| 0.4442 | 2.98 | 750 | 0.4632 | 0.0871 | -0.4538 | 0.9933 | 0.5408 | -226.9819 | -132.0129 | -3.2476 | -3.3966 |
|
77 |
|
78 |
|
79 |
### Framework versions
|