thorirhrafn commited on
Commit
8761f33
·
verified ·
1 Parent(s): b13aa82

End of training

Browse files
Files changed (1) hide show
  1. README.md +23 -23
README.md CHANGED
@@ -18,15 +18,15 @@ should probably proofread and complete it, then remove this comment. -->
18
 
19
  This model is a fine-tuned version of [AI-Sweden-Models/gpt-sw3-1.3b](https://huggingface.co/AI-Sweden-Models/gpt-sw3-1.3b) on an unknown dataset.
20
  It achieves the following results on the evaluation set:
21
- - Loss: 0.4639
22
- - Rewards/chosen: 0.0853
23
- - Rewards/rejected: -0.4537
24
  - Rewards/accuracies: 0.9933
25
- - Rewards/margins: 0.5390
26
- - Logps/rejected: -226.9816
27
- - Logps/chosen: -132.0306
28
- - Logits/rejected: -3.2482
29
- - Logits/chosen: -3.3971
30
 
31
  ## Model description
32
 
@@ -59,21 +59,21 @@ The following hyperparameters were used during training:
59
 
60
  | Training Loss | Epoch | Step | Validation Loss | Rewards/chosen | Rewards/rejected | Rewards/accuracies | Rewards/margins | Logps/rejected | Logps/chosen | Logits/rejected | Logits/chosen |
61
  |:-------------:|:-----:|:----:|:---------------:|:--------------:|:----------------:|:------------------:|:---------------:|:--------------:|:------------:|:---------------:|:-------------:|
62
- | 0.6692 | 0.2 | 50 | 0.6705 | 0.0068 | -0.0395 | 0.8600 | 0.0463 | -222.8391 | -132.8156 | -3.2705 | -3.4127 |
63
- | 0.6348 | 0.4 | 100 | 0.6356 | 0.0214 | -0.0982 | 0.9833 | 0.1195 | -223.4260 | -132.6699 | -3.2676 | -3.4109 |
64
- | 0.5963 | 0.6 | 150 | 0.5978 | 0.0351 | -0.1675 | 0.9867 | 0.2026 | -224.1192 | -132.5324 | -3.2663 | -3.4105 |
65
- | 0.5528 | 0.79 | 200 | 0.5577 | 0.0555 | -0.2405 | 0.9933 | 0.2959 | -224.8489 | -132.3290 | -3.2632 | -3.4086 |
66
- | 0.4959 | 0.99 | 250 | 0.5175 | 0.0694 | -0.3264 | 0.9933 | 0.3958 | -225.7085 | -132.1899 | -3.2575 | -3.4044 |
67
- | 0.4811 | 1.19 | 300 | 0.4873 | 0.0823 | -0.3919 | 0.9933 | 0.4742 | -226.3629 | -132.0607 | -3.2526 | -3.4004 |
68
- | 0.4644 | 1.39 | 350 | 0.4759 | 0.0827 | -0.4233 | 0.9933 | 0.5061 | -226.6773 | -132.0561 | -3.2501 | -3.3986 |
69
- | 0.4421 | 1.59 | 400 | 0.4693 | 0.0848 | -0.4392 | 0.9933 | 0.5240 | -226.8362 | -132.0353 | -3.2493 | -3.3980 |
70
- | 0.4561 | 1.79 | 450 | 0.4659 | 0.0858 | -0.4475 | 0.9933 | 0.5333 | -226.9191 | -132.0255 | -3.2484 | -3.3974 |
71
- | 0.449 | 1.99 | 500 | 0.4655 | 0.0851 | -0.4498 | 0.9933 | 0.5349 | -226.9426 | -132.0327 | -3.2483 | -3.3973 |
72
- | 0.4456 | 2.19 | 550 | 0.4646 | 0.0876 | -0.4492 | 0.9933 | 0.5367 | -226.9357 | -132.0079 | -3.2481 | -3.3970 |
73
- | 0.4622 | 2.38 | 600 | 0.4641 | 0.0853 | -0.4527 | 0.9967 | 0.5379 | -226.9708 | -132.0308 | -3.2480 | -3.3969 |
74
- | 0.4527 | 2.58 | 650 | 0.4637 | 0.0884 | -0.4510 | 0.9933 | 0.5394 | -226.9547 | -132.0001 | -3.2478 | -3.3968 |
75
- | 0.4415 | 2.78 | 700 | 0.4647 | 0.0856 | -0.4511 | 0.9967 | 0.5367 | -226.9554 | -132.0279 | -3.2478 | -3.3968 |
76
- | 0.4432 | 2.98 | 750 | 0.4639 | 0.0853 | -0.4537 | 0.9933 | 0.5390 | -226.9816 | -132.0306 | -3.2482 | -3.3971 |
77
 
78
 
79
  ### Framework versions
 
18
 
19
  This model is a fine-tuned version of [AI-Sweden-Models/gpt-sw3-1.3b](https://huggingface.co/AI-Sweden-Models/gpt-sw3-1.3b) on an unknown dataset.
20
  It achieves the following results on the evaluation set:
21
+ - Loss: 0.4632
22
+ - Rewards/chosen: 0.0871
23
+ - Rewards/rejected: -0.4538
24
  - Rewards/accuracies: 0.9933
25
+ - Rewards/margins: 0.5408
26
+ - Logps/rejected: -226.9819
27
+ - Logps/chosen: -132.0129
28
+ - Logits/rejected: -3.2476
29
+ - Logits/chosen: -3.3966
30
 
31
  ## Model description
32
 
 
59
 
60
  | Training Loss | Epoch | Step | Validation Loss | Rewards/chosen | Rewards/rejected | Rewards/accuracies | Rewards/margins | Logps/rejected | Logps/chosen | Logits/rejected | Logits/chosen |
61
  |:-------------:|:-----:|:----:|:---------------:|:--------------:|:----------------:|:------------------:|:---------------:|:--------------:|:------------:|:---------------:|:-------------:|
62
+ | 0.6689 | 0.2 | 50 | 0.6705 | 0.0068 | -0.0395 | 0.8633 | 0.0463 | -222.8389 | -132.8152 | -3.2702 | -3.4124 |
63
+ | 0.6349 | 0.4 | 100 | 0.6364 | 0.0198 | -0.0980 | 0.9933 | 0.1178 | -223.4240 | -132.6853 | -3.2678 | -3.4111 |
64
+ | 0.5959 | 0.6 | 150 | 0.5973 | 0.0365 | -0.1672 | 0.9933 | 0.2037 | -224.1166 | -132.5191 | -3.2657 | -3.4100 |
65
+ | 0.551 | 0.79 | 200 | 0.5571 | 0.0538 | -0.2436 | 0.9900 | 0.2974 | -224.8797 | -132.3453 | -3.2629 | -3.4084 |
66
+ | 0.4962 | 0.99 | 250 | 0.5159 | 0.0700 | -0.3294 | 0.9933 | 0.3994 | -225.7385 | -132.1836 | -3.2570 | -3.4041 |
67
+ | 0.478 | 1.19 | 300 | 0.4843 | 0.0827 | -0.3994 | 0.9933 | 0.4820 | -226.4377 | -132.0566 | -3.2528 | -3.4007 |
68
+ | 0.4614 | 1.39 | 350 | 0.4722 | 0.0835 | -0.4321 | 0.9933 | 0.5156 | -226.7651 | -132.0486 | -3.2504 | -3.3989 |
69
+ | 0.4412 | 1.59 | 400 | 0.4670 | 0.0858 | -0.4442 | 0.9933 | 0.5300 | -226.8861 | -132.0253 | -3.2496 | -3.3983 |
70
+ | 0.4541 | 1.79 | 450 | 0.4636 | 0.0877 | -0.4519 | 0.9933 | 0.5396 | -226.9627 | -132.0064 | -3.2484 | -3.3972 |
71
+ | 0.4468 | 1.99 | 500 | 0.4630 | 0.0842 | -0.4572 | 0.9933 | 0.5414 | -227.0158 | -132.0411 | -3.2483 | -3.3972 |
72
+ | 0.4416 | 2.19 | 550 | 0.4629 | 0.0861 | -0.4557 | 0.9933 | 0.5418 | -227.0012 | -132.0224 | -3.2481 | -3.3970 |
73
+ | 0.4601 | 2.38 | 600 | 0.4616 | 0.0857 | -0.4595 | 0.9933 | 0.5452 | -227.0392 | -132.0268 | -3.2482 | -3.3971 |
74
+ | 0.4521 | 2.58 | 650 | 0.4623 | 0.0857 | -0.4580 | 0.9933 | 0.5437 | -227.0243 | -132.0269 | -3.2481 | -3.3970 |
75
+ | 0.4375 | 2.78 | 700 | 0.4623 | 0.0860 | -0.4577 | 0.9933 | 0.5437 | -227.0213 | -132.0234 | -3.2480 | -3.3969 |
76
+ | 0.4442 | 2.98 | 750 | 0.4632 | 0.0871 | -0.4538 | 0.9933 | 0.5408 | -226.9819 | -132.0129 | -3.2476 | -3.3966 |
77
 
78
 
79
  ### Framework versions