tsavage68 commited on
Commit
dd4668f
·
verified ·
1 Parent(s): 4a5c684

End of training

Browse files
README.md CHANGED
@@ -17,15 +17,15 @@ should probably proofread and complete it, then remove this comment. -->
17
 
18
  This model is a fine-tuned version of [tsavage68/UTI_M2_1000steps_1e7rate_SFT](https://huggingface.co/tsavage68/UTI_M2_1000steps_1e7rate_SFT) on an unknown dataset.
19
  It achieves the following results on the evaluation set:
20
- - Loss: 0.6931
21
- - Rewards/chosen: 0.0
22
- - Rewards/rejected: 0.0
23
- - Rewards/accuracies: 0.0
24
- - Rewards/margins: 0.0
25
- - Logps/rejected: 0.0
26
- - Logps/chosen: 0.0
27
- - Logits/rejected: -2.7147
28
- - Logits/chosen: -2.7147
29
 
30
  ## Model description
31
 
@@ -59,46 +59,46 @@ The following hyperparameters were used during training:
59
 
60
  | Training Loss | Epoch | Step | Validation Loss | Rewards/chosen | Rewards/rejected | Rewards/accuracies | Rewards/margins | Logps/rejected | Logps/chosen | Logits/rejected | Logits/chosen |
61
  |:-------------:|:-------:|:----:|:---------------:|:--------------:|:----------------:|:------------------:|:---------------:|:--------------:|:------------:|:---------------:|:-------------:|
62
- | 0.6931 | 0.3333 | 25 | 0.6931 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | -2.7147 | -2.7147 |
63
- | 0.6931 | 0.6667 | 50 | 0.6931 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | -2.7147 | -2.7147 |
64
- | 0.6931 | 1.0 | 75 | 0.6931 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | -2.7147 | -2.7147 |
65
- | 0.6931 | 1.3333 | 100 | 0.6931 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | -2.7147 | -2.7147 |
66
- | 0.6931 | 1.6667 | 125 | 0.6931 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | -2.7147 | -2.7147 |
67
- | 0.6931 | 2.0 | 150 | 0.6931 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | -2.7147 | -2.7147 |
68
- | 0.6931 | 2.3333 | 175 | 0.6931 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | -2.7147 | -2.7147 |
69
- | 0.6931 | 2.6667 | 200 | 0.6931 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | -2.7147 | -2.7147 |
70
- | 0.6931 | 3.0 | 225 | 0.6931 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | -2.7147 | -2.7147 |
71
- | 0.6931 | 3.3333 | 250 | 0.6931 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | -2.7147 | -2.7147 |
72
- | 0.6931 | 3.6667 | 275 | 0.6931 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | -2.7147 | -2.7147 |
73
- | 0.6931 | 4.0 | 300 | 0.6931 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | -2.7147 | -2.7147 |
74
- | 0.6931 | 4.3333 | 325 | 0.6931 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | -2.7147 | -2.7147 |
75
- | 0.6931 | 4.6667 | 350 | 0.6931 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | -2.7147 | -2.7147 |
76
- | 0.6931 | 5.0 | 375 | 0.6931 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | -2.7147 | -2.7147 |
77
- | 0.6931 | 5.3333 | 400 | 0.6931 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | -2.7147 | -2.7147 |
78
- | 0.6931 | 5.6667 | 425 | 0.6931 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | -2.7147 | -2.7147 |
79
- | 0.6931 | 6.0 | 450 | 0.6931 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | -2.7147 | -2.7147 |
80
- | 0.6931 | 6.3333 | 475 | 0.6931 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | -2.7147 | -2.7147 |
81
- | 0.6931 | 6.6667 | 500 | 0.6931 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | -2.7147 | -2.7147 |
82
- | 0.6931 | 7.0 | 525 | 0.6931 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | -2.7147 | -2.7147 |
83
- | 0.6931 | 7.3333 | 550 | 0.6931 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | -2.7147 | -2.7147 |
84
- | 0.6931 | 7.6667 | 575 | 0.6931 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | -2.7147 | -2.7147 |
85
- | 0.6931 | 8.0 | 600 | 0.6931 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | -2.7147 | -2.7147 |
86
- | 0.6931 | 8.3333 | 625 | 0.6931 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | -2.7147 | -2.7147 |
87
- | 0.6931 | 8.6667 | 650 | 0.6931 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | -2.7147 | -2.7147 |
88
- | 0.6931 | 9.0 | 675 | 0.6931 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | -2.7147 | -2.7147 |
89
- | 0.6931 | 9.3333 | 700 | 0.6931 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | -2.7147 | -2.7147 |
90
- | 0.6931 | 9.6667 | 725 | 0.6931 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | -2.7147 | -2.7147 |
91
- | 0.6931 | 10.0 | 750 | 0.6931 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | -2.7147 | -2.7147 |
92
- | 0.6931 | 10.3333 | 775 | 0.6931 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | -2.7147 | -2.7147 |
93
- | 0.6931 | 10.6667 | 800 | 0.6931 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | -2.7147 | -2.7147 |
94
- | 0.6931 | 11.0 | 825 | 0.6931 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | -2.7147 | -2.7147 |
95
- | 0.6931 | 11.3333 | 850 | 0.6931 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | -2.7147 | -2.7147 |
96
- | 0.6931 | 11.6667 | 875 | 0.6931 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | -2.7147 | -2.7147 |
97
- | 0.6931 | 12.0 | 900 | 0.6931 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | -2.7147 | -2.7147 |
98
- | 0.6931 | 12.3333 | 925 | 0.6931 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | -2.7147 | -2.7147 |
99
- | 0.6931 | 12.6667 | 950 | 0.6931 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | -2.7147 | -2.7147 |
100
- | 0.6931 | 13.0 | 975 | 0.6931 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | -2.7147 | -2.7147 |
101
- | 0.6931 | 13.3333 | 1000 | 0.6931 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | -2.7147 | -2.7147 |
102
 
103
 
104
  ### Framework versions
 
17
 
18
  This model is a fine-tuned version of [tsavage68/UTI_M2_1000steps_1e7rate_SFT](https://huggingface.co/tsavage68/UTI_M2_1000steps_1e7rate_SFT) on an unknown dataset.
19
  It achieves the following results on the evaluation set:
20
+ - Loss: 0.5476
21
+ - Rewards/chosen: 0.0699
22
+ - Rewards/rejected: -3.0830
23
+ - Rewards/accuracies: 0.2100
24
+ - Rewards/margins: 3.1530
25
+ - Logps/rejected: -15.5400
26
+ - Logps/chosen: -4.4026
27
+ - Logits/rejected: -2.6403
28
+ - Logits/chosen: -2.6398
29
 
30
  ## Model description
31
 
 
59
 
60
  | Training Loss | Epoch | Step | Validation Loss | Rewards/chosen | Rewards/rejected | Rewards/accuracies | Rewards/margins | Logps/rejected | Logps/chosen | Logits/rejected | Logits/chosen |
61
  |:-------------:|:-------:|:----:|:---------------:|:--------------:|:----------------:|:------------------:|:---------------:|:--------------:|:------------:|:---------------:|:-------------:|
62
+ | 0.7009 | 0.3333 | 25 | 0.6667 | 0.0016 | -0.0631 | 0.1900 | 0.0647 | -9.5002 | -4.5393 | -2.7063 | -2.7055 |
63
+ | 0.5794 | 0.6667 | 50 | 0.5690 | 0.0605 | -0.5674 | 0.2100 | 0.6279 | -10.5088 | -4.4215 | -2.6950 | -2.6942 |
64
+ | 0.5772 | 1.0 | 75 | 0.5638 | -0.0374 | -1.8778 | 0.2000 | 1.8404 | -13.1295 | -4.6173 | -2.6653 | -2.6647 |
65
+ | 0.5715 | 1.3333 | 100 | 0.5485 | 0.0321 | -2.2707 | 0.2100 | 2.3028 | -13.9154 | -4.4783 | -2.6560 | -2.6555 |
66
+ | 0.5545 | 1.6667 | 125 | 0.5476 | 0.1013 | -2.5349 | 0.2100 | 2.6363 | -14.4438 | -4.3398 | -2.6499 | -2.6494 |
67
+ | 0.5545 | 2.0 | 150 | 0.5476 | 0.0902 | -2.9376 | 0.2100 | 3.0278 | -15.2492 | -4.3621 | -2.6442 | -2.6437 |
68
+ | 0.5545 | 2.3333 | 175 | 0.5476 | 0.0846 | -2.9244 | 0.2100 | 3.0090 | -15.2229 | -4.3733 | -2.6424 | -2.6419 |
69
+ | 0.4852 | 2.6667 | 200 | 0.5476 | 0.0848 | -2.9648 | 0.2100 | 3.0495 | -15.3035 | -4.3729 | -2.6423 | -2.6417 |
70
+ | 0.6412 | 3.0 | 225 | 0.5476 | 0.0853 | -2.9694 | 0.2100 | 3.0547 | -15.3127 | -4.3718 | -2.6421 | -2.6415 |
71
+ | 0.5545 | 3.3333 | 250 | 0.5476 | 0.0892 | -2.9671 | 0.2100 | 3.0563 | -15.3081 | -4.3640 | -2.6429 | -2.6424 |
72
+ | 0.5372 | 3.6667 | 275 | 0.5476 | 0.0803 | -2.9507 | 0.2100 | 3.0310 | -15.2754 | -4.3819 | -2.6416 | -2.6410 |
73
+ | 0.5892 | 4.0 | 300 | 0.5476 | 0.0791 | -3.0080 | 0.2100 | 3.0871 | -15.3899 | -4.3842 | -2.6421 | -2.6415 |
74
+ | 0.4679 | 4.3333 | 325 | 0.5476 | 0.0770 | -3.0043 | 0.2100 | 3.0814 | -15.3826 | -4.3884 | -2.6420 | -2.6415 |
75
+ | 0.5718 | 4.6667 | 350 | 0.5476 | 0.0767 | -3.0040 | 0.2100 | 3.0808 | -15.3820 | -4.3890 | -2.6414 | -2.6409 |
76
+ | 0.5199 | 5.0 | 375 | 0.5476 | 0.0830 | -3.0444 | 0.2100 | 3.1274 | -15.4628 | -4.3765 | -2.6415 | -2.6410 |
77
+ | 0.5025 | 5.3333 | 400 | 0.5476 | 0.0784 | -3.0520 | 0.2100 | 3.1304 | -15.4779 | -4.3857 | -2.6406 | -2.6401 |
78
+ | 0.5199 | 5.6667 | 425 | 0.5476 | 0.0772 | -3.0417 | 0.2100 | 3.1189 | -15.4575 | -4.3882 | -2.6418 | -2.6412 |
79
+ | 0.5025 | 6.0 | 450 | 0.5476 | 0.0775 | -3.0690 | 0.2100 | 3.1465 | -15.5119 | -4.3875 | -2.6403 | -2.6398 |
80
+ | 0.5718 | 6.3333 | 475 | 0.5476 | 0.0722 | -3.0608 | 0.2100 | 3.1330 | -15.4956 | -4.3980 | -2.6403 | -2.6398 |
81
+ | 0.5718 | 6.6667 | 500 | 0.5476 | 0.0733 | -3.0661 | 0.2100 | 3.1394 | -15.5061 | -4.3958 | -2.6403 | -2.6397 |
82
+ | 0.5025 | 7.0 | 525 | 0.5476 | 0.0687 | -3.0692 | 0.2100 | 3.1379 | -15.5123 | -4.4051 | -2.6407 | -2.6402 |
83
+ | 0.5199 | 7.3333 | 550 | 0.5476 | 0.0691 | -3.0762 | 0.2100 | 3.1454 | -15.5265 | -4.4042 | -2.6401 | -2.6396 |
84
+ | 0.5372 | 7.6667 | 575 | 0.5476 | 0.0728 | -3.0945 | 0.2100 | 3.1672 | -15.5629 | -4.3970 | -2.6414 | -2.6409 |
85
+ | 0.5718 | 8.0 | 600 | 0.5476 | 0.0736 | -3.0806 | 0.2100 | 3.1541 | -15.5351 | -4.3953 | -2.6405 | -2.6400 |
86
+ | 0.5372 | 8.3333 | 625 | 0.5476 | 0.0806 | -3.0954 | 0.2100 | 3.1759 | -15.5647 | -4.3813 | -2.6410 | -2.6405 |
87
+ | 0.4332 | 8.6667 | 650 | 0.5476 | 0.0762 | -3.0922 | 0.2100 | 3.1684 | -15.5583 | -4.3900 | -2.6412 | -2.6407 |
88
+ | 0.5372 | 9.0 | 675 | 0.5476 | 0.0738 | -3.0924 | 0.2100 | 3.1662 | -15.5587 | -4.3948 | -2.6408 | -2.6403 |
89
+ | 0.5025 | 9.3333 | 700 | 0.5476 | 0.0702 | -3.0892 | 0.2100 | 3.1594 | -15.5524 | -4.4020 | -2.6405 | -2.6400 |
90
+ | 0.5025 | 9.6667 | 725 | 0.5476 | 0.0641 | -3.0956 | 0.2100 | 3.1597 | -15.5651 | -4.4142 | -2.6410 | -2.6405 |
91
+ | 0.5892 | 10.0 | 750 | 0.5476 | 0.0696 | -3.0933 | 0.2100 | 3.1630 | -15.5606 | -4.4032 | -2.6403 | -2.6398 |
92
+ | 0.5199 | 10.3333 | 775 | 0.5476 | 0.0764 | -3.0810 | 0.2100 | 3.1574 | -15.5361 | -4.3897 | -2.6404 | -2.6399 |
93
+ | 0.5199 | 10.6667 | 800 | 0.5476 | 0.0750 | -3.0945 | 0.2100 | 3.1695 | -15.5629 | -4.3925 | -2.6399 | -2.6394 |
94
+ | 0.5372 | 11.0 | 825 | 0.5477 | 0.0727 | -3.0777 | 0.2100 | 3.1504 | -15.5293 | -4.3970 | -2.6405 | -2.6399 |
95
+ | 0.5199 | 11.3333 | 850 | 0.5477 | 0.0760 | -3.0775 | 0.2100 | 3.1534 | -15.5289 | -4.3905 | -2.6402 | -2.6397 |
96
+ | 0.6065 | 11.6667 | 875 | 0.5476 | 0.0737 | -3.0877 | 0.2100 | 3.1615 | -15.5495 | -4.3950 | -2.6404 | -2.6398 |
97
+ | 0.5718 | 12.0 | 900 | 0.5476 | 0.0713 | -3.0915 | 0.2100 | 3.1628 | -15.5570 | -4.3999 | -2.6403 | -2.6398 |
98
+ | 0.4159 | 12.3333 | 925 | 0.5476 | 0.0687 | -3.0820 | 0.2100 | 3.1507 | -15.5379 | -4.4051 | -2.6403 | -2.6398 |
99
+ | 0.6238 | 12.6667 | 950 | 0.5476 | 0.0699 | -3.0830 | 0.2100 | 3.1530 | -15.5400 | -4.4026 | -2.6403 | -2.6398 |
100
+ | 0.6065 | 13.0 | 975 | 0.5476 | 0.0699 | -3.0830 | 0.2100 | 3.1530 | -15.5400 | -4.4026 | -2.6403 | -2.6398 |
101
+ | 0.5025 | 13.3333 | 1000 | 0.5476 | 0.0699 | -3.0830 | 0.2100 | 3.1530 | -15.5400 | -4.4026 | -2.6403 | -2.6398 |
102
 
103
 
104
  ### Framework versions
final_checkpoint/model-00001-of-00003.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:9aa2e9687a5e5d24a999a996e9fe4c2bc1cf34ad347da5dc5c7e0adffcb14982
3
  size 4943162240
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:055596a77af9659da7d9951fb3f75af45721eb5b2733ae8b926220c9486f6f4e
3
  size 4943162240
final_checkpoint/model-00002-of-00003.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:268bb18cc8bbff53c912fa3961a6281dd5c163edd1b8e5c85c9b12e87e4e3a63
3
  size 4999819232
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:6eab27f229973937f2de1d94236003e583e3eeb20ef32700ef02e9c8b0bb1703
3
  size 4999819232
final_checkpoint/model-00003-of-00003.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:bbc021dcf68d9e7ddaab0ead255721e73b7f652e3bfd34985bba6c029e0b729c
3
  size 4540516256
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:7fb19c1138598d76cc73ff424bdd75abc5ded37d3446ab091e303c3a1e22d2f7
3
  size 4540516256
model-00001-of-00003.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:9aa2e9687a5e5d24a999a996e9fe4c2bc1cf34ad347da5dc5c7e0adffcb14982
3
  size 4943162240
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:055596a77af9659da7d9951fb3f75af45721eb5b2733ae8b926220c9486f6f4e
3
  size 4943162240
model-00002-of-00003.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:268bb18cc8bbff53c912fa3961a6281dd5c163edd1b8e5c85c9b12e87e4e3a63
3
  size 4999819232
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:6eab27f229973937f2de1d94236003e583e3eeb20ef32700ef02e9c8b0bb1703
3
  size 4999819232
model-00003-of-00003.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:bbc021dcf68d9e7ddaab0ead255721e73b7f652e3bfd34985bba6c029e0b729c
3
  size 4540516256
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:7fb19c1138598d76cc73ff424bdd75abc5ded37d3446ab091e303c3a1e22d2f7
3
  size 4540516256
training_args.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:91c299c1029aeb4a9610760559d7581036fc79df156c10b8ddf53908122495f9
3
  size 4667
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:e53d741e157840027466f7d5de111d0f3d8a1350974d7796488b54e8c05605c8
3
  size 4667