tinyllama-1.1b-sum-dpo-full_LR1e-7_3epochs_old

This model is a fine-tuned version of martimfasantos/tinyllama-1.1b-sum-sft-full_old on the openai/summarize_from_feedback dataset. It achieves the following results on the evaluation set:

  • Loss: 0.6382
  • Rewards/chosen: -0.8614
  • Rewards/rejected: -1.0551
  • Rewards/accuracies: 0.6341
  • Rewards/margins: 0.1937
  • Logps/rejected: -168.6898
  • Logps/chosen: -144.8481
  • Logits/rejected: -2.0951
  • Logits/chosen: -2.1077

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-07
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • distributed_type: multi-GPU
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 16
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 3

Training results

Training Loss Epoch Step Validation Loss Rewards/chosen Rewards/rejected Rewards/accuracies Rewards/margins Logps/rejected Logps/chosen Logits/rejected Logits/chosen
0.6931 0.0172 100 0.6932 -0.0000 0.0000 0.4993 -0.0000 -63.1760 -58.7121 -3.1570 -3.1626
0.6932 0.0345 200 0.6932 -0.0000 0.0000 0.4902 -0.0001 -63.1777 -58.7161 -3.1578 -3.1634
0.6932 0.0517 300 0.6932 0.0001 0.0001 0.4847 -0.0001 -63.1684 -58.7055 -3.1576 -3.1633
0.6932 0.0689 400 0.6932 0.0001 0.0001 0.4814 -0.0001 -63.1658 -58.7068 -3.1575 -3.1631
0.6931 0.0861 500 0.6932 0.0001 0.0001 0.4847 -0.0000 -63.1715 -58.7052 -3.1577 -3.1633
0.6929 0.1034 600 0.6931 0.0002 0.0002 0.5037 0.0000 -63.1560 -58.6876 -3.1571 -3.1628
0.693 0.1206 700 0.6931 0.0003 0.0001 0.5214 0.0002 -63.1660 -58.6822 -3.1562 -3.1619
0.6927 0.1378 800 0.6931 0.0006 0.0005 0.5204 0.0001 -63.1322 -58.6491 -3.1561 -3.1618
0.6927 0.1551 900 0.6930 0.0008 0.0005 0.5300 0.0003 -63.1317 -58.6345 -3.1554 -3.1610
0.6928 0.1723 1000 0.6930 0.0011 0.0007 0.5258 0.0003 -63.1075 -58.6060 -3.1540 -3.1596
0.6922 0.1895 1100 0.6929 0.0013 0.0007 0.5455 0.0006 -63.1103 -58.5820 -3.1523 -3.1579
0.6921 0.2068 1200 0.6927 0.0017 0.0008 0.5574 0.0009 -63.1011 -58.5416 -3.1500 -3.1556
0.692 0.2240 1300 0.6925 0.0020 0.0007 0.5599 0.0013 -63.1123 -58.5097 -3.1479 -3.1535
0.6898 0.2412 1400 0.6923 0.0021 0.0002 0.5743 0.0018 -63.1581 -58.5058 -3.1443 -3.1500
0.6889 0.2584 1500 0.6920 0.0017 -0.0007 0.5827 0.0024 -63.2512 -58.5426 -3.1406 -3.1462
0.69 0.2757 1600 0.6917 0.0011 -0.0018 0.5785 0.0030 -63.3644 -58.5982 -3.1355 -3.1411
0.6897 0.2929 1700 0.6913 0.0001 -0.0037 0.5727 0.0038 -63.5467 -58.6985 -3.1294 -3.1351
0.6857 0.3101 1800 0.6910 -0.0016 -0.0061 0.5734 0.0045 -63.7882 -58.8688 -3.1244 -3.1301
0.6866 0.3274 1900 0.6907 -0.0038 -0.0090 0.5843 0.0052 -64.0830 -59.0939 -3.1188 -3.1245
0.6872 0.3446 2000 0.6903 -0.0075 -0.0134 0.5862 0.0060 -64.5228 -59.4572 -3.1120 -3.1176
0.6854 0.3618 2100 0.6899 -0.0124 -0.0194 0.5813 0.0070 -65.1230 -59.9534 -3.1057 -3.1113
0.6786 0.3790 2200 0.6894 -0.0185 -0.0267 0.5836 0.0082 -65.8538 -60.5638 -3.0978 -3.1035
0.6801 0.3963 2300 0.6889 -0.0230 -0.0323 0.5915 0.0093 -66.4100 -61.0095 -3.0912 -3.0969
0.683 0.4135 2400 0.6882 -0.0304 -0.0413 0.5867 0.0108 -67.3051 -61.7559 -3.0824 -3.0881
0.6853 0.4307 2500 0.6876 -0.0392 -0.0515 0.5841 0.0123 -68.3329 -62.6367 -3.0733 -3.0790
0.6775 0.4480 2600 0.6870 -0.0464 -0.0600 0.5834 0.0136 -69.1773 -63.3517 -3.0671 -3.0728
0.6788 0.4652 2700 0.6864 -0.0532 -0.0681 0.5895 0.0150 -69.9938 -64.0275 -3.0610 -3.0668
0.6781 0.4824 2800 0.6860 -0.0581 -0.0740 0.5876 0.0159 -70.5769 -64.5225 -3.0538 -3.0595
0.6796 0.4997 2900 0.6857 -0.0610 -0.0777 0.5892 0.0166 -70.9456 -64.8128 -3.0460 -3.0517
0.6805 0.5169 3000 0.6853 -0.0658 -0.0834 0.5994 0.0176 -71.5177 -65.2877 -3.0368 -3.0425
0.673 0.5341 3100 0.6849 -0.0663 -0.0847 0.5987 0.0184 -71.6468 -65.3387 -3.0324 -3.0381
0.6747 0.5513 3200 0.6842 -0.0780 -0.0982 0.6027 0.0202 -72.9963 -66.5094 -3.0209 -3.0267
0.6743 0.5686 3300 0.6836 -0.0836 -0.1053 0.6022 0.0216 -73.7081 -67.0762 -3.0078 -3.0136
0.6653 0.5858 3400 0.6833 -0.0846 -0.1069 0.6011 0.0222 -73.8674 -67.1758 -2.9991 -3.0049
0.6764 0.6030 3500 0.6827 -0.0900 -0.1136 0.5999 0.0236 -74.5369 -67.7069 -2.9912 -2.9971
0.6737 0.6203 3600 0.6823 -0.0962 -0.1207 0.6104 0.0245 -75.2502 -68.3295 -2.9812 -2.9871
0.6664 0.6375 3700 0.6816 -0.1051 -0.1313 0.6080 0.0263 -76.3151 -69.2178 -2.9692 -2.9751
0.6667 0.6547 3800 0.6807 -0.1172 -0.1456 0.6085 0.0284 -77.7401 -70.4287 -2.9595 -2.9654
0.6678 0.6720 3900 0.6799 -0.1299 -0.1602 0.6092 0.0304 -79.2047 -71.6971 -2.9499 -2.9558
0.6671 0.6892 4000 0.6792 -0.1408 -0.1729 0.6078 0.0321 -80.4742 -72.7925 -2.9368 -2.9426
0.6554 0.7064 4100 0.6787 -0.1458 -0.1791 0.6120 0.0333 -81.0925 -73.2962 -2.9179 -2.9238
0.6742 0.7236 4200 0.6780 -0.1580 -0.1932 0.6127 0.0352 -82.5005 -74.5101 -2.9044 -2.9103
0.6632 0.7409 4300 0.6774 -0.1672 -0.2038 0.6078 0.0366 -83.5592 -75.4285 -2.8933 -2.8992
0.6639 0.7581 4400 0.6765 -0.1825 -0.2215 0.6064 0.0390 -85.3312 -76.9653 -2.8808 -2.8867
0.6617 0.7753 4500 0.6753 -0.2011 -0.2431 0.6078 0.0421 -87.4948 -78.8183 -2.8704 -2.8763
0.6446 0.7926 4600 0.6742 -0.2184 -0.2634 0.6080 0.0450 -89.5165 -80.5508 -2.8604 -2.8664
0.6536 0.8098 4700 0.6733 -0.2347 -0.2821 0.6078 0.0474 -91.3895 -82.1787 -2.8507 -2.8567
0.661 0.8270 4800 0.6723 -0.2469 -0.2967 0.6071 0.0498 -92.8502 -83.4062 -2.8410 -2.8470
0.6655 0.8442 4900 0.6714 -0.2622 -0.3144 0.6059 0.0522 -94.6208 -84.9348 -2.8302 -2.8362
0.65 0.8615 5000 0.6706 -0.2730 -0.3273 0.5957 0.0544 -95.9136 -86.0080 -2.8112 -2.8172
0.6625 0.8787 5100 0.6695 -0.2893 -0.3467 0.5997 0.0574 -97.8500 -87.6453 -2.8012 -2.8071
0.6509 0.8959 5200 0.6690 -0.2924 -0.3512 0.5985 0.0588 -98.3012 -87.9499 -2.7931 -2.7991
0.6469 0.9132 5300 0.6686 -0.2979 -0.3577 0.5978 0.0598 -98.9499 -88.5002 -2.7822 -2.7882
0.6482 0.9304 5400 0.6680 -0.3024 -0.3637 0.6039 0.0613 -99.5495 -88.9507 -2.7739 -2.7799
0.639 0.9476 5500 0.6673 -0.3146 -0.3781 0.6066 0.0635 -100.9877 -90.1737 -2.7615 -2.7675
0.6515 0.9649 5600 0.6668 -0.3113 -0.3759 0.6080 0.0647 -100.7733 -89.8396 -2.7543 -2.7603
0.6512 0.9821 5700 0.6657 -0.3303 -0.3982 0.6094 0.0680 -103.0038 -91.7385 -2.7432 -2.7493
0.6323 0.9993 5800 0.6645 -0.3552 -0.4268 0.6078 0.0716 -105.8584 -94.2304 -2.7257 -2.7318
0.632 1.0165 5900 0.6629 -0.3911 -0.4682 0.6085 0.0771 -109.9998 -97.8232 -2.7023 -2.7085
0.654 1.0338 6000 0.6632 -0.3807 -0.4571 0.6076 0.0764 -108.8926 -96.7834 -2.6907 -2.6969
0.6293 1.0510 6100 0.6624 -0.3916 -0.4703 0.6111 0.0787 -110.2114 -97.8768 -2.6768 -2.6831
0.6314 1.0682 6200 0.6611 -0.4228 -0.5060 0.6120 0.0832 -113.7813 -100.9947 -2.6635 -2.6697
0.6526 1.0855 6300 0.6599 -0.4394 -0.5262 0.6145 0.0869 -115.8035 -102.6482 -2.6530 -2.6593
0.6347 1.1027 6400 0.6593 -0.4394 -0.5278 0.6180 0.0884 -115.9650 -102.6523 -2.6435 -2.6499
0.6393 1.1199 6500 0.6588 -0.4468 -0.5370 0.6238 0.0901 -116.8754 -103.3932 -2.6289 -2.6354
0.6374 1.1371 6600 0.6590 -0.4501 -0.5403 0.6166 0.0901 -117.2051 -103.7237 -2.6225 -2.6289
0.6359 1.1544 6700 0.6581 -0.4668 -0.5605 0.6190 0.0936 -119.2262 -105.3939 -2.6058 -2.6123
0.6146 1.1716 6800 0.6567 -0.4994 -0.5980 0.6173 0.0987 -122.9848 -108.6496 -2.5870 -2.5937
0.6367 1.1888 6900 0.6561 -0.5093 -0.6101 0.6227 0.1008 -124.1880 -109.6397 -2.5753 -2.5820
0.6185 1.2061 7000 0.6549 -0.5406 -0.6465 0.6159 0.1059 -127.8333 -112.7735 -2.5638 -2.5706
0.6226 1.2233 7100 0.6558 -0.5185 -0.6213 0.6180 0.1028 -125.3109 -110.5579 -2.5582 -2.5651
0.6173 1.2405 7200 0.6550 -0.5301 -0.6358 0.6162 0.1057 -126.7555 -111.7189 -2.5488 -2.5557
0.6472 1.2578 7300 0.6553 -0.5020 -0.6054 0.6197 0.1034 -123.7222 -108.9138 -2.5474 -2.5543
0.6388 1.2750 7400 0.6552 -0.4984 -0.6021 0.6206 0.1037 -123.3937 -108.5536 -2.5418 -2.5489
0.641 1.2922 7500 0.6543 -0.5020 -0.6078 0.6227 0.1058 -123.9613 -108.9147 -2.5332 -2.5404
0.6721 1.3094 7600 0.6531 -0.5286 -0.6388 0.6229 0.1102 -127.0605 -111.5723 -2.5152 -2.5224
0.6262 1.3267 7700 0.6528 -0.5440 -0.6568 0.6199 0.1127 -128.8555 -113.1147 -2.4986 -2.5058
0.6077 1.3439 7800 0.6520 -0.5730 -0.6901 0.6231 0.1172 -132.1913 -116.0070 -2.4824 -2.4898
0.6293 1.3611 7900 0.6511 -0.5869 -0.7073 0.6234 0.1204 -133.9143 -117.4017 -2.4749 -2.4824
0.6065 1.3784 8000 0.6502 -0.5931 -0.7166 0.6236 0.1235 -134.8416 -118.0241 -2.4667 -2.4743
0.6328 1.3956 8100 0.6499 -0.6051 -0.7307 0.6255 0.1256 -136.2457 -119.2178 -2.4558 -2.4635
0.646 1.4128 8200 0.6494 -0.6002 -0.7264 0.6231 0.1262 -135.8235 -118.7345 -2.4523 -2.4600
0.6384 1.4300 8300 0.6500 -0.5815 -0.7052 0.6234 0.1237 -133.6977 -116.8619 -2.4491 -2.4568
0.6173 1.4473 8400 0.6504 -0.5677 -0.6897 0.6217 0.1219 -132.1456 -115.4836 -2.4449 -2.4526
0.6041 1.4645 8500 0.6501 -0.5732 -0.6969 0.6271 0.1237 -132.8701 -116.0278 -2.4292 -2.4370
0.6635 1.4817 8600 0.6490 -0.6018 -0.7304 0.6252 0.1286 -136.2163 -118.8894 -2.4140 -2.4220
0.6377 1.4990 8700 0.6499 -0.5709 -0.6951 0.6255 0.1243 -132.6951 -115.7986 -2.4168 -2.4247
0.6376 1.5162 8800 0.6488 -0.5866 -0.7147 0.6301 0.1281 -134.6506 -117.3752 -2.4074 -2.4155
0.6174 1.5334 8900 0.6478 -0.6255 -0.7594 0.6336 0.1339 -139.1249 -121.2650 -2.3887 -2.3969
0.6228 1.5507 9000 0.6478 -0.6245 -0.7587 0.6292 0.1342 -139.0503 -121.1639 -2.3815 -2.3898
0.6372 1.5679 9100 0.6480 -0.6203 -0.7539 0.6336 0.1335 -138.5676 -120.7465 -2.3769 -2.3852
0.6 1.5851 9200 0.6474 -0.6400 -0.7768 0.6329 0.1368 -140.8612 -122.7150 -2.3665 -2.3751
0.5989 1.6023 9300 0.6468 -0.6474 -0.7867 0.6341 0.1394 -141.8543 -123.4491 -2.3576 -2.3662
0.614 1.6196 9400 0.6459 -0.6825 -0.8279 0.6368 0.1454 -145.9700 -126.9618 -2.3413 -2.3500
0.596 1.6368 9500 0.6456 -0.6809 -0.8268 0.6368 0.1459 -145.8628 -126.8059 -2.3333 -2.3420
0.6174 1.6540 9600 0.6448 -0.7214 -0.8733 0.6364 0.1519 -150.5126 -130.8547 -2.3123 -2.3212
0.6332 1.6713 9700 0.6452 -0.6900 -0.8381 0.6357 0.1480 -146.9875 -127.7156 -2.3143 -2.3232
0.6115 1.6885 9800 0.6452 -0.6884 -0.8368 0.6341 0.1484 -146.8605 -127.5543 -2.3134 -2.3225
0.5539 1.7057 9900 0.6446 -0.6932 -0.8433 0.6322 0.1501 -147.5115 -128.0289 -2.3106 -2.3197
0.5881 1.7229 10000 0.6446 -0.6998 -0.8514 0.6357 0.1516 -148.3202 -128.6942 -2.3004 -2.3096
0.6197 1.7402 10100 0.6450 -0.6864 -0.8362 0.6343 0.1498 -146.7977 -127.3522 -2.2940 -2.3033
0.6029 1.7574 10200 0.6433 -0.7383 -0.8977 0.6336 0.1593 -152.9491 -132.5467 -2.2721 -2.2816
0.6441 1.7746 10300 0.6435 -0.7404 -0.8998 0.6324 0.1594 -153.1610 -132.7534 -2.2664 -2.2760
0.5718 1.7919 10400 0.6444 -0.7047 -0.8588 0.6341 0.1541 -149.0603 -129.1777 -2.2712 -2.2807
0.5866 1.8091 10500 0.6437 -0.7266 -0.8854 0.6343 0.1588 -151.7161 -131.3703 -2.2598 -2.2695
0.6278 1.8263 10600 0.6437 -0.7187 -0.8763 0.6348 0.1576 -150.8070 -130.5783 -2.2553 -2.2651
0.6083 1.8436 10700 0.6428 -0.7398 -0.9018 0.6306 0.1621 -153.3647 -132.6900 -2.2435 -2.2534
0.5999 1.8608 10800 0.6425 -0.7467 -0.9104 0.6324 0.1637 -154.2222 -133.3793 -2.2412 -2.2513
0.6016 1.8780 10900 0.6423 -0.7546 -0.9199 0.6343 0.1654 -155.1725 -134.1676 -2.2317 -2.2420
0.6056 1.8952 11000 0.6424 -0.7430 -0.9074 0.6303 0.1644 -153.9158 -133.0090 -2.2336 -2.2438
0.6068 1.9125 11100 0.6415 -0.7764 -0.9467 0.6315 0.1703 -157.8523 -136.3506 -2.2170 -2.2275
0.5907 1.9297 11200 0.6416 -0.7643 -0.9335 0.6324 0.1692 -156.5323 -135.1456 -2.2154 -2.2259
0.6504 1.9469 11300 0.6420 -0.7478 -0.9145 0.6289 0.1667 -154.6342 -133.4948 -2.2172 -2.2276
0.6037 1.9642 11400 0.6413 -0.7627 -0.9329 0.6296 0.1702 -156.4750 -134.9861 -2.2093 -2.2199
0.6435 1.9814 11500 0.6415 -0.7615 -0.9315 0.6301 0.1700 -156.3274 -134.8601 -2.2078 -2.2184
0.6037 1.9986 11600 0.6418 -0.7425 -0.9097 0.6294 0.1671 -154.1468 -132.9645 -2.2119 -2.2224
0.6036 2.0159 11700 0.6414 -0.7444 -0.9128 0.6289 0.1684 -154.4553 -133.1498 -2.2068 -2.2174
0.6111 2.0331 11800 0.6408 -0.7710 -0.9439 0.6285 0.1729 -157.5724 -135.8124 -2.1917 -2.2026
0.5739 2.0503 11900 0.6401 -0.8062 -0.9851 0.6283 0.1788 -161.6872 -139.3363 -2.1752 -2.1862
0.5807 2.0675 12000 0.6400 -0.8128 -0.9929 0.6327 0.1801 -162.4718 -139.9921 -2.1663 -2.1776
0.5904 2.0848 12100 0.6396 -0.8183 -0.9996 0.6317 0.1814 -163.1447 -140.5391 -2.1626 -2.1739
0.5722 2.1020 12200 0.6397 -0.8246 -1.0067 0.6327 0.1821 -163.8479 -141.1671 -2.1591 -2.1704
0.5874 2.1192 12300 0.6397 -0.8221 -1.0035 0.6343 0.1814 -163.5287 -140.9182 -2.1576 -2.1690
0.5575 2.1365 12400 0.6391 -0.8641 -1.0517 0.6341 0.1876 -168.3473 -145.1188 -2.1426 -2.1543
0.59 2.1537 12500 0.6392 -0.8708 -1.0586 0.6341 0.1878 -169.0439 -145.7953 -2.1364 -2.1481
0.6028 2.1709 12600 0.6394 -0.8507 -1.0363 0.6336 0.1856 -166.8094 -143.7794 -2.1403 -2.1519
0.5745 2.1881 12700 0.6394 -0.8476 -1.0328 0.6331 0.1852 -166.4608 -143.4725 -2.1395 -2.1511
0.6037 2.2054 12800 0.6395 -0.8490 -1.0347 0.6317 0.1857 -166.6464 -143.6127 -2.1340 -2.1457
0.5773 2.2226 12900 0.6393 -0.8462 -1.0320 0.6315 0.1858 -166.3826 -143.3317 -2.1329 -2.1446
0.5747 2.2398 13000 0.6391 -0.8618 -1.0498 0.6320 0.1880 -168.1579 -144.8899 -2.1262 -2.1381
0.5788 2.2571 13100 0.6392 -0.8607 -1.0489 0.6331 0.1882 -168.0727 -144.7845 -2.1216 -2.1335
0.6091 2.2743 13200 0.6390 -0.8603 -1.0494 0.6327 0.1891 -168.1196 -144.7427 -2.1177 -2.1296
0.6213 2.2915 13300 0.6393 -0.8616 -1.0503 0.6301 0.1886 -168.2058 -144.8738 -2.1141 -2.1261
0.5545 2.3088 13400 0.6397 -0.8361 -1.0209 0.6310 0.1848 -165.2700 -142.3214 -2.1231 -2.1350
0.5633 2.3260 13500 0.6392 -0.8526 -1.0406 0.6336 0.1879 -167.2357 -143.9755 -2.1181 -2.1301
0.5982 2.3432 13600 0.6391 -0.8544 -1.0431 0.6320 0.1886 -167.4862 -144.1549 -2.1134 -2.1255
0.6165 2.3604 13700 0.6390 -0.8581 -1.0475 0.6336 0.1894 -167.9277 -144.5217 -2.1098 -2.1221
0.5863 2.3777 13800 0.6393 -0.8480 -1.0361 0.6322 0.1881 -166.7901 -143.5142 -2.1112 -2.1233
0.6023 2.3949 13900 0.6395 -0.8345 -1.0207 0.6322 0.1862 -165.2497 -142.1660 -2.1148 -2.1269
0.551 2.4121 14000 0.6389 -0.8440 -1.0328 0.6331 0.1888 -166.4650 -143.1130 -2.1104 -2.1226
0.565 2.4294 14100 0.6394 -0.8393 -1.0266 0.6322 0.1874 -165.8436 -142.6391 -2.1116 -2.1238
0.555 2.4466 14200 0.6396 -0.8346 -1.0211 0.6317 0.1865 -165.2906 -142.1683 -2.1129 -2.1251
0.5303 2.4638 14300 0.6392 -0.8468 -1.0356 0.6313 0.1888 -166.7382 -143.3939 -2.1079 -2.1202
0.5998 2.4810 14400 0.6390 -0.8530 -1.0429 0.6350 0.1899 -167.4716 -144.0141 -2.1038 -2.1161
0.5688 2.4983 14500 0.6387 -0.8590 -1.0506 0.6338 0.1916 -168.2381 -144.6089 -2.1014 -2.1137
0.5601 2.5155 14600 0.6386 -0.8520 -1.0429 0.6341 0.1909 -167.4715 -143.9122 -2.1035 -2.1158
0.5694 2.5327 14700 0.6385 -0.8549 -1.0466 0.6336 0.1917 -167.8379 -144.2034 -2.1025 -2.1148
0.5762 2.5500 14800 0.6388 -0.8514 -1.0423 0.6327 0.1909 -167.4103 -143.8544 -2.1027 -2.1151
0.5944 2.5672 14900 0.6388 -0.8497 -1.0403 0.6322 0.1906 -167.2102 -143.6825 -2.1028 -2.1151
0.5766 2.5844 15000 0.6386 -0.8528 -1.0444 0.6327 0.1916 -167.6185 -143.9918 -2.1007 -2.1131
0.6066 2.6017 15100 0.6387 -0.8545 -1.0460 0.6334 0.1915 -167.7836 -144.1632 -2.1001 -2.1125
0.557 2.6189 15200 0.6385 -0.8591 -1.0515 0.6331 0.1924 -168.3309 -144.6236 -2.0980 -2.1104
0.5819 2.6361 15300 0.6384 -0.8621 -1.0552 0.6329 0.1931 -168.6976 -144.9198 -2.0966 -2.1092
0.6353 2.6533 15400 0.6384 -0.8617 -1.0548 0.6331 0.1931 -168.6601 -144.8850 -2.0966 -2.1091
0.6352 2.6706 15500 0.6385 -0.8591 -1.0515 0.6341 0.1924 -168.3342 -144.6245 -2.0974 -2.1098
0.5882 2.6878 15600 0.6384 -0.8581 -1.0511 0.6329 0.1930 -168.2865 -144.5229 -2.0972 -2.1097
0.5698 2.7050 15700 0.6384 -0.8579 -1.0506 0.6334 0.1928 -168.2427 -144.4972 -2.0972 -2.1098
0.5774 2.7223 15800 0.6383 -0.8576 -1.0507 0.6317 0.1931 -168.2498 -144.4737 -2.0970 -2.1095
0.5948 2.7395 15900 0.6385 -0.8583 -1.0511 0.6329 0.1928 -168.2885 -144.5436 -2.0963 -2.1088
0.5977 2.7567 16000 0.6382 -0.8592 -1.0527 0.6343 0.1935 -168.4506 -144.6316 -2.0959 -2.1084
0.5412 2.7739 16100 0.6385 -0.8607 -1.0535 0.6341 0.1927 -168.5258 -144.7848 -2.0957 -2.1081
0.6015 2.7912 16200 0.6385 -0.8599 -1.0527 0.6320 0.1927 -168.4485 -144.7054 -2.0961 -2.1086
0.5921 2.8084 16300 0.6382 -0.8602 -1.0537 0.6338 0.1935 -168.5526 -144.7336 -2.0959 -2.1084
0.5958 2.8256 16400 0.6384 -0.8602 -1.0534 0.6322 0.1932 -168.5213 -144.7309 -2.0953 -2.1078
0.5977 2.8429 16500 0.6384 -0.8601 -1.0531 0.6334 0.1931 -168.4950 -144.7180 -2.0952 -2.1077
0.6289 2.8601 16600 0.6382 -0.8611 -1.0549 0.6338 0.1937 -168.6687 -144.8262 -2.0951 -2.1076
0.6271 2.8773 16700 0.6385 -0.8602 -1.0531 0.6336 0.1929 -168.4876 -144.7302 -2.0954 -2.1080
0.5918 2.8946 16800 0.6384 -0.8615 -1.0546 0.6331 0.1931 -168.6371 -144.8581 -2.0953 -2.1078
0.5885 2.9118 16900 0.6383 -0.8598 -1.0533 0.6331 0.1935 -168.5110 -144.6941 -2.0954 -2.1080
0.6058 2.9290 17000 0.6384 -0.8615 -1.0547 0.6331 0.1933 -168.6532 -144.8587 -2.0949 -2.1075
0.5841 2.9462 17100 0.6384 -0.8599 -1.0531 0.6322 0.1932 -168.4870 -144.7006 -2.0956 -2.1082
0.6214 2.9635 17200 0.6385 -0.8609 -1.0538 0.6341 0.1930 -168.5645 -144.7976 -2.0955 -2.1081
0.5905 2.9807 17300 0.6385 -0.8611 -1.0541 0.6327 0.1931 -168.5945 -144.8186 -2.0951 -2.1076
0.5878 2.9979 17400 0.6382 -0.8614 -1.0551 0.6341 0.1937 -168.6898 -144.8481 -2.0951 -2.1077

Framework versions

  • Transformers 4.41.2
  • Pytorch 2.1.2
  • Datasets 2.19.2
  • Tokenizers 0.19.1
Downloads last month
22
Safetensors
Model size
1.1B params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for martimfasantos/tinyllama-1.1b-sum-dpo-full_LR1e-7_3epochs_old

Dataset used to train martimfasantos/tinyllama-1.1b-sum-dpo-full_LR1e-7_3epochs_old