OpenELM-1_1B-DPO-full-2-5

This model was trained from scratch on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 1.1888
  • Rewards/chosen: -13.5625
  • Rewards/rejected: -17.0
  • Rewards/accuracies: 0.7070
  • Rewards/margins: 3.4062
  • Logps/rejected: -1984.0
  • Logps/chosen: -1672.0
  • Logits/rejected: 6.2188
  • Logits/chosen: 4.5312

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 8
  • eval_batch_size: 16
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 4
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 64
  • total_eval_batch_size: 64
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 5

Training results

Training Loss Epoch Step Validation Loss Rewards/chosen Rewards/rejected Rewards/accuracies Rewards/margins Logps/rejected Logps/chosen Logits/rejected Logits/chosen
0.615 0.1047 100 0.6275 -0.7383 -0.9961 0.6719 0.2578 -386.0 -388.0 -9.625 -9.8125
0.5897 0.2093 200 0.6029 -1.6641 -2.0938 0.6934 0.4336 -496.0 -480.0 -9.375 -9.75
0.6457 0.3140 300 0.5886 -1.3828 -1.8281 0.6895 0.4473 -470.0 -454.0 -13.5625 -13.75
0.6271 0.4186 400 0.5936 -1.7031 -2.25 0.6992 0.5430 -510.0 -484.0 -8.4375 -8.8125
0.5746 0.5233 500 0.5886 -2.0156 -2.5625 0.6816 0.5430 -540.0 -516.0 -6.6562 -7.4062
0.5484 0.6279 600 0.5710 -3.9531 -4.6875 0.6973 0.7422 -756.0 -708.0 -5.75 -6.4375
0.5747 0.7326 700 0.5820 -2.75 -3.4844 0.6953 0.7227 -632.0 -592.0 -6.5312 -7.5938
0.5591 0.8373 800 0.5662 -2.8594 -3.5156 0.7090 0.6523 -636.0 -600.0 -3.375 -4.7812
0.5892 0.9419 900 0.5821 -2.625 -3.2344 0.7012 0.5977 -608.0 -576.0 -4.8438 -6.125
0.261 1.0466 1000 0.5852 -3.9375 -4.9688 0.7324 1.0078 -780.0 -708.0 -0.8672 -2.1094
0.2407 1.1512 1100 0.5943 -4.0625 -5.0 0.6895 0.9336 -784.0 -720.0 -0.3672 -1.9688
0.2348 1.2559 1200 0.6151 -4.9375 -5.9688 0.6777 1.0547 -884.0 -808.0 1.5312 0.2227
0.257 1.3605 1300 0.6005 -4.4688 -5.4688 0.6973 0.9883 -832.0 -760.0 1.5312 -0.1445
0.2416 1.4652 1400 0.6023 -5.1875 -6.125 0.6855 0.9258 -900.0 -836.0 1.9141 0.2715
0.215 1.5699 1500 0.6062 -5.5938 -6.7188 0.6934 1.1328 -960.0 -872.0 1.9219 0.2637
0.2534 1.6745 1600 0.6013 -4.6562 -5.7188 0.7129 1.0391 -856.0 -780.0 2.7969 1.1406
0.2463 1.7792 1700 0.6173 -5.2812 -6.4375 0.6914 1.1484 -928.0 -844.0 1.9688 0.0977
0.23 1.8838 1800 0.6153 -5.8438 -7.0625 0.7090 1.2266 -992.0 -896.0 2.9062 1.0156
0.2092 1.9885 1900 0.6082 -5.5625 -6.7188 0.7051 1.1641 -956.0 -868.0 2.9375 1.0781
0.0271 2.0931 2000 0.7202 -7.625 -9.375 0.7207 1.7734 -1224.0 -1080.0 3.5781 1.8516
0.0367 2.1978 2100 0.8323 -9.3125 -11.5 0.7168 2.1406 -1432.0 -1248.0 4.7188 2.9219
0.0443 2.3025 2200 0.7840 -8.0 -10.0625 0.7324 2.0625 -1296.0 -1112.0 3.9375 2.0312
0.0302 2.4071 2300 0.7981 -8.375 -10.375 0.7070 2.0 -1328.0 -1152.0 4.625 2.8125
0.031 2.5118 2400 0.7786 -7.9062 -9.875 0.7129 1.9922 -1280.0 -1104.0 4.875 3.0156
0.018 2.6164 2500 0.8584 -9.9375 -12.125 0.6914 2.2031 -1496.0 -1312.0 5.4688 3.6719
0.0248 2.7211 2600 0.8079 -8.625 -10.6875 0.7012 2.0469 -1352.0 -1176.0 5.0312 3.0938
0.0263 2.8257 2700 0.8371 -9.3125 -11.375 0.6914 2.0156 -1424.0 -1248.0 5.2812 3.4531
0.033 2.9304 2800 0.8799 -9.8125 -12.1875 0.7207 2.4062 -1504.0 -1296.0 5.2188 3.3281
0.0118 3.0351 2900 0.8372 -9.625 -11.875 0.7246 2.2969 -1472.0 -1280.0 5.6562 3.7812
0.0094 3.1397 3000 0.9555 -11.0 -13.6875 0.7090 2.6875 -1656.0 -1416.0 6.0938 4.3125
0.0073 3.2444 3100 0.9687 -11.375 -14.125 0.7129 2.7344 -1696.0 -1456.0 5.9062 4.1875
0.0104 3.3490 3200 1.0111 -11.75 -14.5625 0.7070 2.8438 -1744.0 -1488.0 6.1875 4.4688
0.01 3.4537 3300 1.0564 -12.125 -15.0625 0.7051 2.9375 -1792.0 -1528.0 5.9375 4.2188
0.0089 3.5583 3400 0.9822 -11.375 -14.0625 0.7051 2.7031 -1696.0 -1448.0 5.875 4.2188
0.0106 3.6630 3500 1.0239 -11.5625 -14.375 0.7070 2.8125 -1720.0 -1472.0 5.9688 4.25
0.0099 3.7677 3600 1.0668 -11.9375 -14.9375 0.6973 3.0 -1784.0 -1512.0 6.125 4.375
0.0066 3.8723 3700 1.0938 -12.75 -15.875 0.7070 3.1406 -1872.0 -1592.0 6.2188 4.5312
0.0081 3.9770 3800 1.0255 -11.6875 -14.5625 0.7129 2.8906 -1744.0 -1488.0 5.9688 4.2812
0.0035 4.0816 3900 1.1112 -12.75 -15.875 0.7031 3.1406 -1872.0 -1592.0 6.2188 4.5312
0.002 4.1863 4000 1.1127 -12.8125 -16.0 0.7051 3.1562 -1888.0 -1600.0 6.1875 4.5
0.0036 4.2909 4100 1.1368 -13.0 -16.25 0.7031 3.25 -1912.0 -1616.0 6.1875 4.4688
0.0069 4.3956 4200 1.1589 -13.25 -16.625 0.7070 3.3125 -1944.0 -1640.0 6.2188 4.5312
0.0043 4.5003 4300 1.1756 -13.4375 -16.75 0.7031 3.375 -1968.0 -1656.0 6.2188 4.5312
0.0091 4.6049 4400 1.1842 -13.5 -16.875 0.7031 3.3906 -1976.0 -1664.0 6.2188 4.5312
0.0058 4.7096 4500 1.1865 -13.5 -16.875 0.7051 3.3906 -1976.0 -1664.0 6.2188 4.5312
0.0034 4.8142 4600 1.1880 -13.5625 -17.0 0.7051 3.3906 -1984.0 -1672.0 6.2188 4.5312
0.006 4.9189 4700 1.1888 -13.5625 -17.0 0.7070 3.4062 -1984.0 -1672.0 6.2188 4.5312

Framework versions

  • Transformers 4.44.2
  • Pytorch 2.1.2
  • Datasets 2.18.0
  • Tokenizers 0.19.1
Downloads last month
4
Safetensors
Model size
1.08B params
Tensor type
BF16
·
Inference Examples
Inference API (serverless) does not yet support model repos that contain custom code.