Edit model card

zephyr-7b-beta-qlora-generation-3.0

This model is a fine-tuned version of HuggingFaceH4/zephyr-7b-beta on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: nan
  • Exact match (%): 35.1001
  • Wrong, but certain (%): 56.1553
  • Wrong, but not sure (%): 8.7445

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 4
  • eval_batch_size: 1
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 16
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 150
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Exact match (%) Wrong, but certain (%) Wrong, but not sure (%)
3.31 0.9929 35 nan 37.0298 26.6243 36.3459
2.9673 1.9858 70 nan 38.7640 22.3742 38.8617
2.6967 2.9787 105 nan 39.6922 23.8886 36.4191
2.3707 4.0 141 nan 40.2540 22.7894 36.9565
2.1356 4.9929 176 nan 39.3014 24.2062 36.4924
1.7579 5.9858 211 nan 38.2511 27.1373 34.6116
1.394 6.9787 246 nan 37.1764 30.6546 32.1690
1.0519 8.0 282 nan 36.8588 32.9995 30.1417
0.8315 8.9929 317 nan 35.6131 36.4191 27.9678
0.6824 9.9858 352 nan 35.1734 41.7440 23.0826
0.5828 10.9787 387 nan 34.9047 39.4235 25.6717
0.5147 12.0 423 nan 35.8329 42.3302 21.8368
0.5027 12.9929 458 nan 34.8559 43.8447 21.2995
0.4697 13.9858 493 nan 35.1979 43.5515 21.2506
0.4571 14.9787 528 nan 35.2711 45.0660 19.6629
0.4346 16.0 564 nan 35.1246 47.5818 17.2936
0.4441 16.9929 599 nan 35.9551 46.3605 17.6844
0.4208 17.9858 634 nan 35.5398 48.1436 16.3166
0.4163 18.9787 669 nan 35.5887 47.5574 16.8539
0.4005 20.0 705 nan 36.1260 49.6092 14.2648
0.3931 20.9929 740 nan 35.5642 49.9267 14.5090
0.3744 21.9858 775 nan 35.7841 50.4397 13.7763
0.3637 22.9787 810 nan 35.1979 51.2457 13.5564
0.3516 24.0 846 nan 35.6131 51.6854 12.7015
0.3548 24.9929 881 nan 35.2956 51.8808 12.8236
0.3537 25.9858 916 nan 35.2956 51.5877 13.1168
0.3469 26.9787 951 nan 35.3688 52.4915 12.1397
0.3283 28.0 987 nan 35.2956 52.2716 12.4328
0.3368 28.9929 1022 nan 35.7841 52.4670 11.7489
0.3394 29.9858 1057 nan 35.3444 52.2472 12.4084
0.3339 30.9787 1092 nan 35.4665 52.1251 12.4084
0.3238 32.0 1128 nan 35.4177 52.9311 11.6512
0.3337 32.9929 1163 nan 35.3444 53.6150 11.0405
0.3341 33.9858 1198 nan 35.9306 53.1510 10.9184
0.3285 34.9787 1233 nan 35.6864 53.2975 11.0161
0.3204 36.0 1269 nan 35.6375 53.0532 11.3092
0.3324 36.9929 1304 nan 35.3933 53.6150 10.9917
0.324 37.9858 1339 nan 35.7108 53.8593 10.4299
0.3292 38.9787 1374 nan 35.3688 53.4929 11.1383
0.3206 40.0 1410 nan 35.2956 54.2501 10.4543
0.325 40.9929 1445 nan 35.4910 54.2257 10.2833
0.3284 41.9858 1480 nan 35.3200 54.3478 10.3322
0.3228 42.9787 1515 nan 35.4421 53.4685 11.0894
0.316 44.0 1551 nan 35.0024 54.2745 10.7230
0.3242 44.9929 1586 nan 35.3933 54.2257 10.3810
0.321 45.9858 1621 nan 35.3688 54.3478 10.2833
0.3209 46.9787 1656 nan 35.4910 53.9814 10.5276
0.3125 48.0 1692 nan 35.8329 54.3234 9.8437
0.3275 48.9929 1727 nan 35.3444 55.0562 9.5994
0.3198 49.9858 1762 nan 35.4665 55.0562 9.4773
0.3281 50.9787 1797 nan 35.3200 54.8363 9.8437
0.3091 52.0 1833 nan 35.6864 54.4700 9.8437
0.3242 52.9929 1868 nan 35.4421 54.7142 9.8437
0.3222 53.9858 1903 nan 36.2237 53.5662 10.2101
0.3223 54.9787 1938 nan 36.0772 54.3478 9.5750
0.3147 56.0 1974 nan 35.5887 55.1539 9.2574
0.3157 56.9929 2009 nan 35.4177 55.0562 9.5261
0.3183 57.9858 2044 nan 35.2223 55.4470 9.3307
0.3132 58.9787 2079 nan 35.6864 54.8852 9.4284
0.3052 60.0 2115 nan 35.5887 55.0073 9.4040
0.3139 60.9929 2150 nan 35.3200 55.1295 9.5506
0.3186 61.9858 2185 nan 35.6619 54.7875 9.5506
0.3172 62.9787 2220 nan 34.8559 55.1050 10.0391
0.3114 64.0 2256 nan 35.1979 55.2760 9.5261
0.3159 64.9929 2291 nan 35.1734 55.2272 9.5994
0.3179 65.9858 2326 nan 35.2711 55.4958 9.2330
0.3264 66.9787 2361 nan 35.3200 54.9829 9.6971
0.3061 68.0 2397 nan 35.0269 55.2516 9.7215
0.3175 68.9929 2432 nan 34.9536 55.5203 9.5261
0.3179 69.9858 2467 nan 34.8803 55.3004 9.8192
0.3174 70.9787 2502 nan 35.2223 54.8852 9.8925
0.3033 72.0 2538 nan 34.6116 55.5936 9.7948
0.3127 72.9929 2573 nan 35.0757 55.3249 9.5994
0.3134 73.9858 2608 nan 34.9047 55.5691 9.5261
0.3155 74.9787 2643 nan 35.2956 55.3981 9.3063
0.3088 76.0 2679 nan 35.1246 55.5936 9.2819
0.3162 76.9929 2714 nan 35.1001 55.6668 9.2330
0.315 77.9858 2749 nan 35.1979 55.4958 9.3063
0.315 78.9787 2784 nan 35.2467 55.5936 9.1597
0.3035 80.0 2820 nan 34.9780 55.7157 9.3063
0.3127 80.9929 2855 nan 34.9536 55.6424 9.4040
0.3105 81.9858 2890 nan 34.9292 55.6180 9.4529
0.3137 82.9787 2925 nan 34.9292 55.8134 9.2574
0.3027 84.0 2961 nan 34.7338 56.0821 9.1842
0.3137 84.9929 2996 nan 35.0513 55.8622 9.0865
0.3137 85.9858 3031 nan 35.0513 55.7157 9.2330
0.3125 86.9787 3066 nan 35.1490 55.7401 9.1109
0.3022 88.0 3102 nan 34.8803 55.9355 9.1842
0.3104 88.9929 3137 nan 35.2711 55.8378 8.8911
0.3164 89.9858 3172 nan 35.0757 55.9844 8.9399
0.3159 90.9787 3207 nan 35.2223 55.7157 9.0620
0.3071 92.0 3243 nan 35.5642 55.4226 9.0132
0.3114 92.9929 3278 nan 35.3200 55.3004 9.3796
0.309 93.9858 3313 nan 35.5642 55.4226 9.0132
0.3152 94.9787 3348 nan 35.1979 55.6913 9.1109
0.3054 96.0 3384 nan 35.2711 55.6913 9.0376
0.3086 96.9929 3419 nan 35.3444 55.7890 8.8666
0.3109 97.9858 3454 nan 35.3444 55.5936 9.0620
0.3086 98.9787 3489 nan 35.2223 55.8622 8.9155
0.3044 100.0 3525 nan 35.2223 55.8134 8.9643
0.309 100.9929 3560 nan 35.1734 56.1309 8.6957
0.309 101.9858 3595 nan 35.1490 55.9355 8.9155
0.3099 102.9787 3630 nan 35.1979 56.0088 8.7934
0.3041 104.0 3666 nan 35.1246 56.0088 8.8666
0.3098 104.9929 3701 nan 35.0757 56.1553 8.7689
0.3102 105.9858 3736 nan 35.2711 55.8867 8.8422
0.3103 106.9787 3771 nan 35.1246 56.1065 8.7689
0.3002 108.0 3807 nan 35.1734 56.1065 8.7201
0.3107 108.9929 3842 nan 35.1490 56.1065 8.7445
0.3089 109.9858 3877 nan 35.4177 55.8867 8.6957
0.307 110.9787 3912 nan 35.2711 55.8867 8.8422
0.3022 112.0 3948 nan 35.2956 56.1065 8.5979
0.3084 112.9929 3983 nan 35.2223 56.1065 8.6712
0.3097 113.9858 4018 nan 35.2223 56.2531 8.5247
0.3097 114.9787 4053 nan 35.2956 56.2531 8.4514
0.3012 116.0 4089 nan 35.2711 56.3263 8.4025
0.3097 116.9929 4124 nan 35.3200 56.3508 8.3293
0.3095 117.9858 4159 nan 35.1734 56.3263 8.5002
0.3093 118.9787 4194 nan 35.2711 56.2286 8.5002
0.2987 120.0 4230 nan 35.1734 56.3996 8.4270
0.3075 120.9929 4265 nan 35.2467 56.3263 8.4270
0.3071 121.9858 4300 nan 35.1490 56.3752 8.4758
0.3078 122.9787 4335 nan 35.2223 56.2531 8.5247
0.2999 124.0 4371 nan 35.2467 56.3752 8.3781
0.307 124.9929 4406 nan 35.1246 56.1309 8.7445
0.3069 125.9858 4441 nan 35.2223 56.2531 8.5247
0.3081 126.9787 4476 nan 35.1001 56.3263 8.5735
0.2958 128.0 4512 nan 35.1979 56.0332 8.7689
0.3078 128.9929 4547 nan 35.0757 56.3263 8.5979
0.3032 129.9858 4582 nan 35.0513 56.4485 8.5002
0.311 130.9787 4617 nan 35.1734 56.3019 8.5247
0.2955 132.0 4653 nan 35.0757 56.2775 8.6468
0.3069 132.9929 4688 nan 35.0513 56.2042 8.7445
0.3055 133.9858 4723 nan 35.0269 56.3019 8.6712
0.3074 134.9787 4758 nan 34.9536 56.3996 8.6468
0.3004 136.0 4794 nan 35.0513 56.3263 8.6224
0.3059 136.9929 4829 nan 35.0024 56.4240 8.5735
0.3034 137.9858 4864 nan 35.0513 56.3508 8.5979
0.3055 138.9787 4899 nan 35.0269 56.4729 8.5002
0.2959 140.0 4935 nan 35.0024 56.3263 8.6712
0.3088 140.9929 4970 nan 35.0269 56.3019 8.6712
0.3021 141.9858 5005 nan 35.0024 56.4240 8.5735
0.3024 142.9787 5040 nan 35.1490 56.3019 8.5491
0.2982 144.0 5076 nan 35.0757 56.3263 8.5979
0.3038 144.9929 5111 nan 35.0513 56.1798 8.7689
0.3063 145.9858 5146 nan 35.0513 56.1553 8.7934
0.3059 146.9787 5181 nan 35.1001 56.1553 8.7445
0.2986 148.0 5217 nan 35.0513 56.2042 8.7445
0.3014 148.9362 5250 nan 35.1001 56.1553 8.7445

Framework versions

  • PEFT 0.13.2
  • Transformers 4.44.2
  • Pytorch 2.4.1+cu121
  • Datasets 3.0.1
  • Tokenizers 0.19.1
Downloads last month
2
Inference API
Unable to determine this model’s pipeline type. Check the docs .

Model tree for clrksnbot/zephyr-7b-beta-qlora-generation-3.0

Adapter
(276)
this model