zephyr-7b-beta-qlora-generation-3.0
This model is a fine-tuned version of HuggingFaceH4/zephyr-7b-beta on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: nan
- Exact match (%): 35.1001
- Wrong, but certain (%): 56.1553
- Wrong, but not sure (%): 8.7445
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 1e-05
- train_batch_size: 4
- eval_batch_size: 1
- seed: 42
- gradient_accumulation_steps: 4
- total_train_batch_size: 16
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 150
- mixed_precision_training: Native AMP
Training results
Training Loss | Epoch | Step | Validation Loss | Exact match (%) | Wrong, but certain (%) | Wrong, but not sure (%) |
---|---|---|---|---|---|---|
3.31 | 0.9929 | 35 | nan | 37.0298 | 26.6243 | 36.3459 |
2.9673 | 1.9858 | 70 | nan | 38.7640 | 22.3742 | 38.8617 |
2.6967 | 2.9787 | 105 | nan | 39.6922 | 23.8886 | 36.4191 |
2.3707 | 4.0 | 141 | nan | 40.2540 | 22.7894 | 36.9565 |
2.1356 | 4.9929 | 176 | nan | 39.3014 | 24.2062 | 36.4924 |
1.7579 | 5.9858 | 211 | nan | 38.2511 | 27.1373 | 34.6116 |
1.394 | 6.9787 | 246 | nan | 37.1764 | 30.6546 | 32.1690 |
1.0519 | 8.0 | 282 | nan | 36.8588 | 32.9995 | 30.1417 |
0.8315 | 8.9929 | 317 | nan | 35.6131 | 36.4191 | 27.9678 |
0.6824 | 9.9858 | 352 | nan | 35.1734 | 41.7440 | 23.0826 |
0.5828 | 10.9787 | 387 | nan | 34.9047 | 39.4235 | 25.6717 |
0.5147 | 12.0 | 423 | nan | 35.8329 | 42.3302 | 21.8368 |
0.5027 | 12.9929 | 458 | nan | 34.8559 | 43.8447 | 21.2995 |
0.4697 | 13.9858 | 493 | nan | 35.1979 | 43.5515 | 21.2506 |
0.4571 | 14.9787 | 528 | nan | 35.2711 | 45.0660 | 19.6629 |
0.4346 | 16.0 | 564 | nan | 35.1246 | 47.5818 | 17.2936 |
0.4441 | 16.9929 | 599 | nan | 35.9551 | 46.3605 | 17.6844 |
0.4208 | 17.9858 | 634 | nan | 35.5398 | 48.1436 | 16.3166 |
0.4163 | 18.9787 | 669 | nan | 35.5887 | 47.5574 | 16.8539 |
0.4005 | 20.0 | 705 | nan | 36.1260 | 49.6092 | 14.2648 |
0.3931 | 20.9929 | 740 | nan | 35.5642 | 49.9267 | 14.5090 |
0.3744 | 21.9858 | 775 | nan | 35.7841 | 50.4397 | 13.7763 |
0.3637 | 22.9787 | 810 | nan | 35.1979 | 51.2457 | 13.5564 |
0.3516 | 24.0 | 846 | nan | 35.6131 | 51.6854 | 12.7015 |
0.3548 | 24.9929 | 881 | nan | 35.2956 | 51.8808 | 12.8236 |
0.3537 | 25.9858 | 916 | nan | 35.2956 | 51.5877 | 13.1168 |
0.3469 | 26.9787 | 951 | nan | 35.3688 | 52.4915 | 12.1397 |
0.3283 | 28.0 | 987 | nan | 35.2956 | 52.2716 | 12.4328 |
0.3368 | 28.9929 | 1022 | nan | 35.7841 | 52.4670 | 11.7489 |
0.3394 | 29.9858 | 1057 | nan | 35.3444 | 52.2472 | 12.4084 |
0.3339 | 30.9787 | 1092 | nan | 35.4665 | 52.1251 | 12.4084 |
0.3238 | 32.0 | 1128 | nan | 35.4177 | 52.9311 | 11.6512 |
0.3337 | 32.9929 | 1163 | nan | 35.3444 | 53.6150 | 11.0405 |
0.3341 | 33.9858 | 1198 | nan | 35.9306 | 53.1510 | 10.9184 |
0.3285 | 34.9787 | 1233 | nan | 35.6864 | 53.2975 | 11.0161 |
0.3204 | 36.0 | 1269 | nan | 35.6375 | 53.0532 | 11.3092 |
0.3324 | 36.9929 | 1304 | nan | 35.3933 | 53.6150 | 10.9917 |
0.324 | 37.9858 | 1339 | nan | 35.7108 | 53.8593 | 10.4299 |
0.3292 | 38.9787 | 1374 | nan | 35.3688 | 53.4929 | 11.1383 |
0.3206 | 40.0 | 1410 | nan | 35.2956 | 54.2501 | 10.4543 |
0.325 | 40.9929 | 1445 | nan | 35.4910 | 54.2257 | 10.2833 |
0.3284 | 41.9858 | 1480 | nan | 35.3200 | 54.3478 | 10.3322 |
0.3228 | 42.9787 | 1515 | nan | 35.4421 | 53.4685 | 11.0894 |
0.316 | 44.0 | 1551 | nan | 35.0024 | 54.2745 | 10.7230 |
0.3242 | 44.9929 | 1586 | nan | 35.3933 | 54.2257 | 10.3810 |
0.321 | 45.9858 | 1621 | nan | 35.3688 | 54.3478 | 10.2833 |
0.3209 | 46.9787 | 1656 | nan | 35.4910 | 53.9814 | 10.5276 |
0.3125 | 48.0 | 1692 | nan | 35.8329 | 54.3234 | 9.8437 |
0.3275 | 48.9929 | 1727 | nan | 35.3444 | 55.0562 | 9.5994 |
0.3198 | 49.9858 | 1762 | nan | 35.4665 | 55.0562 | 9.4773 |
0.3281 | 50.9787 | 1797 | nan | 35.3200 | 54.8363 | 9.8437 |
0.3091 | 52.0 | 1833 | nan | 35.6864 | 54.4700 | 9.8437 |
0.3242 | 52.9929 | 1868 | nan | 35.4421 | 54.7142 | 9.8437 |
0.3222 | 53.9858 | 1903 | nan | 36.2237 | 53.5662 | 10.2101 |
0.3223 | 54.9787 | 1938 | nan | 36.0772 | 54.3478 | 9.5750 |
0.3147 | 56.0 | 1974 | nan | 35.5887 | 55.1539 | 9.2574 |
0.3157 | 56.9929 | 2009 | nan | 35.4177 | 55.0562 | 9.5261 |
0.3183 | 57.9858 | 2044 | nan | 35.2223 | 55.4470 | 9.3307 |
0.3132 | 58.9787 | 2079 | nan | 35.6864 | 54.8852 | 9.4284 |
0.3052 | 60.0 | 2115 | nan | 35.5887 | 55.0073 | 9.4040 |
0.3139 | 60.9929 | 2150 | nan | 35.3200 | 55.1295 | 9.5506 |
0.3186 | 61.9858 | 2185 | nan | 35.6619 | 54.7875 | 9.5506 |
0.3172 | 62.9787 | 2220 | nan | 34.8559 | 55.1050 | 10.0391 |
0.3114 | 64.0 | 2256 | nan | 35.1979 | 55.2760 | 9.5261 |
0.3159 | 64.9929 | 2291 | nan | 35.1734 | 55.2272 | 9.5994 |
0.3179 | 65.9858 | 2326 | nan | 35.2711 | 55.4958 | 9.2330 |
0.3264 | 66.9787 | 2361 | nan | 35.3200 | 54.9829 | 9.6971 |
0.3061 | 68.0 | 2397 | nan | 35.0269 | 55.2516 | 9.7215 |
0.3175 | 68.9929 | 2432 | nan | 34.9536 | 55.5203 | 9.5261 |
0.3179 | 69.9858 | 2467 | nan | 34.8803 | 55.3004 | 9.8192 |
0.3174 | 70.9787 | 2502 | nan | 35.2223 | 54.8852 | 9.8925 |
0.3033 | 72.0 | 2538 | nan | 34.6116 | 55.5936 | 9.7948 |
0.3127 | 72.9929 | 2573 | nan | 35.0757 | 55.3249 | 9.5994 |
0.3134 | 73.9858 | 2608 | nan | 34.9047 | 55.5691 | 9.5261 |
0.3155 | 74.9787 | 2643 | nan | 35.2956 | 55.3981 | 9.3063 |
0.3088 | 76.0 | 2679 | nan | 35.1246 | 55.5936 | 9.2819 |
0.3162 | 76.9929 | 2714 | nan | 35.1001 | 55.6668 | 9.2330 |
0.315 | 77.9858 | 2749 | nan | 35.1979 | 55.4958 | 9.3063 |
0.315 | 78.9787 | 2784 | nan | 35.2467 | 55.5936 | 9.1597 |
0.3035 | 80.0 | 2820 | nan | 34.9780 | 55.7157 | 9.3063 |
0.3127 | 80.9929 | 2855 | nan | 34.9536 | 55.6424 | 9.4040 |
0.3105 | 81.9858 | 2890 | nan | 34.9292 | 55.6180 | 9.4529 |
0.3137 | 82.9787 | 2925 | nan | 34.9292 | 55.8134 | 9.2574 |
0.3027 | 84.0 | 2961 | nan | 34.7338 | 56.0821 | 9.1842 |
0.3137 | 84.9929 | 2996 | nan | 35.0513 | 55.8622 | 9.0865 |
0.3137 | 85.9858 | 3031 | nan | 35.0513 | 55.7157 | 9.2330 |
0.3125 | 86.9787 | 3066 | nan | 35.1490 | 55.7401 | 9.1109 |
0.3022 | 88.0 | 3102 | nan | 34.8803 | 55.9355 | 9.1842 |
0.3104 | 88.9929 | 3137 | nan | 35.2711 | 55.8378 | 8.8911 |
0.3164 | 89.9858 | 3172 | nan | 35.0757 | 55.9844 | 8.9399 |
0.3159 | 90.9787 | 3207 | nan | 35.2223 | 55.7157 | 9.0620 |
0.3071 | 92.0 | 3243 | nan | 35.5642 | 55.4226 | 9.0132 |
0.3114 | 92.9929 | 3278 | nan | 35.3200 | 55.3004 | 9.3796 |
0.309 | 93.9858 | 3313 | nan | 35.5642 | 55.4226 | 9.0132 |
0.3152 | 94.9787 | 3348 | nan | 35.1979 | 55.6913 | 9.1109 |
0.3054 | 96.0 | 3384 | nan | 35.2711 | 55.6913 | 9.0376 |
0.3086 | 96.9929 | 3419 | nan | 35.3444 | 55.7890 | 8.8666 |
0.3109 | 97.9858 | 3454 | nan | 35.3444 | 55.5936 | 9.0620 |
0.3086 | 98.9787 | 3489 | nan | 35.2223 | 55.8622 | 8.9155 |
0.3044 | 100.0 | 3525 | nan | 35.2223 | 55.8134 | 8.9643 |
0.309 | 100.9929 | 3560 | nan | 35.1734 | 56.1309 | 8.6957 |
0.309 | 101.9858 | 3595 | nan | 35.1490 | 55.9355 | 8.9155 |
0.3099 | 102.9787 | 3630 | nan | 35.1979 | 56.0088 | 8.7934 |
0.3041 | 104.0 | 3666 | nan | 35.1246 | 56.0088 | 8.8666 |
0.3098 | 104.9929 | 3701 | nan | 35.0757 | 56.1553 | 8.7689 |
0.3102 | 105.9858 | 3736 | nan | 35.2711 | 55.8867 | 8.8422 |
0.3103 | 106.9787 | 3771 | nan | 35.1246 | 56.1065 | 8.7689 |
0.3002 | 108.0 | 3807 | nan | 35.1734 | 56.1065 | 8.7201 |
0.3107 | 108.9929 | 3842 | nan | 35.1490 | 56.1065 | 8.7445 |
0.3089 | 109.9858 | 3877 | nan | 35.4177 | 55.8867 | 8.6957 |
0.307 | 110.9787 | 3912 | nan | 35.2711 | 55.8867 | 8.8422 |
0.3022 | 112.0 | 3948 | nan | 35.2956 | 56.1065 | 8.5979 |
0.3084 | 112.9929 | 3983 | nan | 35.2223 | 56.1065 | 8.6712 |
0.3097 | 113.9858 | 4018 | nan | 35.2223 | 56.2531 | 8.5247 |
0.3097 | 114.9787 | 4053 | nan | 35.2956 | 56.2531 | 8.4514 |
0.3012 | 116.0 | 4089 | nan | 35.2711 | 56.3263 | 8.4025 |
0.3097 | 116.9929 | 4124 | nan | 35.3200 | 56.3508 | 8.3293 |
0.3095 | 117.9858 | 4159 | nan | 35.1734 | 56.3263 | 8.5002 |
0.3093 | 118.9787 | 4194 | nan | 35.2711 | 56.2286 | 8.5002 |
0.2987 | 120.0 | 4230 | nan | 35.1734 | 56.3996 | 8.4270 |
0.3075 | 120.9929 | 4265 | nan | 35.2467 | 56.3263 | 8.4270 |
0.3071 | 121.9858 | 4300 | nan | 35.1490 | 56.3752 | 8.4758 |
0.3078 | 122.9787 | 4335 | nan | 35.2223 | 56.2531 | 8.5247 |
0.2999 | 124.0 | 4371 | nan | 35.2467 | 56.3752 | 8.3781 |
0.307 | 124.9929 | 4406 | nan | 35.1246 | 56.1309 | 8.7445 |
0.3069 | 125.9858 | 4441 | nan | 35.2223 | 56.2531 | 8.5247 |
0.3081 | 126.9787 | 4476 | nan | 35.1001 | 56.3263 | 8.5735 |
0.2958 | 128.0 | 4512 | nan | 35.1979 | 56.0332 | 8.7689 |
0.3078 | 128.9929 | 4547 | nan | 35.0757 | 56.3263 | 8.5979 |
0.3032 | 129.9858 | 4582 | nan | 35.0513 | 56.4485 | 8.5002 |
0.311 | 130.9787 | 4617 | nan | 35.1734 | 56.3019 | 8.5247 |
0.2955 | 132.0 | 4653 | nan | 35.0757 | 56.2775 | 8.6468 |
0.3069 | 132.9929 | 4688 | nan | 35.0513 | 56.2042 | 8.7445 |
0.3055 | 133.9858 | 4723 | nan | 35.0269 | 56.3019 | 8.6712 |
0.3074 | 134.9787 | 4758 | nan | 34.9536 | 56.3996 | 8.6468 |
0.3004 | 136.0 | 4794 | nan | 35.0513 | 56.3263 | 8.6224 |
0.3059 | 136.9929 | 4829 | nan | 35.0024 | 56.4240 | 8.5735 |
0.3034 | 137.9858 | 4864 | nan | 35.0513 | 56.3508 | 8.5979 |
0.3055 | 138.9787 | 4899 | nan | 35.0269 | 56.4729 | 8.5002 |
0.2959 | 140.0 | 4935 | nan | 35.0024 | 56.3263 | 8.6712 |
0.3088 | 140.9929 | 4970 | nan | 35.0269 | 56.3019 | 8.6712 |
0.3021 | 141.9858 | 5005 | nan | 35.0024 | 56.4240 | 8.5735 |
0.3024 | 142.9787 | 5040 | nan | 35.1490 | 56.3019 | 8.5491 |
0.2982 | 144.0 | 5076 | nan | 35.0757 | 56.3263 | 8.5979 |
0.3038 | 144.9929 | 5111 | nan | 35.0513 | 56.1798 | 8.7689 |
0.3063 | 145.9858 | 5146 | nan | 35.0513 | 56.1553 | 8.7934 |
0.3059 | 146.9787 | 5181 | nan | 35.1001 | 56.1553 | 8.7445 |
0.2986 | 148.0 | 5217 | nan | 35.0513 | 56.2042 | 8.7445 |
0.3014 | 148.9362 | 5250 | nan | 35.1001 | 56.1553 | 8.7445 |
Framework versions
- PEFT 0.13.2
- Transformers 4.44.2
- Pytorch 2.4.1+cu121
- Datasets 3.0.1
- Tokenizers 0.19.1
- Downloads last month
- 2
Model tree for clrksnbot/zephyr-7b-beta-qlora-generation-3.0
Base model
mistralai/Mistral-7B-v0.1
Finetuned
HuggingFaceH4/zephyr-7b-beta