genz_model

This model was trained from scratch on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 1.2536
  • Bleu: 40.0734
  • Gen Len: 15.8667

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 50

Training results

Training Loss Epoch Step Validation Loss Bleu Gen Len
No log 1.0 41 1.9667 16.4087 16.3333
No log 2.0 82 1.8242 30.3437 15.4788
No log 3.0 123 1.7376 35.0542 15.6545
No log 4.0 164 1.6830 36.3815 15.9091
No log 5.0 205 1.6438 37.3325 15.9212
No log 6.0 246 1.6052 37.5162 16.0364
No log 7.0 287 1.5723 37.5334 16.097
No log 8.0 328 1.5484 38.2319 16.1152
No log 9.0 369 1.5249 38.3884 16.1455
No log 10.0 410 1.5040 38.4443 16.1394
No log 11.0 451 1.4852 38.8279 16.1879
No log 12.0 492 1.4706 39.4717 16.0424
1.7321 13.0 533 1.4525 39.6365 16.103
1.7321 14.0 574 1.4361 39.7667 16.0545
1.7321 15.0 615 1.4237 39.934 16.0182
1.7321 16.0 656 1.4084 39.8808 16.0606
1.7321 17.0 697 1.4013 39.958 16.0606
1.7321 18.0 738 1.3875 39.4972 16.0788
1.7321 19.0 779 1.3770 39.4976 15.9394
1.7321 20.0 820 1.3681 39.4927 15.9818
1.7321 21.0 861 1.3592 39.8584 15.9818
1.7321 22.0 902 1.3512 39.9409 15.9515
1.7321 23.0 943 1.3414 39.8891 15.9576
1.7321 24.0 984 1.3367 40.0053 15.9576
1.3831 25.0 1025 1.3298 39.9729 15.9636
1.3831 26.0 1066 1.3231 40.0029 15.9333
1.3831 27.0 1107 1.3157 39.9874 15.9394
1.3831 28.0 1148 1.3093 39.8156 15.9152
1.3831 29.0 1189 1.3051 40.1371 15.9152
1.3831 30.0 1230 1.3006 40.0601 15.897
1.3831 31.0 1271 1.2950 40.2356 15.8727
1.3831 32.0 1312 1.2899 40.3369 15.8848
1.3831 33.0 1353 1.2871 40.452 15.8667
1.3831 34.0 1394 1.2836 40.5232 15.8364
1.3831 35.0 1435 1.2804 40.455 15.8485
1.3831 36.0 1476 1.2768 40.4874 15.8485
1.2414 37.0 1517 1.2728 40.5694 15.8424
1.2414 38.0 1558 1.2692 40.4767 15.8424
1.2414 39.0 1599 1.2679 40.5449 15.8424
1.2414 40.0 1640 1.2650 40.5121 15.8667
1.2414 41.0 1681 1.2625 40.0705 15.8545
1.2414 42.0 1722 1.2604 40.056 15.8545
1.2414 43.0 1763 1.2597 40.1238 15.8667
1.2414 44.0 1804 1.2579 40.0473 15.8606
1.2414 45.0 1845 1.2565 40.0792 15.8667
1.2414 46.0 1886 1.2553 40.0734 15.8667
1.2414 47.0 1927 1.2545 40.0734 15.8667
1.2414 48.0 1968 1.2539 40.0734 15.8667
1.179 49.0 2009 1.2537 40.0734 15.8667
1.179 50.0 2050 1.2536 40.0734 15.8667

Framework versions

  • Transformers 4.31.0
  • Pytorch 2.0.1+cu118
  • Datasets 2.14.2
  • Tokenizers 0.13.3
Downloads last month
10
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.