genz_model / README.md

update model card README.md

69c9b3d over 1 year ago

4.75 kB

	---
	tags:
	- generated_from_trainer
	metrics:
	- bleu
	model-index:
	- name: genz_model
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# genz_model

	This model was trained from scratch on the None dataset.
	It achieves the following results on the evaluation set:
	- Loss: 1.2536
	- Bleu: 40.0734
	- Gen Len: 15.8667

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 2e-05
	- train_batch_size: 16
	- eval_batch_size: 16
	- seed: 42
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- num_epochs: 50

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Bleu \| Gen Len \|
	\|:-------------:\|:-----:\|:----:\|:---------------:\|:-------:\|:-------:\|
	\| No log \| 1.0 \| 41 \| 1.9667 \| 16.4087 \| 16.3333 \|
	\| No log \| 2.0 \| 82 \| 1.8242 \| 30.3437 \| 15.4788 \|
	\| No log \| 3.0 \| 123 \| 1.7376 \| 35.0542 \| 15.6545 \|
	\| No log \| 4.0 \| 164 \| 1.6830 \| 36.3815 \| 15.9091 \|
	\| No log \| 5.0 \| 205 \| 1.6438 \| 37.3325 \| 15.9212 \|
	\| No log \| 6.0 \| 246 \| 1.6052 \| 37.5162 \| 16.0364 \|
	\| No log \| 7.0 \| 287 \| 1.5723 \| 37.5334 \| 16.097 \|
	\| No log \| 8.0 \| 328 \| 1.5484 \| 38.2319 \| 16.1152 \|
	\| No log \| 9.0 \| 369 \| 1.5249 \| 38.3884 \| 16.1455 \|
	\| No log \| 10.0 \| 410 \| 1.5040 \| 38.4443 \| 16.1394 \|
	\| No log \| 11.0 \| 451 \| 1.4852 \| 38.8279 \| 16.1879 \|
	\| No log \| 12.0 \| 492 \| 1.4706 \| 39.4717 \| 16.0424 \|
	\| 1.7321 \| 13.0 \| 533 \| 1.4525 \| 39.6365 \| 16.103 \|
	\| 1.7321 \| 14.0 \| 574 \| 1.4361 \| 39.7667 \| 16.0545 \|
	\| 1.7321 \| 15.0 \| 615 \| 1.4237 \| 39.934 \| 16.0182 \|
	\| 1.7321 \| 16.0 \| 656 \| 1.4084 \| 39.8808 \| 16.0606 \|
	\| 1.7321 \| 17.0 \| 697 \| 1.4013 \| 39.958 \| 16.0606 \|
	\| 1.7321 \| 18.0 \| 738 \| 1.3875 \| 39.4972 \| 16.0788 \|
	\| 1.7321 \| 19.0 \| 779 \| 1.3770 \| 39.4976 \| 15.9394 \|
	\| 1.7321 \| 20.0 \| 820 \| 1.3681 \| 39.4927 \| 15.9818 \|
	\| 1.7321 \| 21.0 \| 861 \| 1.3592 \| 39.8584 \| 15.9818 \|
	\| 1.7321 \| 22.0 \| 902 \| 1.3512 \| 39.9409 \| 15.9515 \|
	\| 1.7321 \| 23.0 \| 943 \| 1.3414 \| 39.8891 \| 15.9576 \|
	\| 1.7321 \| 24.0 \| 984 \| 1.3367 \| 40.0053 \| 15.9576 \|
	\| 1.3831 \| 25.0 \| 1025 \| 1.3298 \| 39.9729 \| 15.9636 \|
	\| 1.3831 \| 26.0 \| 1066 \| 1.3231 \| 40.0029 \| 15.9333 \|
	\| 1.3831 \| 27.0 \| 1107 \| 1.3157 \| 39.9874 \| 15.9394 \|
	\| 1.3831 \| 28.0 \| 1148 \| 1.3093 \| 39.8156 \| 15.9152 \|
	\| 1.3831 \| 29.0 \| 1189 \| 1.3051 \| 40.1371 \| 15.9152 \|
	\| 1.3831 \| 30.0 \| 1230 \| 1.3006 \| 40.0601 \| 15.897 \|
	\| 1.3831 \| 31.0 \| 1271 \| 1.2950 \| 40.2356 \| 15.8727 \|
	\| 1.3831 \| 32.0 \| 1312 \| 1.2899 \| 40.3369 \| 15.8848 \|
	\| 1.3831 \| 33.0 \| 1353 \| 1.2871 \| 40.452 \| 15.8667 \|
	\| 1.3831 \| 34.0 \| 1394 \| 1.2836 \| 40.5232 \| 15.8364 \|
	\| 1.3831 \| 35.0 \| 1435 \| 1.2804 \| 40.455 \| 15.8485 \|
	\| 1.3831 \| 36.0 \| 1476 \| 1.2768 \| 40.4874 \| 15.8485 \|
	\| 1.2414 \| 37.0 \| 1517 \| 1.2728 \| 40.5694 \| 15.8424 \|
	\| 1.2414 \| 38.0 \| 1558 \| 1.2692 \| 40.4767 \| 15.8424 \|
	\| 1.2414 \| 39.0 \| 1599 \| 1.2679 \| 40.5449 \| 15.8424 \|
	\| 1.2414 \| 40.0 \| 1640 \| 1.2650 \| 40.5121 \| 15.8667 \|
	\| 1.2414 \| 41.0 \| 1681 \| 1.2625 \| 40.0705 \| 15.8545 \|
	\| 1.2414 \| 42.0 \| 1722 \| 1.2604 \| 40.056 \| 15.8545 \|
	\| 1.2414 \| 43.0 \| 1763 \| 1.2597 \| 40.1238 \| 15.8667 \|
	\| 1.2414 \| 44.0 \| 1804 \| 1.2579 \| 40.0473 \| 15.8606 \|
	\| 1.2414 \| 45.0 \| 1845 \| 1.2565 \| 40.0792 \| 15.8667 \|
	\| 1.2414 \| 46.0 \| 1886 \| 1.2553 \| 40.0734 \| 15.8667 \|
	\| 1.2414 \| 47.0 \| 1927 \| 1.2545 \| 40.0734 \| 15.8667 \|
	\| 1.2414 \| 48.0 \| 1968 \| 1.2539 \| 40.0734 \| 15.8667 \|
	\| 1.179 \| 49.0 \| 2009 \| 1.2537 \| 40.0734 \| 15.8667 \|
	\| 1.179 \| 50.0 \| 2050 \| 1.2536 \| 40.0734 \| 15.8667 \|


	### Framework versions

	- Transformers 4.31.0
	- Pytorch 2.0.1+cu118
	- Datasets 2.14.2
	- Tokenizers 0.13.3

	---
	tags:
	- generated_from_trainer
	metrics:
	- bleu
	model-index:
	- name: genz_model
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# genz_model

	This model was trained from scratch on the None dataset.
	It achieves the following results on the evaluation set:
	- Loss: 1.2536
	- Bleu: 40.0734
	- Gen Len: 15.8667

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 2e-05
	- train_batch_size: 16
	- eval_batch_size: 16
	- seed: 42
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- num_epochs: 50

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Bleu \| Gen Len \|
	\|:-------------:\|:-----:\|:----:\|:---------------:\|:-------:\|:-------:\|
	\| No log \| 1.0 \| 41 \| 1.9667 \| 16.4087 \| 16.3333 \|
	\| No log \| 2.0 \| 82 \| 1.8242 \| 30.3437 \| 15.4788 \|
	\| No log \| 3.0 \| 123 \| 1.7376 \| 35.0542 \| 15.6545 \|
	\| No log \| 4.0 \| 164 \| 1.6830 \| 36.3815 \| 15.9091 \|
	\| No log \| 5.0 \| 205 \| 1.6438 \| 37.3325 \| 15.9212 \|
	\| No log \| 6.0 \| 246 \| 1.6052 \| 37.5162 \| 16.0364 \|
	\| No log \| 7.0 \| 287 \| 1.5723 \| 37.5334 \| 16.097 \|
	\| No log \| 8.0 \| 328 \| 1.5484 \| 38.2319 \| 16.1152 \|
	\| No log \| 9.0 \| 369 \| 1.5249 \| 38.3884 \| 16.1455 \|
	\| No log \| 10.0 \| 410 \| 1.5040 \| 38.4443 \| 16.1394 \|
	\| No log \| 11.0 \| 451 \| 1.4852 \| 38.8279 \| 16.1879 \|
	\| No log \| 12.0 \| 492 \| 1.4706 \| 39.4717 \| 16.0424 \|
	\| 1.7321 \| 13.0 \| 533 \| 1.4525 \| 39.6365 \| 16.103 \|
	\| 1.7321 \| 14.0 \| 574 \| 1.4361 \| 39.7667 \| 16.0545 \|
	\| 1.7321 \| 15.0 \| 615 \| 1.4237 \| 39.934 \| 16.0182 \|
	\| 1.7321 \| 16.0 \| 656 \| 1.4084 \| 39.8808 \| 16.0606 \|
	\| 1.7321 \| 17.0 \| 697 \| 1.4013 \| 39.958 \| 16.0606 \|
	\| 1.7321 \| 18.0 \| 738 \| 1.3875 \| 39.4972 \| 16.0788 \|
	\| 1.7321 \| 19.0 \| 779 \| 1.3770 \| 39.4976 \| 15.9394 \|
	\| 1.7321 \| 20.0 \| 820 \| 1.3681 \| 39.4927 \| 15.9818 \|
	\| 1.7321 \| 21.0 \| 861 \| 1.3592 \| 39.8584 \| 15.9818 \|
	\| 1.7321 \| 22.0 \| 902 \| 1.3512 \| 39.9409 \| 15.9515 \|
	\| 1.7321 \| 23.0 \| 943 \| 1.3414 \| 39.8891 \| 15.9576 \|
	\| 1.7321 \| 24.0 \| 984 \| 1.3367 \| 40.0053 \| 15.9576 \|
	\| 1.3831 \| 25.0 \| 1025 \| 1.3298 \| 39.9729 \| 15.9636 \|
	\| 1.3831 \| 26.0 \| 1066 \| 1.3231 \| 40.0029 \| 15.9333 \|
	\| 1.3831 \| 27.0 \| 1107 \| 1.3157 \| 39.9874 \| 15.9394 \|
	\| 1.3831 \| 28.0 \| 1148 \| 1.3093 \| 39.8156 \| 15.9152 \|
	\| 1.3831 \| 29.0 \| 1189 \| 1.3051 \| 40.1371 \| 15.9152 \|
	\| 1.3831 \| 30.0 \| 1230 \| 1.3006 \| 40.0601 \| 15.897 \|
	\| 1.3831 \| 31.0 \| 1271 \| 1.2950 \| 40.2356 \| 15.8727 \|
	\| 1.3831 \| 32.0 \| 1312 \| 1.2899 \| 40.3369 \| 15.8848 \|
	\| 1.3831 \| 33.0 \| 1353 \| 1.2871 \| 40.452 \| 15.8667 \|
	\| 1.3831 \| 34.0 \| 1394 \| 1.2836 \| 40.5232 \| 15.8364 \|
	\| 1.3831 \| 35.0 \| 1435 \| 1.2804 \| 40.455 \| 15.8485 \|
	\| 1.3831 \| 36.0 \| 1476 \| 1.2768 \| 40.4874 \| 15.8485 \|
	\| 1.2414 \| 37.0 \| 1517 \| 1.2728 \| 40.5694 \| 15.8424 \|
	\| 1.2414 \| 38.0 \| 1558 \| 1.2692 \| 40.4767 \| 15.8424 \|
	\| 1.2414 \| 39.0 \| 1599 \| 1.2679 \| 40.5449 \| 15.8424 \|
	\| 1.2414 \| 40.0 \| 1640 \| 1.2650 \| 40.5121 \| 15.8667 \|
	\| 1.2414 \| 41.0 \| 1681 \| 1.2625 \| 40.0705 \| 15.8545 \|
	\| 1.2414 \| 42.0 \| 1722 \| 1.2604 \| 40.056 \| 15.8545 \|
	\| 1.2414 \| 43.0 \| 1763 \| 1.2597 \| 40.1238 \| 15.8667 \|
	\| 1.2414 \| 44.0 \| 1804 \| 1.2579 \| 40.0473 \| 15.8606 \|
	\| 1.2414 \| 45.0 \| 1845 \| 1.2565 \| 40.0792 \| 15.8667 \|
	\| 1.2414 \| 46.0 \| 1886 \| 1.2553 \| 40.0734 \| 15.8667 \|
	\| 1.2414 \| 47.0 \| 1927 \| 1.2545 \| 40.0734 \| 15.8667 \|
	\| 1.2414 \| 48.0 \| 1968 \| 1.2539 \| 40.0734 \| 15.8667 \|
	\| 1.179 \| 49.0 \| 2009 \| 1.2537 \| 40.0734 \| 15.8667 \|
	\| 1.179 \| 50.0 \| 2050 \| 1.2536 \| 40.0734 \| 15.8667 \|


	### Framework versions

	- Transformers 4.31.0
	- Pytorch 2.0.1+cu118
	- Datasets 2.14.2
	- Tokenizers 0.13.3