Upload tokenizer

2a22b10 verified 12 months ago

3.74 kB

	---
	license: apache-2.0
	tags:
	- generated_from_keras_callback
	base_model: serhii-korobchenko/mt5-small_poetry_test-2024-02-23-11-37-04
	model-index:
	- name: mt5-small_poetry_test-2024-02-23-15-43-02
	results: []
	---

	<!-- This model card has been generated automatically according to the information Keras had access to. You should
	probably proofread and complete it, then remove this comment. -->

	# mt5-small_poetry_test-2024-02-23-15-43-02

	This model is a fine-tuned version of [serhii-korobchenko/mt5-small_poetry_test-2024-02-23-11-37-04](https://huggingface.co/serhii-korobchenko/mt5-small_poetry_test-2024-02-23-11-37-04) on an unknown dataset.
	It achieves the following results on the evaluation set:
	- Train Loss: 1.0719
	- Validation Loss: 8.8056
	- Epoch: 49

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- optimizer: {'name': 'AdamWeightDecay', 'learning_rate': {'module': 'keras.optimizers.schedules', 'class_name': 'PolynomialDecay', 'config': {'initial_learning_rate': 0.0056, 'decay_steps': 750, 'end_learning_rate': 0.0, 'power': 1.0, 'cycle': False, 'name': None}, 'registered_name': None}, 'decay': 0.0, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-08, 'amsgrad': False, 'weight_decay_rate': 1e-06}
	- training_precision: mixed_float16

	### Training results

	\| Train Loss \| Validation Loss \| Epoch \|
	\|:----------:\|:---------------:\|:-----:\|
	\| 10.6051 \| 8.0186 \| 0 \|
	\| 7.2838 \| 7.5097 \| 1 \|
	\| 6.1079 \| 7.5227 \| 2 \|
	\| 5.8520 \| 7.6991 \| 3 \|
	\| 5.7403 \| 7.6098 \| 4 \|
	\| 5.7371 \| 7.6277 \| 5 \|
	\| 5.6356 \| 7.6047 \| 6 \|
	\| 5.6156 \| 7.7156 \| 7 \|
	\| 5.5722 \| 7.5735 \| 8 \|
	\| 5.5118 \| 7.5908 \| 9 \|
	\| 5.4539 \| 7.6017 \| 10 \|
	\| 5.3815 \| 7.6174 \| 11 \|
	\| 5.3763 \| 7.6022 \| 12 \|
	\| 5.2895 \| 7.4830 \| 13 \|
	\| 5.2140 \| 7.5455 \| 14 \|
	\| 5.1843 \| 7.4243 \| 15 \|
	\| 5.1056 \| 7.3897 \| 16 \|
	\| 4.9740 \| 7.2854 \| 17 \|
	\| 4.9361 \| 7.2887 \| 18 \|
	\| 4.8234 \| 7.4561 \| 19 \|
	\| 4.9462 \| 7.3764 \| 20 \|
	\| 4.8029 \| 7.2209 \| 21 \|
	\| 4.6122 \| 7.1327 \| 22 \|
	\| 4.4010 \| 7.3362 \| 23 \|
	\| 4.2291 \| 7.0549 \| 24 \|
	\| 4.0323 \| 7.2076 \| 25 \|
	\| 3.8655 \| 7.2932 \| 26 \|
	\| 3.6406 \| 7.3575 \| 27 \|
	\| 3.4665 \| 7.2689 \| 28 \|
	\| 3.4070 \| 7.1520 \| 29 \|
	\| 3.3049 \| 7.4382 \| 30 \|
	\| 3.0354 \| 7.5552 \| 31 \|
	\| 2.7136 \| 7.2149 \| 32 \|
	\| 2.5568 \| 7.8140 \| 33 \|
	\| 2.2594 \| 7.7701 \| 34 \|
	\| 2.1743 \| 7.9400 \| 35 \|
	\| 2.0776 \| 8.1060 \| 36 \|
	\| 1.8686 \| 7.7733 \| 37 \|
	\| 1.8453 \| 8.1850 \| 38 \|
	\| 1.7281 \| 7.8816 \| 39 \|
	\| 1.5912 \| 7.8918 \| 40 \|
	\| 1.4447 \| 8.4160 \| 41 \|
	\| 1.4090 \| 8.5857 \| 42 \|
	\| 1.2143 \| 8.5367 \| 43 \|
	\| 1.2254 \| 8.3491 \| 44 \|
	\| 1.0937 \| 8.8601 \| 45 \|
	\| 1.1357 \| 8.4994 \| 46 \|
	\| 1.0708 \| 8.9421 \| 47 \|
	\| 1.0830 \| 8.9817 \| 48 \|
	\| 1.0719 \| 8.8056 \| 49 \|


	### Framework versions

	- Transformers 4.37.2
	- TensorFlow 2.15.0
	- Datasets 2.17.1
	- Tokenizers 0.15.2

	---
	license: apache-2.0
	tags:
	- generated_from_keras_callback
	base_model: serhii-korobchenko/mt5-small_poetry_test-2024-02-23-11-37-04
	model-index:
	- name: mt5-small_poetry_test-2024-02-23-15-43-02
	results: []
	---

	<!-- This model card has been generated automatically according to the information Keras had access to. You should
	probably proofread and complete it, then remove this comment. -->

	# mt5-small_poetry_test-2024-02-23-15-43-02

	This model is a fine-tuned version of [serhii-korobchenko/mt5-small_poetry_test-2024-02-23-11-37-04](https://huggingface.co/serhii-korobchenko/mt5-small_poetry_test-2024-02-23-11-37-04) on an unknown dataset.
	It achieves the following results on the evaluation set:
	- Train Loss: 1.0719
	- Validation Loss: 8.8056
	- Epoch: 49

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- optimizer: {'name': 'AdamWeightDecay', 'learning_rate': {'module': 'keras.optimizers.schedules', 'class_name': 'PolynomialDecay', 'config': {'initial_learning_rate': 0.0056, 'decay_steps': 750, 'end_learning_rate': 0.0, 'power': 1.0, 'cycle': False, 'name': None}, 'registered_name': None}, 'decay': 0.0, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-08, 'amsgrad': False, 'weight_decay_rate': 1e-06}
	- training_precision: mixed_float16

	### Training results

	\| Train Loss \| Validation Loss \| Epoch \|
	\|:----------:\|:---------------:\|:-----:\|
	\| 10.6051 \| 8.0186 \| 0 \|
	\| 7.2838 \| 7.5097 \| 1 \|
	\| 6.1079 \| 7.5227 \| 2 \|
	\| 5.8520 \| 7.6991 \| 3 \|
	\| 5.7403 \| 7.6098 \| 4 \|
	\| 5.7371 \| 7.6277 \| 5 \|
	\| 5.6356 \| 7.6047 \| 6 \|
	\| 5.6156 \| 7.7156 \| 7 \|
	\| 5.5722 \| 7.5735 \| 8 \|
	\| 5.5118 \| 7.5908 \| 9 \|
	\| 5.4539 \| 7.6017 \| 10 \|
	\| 5.3815 \| 7.6174 \| 11 \|
	\| 5.3763 \| 7.6022 \| 12 \|
	\| 5.2895 \| 7.4830 \| 13 \|
	\| 5.2140 \| 7.5455 \| 14 \|
	\| 5.1843 \| 7.4243 \| 15 \|
	\| 5.1056 \| 7.3897 \| 16 \|
	\| 4.9740 \| 7.2854 \| 17 \|
	\| 4.9361 \| 7.2887 \| 18 \|
	\| 4.8234 \| 7.4561 \| 19 \|
	\| 4.9462 \| 7.3764 \| 20 \|
	\| 4.8029 \| 7.2209 \| 21 \|
	\| 4.6122 \| 7.1327 \| 22 \|
	\| 4.4010 \| 7.3362 \| 23 \|
	\| 4.2291 \| 7.0549 \| 24 \|
	\| 4.0323 \| 7.2076 \| 25 \|
	\| 3.8655 \| 7.2932 \| 26 \|
	\| 3.6406 \| 7.3575 \| 27 \|
	\| 3.4665 \| 7.2689 \| 28 \|
	\| 3.4070 \| 7.1520 \| 29 \|
	\| 3.3049 \| 7.4382 \| 30 \|
	\| 3.0354 \| 7.5552 \| 31 \|
	\| 2.7136 \| 7.2149 \| 32 \|
	\| 2.5568 \| 7.8140 \| 33 \|
	\| 2.2594 \| 7.7701 \| 34 \|
	\| 2.1743 \| 7.9400 \| 35 \|
	\| 2.0776 \| 8.1060 \| 36 \|
	\| 1.8686 \| 7.7733 \| 37 \|
	\| 1.8453 \| 8.1850 \| 38 \|
	\| 1.7281 \| 7.8816 \| 39 \|
	\| 1.5912 \| 7.8918 \| 40 \|
	\| 1.4447 \| 8.4160 \| 41 \|
	\| 1.4090 \| 8.5857 \| 42 \|
	\| 1.2143 \| 8.5367 \| 43 \|
	\| 1.2254 \| 8.3491 \| 44 \|
	\| 1.0937 \| 8.8601 \| 45 \|
	\| 1.1357 \| 8.4994 \| 46 \|
	\| 1.0708 \| 8.9421 \| 47 \|
	\| 1.0830 \| 8.9817 \| 48 \|
	\| 1.0719 \| 8.8056 \| 49 \|


	### Framework versions

	- Transformers 4.37.2
	- TensorFlow 2.15.0
	- Datasets 2.17.1
	- Tokenizers 0.15.2