End of training

f41f5d6 verified 6 months ago

4.68 kB

	---
	license: apache-2.0
	library_name: peft
	tags:
	- trl
	- sft
	- generated_from_trainer
	base_model: petals-team/falcon-rw-1b
	model-index:
	- name: GenAI-task-2-ModelD-DS
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# GenAI-task-2-ModelD-DS

	This model is a fine-tuned version of [petals-team/falcon-rw-1b](https://huggingface.co/petals-team/falcon-rw-1b) on an unknown dataset.
	It achieves the following results on the evaluation set:
	- Loss: 0.8551

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 2e-05
	- train_batch_size: 2
	- eval_batch_size: 8
	- seed: 42
	- gradient_accumulation_steps: 2
	- total_train_batch_size: 4
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: cosine
	- lr_scheduler_warmup_ratio: 0.01
	- num_epochs: 2

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \|
	\|:-------------:\|:------:\|:----:\|:---------------:\|
	\| 1.538 \| 0.0316 \| 20 \| 1.4898 \|
	\| 2.1944 \| 0.0631 \| 40 \| 1.4676 \|
	\| 2.2479 \| 0.0947 \| 60 \| 1.4396 \|
	\| 1.7654 \| 0.1263 \| 80 \| 1.3892 \|
	\| 2.1763 \| 0.1579 \| 100 \| 1.3834 \|
	\| 1.3644 \| 0.1894 \| 120 \| 1.3282 \|
	\| 1.6781 \| 0.2210 \| 140 \| 1.3058 \|
	\| 1.7429 \| 0.2526 \| 160 \| 1.2880 \|
	\| 1.37 \| 0.2841 \| 180 \| 1.2483 \|
	\| 1.8196 \| 0.3157 \| 200 \| 1.2511 \|
	\| 1.223 \| 0.3473 \| 220 \| 1.2120 \|
	\| 1.5357 \| 0.3788 \| 240 \| 1.2171 \|
	\| 1.6471 \| 0.4104 \| 260 \| 1.1906 \|
	\| 1.271 \| 0.4420 \| 280 \| 1.1818 \|
	\| 1.7222 \| 0.4736 \| 300 \| 1.1788 \|
	\| 1.2022 \| 0.5051 \| 320 \| 1.1170 \|
	\| 1.4455 \| 0.5367 \| 340 \| 1.1633 \|
	\| 1.7014 \| 0.5683 \| 360 \| 1.1011 \|
	\| 1.1309 \| 0.5998 \| 380 \| 1.0815 \|
	\| 1.6978 \| 0.6314 \| 400 \| 1.0966 \|
	\| 1.0796 \| 0.6630 \| 420 \| 1.0325 \|
	\| 1.4504 \| 0.6946 \| 440 \| 1.0429 \|
	\| 1.4698 \| 0.7261 \| 460 \| 1.0216 \|
	\| 1.0858 \| 0.7577 \| 480 \| 1.0031 \|
	\| 1.4275 \| 0.7893 \| 500 \| 1.0115 \|
	\| 0.9607 \| 0.8208 \| 520 \| 0.9771 \|
	\| 1.2579 \| 0.8524 \| 540 \| 0.9792 \|
	\| 1.3363 \| 0.8840 \| 560 \| 0.9608 \|
	\| 1.0551 \| 0.9155 \| 580 \| 0.9471 \|
	\| 1.531 \| 0.9471 \| 600 \| 0.9530 \|
	\| 0.9776 \| 0.9787 \| 620 \| 0.9321 \|
	\| 1.374 \| 1.0103 \| 640 \| 0.9257 \|
	\| 0.9688 \| 1.0418 \| 660 \| 0.9217 \|
	\| 1.464 \| 1.0734 \| 680 \| 0.9278 \|
	\| 1.0608 \| 1.1050 \| 700 \| 0.9040 \|
	\| 1.0711 \| 1.1365 \| 720 \| 0.9017 \|
	\| 1.2806 \| 1.1681 \| 740 \| 0.8954 \|
	\| 0.9129 \| 1.1997 \| 760 \| 0.8877 \|
	\| 1.2161 \| 1.2313 \| 780 \| 0.8907 \|
	\| 1.0221 \| 1.2628 \| 800 \| 0.8794 \|
	\| 1.1306 \| 1.2944 \| 820 \| 0.8782 \|
	\| 1.3235 \| 1.3260 \| 840 \| 0.8768 \|
	\| 0.9663 \| 1.3575 \| 860 \| 0.8711 \|
	\| 1.3124 \| 1.3891 \| 880 \| 0.8716 \|
	\| 1.0169 \| 1.4207 \| 900 \| 0.8663 \|
	\| 1.1686 \| 1.4522 \| 920 \| 0.8658 \|
	\| 1.2976 \| 1.4838 \| 940 \| 0.8656 \|
	\| 0.8896 \| 1.5154 \| 960 \| 0.8620 \|
	\| 1.3252 \| 1.5470 \| 980 \| 0.8623 \|
	\| 1.0821 \| 1.5785 \| 1000 \| 0.8601 \|
	\| 1.1595 \| 1.6101 \| 1020 \| 0.8594 \|
	\| 1.4023 \| 1.6417 \| 1040 \| 0.8591 \|
	\| 0.8901 \| 1.6732 \| 1060 \| 0.8574 \|
	\| 1.2387 \| 1.7048 \| 1080 \| 0.8575 \|
	\| 0.9921 \| 1.7364 \| 1100 \| 0.8564 \|
	\| 1.0593 \| 1.7680 \| 1120 \| 0.8558 \|
	\| 1.3434 \| 1.7995 \| 1140 \| 0.8558 \|
	\| 0.8345 \| 1.8311 \| 1160 \| 0.8554 \|
	\| 1.3537 \| 1.8627 \| 1180 \| 0.8554 \|
	\| 1.0417 \| 1.8942 \| 1200 \| 0.8552 \|
	\| 1.0643 \| 1.9258 \| 1220 \| 0.8551 \|
	\| 1.2218 \| 1.9574 \| 1240 \| 0.8551 \|
	\| 1.1633 \| 1.9890 \| 1260 \| 0.8551 \|


	### Framework versions

	- PEFT 0.10.0
	- Transformers 4.40.0
	- Pytorch 2.2.1+cu121
	- Datasets 2.19.0
	- Tokenizers 0.19.1

	---
	license: apache-2.0
	library_name: peft
	tags:
	- trl
	- sft
	- generated_from_trainer
	base_model: petals-team/falcon-rw-1b
	model-index:
	- name: GenAI-task-2-ModelD-DS
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# GenAI-task-2-ModelD-DS

	This model is a fine-tuned version of [petals-team/falcon-rw-1b](https://huggingface.co/petals-team/falcon-rw-1b) on an unknown dataset.
	It achieves the following results on the evaluation set:
	- Loss: 0.8551

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 2e-05
	- train_batch_size: 2
	- eval_batch_size: 8
	- seed: 42
	- gradient_accumulation_steps: 2
	- total_train_batch_size: 4
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: cosine
	- lr_scheduler_warmup_ratio: 0.01
	- num_epochs: 2

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \|
	\|:-------------:\|:------:\|:----:\|:---------------:\|
	\| 1.538 \| 0.0316 \| 20 \| 1.4898 \|
	\| 2.1944 \| 0.0631 \| 40 \| 1.4676 \|
	\| 2.2479 \| 0.0947 \| 60 \| 1.4396 \|
	\| 1.7654 \| 0.1263 \| 80 \| 1.3892 \|
	\| 2.1763 \| 0.1579 \| 100 \| 1.3834 \|
	\| 1.3644 \| 0.1894 \| 120 \| 1.3282 \|
	\| 1.6781 \| 0.2210 \| 140 \| 1.3058 \|
	\| 1.7429 \| 0.2526 \| 160 \| 1.2880 \|
	\| 1.37 \| 0.2841 \| 180 \| 1.2483 \|
	\| 1.8196 \| 0.3157 \| 200 \| 1.2511 \|
	\| 1.223 \| 0.3473 \| 220 \| 1.2120 \|
	\| 1.5357 \| 0.3788 \| 240 \| 1.2171 \|
	\| 1.6471 \| 0.4104 \| 260 \| 1.1906 \|
	\| 1.271 \| 0.4420 \| 280 \| 1.1818 \|
	\| 1.7222 \| 0.4736 \| 300 \| 1.1788 \|
	\| 1.2022 \| 0.5051 \| 320 \| 1.1170 \|
	\| 1.4455 \| 0.5367 \| 340 \| 1.1633 \|
	\| 1.7014 \| 0.5683 \| 360 \| 1.1011 \|
	\| 1.1309 \| 0.5998 \| 380 \| 1.0815 \|
	\| 1.6978 \| 0.6314 \| 400 \| 1.0966 \|
	\| 1.0796 \| 0.6630 \| 420 \| 1.0325 \|
	\| 1.4504 \| 0.6946 \| 440 \| 1.0429 \|
	\| 1.4698 \| 0.7261 \| 460 \| 1.0216 \|
	\| 1.0858 \| 0.7577 \| 480 \| 1.0031 \|
	\| 1.4275 \| 0.7893 \| 500 \| 1.0115 \|
	\| 0.9607 \| 0.8208 \| 520 \| 0.9771 \|
	\| 1.2579 \| 0.8524 \| 540 \| 0.9792 \|
	\| 1.3363 \| 0.8840 \| 560 \| 0.9608 \|
	\| 1.0551 \| 0.9155 \| 580 \| 0.9471 \|
	\| 1.531 \| 0.9471 \| 600 \| 0.9530 \|
	\| 0.9776 \| 0.9787 \| 620 \| 0.9321 \|
	\| 1.374 \| 1.0103 \| 640 \| 0.9257 \|
	\| 0.9688 \| 1.0418 \| 660 \| 0.9217 \|
	\| 1.464 \| 1.0734 \| 680 \| 0.9278 \|
	\| 1.0608 \| 1.1050 \| 700 \| 0.9040 \|
	\| 1.0711 \| 1.1365 \| 720 \| 0.9017 \|
	\| 1.2806 \| 1.1681 \| 740 \| 0.8954 \|
	\| 0.9129 \| 1.1997 \| 760 \| 0.8877 \|
	\| 1.2161 \| 1.2313 \| 780 \| 0.8907 \|
	\| 1.0221 \| 1.2628 \| 800 \| 0.8794 \|
	\| 1.1306 \| 1.2944 \| 820 \| 0.8782 \|
	\| 1.3235 \| 1.3260 \| 840 \| 0.8768 \|
	\| 0.9663 \| 1.3575 \| 860 \| 0.8711 \|
	\| 1.3124 \| 1.3891 \| 880 \| 0.8716 \|
	\| 1.0169 \| 1.4207 \| 900 \| 0.8663 \|
	\| 1.1686 \| 1.4522 \| 920 \| 0.8658 \|
	\| 1.2976 \| 1.4838 \| 940 \| 0.8656 \|
	\| 0.8896 \| 1.5154 \| 960 \| 0.8620 \|
	\| 1.3252 \| 1.5470 \| 980 \| 0.8623 \|
	\| 1.0821 \| 1.5785 \| 1000 \| 0.8601 \|
	\| 1.1595 \| 1.6101 \| 1020 \| 0.8594 \|
	\| 1.4023 \| 1.6417 \| 1040 \| 0.8591 \|
	\| 0.8901 \| 1.6732 \| 1060 \| 0.8574 \|
	\| 1.2387 \| 1.7048 \| 1080 \| 0.8575 \|
	\| 0.9921 \| 1.7364 \| 1100 \| 0.8564 \|
	\| 1.0593 \| 1.7680 \| 1120 \| 0.8558 \|
	\| 1.3434 \| 1.7995 \| 1140 \| 0.8558 \|
	\| 0.8345 \| 1.8311 \| 1160 \| 0.8554 \|
	\| 1.3537 \| 1.8627 \| 1180 \| 0.8554 \|
	\| 1.0417 \| 1.8942 \| 1200 \| 0.8552 \|
	\| 1.0643 \| 1.9258 \| 1220 \| 0.8551 \|
	\| 1.2218 \| 1.9574 \| 1240 \| 0.8551 \|
	\| 1.1633 \| 1.9890 \| 1260 \| 0.8551 \|


	### Framework versions

	- PEFT 0.10.0
	- Transformers 4.40.0
	- Pytorch 2.2.1+cu121
	- Datasets 2.19.0
	- Tokenizers 0.19.1