End of training

68861b7 verified 5 months ago

3.51 kB

	---
	base_model: EleutherAI/gpt-neo-125m
	library_name: peft
	license: mit
	tags:
	- trl
	- sft
	- generated_from_trainer
	model-index:
	- name: gpt-neoMedChatbot
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# gpt-neoMedChatbot

	This model is a fine-tuned version of [EleutherAI/gpt-neo-125m](https://huggingface.co/EleutherAI/gpt-neo-125m) on the None dataset.
	It achieves the following results on the evaluation set:
	- Loss: 2.4059

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 0.0001
	- train_batch_size: 4
	- eval_batch_size: 4
	- seed: 42
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: constant
	- lr_scheduler_warmup_ratio: 0.03
	- num_epochs: 3

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \|
	\|:-------------:\|:------:\|:----:\|:---------------:\|
	\| 3.1874 \| 0.0709 \| 100 \| 3.0118 \|
	\| 2.8756 \| 0.1417 \| 200 \| 2.8228 \|
	\| 2.7134 \| 0.2126 \| 300 \| 2.7358 \|
	\| 2.6948 \| 0.2835 \| 400 \| 2.6833 \|
	\| 2.6386 \| 0.3544 \| 500 \| 2.6441 \|
	\| 2.6525 \| 0.4252 \| 600 \| 2.6150 \|
	\| 2.6242 \| 0.4961 \| 700 \| 2.5856 \|
	\| 2.6444 \| 0.5670 \| 800 \| 2.5701 \|
	\| 2.6007 \| 0.6378 \| 900 \| 2.5540 \|
	\| 2.462 \| 0.7087 \| 1000 \| 2.5418 \|
	\| 2.5641 \| 0.7796 \| 1100 \| 2.5315 \|
	\| 2.4672 \| 0.8505 \| 1200 \| 2.5238 \|
	\| 2.5017 \| 0.9213 \| 1300 \| 2.5146 \|
	\| 2.6389 \| 0.9922 \| 1400 \| 2.5083 \|
	\| 2.4869 \| 1.0631 \| 1500 \| 2.5021 \|
	\| 2.5302 \| 1.1339 \| 1600 \| 2.4942 \|
	\| 2.497 \| 1.2048 \| 1700 \| 2.4886 \|
	\| 2.4965 \| 1.2757 \| 1800 \| 2.4846 \|
	\| 2.5535 \| 1.3466 \| 1900 \| 2.4783 \|
	\| 2.5747 \| 1.4174 \| 2000 \| 2.4732 \|
	\| 2.4534 \| 1.4883 \| 2100 \| 2.4679 \|
	\| 2.4909 \| 1.5592 \| 2200 \| 2.4657 \|
	\| 2.5192 \| 1.6300 \| 2300 \| 2.4617 \|
	\| 2.4271 \| 1.7009 \| 2400 \| 2.4573 \|
	\| 2.4855 \| 1.7718 \| 2500 \| 2.4542 \|
	\| 2.4599 \| 1.8427 \| 2600 \| 2.4530 \|
	\| 2.4482 \| 1.9135 \| 2700 \| 2.4444 \|
	\| 2.493 \| 1.9844 \| 2800 \| 2.4446 \|
	\| 2.3527 \| 2.0553 \| 2900 \| 2.4414 \|
	\| 2.5243 \| 2.1262 \| 3000 \| 2.4376 \|
	\| 2.4644 \| 2.1970 \| 3100 \| 2.4330 \|
	\| 2.386 \| 2.2679 \| 3200 \| 2.4308 \|
	\| 2.3762 \| 2.3388 \| 3300 \| 2.4281 \|
	\| 2.3827 \| 2.4096 \| 3400 \| 2.4245 \|
	\| 2.3487 \| 2.4805 \| 3500 \| 2.4221 \|
	\| 2.4737 \| 2.5514 \| 3600 \| 2.4192 \|
	\| 2.4907 \| 2.6223 \| 3700 \| 2.4171 \|
	\| 2.3967 \| 2.6931 \| 3800 \| 2.4159 \|
	\| 2.4772 \| 2.7640 \| 3900 \| 2.4146 \|
	\| 2.4114 \| 2.8349 \| 4000 \| 2.4106 \|
	\| 2.4017 \| 2.9057 \| 4100 \| 2.4065 \|
	\| 2.3477 \| 2.9766 \| 4200 \| 2.4059 \|


	### Framework versions

	- PEFT 0.12.0
	- Transformers 4.42.4
	- Pytorch 2.3.1+cu121
	- Datasets 2.21.0
	- Tokenizers 0.19.1

	---
	base_model: EleutherAI/gpt-neo-125m
	library_name: peft
	license: mit
	tags:
	- trl
	- sft
	- generated_from_trainer
	model-index:
	- name: gpt-neoMedChatbot
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# gpt-neoMedChatbot

	This model is a fine-tuned version of [EleutherAI/gpt-neo-125m](https://huggingface.co/EleutherAI/gpt-neo-125m) on the None dataset.
	It achieves the following results on the evaluation set:
	- Loss: 2.4059

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 0.0001
	- train_batch_size: 4
	- eval_batch_size: 4
	- seed: 42
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: constant
	- lr_scheduler_warmup_ratio: 0.03
	- num_epochs: 3

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \|
	\|:-------------:\|:------:\|:----:\|:---------------:\|
	\| 3.1874 \| 0.0709 \| 100 \| 3.0118 \|
	\| 2.8756 \| 0.1417 \| 200 \| 2.8228 \|
	\| 2.7134 \| 0.2126 \| 300 \| 2.7358 \|
	\| 2.6948 \| 0.2835 \| 400 \| 2.6833 \|
	\| 2.6386 \| 0.3544 \| 500 \| 2.6441 \|
	\| 2.6525 \| 0.4252 \| 600 \| 2.6150 \|
	\| 2.6242 \| 0.4961 \| 700 \| 2.5856 \|
	\| 2.6444 \| 0.5670 \| 800 \| 2.5701 \|
	\| 2.6007 \| 0.6378 \| 900 \| 2.5540 \|
	\| 2.462 \| 0.7087 \| 1000 \| 2.5418 \|
	\| 2.5641 \| 0.7796 \| 1100 \| 2.5315 \|
	\| 2.4672 \| 0.8505 \| 1200 \| 2.5238 \|
	\| 2.5017 \| 0.9213 \| 1300 \| 2.5146 \|
	\| 2.6389 \| 0.9922 \| 1400 \| 2.5083 \|
	\| 2.4869 \| 1.0631 \| 1500 \| 2.5021 \|
	\| 2.5302 \| 1.1339 \| 1600 \| 2.4942 \|
	\| 2.497 \| 1.2048 \| 1700 \| 2.4886 \|
	\| 2.4965 \| 1.2757 \| 1800 \| 2.4846 \|
	\| 2.5535 \| 1.3466 \| 1900 \| 2.4783 \|
	\| 2.5747 \| 1.4174 \| 2000 \| 2.4732 \|
	\| 2.4534 \| 1.4883 \| 2100 \| 2.4679 \|
	\| 2.4909 \| 1.5592 \| 2200 \| 2.4657 \|
	\| 2.5192 \| 1.6300 \| 2300 \| 2.4617 \|
	\| 2.4271 \| 1.7009 \| 2400 \| 2.4573 \|
	\| 2.4855 \| 1.7718 \| 2500 \| 2.4542 \|
	\| 2.4599 \| 1.8427 \| 2600 \| 2.4530 \|
	\| 2.4482 \| 1.9135 \| 2700 \| 2.4444 \|
	\| 2.493 \| 1.9844 \| 2800 \| 2.4446 \|
	\| 2.3527 \| 2.0553 \| 2900 \| 2.4414 \|
	\| 2.5243 \| 2.1262 \| 3000 \| 2.4376 \|
	\| 2.4644 \| 2.1970 \| 3100 \| 2.4330 \|
	\| 2.386 \| 2.2679 \| 3200 \| 2.4308 \|
	\| 2.3762 \| 2.3388 \| 3300 \| 2.4281 \|
	\| 2.3827 \| 2.4096 \| 3400 \| 2.4245 \|
	\| 2.3487 \| 2.4805 \| 3500 \| 2.4221 \|
	\| 2.4737 \| 2.5514 \| 3600 \| 2.4192 \|
	\| 2.4907 \| 2.6223 \| 3700 \| 2.4171 \|
	\| 2.3967 \| 2.6931 \| 3800 \| 2.4159 \|
	\| 2.4772 \| 2.7640 \| 3900 \| 2.4146 \|
	\| 2.4114 \| 2.8349 \| 4000 \| 2.4106 \|
	\| 2.4017 \| 2.9057 \| 4100 \| 2.4065 \|
	\| 2.3477 \| 2.9766 \| 4200 \| 2.4059 \|


	### Framework versions

	- PEFT 0.12.0
	- Transformers 4.42.4
	- Pytorch 2.3.1+cu121
	- Datasets 2.21.0
	- Tokenizers 0.19.1