Model save

f634400 verified 4 months ago

3.34 kB

	---
	library_name: transformers
	license: mit
	base_model: gpt2
	tags:
	- generated_from_trainer
	model-index:
	- name: age_transcript_conv1
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# age_transcript_conv1

	This model is a fine-tuned version of [gpt2](https://huggingface.co/gpt2) on an unknown dataset.
	It achieves the following results on the evaluation set:
	- Loss: 3.2240

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 0.0001
	- train_batch_size: 32
	- eval_batch_size: 32
	- seed: 42
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: reduce_lr_on_plateau
	- lr_scheduler_warmup_steps: 500
	- num_epochs: 1

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \|
	\|:-------------:\|:------:\|:-----:\|:---------------:\|
	\| 4.7007 \| 0.0254 \| 1000 \| 4.4922 \|
	\| 4.2819 \| 0.0508 \| 2000 \| 4.2123 \|
	\| 4.1092 \| 0.0762 \| 3000 \| 4.0324 \|
	\| 3.9861 \| 0.1016 \| 4000 \| 3.9267 \|
	\| 3.9221 \| 0.1270 \| 5000 \| 3.8582 \|
	\| 3.8904 \| 0.1524 \| 6000 \| 3.7809 \|
	\| 3.7526 \| 0.1778 \| 7000 \| 3.7252 \|
	\| 3.7724 \| 0.2032 \| 8000 \| 3.6846 \|
	\| 3.6967 \| 0.2285 \| 9000 \| 3.6293 \|
	\| 3.5701 \| 0.2539 \| 10000 \| 3.5902 \|
	\| 3.676 \| 0.2793 \| 11000 \| 3.5787 \|
	\| 3.6092 \| 0.3047 \| 12000 \| 3.5333 \|
	\| 3.5105 \| 0.3301 \| 13000 \| 3.5061 \|
	\| 3.5298 \| 0.3555 \| 14000 \| 3.4776 \|
	\| 3.4857 \| 0.3809 \| 15000 \| 3.4537 \|
	\| 3.4688 \| 0.4063 \| 16000 \| 3.4490 \|
	\| 3.4914 \| 0.4317 \| 17000 \| 3.4141 \|
	\| 3.3866 \| 0.4571 \| 18000 \| 3.3970 \|
	\| 3.484 \| 0.4825 \| 19000 \| 3.3963 \|
	\| 3.4187 \| 0.5079 \| 20000 \| 3.3733 \|
	\| 3.2706 \| 0.5333 \| 21000 \| 3.3546 \|
	\| 3.4344 \| 0.5587 \| 22000 \| 3.3640 \|
	\| 3.3577 \| 0.5841 \| 23000 \| 3.3337 \|
	\| 3.3058 \| 0.6095 \| 24000 \| 3.3364 \|
	\| 3.3558 \| 0.6349 \| 25000 \| 3.3195 \|
	\| 3.2865 \| 0.6603 \| 26000 \| 3.2988 \|
	\| 3.3295 \| 0.6856 \| 27000 \| 3.3054 \|
	\| 3.3024 \| 0.7110 \| 28000 \| 3.2867 \|
	\| 3.1984 \| 0.7364 \| 29000 \| 3.2751 \|
	\| 3.3467 \| 0.7618 \| 30000 \| 3.2792 \|
	\| 3.3066 \| 0.7872 \| 31000 \| 3.2647 \|
	\| 3.1441 \| 0.8126 \| 32000 \| 3.2606 \|
	\| 3.3292 \| 0.8380 \| 33000 \| 3.2656 \|
	\| 3.2561 \| 0.8634 \| 34000 \| 3.2444 \|
	\| 3.2027 \| 0.8888 \| 35000 \| 3.2549 \|
	\| 3.264 \| 0.9142 \| 36000 \| 3.2401 \|
	\| 3.1928 \| 0.9396 \| 37000 \| 3.2293 \|
	\| 3.2385 \| 0.9650 \| 38000 \| 3.2376 \|
	\| 3.2274 \| 0.9904 \| 39000 \| 3.2240 \|


	### Framework versions

	- Transformers 4.45.2
	- Pytorch 2.4.1+cu121
	- Datasets 3.0.1
	- Tokenizers 0.20.1

	---
	library_name: transformers
	license: mit
	base_model: gpt2
	tags:
	- generated_from_trainer
	model-index:
	- name: age_transcript_conv1
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# age_transcript_conv1

	This model is a fine-tuned version of [gpt2](https://huggingface.co/gpt2) on an unknown dataset.
	It achieves the following results on the evaluation set:
	- Loss: 3.2240

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 0.0001
	- train_batch_size: 32
	- eval_batch_size: 32
	- seed: 42
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: reduce_lr_on_plateau
	- lr_scheduler_warmup_steps: 500
	- num_epochs: 1

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \|
	\|:-------------:\|:------:\|:-----:\|:---------------:\|
	\| 4.7007 \| 0.0254 \| 1000 \| 4.4922 \|
	\| 4.2819 \| 0.0508 \| 2000 \| 4.2123 \|
	\| 4.1092 \| 0.0762 \| 3000 \| 4.0324 \|
	\| 3.9861 \| 0.1016 \| 4000 \| 3.9267 \|
	\| 3.9221 \| 0.1270 \| 5000 \| 3.8582 \|
	\| 3.8904 \| 0.1524 \| 6000 \| 3.7809 \|
	\| 3.7526 \| 0.1778 \| 7000 \| 3.7252 \|
	\| 3.7724 \| 0.2032 \| 8000 \| 3.6846 \|
	\| 3.6967 \| 0.2285 \| 9000 \| 3.6293 \|
	\| 3.5701 \| 0.2539 \| 10000 \| 3.5902 \|
	\| 3.676 \| 0.2793 \| 11000 \| 3.5787 \|
	\| 3.6092 \| 0.3047 \| 12000 \| 3.5333 \|
	\| 3.5105 \| 0.3301 \| 13000 \| 3.5061 \|
	\| 3.5298 \| 0.3555 \| 14000 \| 3.4776 \|
	\| 3.4857 \| 0.3809 \| 15000 \| 3.4537 \|
	\| 3.4688 \| 0.4063 \| 16000 \| 3.4490 \|
	\| 3.4914 \| 0.4317 \| 17000 \| 3.4141 \|
	\| 3.3866 \| 0.4571 \| 18000 \| 3.3970 \|
	\| 3.484 \| 0.4825 \| 19000 \| 3.3963 \|
	\| 3.4187 \| 0.5079 \| 20000 \| 3.3733 \|
	\| 3.2706 \| 0.5333 \| 21000 \| 3.3546 \|
	\| 3.4344 \| 0.5587 \| 22000 \| 3.3640 \|
	\| 3.3577 \| 0.5841 \| 23000 \| 3.3337 \|
	\| 3.3058 \| 0.6095 \| 24000 \| 3.3364 \|
	\| 3.3558 \| 0.6349 \| 25000 \| 3.3195 \|
	\| 3.2865 \| 0.6603 \| 26000 \| 3.2988 \|
	\| 3.3295 \| 0.6856 \| 27000 \| 3.3054 \|
	\| 3.3024 \| 0.7110 \| 28000 \| 3.2867 \|
	\| 3.1984 \| 0.7364 \| 29000 \| 3.2751 \|
	\| 3.3467 \| 0.7618 \| 30000 \| 3.2792 \|
	\| 3.3066 \| 0.7872 \| 31000 \| 3.2647 \|
	\| 3.1441 \| 0.8126 \| 32000 \| 3.2606 \|
	\| 3.3292 \| 0.8380 \| 33000 \| 3.2656 \|
	\| 3.2561 \| 0.8634 \| 34000 \| 3.2444 \|
	\| 3.2027 \| 0.8888 \| 35000 \| 3.2549 \|
	\| 3.264 \| 0.9142 \| 36000 \| 3.2401 \|
	\| 3.1928 \| 0.9396 \| 37000 \| 3.2293 \|
	\| 3.2385 \| 0.9650 \| 38000 \| 3.2376 \|
	\| 3.2274 \| 0.9904 \| 39000 \| 3.2240 \|


	### Framework versions

	- Transformers 4.45.2
	- Pytorch 2.4.1+cu121
	- Datasets 3.0.1
	- Tokenizers 0.20.1