Model save

82d7cd6 verified 3 months ago

7.24 kB

	---
	library_name: transformers
	tags:
	- trl
	- dpo
	- alignment-handbook
	- generated_from_trainer
	model-index:
	- name: OpenELM-1_1B-SLiC
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# OpenELM-1_1B-SLiC

	This model was trained from scratch on an unknown dataset.
	It achieves the following results on the evaluation set:
	- Logits/chosen: -10.0625
	- Logits/rejected: -8.75
	- Logps/chosen: -752.0
	- Logps/rejected: -824.0
	- Loss: 0.6883
	- Rewards/accuracies: 0.7344
	- Rewards/chosen: -4.3438
	- Rewards/margins: 0.9922
	- Rewards/rejected: -5.3438

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 5e-05
	- train_batch_size: 8
	- eval_batch_size: 16
	- seed: 42
	- distributed_type: multi-GPU
	- num_devices: 4
	- gradient_accumulation_steps: 2
	- total_train_batch_size: 64
	- total_eval_batch_size: 64
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: cosine
	- lr_scheduler_warmup_ratio: 0.1
	- num_epochs: 3

	### Training results

	\| Training Loss \| Epoch \| Step \| Logits/chosen \| Logits/rejected \| Logps/chosen \| Logps/rejected \| Validation Loss \| Rewards/accuracies \| Rewards/chosen \| Rewards/margins \| Rewards/rejected \|
	\|:-------------:\|:------:\|:----:\|:-------------:\|:---------------:\|:------------:\|:--------------:\|:---------------:\|:------------------:\|:--------------:\|:---------------:\|:----------------:\|
	\| 0.7634 \| 0.1047 \| 100 \| -13.0625 \| -12.9375 \| -392.0 \| -392.0 \| 0.7878 \| 0.6406 \| -0.7461 \| 0.2832 \| -1.0312 \|
	\| 0.7498 \| 0.2093 \| 200 \| -12.75 \| -12.4375 \| -436.0 \| -444.0 \| 0.7468 \| 0.6719 \| -1.1719 \| 0.3809 \| -1.5547 \|
	\| 0.8142 \| 0.3140 \| 300 \| -14.8125 \| -14.75 \| -504.0 \| -516.0 \| 0.7466 \| 0.6914 \| -1.8594 \| 0.4141 \| -2.2812 \|
	\| 0.7764 \| 0.4186 \| 400 \| -14.5625 \| -14.4375 \| -516.0 \| -528.0 \| 0.7499 \| 0.6699 \| -1.9688 \| 0.4316 \| -2.4062 \|
	\| 0.731 \| 0.5233 \| 500 \| -11.0 \| -10.5 \| -560.0 \| -576.0 \| 0.7240 \| 0.6914 \| -2.4219 \| 0.4375 \| -2.8594 \|
	\| 0.665 \| 0.6279 \| 600 \| -10.75 \| -10.0625 \| -660.0 \| -696.0 \| 0.7045 \| 0.6973 \| -3.4062 \| 0.6680 \| -4.0625 \|
	\| 0.6806 \| 0.7326 \| 700 \| -13.875 \| -13.4375 \| -568.0 \| -604.0 \| 0.6912 \| 0.7070 \| -2.5156 \| 0.6523 \| -3.1562 \|
	\| 0.6597 \| 0.8373 \| 800 \| -13.5 \| -13.3125 \| -548.0 \| -576.0 \| 0.7087 \| 0.6777 \| -2.2969 \| 0.5664 \| -2.8594 \|
	\| 0.7325 \| 0.9419 \| 900 \| -14.0 \| -13.25 \| -588.0 \| -624.0 \| 0.6838 \| 0.7090 \| -2.6875 \| 0.6602 \| -3.3594 \|
	\| 0.2677 \| 1.0466 \| 1000 \| -12.1875 \| -11.0625 \| -640.0 \| -688.0 \| 0.6726 \| 0.7070 \| -3.2344 \| 0.7734 \| -4.0 \|
	\| 0.2256 \| 1.1512 \| 1100 \| -11.125 \| -10.0625 \| -676.0 \| -728.0 \| 0.6992 \| 0.7090 \| -3.5938 \| 0.7969 \| -4.375 \|
	\| 0.1954 \| 1.2559 \| 1200 \| -11.3125 \| -10.125 \| -664.0 \| -720.0 \| 0.7033 \| 0.7051 \| -3.4688 \| 0.8477 \| -4.3125 \|
	\| 0.2289 \| 1.3605 \| 1300 \| -11.0 \| -9.9375 \| -692.0 \| -740.0 \| 0.6722 \| 0.7344 \| -3.7344 \| 0.7852 \| -4.5 \|
	\| 0.2227 \| 1.4652 \| 1400 \| -12.5 \| -11.8125 \| -676.0 \| -720.0 \| 0.6925 \| 0.6953 \| -3.5781 \| 0.7383 \| -4.3125 \|
	\| 0.1902 \| 1.5699 \| 1500 \| -12.0625 \| -11.125 \| -736.0 \| -792.0 \| 0.6758 \| 0.7148 \| -4.1875 \| 0.8320 \| -5.0312 \|
	\| 0.2192 \| 1.6745 \| 1600 \| -13.625 \| -12.875 \| -704.0 \| -748.0 \| 0.6833 \| 0.7148 \| -3.8438 \| 0.7695 \| -4.625 \|
	\| 0.2137 \| 1.7792 \| 1700 \| -11.9375 \| -11.0 \| -716.0 \| -764.0 \| 0.6734 \| 0.7207 \| -3.9688 \| 0.8008 \| -4.7812 \|
	\| 0.2001 \| 1.8838 \| 1800 \| -12.125 \| -11.3125 \| -692.0 \| -740.0 \| 0.6734 \| 0.7207 \| -3.7344 \| 0.7617 \| -4.5 \|
	\| 0.1713 \| 1.9885 \| 1900 \| -10.4375 \| -9.25 \| -712.0 \| -768.0 \| 0.6680 \| 0.7383 \| -3.9375 \| 0.8789 \| -4.8125 \|
	\| 0.0184 \| 2.0931 \| 2000 \| -11.0625 \| -9.875 \| -704.0 \| -768.0 \| 0.6845 \| 0.7305 \| -3.8594 \| 0.9453 \| -4.8125 \|
	\| 0.0313 \| 2.1978 \| 2100 \| -11.25 \| -10.125 \| -720.0 \| -784.0 \| 0.6798 \| 0.7402 \| -4.0 \| 0.9570 \| -4.9688 \|
	\| 0.0401 \| 2.3025 \| 2200 \| -10.6875 \| -9.375 \| -732.0 \| -800.0 \| 0.6865 \| 0.7363 \| -4.1562 \| 0.9492 \| -5.0938 \|
	\| 0.0211 \| 2.4071 \| 2300 \| -10.125 \| -8.75 \| -740.0 \| -812.0 \| 0.6874 \| 0.7383 \| -4.2188 \| 1.0078 \| -5.2188 \|
	\| 0.0239 \| 2.5118 \| 2400 \| -10.1875 \| -8.875 \| -736.0 \| -800.0 \| 0.6858 \| 0.7383 \| -4.1562 \| 0.9766 \| -5.125 \|
	\| 0.0188 \| 2.6164 \| 2500 \| -10.125 \| -8.8125 \| -744.0 \| -816.0 \| 0.6902 \| 0.7324 \| -4.2812 \| 0.9883 \| -5.25 \|
	\| 0.0145 \| 2.7211 \| 2600 \| -10.125 \| -8.8125 \| -748.0 \| -816.0 \| 0.6874 \| 0.7383 \| -4.2812 \| 0.9844 \| -5.2812 \|
	\| 0.0229 \| 2.8257 \| 2700 \| -10.0625 \| -8.75 \| -752.0 \| -824.0 \| 0.6883 \| 0.7344 \| -4.3438 \| 0.9922 \| -5.3438 \|
	\| 0.0298 \| 2.9304 \| 2800 \| -10.0625 \| -8.75 \| -752.0 \| -824.0 \| 0.6883 \| 0.7344 \| -4.3438 \| 0.9922 \| -5.3438 \|


	### Framework versions

	- Transformers 4.44.2
	- Pytorch 2.3.0
	- Datasets 3.0.0
	- Tokenizers 0.19.1