Platyporoni-7B / README.md

Upload folder using huggingface_hub

6a030a9 over 1 year ago

4.59 kB

	---
	license: cc-by-nc-4.0
	base_model: AIDC-ai-business/Marcoroni-7B
	tags:
	- generated_from_trainer
	model-index:
	- name: results
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# results

	This model is a fine-tuned version of [AIDC-ai-business/Marcoroni-7B](https://huggingface.co/AIDC-ai-business/Marcoroni-7B) on an unknown dataset.
	It achieves the following results on the evaluation set:
	- Loss: 2.7324

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 8e-06
	- train_batch_size: 48
	- eval_batch_size: 6
	- seed: 42
	- gradient_accumulation_steps: 2
	- total_train_batch_size: 96
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- num_epochs: 1

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \|
	\|:-------------:\|:-----:\|:----:\|:---------------:\|
	\| 1.6335 \| 0.02 \| 4 \| 2.8691 \|
	\| 1.5666 \| 0.03 \| 8 \| 2.8173 \|
	\| 1.4985 \| 0.05 \| 12 \| 2.8003 \|
	\| 1.4244 \| 0.06 \| 16 \| 2.7800 \|
	\| 1.4245 \| 0.08 \| 20 \| 2.7674 \|
	\| 1.3865 \| 0.09 \| 24 \| 2.7675 \|
	\| 1.3887 \| 0.11 \| 28 \| 2.7687 \|
	\| 1.3794 \| 0.12 \| 32 \| 2.7641 \|
	\| 1.3581 \| 0.14 \| 36 \| 2.7628 \|
	\| 1.3712 \| 0.15 \| 40 \| 2.7578 \|
	\| 1.328 \| 0.17 \| 44 \| 2.7535 \|
	\| 1.3937 \| 0.19 \| 48 \| 2.7494 \|
	\| 1.3843 \| 0.2 \| 52 \| 2.7387 \|
	\| 1.2925 \| 0.22 \| 56 \| 2.7368 \|
	\| 1.3135 \| 0.23 \| 60 \| 2.7375 \|
	\| 1.3633 \| 0.25 \| 64 \| 2.7335 \|
	\| 1.326 \| 0.26 \| 68 \| 2.7365 \|
	\| 1.3392 \| 0.28 \| 72 \| 2.7361 \|
	\| 1.2583 \| 0.29 \| 76 \| 2.7316 \|
	\| 1.2652 \| 0.31 \| 80 \| 2.7353 \|
	\| 1.2756 \| 0.32 \| 84 \| 2.7394 \|
	\| 1.2966 \| 0.34 \| 88 \| 2.7400 \|
	\| 1.359 \| 0.36 \| 92 \| 2.7348 \|
	\| 1.3704 \| 0.37 \| 96 \| 2.7342 \|
	\| 1.3389 \| 0.39 \| 100 \| 2.7330 \|
	\| 1.3471 \| 0.4 \| 104 \| 2.7336 \|
	\| 1.3288 \| 0.42 \| 108 \| 2.7380 \|
	\| 1.2856 \| 0.43 \| 112 \| 2.7382 \|
	\| 1.3277 \| 0.45 \| 116 \| 2.7380 \|
	\| 1.2779 \| 0.46 \| 120 \| 2.7414 \|
	\| 1.2967 \| 0.48 \| 124 \| 2.7403 \|
	\| 1.2586 \| 0.5 \| 128 \| 2.7433 \|
	\| 1.2652 \| 0.51 \| 132 \| 2.7407 \|
	\| 1.3011 \| 0.53 \| 136 \| 2.7399 \|
	\| 1.3377 \| 0.54 \| 140 \| 2.7415 \|
	\| 1.295 \| 0.56 \| 144 \| 2.7384 \|
	\| 1.3342 \| 0.57 \| 148 \| 2.7344 \|
	\| 1.3309 \| 0.59 \| 152 \| 2.7409 \|
	\| 1.3463 \| 0.6 \| 156 \| 2.7394 \|
	\| 1.3104 \| 0.62 \| 160 \| 2.7353 \|
	\| 1.2692 \| 0.63 \| 164 \| 2.7380 \|
	\| 1.364 \| 0.65 \| 168 \| 2.7386 \|
	\| 1.2888 \| 0.67 \| 172 \| 2.7370 \|
	\| 1.3238 \| 0.68 \| 176 \| 2.7380 \|
	\| 1.2687 \| 0.7 \| 180 \| 2.7371 \|
	\| 1.2405 \| 0.71 \| 184 \| 2.7396 \|
	\| 1.3065 \| 0.73 \| 188 \| 2.7388 \|
	\| 1.2774 \| 0.74 \| 192 \| 2.7424 \|
	\| 1.3195 \| 0.76 \| 196 \| 2.7382 \|
	\| 1.2521 \| 0.77 \| 200 \| 2.7413 \|
	\| 1.2922 \| 0.79 \| 204 \| 2.7393 \|
	\| 1.3293 \| 0.8 \| 208 \| 2.7394 \|
	\| 1.3062 \| 0.82 \| 212 \| 2.7362 \|
	\| 1.2978 \| 0.84 \| 216 \| 2.7394 \|
	\| 1.3054 \| 0.85 \| 220 \| 2.7359 \|
	\| 1.3377 \| 0.87 \| 224 \| 2.7383 \|
	\| 1.3088 \| 0.88 \| 228 \| 2.7363 \|
	\| 1.296 \| 0.9 \| 232 \| 2.7347 \|
	\| 1.3099 \| 0.91 \| 236 \| 2.7394 \|
	\| 1.3008 \| 0.93 \| 240 \| 2.7358 \|
	\| 1.2943 \| 0.94 \| 244 \| 2.7417 \|
	\| 1.3035 \| 0.96 \| 248 \| 2.7398 \|
	\| 1.3877 \| 0.97 \| 252 \| 2.7390 \|
	\| 1.3324 \| 0.99 \| 256 \| 2.7324 \|


	### Framework versions

	- Transformers 4.34.0.dev0
	- Pytorch 2.0.1+cu118
	- Datasets 2.14.6.dev0
	- Tokenizers 0.14.0

	---
	license: cc-by-nc-4.0
	base_model: AIDC-ai-business/Marcoroni-7B
	tags:
	- generated_from_trainer
	model-index:
	- name: results
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# results

	This model is a fine-tuned version of [AIDC-ai-business/Marcoroni-7B](https://huggingface.co/AIDC-ai-business/Marcoroni-7B) on an unknown dataset.
	It achieves the following results on the evaluation set:
	- Loss: 2.7324

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 8e-06
	- train_batch_size: 48
	- eval_batch_size: 6
	- seed: 42
	- gradient_accumulation_steps: 2
	- total_train_batch_size: 96
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- num_epochs: 1

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \|
	\|:-------------:\|:-----:\|:----:\|:---------------:\|
	\| 1.6335 \| 0.02 \| 4 \| 2.8691 \|
	\| 1.5666 \| 0.03 \| 8 \| 2.8173 \|
	\| 1.4985 \| 0.05 \| 12 \| 2.8003 \|
	\| 1.4244 \| 0.06 \| 16 \| 2.7800 \|
	\| 1.4245 \| 0.08 \| 20 \| 2.7674 \|
	\| 1.3865 \| 0.09 \| 24 \| 2.7675 \|
	\| 1.3887 \| 0.11 \| 28 \| 2.7687 \|
	\| 1.3794 \| 0.12 \| 32 \| 2.7641 \|
	\| 1.3581 \| 0.14 \| 36 \| 2.7628 \|
	\| 1.3712 \| 0.15 \| 40 \| 2.7578 \|
	\| 1.328 \| 0.17 \| 44 \| 2.7535 \|
	\| 1.3937 \| 0.19 \| 48 \| 2.7494 \|
	\| 1.3843 \| 0.2 \| 52 \| 2.7387 \|
	\| 1.2925 \| 0.22 \| 56 \| 2.7368 \|
	\| 1.3135 \| 0.23 \| 60 \| 2.7375 \|
	\| 1.3633 \| 0.25 \| 64 \| 2.7335 \|
	\| 1.326 \| 0.26 \| 68 \| 2.7365 \|
	\| 1.3392 \| 0.28 \| 72 \| 2.7361 \|
	\| 1.2583 \| 0.29 \| 76 \| 2.7316 \|
	\| 1.2652 \| 0.31 \| 80 \| 2.7353 \|
	\| 1.2756 \| 0.32 \| 84 \| 2.7394 \|
	\| 1.2966 \| 0.34 \| 88 \| 2.7400 \|
	\| 1.359 \| 0.36 \| 92 \| 2.7348 \|
	\| 1.3704 \| 0.37 \| 96 \| 2.7342 \|
	\| 1.3389 \| 0.39 \| 100 \| 2.7330 \|
	\| 1.3471 \| 0.4 \| 104 \| 2.7336 \|
	\| 1.3288 \| 0.42 \| 108 \| 2.7380 \|
	\| 1.2856 \| 0.43 \| 112 \| 2.7382 \|
	\| 1.3277 \| 0.45 \| 116 \| 2.7380 \|
	\| 1.2779 \| 0.46 \| 120 \| 2.7414 \|
	\| 1.2967 \| 0.48 \| 124 \| 2.7403 \|
	\| 1.2586 \| 0.5 \| 128 \| 2.7433 \|
	\| 1.2652 \| 0.51 \| 132 \| 2.7407 \|
	\| 1.3011 \| 0.53 \| 136 \| 2.7399 \|
	\| 1.3377 \| 0.54 \| 140 \| 2.7415 \|
	\| 1.295 \| 0.56 \| 144 \| 2.7384 \|
	\| 1.3342 \| 0.57 \| 148 \| 2.7344 \|
	\| 1.3309 \| 0.59 \| 152 \| 2.7409 \|
	\| 1.3463 \| 0.6 \| 156 \| 2.7394 \|
	\| 1.3104 \| 0.62 \| 160 \| 2.7353 \|
	\| 1.2692 \| 0.63 \| 164 \| 2.7380 \|
	\| 1.364 \| 0.65 \| 168 \| 2.7386 \|
	\| 1.2888 \| 0.67 \| 172 \| 2.7370 \|
	\| 1.3238 \| 0.68 \| 176 \| 2.7380 \|
	\| 1.2687 \| 0.7 \| 180 \| 2.7371 \|
	\| 1.2405 \| 0.71 \| 184 \| 2.7396 \|
	\| 1.3065 \| 0.73 \| 188 \| 2.7388 \|
	\| 1.2774 \| 0.74 \| 192 \| 2.7424 \|
	\| 1.3195 \| 0.76 \| 196 \| 2.7382 \|
	\| 1.2521 \| 0.77 \| 200 \| 2.7413 \|
	\| 1.2922 \| 0.79 \| 204 \| 2.7393 \|
	\| 1.3293 \| 0.8 \| 208 \| 2.7394 \|
	\| 1.3062 \| 0.82 \| 212 \| 2.7362 \|
	\| 1.2978 \| 0.84 \| 216 \| 2.7394 \|
	\| 1.3054 \| 0.85 \| 220 \| 2.7359 \|
	\| 1.3377 \| 0.87 \| 224 \| 2.7383 \|
	\| 1.3088 \| 0.88 \| 228 \| 2.7363 \|
	\| 1.296 \| 0.9 \| 232 \| 2.7347 \|
	\| 1.3099 \| 0.91 \| 236 \| 2.7394 \|
	\| 1.3008 \| 0.93 \| 240 \| 2.7358 \|
	\| 1.2943 \| 0.94 \| 244 \| 2.7417 \|
	\| 1.3035 \| 0.96 \| 248 \| 2.7398 \|
	\| 1.3877 \| 0.97 \| 252 \| 2.7390 \|
	\| 1.3324 \| 0.99 \| 256 \| 2.7324 \|


	### Framework versions

	- Transformers 4.34.0.dev0
	- Pytorch 2.0.1+cu118
	- Datasets 2.14.6.dev0
	- Tokenizers 0.14.0