2504v2 / README.md

End of training

0ed21e4 verified 11 months ago

4.22 kB

	---
	license: apache-2.0
	base_model: projecte-aina/roberta-base-ca-v2-cased-te
	tags:
	- generated_from_trainer
	metrics:
	- accuracy
	- precision
	- recall
	- f1
	model-index:
	- name: 2504v2
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# 2504v2

	This model is a fine-tuned version of [projecte-aina/roberta-base-ca-v2-cased-te](https://huggingface.co/projecte-aina/roberta-base-ca-v2-cased-te) on the None dataset.
	It achieves the following results on the evaluation set:
	- Loss: 0.6769
	- Accuracy: 0.8655
	- Precision: 0.8660
	- Recall: 0.8655
	- F1: 0.8655
	- Ratio: 0.5168

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 1e-05
	- train_batch_size: 16
	- eval_batch_size: 8
	- seed: 42
	- gradient_accumulation_steps: 3
	- total_train_batch_size: 48
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- lr_scheduler_warmup_ratio: 0.06
	- lr_scheduler_warmup_steps: 4
	- num_epochs: 10
	- label_smoothing_factor: 0.2

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Accuracy \| Precision \| Recall \| F1 \| Ratio \|
	\|:-------------:\|:------:\|:----:\|:---------------:\|:--------:\|:---------:\|:------:\|:------:\|:------:\|
	\| 4.1824 \| 0.3896 \| 10 \| 2.4179 \| 0.5084 \| 0.3727 \| 0.3389 \| 0.3212 \| 0.7479 \|
	\| 1.997 \| 0.7792 \| 20 \| 1.6877 \| 0.5462 \| 0.5489 \| 0.5462 \| 0.5398 \| 0.3824 \|
	\| 1.4096 \| 1.1688 \| 30 \| 1.2832 \| 0.5924 \| 0.5939 \| 0.5924 \| 0.5908 \| 0.5630 \|
	\| 1.1296 \| 1.5584 \| 40 \| 1.1040 \| 0.6176 \| 0.6187 \| 0.6176 \| 0.6168 \| 0.5462 \|
	\| 1.0408 \| 1.9481 \| 50 \| 0.9666 \| 0.7227 \| 0.7292 \| 0.7227 \| 0.7207 \| 0.5840 \|
	\| 0.9242 \| 2.3377 \| 60 \| 0.8829 \| 0.7815 \| 0.7816 \| 0.7815 \| 0.7815 \| 0.4916 \|
	\| 0.8948 \| 2.7273 \| 70 \| 0.8146 \| 0.7899 \| 0.7940 \| 0.7899 \| 0.7892 \| 0.4412 \|
	\| 0.842 \| 3.1169 \| 80 \| 0.7745 \| 0.7941 \| 0.8101 \| 0.7941 \| 0.7914 \| 0.6134 \|
	\| 0.7715 \| 3.5065 \| 90 \| 0.7244 \| 0.8277 \| 0.8279 \| 0.8277 \| 0.8277 \| 0.4874 \|
	\| 0.7361 \| 3.8961 \| 100 \| 0.7224 \| 0.8151 \| 0.8243 \| 0.8151 \| 0.8138 \| 0.5840 \|
	\| 0.7115 \| 4.2857 \| 110 \| 0.7004 \| 0.8403 \| 0.8407 \| 0.8403 \| 0.8403 \| 0.5168 \|
	\| 0.7076 \| 4.6753 \| 120 \| 0.6940 \| 0.8403 \| 0.8407 \| 0.8403 \| 0.8403 \| 0.4832 \|
	\| 0.7026 \| 5.0649 \| 130 \| 0.6936 \| 0.8487 \| 0.8491 \| 0.8487 \| 0.8487 \| 0.5168 \|
	\| 0.6717 \| 5.4545 \| 140 \| 0.6912 \| 0.8571 \| 0.8581 \| 0.8571 \| 0.8571 \| 0.4748 \|
	\| 0.7166 \| 5.8442 \| 150 \| 0.6867 \| 0.8571 \| 0.8575 \| 0.8571 \| 0.8571 \| 0.5168 \|
	\| 0.6606 \| 6.2338 \| 160 \| 0.6812 \| 0.8613 \| 0.8616 \| 0.8613 \| 0.8613 \| 0.4874 \|
	\| 0.6939 \| 6.6234 \| 170 \| 0.6747 \| 0.8613 \| 0.8614 \| 0.8613 \| 0.8613 \| 0.4958 \|
	\| 0.6609 \| 7.0130 \| 180 \| 0.6744 \| 0.8613 \| 0.8616 \| 0.8613 \| 0.8613 \| 0.5126 \|
	\| 0.6388 \| 7.4026 \| 190 \| 0.6790 \| 0.8529 \| 0.8532 \| 0.8529 \| 0.8529 \| 0.5126 \|
	\| 0.6435 \| 7.7922 \| 200 \| 0.6840 \| 0.8571 \| 0.8572 \| 0.8571 \| 0.8571 \| 0.5084 \|
	\| 0.6534 \| 8.1818 \| 210 \| 0.6828 \| 0.8571 \| 0.8571 \| 0.8571 \| 0.8571 \| 0.5 \|
	\| 0.6552 \| 8.5714 \| 220 \| 0.6818 \| 0.8655 \| 0.8660 \| 0.8655 \| 0.8655 \| 0.5168 \|
	\| 0.646 \| 8.9610 \| 230 \| 0.6788 \| 0.8655 \| 0.8660 \| 0.8655 \| 0.8655 \| 0.5168 \|
	\| 0.6443 \| 9.3506 \| 240 \| 0.6770 \| 0.8655 \| 0.8660 \| 0.8655 \| 0.8655 \| 0.5168 \|
	\| 0.6418 \| 9.7403 \| 250 \| 0.6769 \| 0.8655 \| 0.8660 \| 0.8655 \| 0.8655 \| 0.5168 \|


	### Framework versions

	- Transformers 4.40.0
	- Pytorch 2.2.1+cu121
	- Datasets 2.19.0
	- Tokenizers 0.19.1