trainer: training complete at 2024-10-29 15:25:45.285527.

28a50fa verified 3 months ago

7.79 kB

	---
	library_name: transformers
	license: apache-2.0
	base_model: allenai/longformer-base-4096
	tags:
	- generated_from_trainer
	datasets:
	- stab-gurevych-essays
	metrics:
	- accuracy
	model-index:
	- name: longformer-sep_tok
	results:
	- task:
	name: Token Classification
	type: token-classification
	dataset:
	name: stab-gurevych-essays
	type: stab-gurevych-essays
	config: sep_tok
	split: train[0%:20%]
	args: sep_tok
	metrics:
	- name: Accuracy
	type: accuracy
	value: 0.8978882572968515
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# longformer-sep_tok

	This model is a fine-tuned version of [allenai/longformer-base-4096](https://huggingface.co/allenai/longformer-base-4096) on the stab-gurevych-essays dataset.
	It achieves the following results on the evaluation set:
	- Loss: 0.2532
	- Claim: {'precision': 0.6492678318375059, 'recall': 0.5891555936562366, 'f1-score': 0.6177528089887641, 'support': 4666.0}
	- Majorclaim: {'precision': 0.8648648648648649, 'recall': 0.8003280032800328, 'f1-score': 0.8313458262350937, 'support': 2439.0}
	- O: {'precision': 1.0, 'recall': 0.9992669218864544, 'f1-score': 0.9996333265430841, 'support': 12277.0}
	- Premise: {'precision': 0.8896274845333685, 'recall': 0.9276645391531123, 'f1-score': 0.9082479422140097, 'support': 14571.0}
	- Accuracy: 0.8979
	- Macro avg: {'precision': 0.8509400453089349, 'recall': 0.8291037644939591, 'f1-score': 0.839244975995238, 'support': 33953.0}
	- Weighted avg: {'precision': 0.8947265686653585, 'recall': 0.8978882572968515, 'f1-score': 0.8958462048390052, 'support': 33953.0}

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 2e-05
	- train_batch_size: 8
	- eval_batch_size: 8
	- seed: 42
	- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
	- lr_scheduler_type: linear
	- num_epochs: 5

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Claim \| Majorclaim \| O \| Premise \| Accuracy \| Macro avg \| Weighted avg \|
	\|:-------------:\|:-----:\|:----:\|:---------------:\|:--------------------------------------------------------------------------------------------------------------------:\|:------------------------------------------------------------------------------------------------------------------:\|:-------------------------------------------------------------------------------------------------------------------:\|:-------------------------------------------------------------------------------------------------------------------:\|:--------:\|:-------------------------------------------------------------------------------------------------------------------:\|:-------------------------------------------------------------------------------------------------------------------:\|
	\| No log \| 1.0 \| 41 \| 0.3662 \| {'precision': 0.5429831006612784, 'recall': 0.15837976853836264, 'f1-score': 0.24522979923676788, 'support': 4666.0} \| {'precision': 0.6182052106786748, 'recall': 0.7880278802788028, 'f1-score': 0.6928622927180966, 'support': 2439.0} \| {'precision': 0.9983710702068741, 'recall': 0.9984523906491813, 'f1-score': 0.9984117287721441, 'support': 12277.0} \| {'precision': 0.81162452775356, 'recall': 0.9583419120170201, 'f1-score': 0.8789023162134945, 'support': 14571.0} \| 0.8507 \| {'precision': 0.7427959773250968, 'recall': 0.7258004878708417, 'f1-score': 0.7038515342351258, 'support': 33953.0} \| {'precision': 0.8283375336305401, 'recall': 0.8506759343798781, 'f1-score': 0.8216687720926173, 'support': 33953.0} \|
	\| No log \| 2.0 \| 82 \| 0.2911 \| {'precision': 0.6262440103206782, 'recall': 0.3641234462066009, 'f1-score': 0.4604960021683155, 'support': 4666.0} \| {'precision': 0.7612877895563408, 'recall': 0.7949979499794998, 'f1-score': 0.7777777777777778, 'support': 2439.0} \| {'precision': 0.9998368545558365, 'recall': 0.9983709375254541, 'f1-score': 0.9991033583306163, 'support': 12277.0} \| {'precision': 0.8484239990264086, 'recall': 0.9569006931576419, 'f1-score': 0.8994033220448315, 'support': 14571.0} \| 0.8788 \| {'precision': 0.808948163364816, 'recall': 0.7785982567172992, 'f1-score': 0.7841951150803853, 'support': 33953.0} \| {'precision': 0.8663805444019677, 'recall': 0.878803051276765, 'f1-score': 0.8663997903530639, 'support': 33953.0} \|
	\| No log \| 3.0 \| 123 \| 0.2556 \| {'precision': 0.6437802907915994, 'recall': 0.5124303471924561, 'f1-score': 0.5706443914081145, 'support': 4666.0} \| {'precision': 0.8149312377210216, 'recall': 0.8503485034850349, 'f1-score': 0.8322632423756019, 'support': 2439.0} \| {'precision': 1.0, 'recall': 0.9994298281339089, 'f1-score': 0.9997148327697886, 'support': 12277.0} \| {'precision': 0.8788900414937759, 'recall': 0.9303410884633861, 'f1-score': 0.9038839806634439, 'support': 14571.0} \| 0.8921 \| {'precision': 0.8344003925015993, 'recall': 0.8231374418186965, 'f1-score': 0.8266266118042372, 'support': 33953.0} \| {'precision': 0.8857774841763904, 'recall': 0.8921450240037699, 'f1-score': 0.8875948888942388, 'support': 33953.0} \|
	\| No log \| 4.0 \| 164 \| 0.2579 \| {'precision': 0.5919405320813772, 'recall': 0.6485212173167595, 'f1-score': 0.6189404786254857, 'support': 4666.0} \| {'precision': 0.8280930992241732, 'recall': 0.8314883148831488, 'f1-score': 0.8297872340425532, 'support': 2439.0} \| {'precision': 1.0, 'recall': 0.9988596562678179, 'f1-score': 0.9994295028524858, 'support': 12277.0} \| {'precision': 0.9086276452685965, 'recall': 0.88106512936655, 'f1-score': 0.8946341463414634, 'support': 14571.0} \| 0.8881 \| {'precision': 0.8321653191435366, 'recall': 0.839983579458569, 'f1-score': 0.835697840465497, 'support': 33953.0} \| {'precision': 0.8923608226344708, 'recall': 0.8881394869378259, 'f1-score': 0.8899813710116259, 'support': 33953.0} \|
	\| No log \| 5.0 \| 205 \| 0.2532 \| {'precision': 0.6492678318375059, 'recall': 0.5891555936562366, 'f1-score': 0.6177528089887641, 'support': 4666.0} \| {'precision': 0.8648648648648649, 'recall': 0.8003280032800328, 'f1-score': 0.8313458262350937, 'support': 2439.0} \| {'precision': 1.0, 'recall': 0.9992669218864544, 'f1-score': 0.9996333265430841, 'support': 12277.0} \| {'precision': 0.8896274845333685, 'recall': 0.9276645391531123, 'f1-score': 0.9082479422140097, 'support': 14571.0} \| 0.8979 \| {'precision': 0.8509400453089349, 'recall': 0.8291037644939591, 'f1-score': 0.839244975995238, 'support': 33953.0} \| {'precision': 0.8947265686653585, 'recall': 0.8978882572968515, 'f1-score': 0.8958462048390052, 'support': 33953.0} \|


	### Framework versions

	- Transformers 4.46.0
	- Pytorch 2.5.0+cu124
	- Datasets 3.0.2
	- Tokenizers 0.20.1