longformer-simple / README.md

trainer: training complete at 2024-10-26 21:03:14.375078.

d4cdc9a verified 10 months ago

7.73 kB

	---
	library_name: transformers
	license: apache-2.0
	base_model: allenai/longformer-base-4096
	tags:
	- generated_from_trainer
	datasets:
	- stab-gurevych-essays
	metrics:
	- accuracy
	model-index:
	- name: longformer-simple
	results:
	- task:
	name: Token Classification
	type: token-classification
	dataset:
	name: stab-gurevych-essays
	type: stab-gurevych-essays
	config: simple
	split: train[0%:20%]
	args: simple
	metrics:
	- name: Accuracy
	type: accuracy
	value: 0.8751580602166706
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# longformer-simple

	This model is a fine-tuned version of [allenai/longformer-base-4096](https://huggingface.co/allenai/longformer-base-4096) on the stab-gurevych-essays dataset.
	It achieves the following results on the evaluation set:
	- Loss: 0.3326
	- Claim: {'precision': 0.6375421311900441, 'recall': 0.6000488042947779, 'f1-score': 0.6182275298554368, 'support': 4098.0}
	- Majorclaim: {'precision': 0.8534005037783375, 'recall': 0.7853500231803431, 'f1-score': 0.8179623370352487, 'support': 2157.0}
	- O: {'precision': 0.9584632404706829, 'recall': 0.9674144756877474, 'f1-score': 0.9629180559765586, 'support': 9851.0}
	- Premise: {'precision': 0.884906500445236, 'recall': 0.9064994298745724, 'f1-score': 0.8955728286583305, 'support': 13155.0}
	- Accuracy: 0.8752
	- Macro avg: {'precision': 0.8335780939710751, 'recall': 0.8148281832593602, 'f1-score': 0.8236701878813937, 'support': 29261.0}
	- Weighted avg: {'precision': 0.8727042457708365, 'recall': 0.8751580602166706, 'f1-score': 0.8736819489681839, 'support': 29261.0}

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 2e-05
	- train_batch_size: 8
	- eval_batch_size: 8
	- seed: 42
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- num_epochs: 5

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Claim \| Majorclaim \| O \| Premise \| Accuracy \| Macro avg \| Weighted avg \|
	\|:-------------:\|:-----:\|:----:\|:---------------:\|:---------------------------------------------------------------------------------------------------------------------:\|:------------------------------------------------------------------------------------------------------------------:\|:------------------------------------------------------------------------------------------------------------------:\|:-------------------------------------------------------------------------------------------------------------------:\|:--------:\|:-------------------------------------------------------------------------------------------------------------------:\|:-------------------------------------------------------------------------------------------------------------------:\|
	\| No log \| 1.0 \| 41 \| 0.5843 \| {'precision': 0.48984771573604063, 'recall': 0.14128843338213762, 'f1-score': 0.21931818181818183, 'support': 4098.0} \| {'precision': 0.5396243701328447, 'recall': 0.5461288827074641, 'f1-score': 0.5428571428571428, 'support': 2157.0} \| {'precision': 0.8916069169126951, 'recall': 0.858390011166379, 'f1-score': 0.8746832169640548, 'support': 9851.0} \| {'precision': 0.7743724104313917, 'recall': 0.9660965412390726, 'f1-score': 0.8596746372645179, 'support': 13155.0} \| 0.7834 \| {'precision': 0.673862853303243, 'recall': 0.6279759671237634, 'f1-score': 0.6241332947259743, 'support': 29261.0} \| {'precision': 0.7566882370115429, 'recall': 0.7833635214107515, 'f1-score': 0.7516910901801511, 'support': 29261.0} \|
	\| No log \| 2.0 \| 82 \| 0.4171 \| {'precision': 0.5352343493936415, 'recall': 0.39848706686188384, 'f1-score': 0.45684711148412366, 'support': 4098.0} \| {'precision': 0.8575712143928036, 'recall': 0.5303662494204914, 'f1-score': 0.6553995989687769, 'support': 2157.0} \| {'precision': 0.9516030844155844, 'recall': 0.952086082631205, 'f1-score': 0.9518445222509768, 'support': 9851.0} \| {'precision': 0.8249001331557922, 'recall': 0.9418472063854048, 'f1-score': 0.8795031055900621, 'support': 13155.0} \| 0.8389 \| {'precision': 0.7923271953394554, 'recall': 0.7056966513247462, 'f1-score': 0.7358985845734849, 'support': 29261.0} \| {'precision': 0.8293966272342979, 'recall': 0.8388640169508903, 'f1-score': 0.8281446341741303, 'support': 29261.0} \|
	\| No log \| 3.0 \| 123 \| 0.3525 \| {'precision': 0.6357409713574097, 'recall': 0.49829184968277207, 'f1-score': 0.5586867305061559, 'support': 4098.0} \| {'precision': 0.7525641025641026, 'recall': 0.8164116828929068, 'f1-score': 0.7831887925283523, 'support': 2157.0} \| {'precision': 0.9471273523847455, 'recall': 0.9655872500253782, 'f1-score': 0.956268221574344, 'support': 9851.0} \| {'precision': 0.8749451192741109, 'recall': 0.9089319650323071, 'f1-score': 0.8916147794638529, 'support': 13155.0} \| 0.8637 \| {'precision': 0.8025943863950922, 'recall': 0.797305686908341, 'f1-score': 0.7974396310181762, 'support': 29261.0} \| {'precision': 0.8567240306977373, 'recall': 0.863675199070435, 'f1-score': 0.8587617347894375, 'support': 29261.0} \|
	\| No log \| 4.0 \| 164 \| 0.3385 \| {'precision': 0.6185015290519877, 'recall': 0.5922401171303074, 'f1-score': 0.6050860134629769, 'support': 4098.0} \| {'precision': 0.7913082842915347, 'recall': 0.8103847936949466, 'f1-score': 0.8007329363261567, 'support': 2157.0} \| {'precision': 0.9529177057356608, 'recall': 0.9697492640341082, 'f1-score': 0.9612598108271282, 'support': 9851.0} \| {'precision': 0.8938411050904373, 'recall': 0.8903078677309008, 'f1-score': 0.8920709878894051, 'support': 13155.0} \| 0.8694 \| {'precision': 0.8141421560424051, 'recall': 0.8156705106475658, 'f1-score': 0.8147874371264168, 'support': 29261.0} \| {'precision': 0.8676102420265399, 'recall': 0.8694166296435528, 'f1-score': 0.8684387980236479, 'support': 29261.0} \|
	\| No log \| 5.0 \| 205 \| 0.3326 \| {'precision': 0.6375421311900441, 'recall': 0.6000488042947779, 'f1-score': 0.6182275298554368, 'support': 4098.0} \| {'precision': 0.8534005037783375, 'recall': 0.7853500231803431, 'f1-score': 0.8179623370352487, 'support': 2157.0} \| {'precision': 0.9584632404706829, 'recall': 0.9674144756877474, 'f1-score': 0.9629180559765586, 'support': 9851.0} \| {'precision': 0.884906500445236, 'recall': 0.9064994298745724, 'f1-score': 0.8955728286583305, 'support': 13155.0} \| 0.8752 \| {'precision': 0.8335780939710751, 'recall': 0.8148281832593602, 'f1-score': 0.8236701878813937, 'support': 29261.0} \| {'precision': 0.8727042457708365, 'recall': 0.8751580602166706, 'f1-score': 0.8736819489681839, 'support': 29261.0} \|


	### Framework versions

	- Transformers 4.45.2
	- Pytorch 2.5.0+cu124
	- Datasets 2.19.1
	- Tokenizers 0.20.1