|
--- |
|
library_name: transformers |
|
license: apache-2.0 |
|
base_model: allenai/longformer-base-4096 |
|
tags: |
|
- generated_from_trainer |
|
datasets: |
|
- stab-gurevych-essays |
|
metrics: |
|
- accuracy |
|
model-index: |
|
- name: longformer-sep_tok |
|
results: |
|
- task: |
|
name: Token Classification |
|
type: token-classification |
|
dataset: |
|
name: stab-gurevych-essays |
|
type: stab-gurevych-essays |
|
config: sep_tok |
|
split: train[0%:20%] |
|
args: sep_tok |
|
metrics: |
|
- name: Accuracy |
|
type: accuracy |
|
value: 0.8978882572968515 |
|
--- |
|
|
|
<!-- This model card has been generated automatically according to the information the Trainer had access to. You |
|
should probably proofread and complete it, then remove this comment. --> |
|
|
|
# longformer-sep_tok |
|
|
|
This model is a fine-tuned version of [allenai/longformer-base-4096](https://huggingface.co/allenai/longformer-base-4096) on the stab-gurevych-essays dataset. |
|
It achieves the following results on the evaluation set: |
|
- Loss: 0.2532 |
|
- Claim: {'precision': 0.6492678318375059, 'recall': 0.5891555936562366, 'f1-score': 0.6177528089887641, 'support': 4666.0} |
|
- Majorclaim: {'precision': 0.8648648648648649, 'recall': 0.8003280032800328, 'f1-score': 0.8313458262350937, 'support': 2439.0} |
|
- O: {'precision': 1.0, 'recall': 0.9992669218864544, 'f1-score': 0.9996333265430841, 'support': 12277.0} |
|
- Premise: {'precision': 0.8896274845333685, 'recall': 0.9276645391531123, 'f1-score': 0.9082479422140097, 'support': 14571.0} |
|
- Accuracy: 0.8979 |
|
- Macro avg: {'precision': 0.8509400453089349, 'recall': 0.8291037644939591, 'f1-score': 0.839244975995238, 'support': 33953.0} |
|
- Weighted avg: {'precision': 0.8947265686653585, 'recall': 0.8978882572968515, 'f1-score': 0.8958462048390052, 'support': 33953.0} |
|
|
|
## Model description |
|
|
|
More information needed |
|
|
|
## Intended uses & limitations |
|
|
|
More information needed |
|
|
|
## Training and evaluation data |
|
|
|
More information needed |
|
|
|
## Training procedure |
|
|
|
### Training hyperparameters |
|
|
|
The following hyperparameters were used during training: |
|
- learning_rate: 2e-05 |
|
- train_batch_size: 8 |
|
- eval_batch_size: 8 |
|
- seed: 42 |
|
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments |
|
- lr_scheduler_type: linear |
|
- num_epochs: 5 |
|
|
|
### Training results |
|
|
|
| Training Loss | Epoch | Step | Validation Loss | Claim | Majorclaim | O | Premise | Accuracy | Macro avg | Weighted avg | |
|
|:-------------:|:-----:|:----:|:---------------:|:--------------------------------------------------------------------------------------------------------------------:|:------------------------------------------------------------------------------------------------------------------:|:-------------------------------------------------------------------------------------------------------------------:|:-------------------------------------------------------------------------------------------------------------------:|:--------:|:-------------------------------------------------------------------------------------------------------------------:|:-------------------------------------------------------------------------------------------------------------------:| |
|
| No log | 1.0 | 41 | 0.3662 | {'precision': 0.5429831006612784, 'recall': 0.15837976853836264, 'f1-score': 0.24522979923676788, 'support': 4666.0} | {'precision': 0.6182052106786748, 'recall': 0.7880278802788028, 'f1-score': 0.6928622927180966, 'support': 2439.0} | {'precision': 0.9983710702068741, 'recall': 0.9984523906491813, 'f1-score': 0.9984117287721441, 'support': 12277.0} | {'precision': 0.81162452775356, 'recall': 0.9583419120170201, 'f1-score': 0.8789023162134945, 'support': 14571.0} | 0.8507 | {'precision': 0.7427959773250968, 'recall': 0.7258004878708417, 'f1-score': 0.7038515342351258, 'support': 33953.0} | {'precision': 0.8283375336305401, 'recall': 0.8506759343798781, 'f1-score': 0.8216687720926173, 'support': 33953.0} | |
|
| No log | 2.0 | 82 | 0.2911 | {'precision': 0.6262440103206782, 'recall': 0.3641234462066009, 'f1-score': 0.4604960021683155, 'support': 4666.0} | {'precision': 0.7612877895563408, 'recall': 0.7949979499794998, 'f1-score': 0.7777777777777778, 'support': 2439.0} | {'precision': 0.9998368545558365, 'recall': 0.9983709375254541, 'f1-score': 0.9991033583306163, 'support': 12277.0} | {'precision': 0.8484239990264086, 'recall': 0.9569006931576419, 'f1-score': 0.8994033220448315, 'support': 14571.0} | 0.8788 | {'precision': 0.808948163364816, 'recall': 0.7785982567172992, 'f1-score': 0.7841951150803853, 'support': 33953.0} | {'precision': 0.8663805444019677, 'recall': 0.878803051276765, 'f1-score': 0.8663997903530639, 'support': 33953.0} | |
|
| No log | 3.0 | 123 | 0.2556 | {'precision': 0.6437802907915994, 'recall': 0.5124303471924561, 'f1-score': 0.5706443914081145, 'support': 4666.0} | {'precision': 0.8149312377210216, 'recall': 0.8503485034850349, 'f1-score': 0.8322632423756019, 'support': 2439.0} | {'precision': 1.0, 'recall': 0.9994298281339089, 'f1-score': 0.9997148327697886, 'support': 12277.0} | {'precision': 0.8788900414937759, 'recall': 0.9303410884633861, 'f1-score': 0.9038839806634439, 'support': 14571.0} | 0.8921 | {'precision': 0.8344003925015993, 'recall': 0.8231374418186965, 'f1-score': 0.8266266118042372, 'support': 33953.0} | {'precision': 0.8857774841763904, 'recall': 0.8921450240037699, 'f1-score': 0.8875948888942388, 'support': 33953.0} | |
|
| No log | 4.0 | 164 | 0.2579 | {'precision': 0.5919405320813772, 'recall': 0.6485212173167595, 'f1-score': 0.6189404786254857, 'support': 4666.0} | {'precision': 0.8280930992241732, 'recall': 0.8314883148831488, 'f1-score': 0.8297872340425532, 'support': 2439.0} | {'precision': 1.0, 'recall': 0.9988596562678179, 'f1-score': 0.9994295028524858, 'support': 12277.0} | {'precision': 0.9086276452685965, 'recall': 0.88106512936655, 'f1-score': 0.8946341463414634, 'support': 14571.0} | 0.8881 | {'precision': 0.8321653191435366, 'recall': 0.839983579458569, 'f1-score': 0.835697840465497, 'support': 33953.0} | {'precision': 0.8923608226344708, 'recall': 0.8881394869378259, 'f1-score': 0.8899813710116259, 'support': 33953.0} | |
|
| No log | 5.0 | 205 | 0.2532 | {'precision': 0.6492678318375059, 'recall': 0.5891555936562366, 'f1-score': 0.6177528089887641, 'support': 4666.0} | {'precision': 0.8648648648648649, 'recall': 0.8003280032800328, 'f1-score': 0.8313458262350937, 'support': 2439.0} | {'precision': 1.0, 'recall': 0.9992669218864544, 'f1-score': 0.9996333265430841, 'support': 12277.0} | {'precision': 0.8896274845333685, 'recall': 0.9276645391531123, 'f1-score': 0.9082479422140097, 'support': 14571.0} | 0.8979 | {'precision': 0.8509400453089349, 'recall': 0.8291037644939591, 'f1-score': 0.839244975995238, 'support': 33953.0} | {'precision': 0.8947265686653585, 'recall': 0.8978882572968515, 'f1-score': 0.8958462048390052, 'support': 33953.0} | |
|
|
|
|
|
### Framework versions |
|
|
|
- Transformers 4.46.0 |
|
- Pytorch 2.5.0+cu124 |
|
- Datasets 3.0.2 |
|
- Tokenizers 0.20.1 |
|
|