memo3_indirect_speech

This model is a fine-tuned version of MiMe-MeMo/MeMo-BERT-03 on an unknown dataset. It achieves the following results on the evaluation set:

  • Accuracy: 0.6579
  • Precision: 0.6594
  • Recall: 0.6579
  • F1: 0.6534
  • Loss: 0.8215

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 8
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 20
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Accuracy Precision Recall F1 Validation Loss
No log 1.0 13 0.5463 0.5179 0.5463 0.4339 1.0286
No log 2.0 26 0.4100 0.1876 0.4100 0.2461 1.0251
No log 3.0 39 0.5086 0.6698 0.5086 0.4356 0.8710
No log 4.0 52 0.4802 0.6966 0.4802 0.3729 1.3227
No log 5.0 65 0.4691 0.7161 0.4691 0.3548 1.0735
No log 6.0 78 0.5927 0.6874 0.5927 0.5628 0.8102
No log 7.0 91 0.5402 0.7032 0.5402 0.4823 1.3396
No log 8.0 104 0.6661 0.6753 0.6661 0.6340 0.7542
No log 9.0 117 0.6234 0.6935 0.6234 0.6047 0.8814
No log 10.0 130 0.6633 0.6732 0.6633 0.6574 0.7494
No log 11.0 143 0.6567 0.6597 0.6567 0.6520 0.7748
No log 12.0 156 0.6606 0.6596 0.6606 0.6552 0.7600
No log 13.0 169 0.6624 0.6744 0.6624 0.6567 0.7976
No log 14.0 182 0.6667 0.6668 0.6667 0.6619 0.7685
No log 15.0 195 0.6452 0.6778 0.6452 0.6361 0.8573
No log 16.0 208 0.6536 0.6721 0.6536 0.6466 0.8498
No log 17.0 221 0.6545 0.6625 0.6545 0.6501 0.8457
No log 18.0 234 0.6570 0.6602 0.6570 0.6523 0.8187
No log 18.48 240 0.6579 0.6594 0.6579 0.6534 0.8215

Framework versions

  • Transformers 4.48.2
  • Pytorch 2.5.1+cu124
  • Tokenizers 0.21.0
Downloads last month
30
Safetensors
Model size
124M params
Tensor type
F32
·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.

Model tree for yemen2016/memo3_indirect_speech

Finetuned
(17)
this model