Bagus's picture
Update README.md
454aa78 verified
metadata
base_model: microsoft/wavlm-base
tags:
  - generated_from_trainer
task: audio-classification
model-index:
  - name: wavlm_finetuned_emodb
    results: []

wavlm_finetuned_emodb

This model is a fine-tuned version of microsoft/wavlm-base on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.9254
  • Uar: 0.8148
  • Acc: 0.8529

Model description

This model predict given audio waveform to one of four common emotion categories: anger, happiness, sadness, and neutral

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 32
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 10
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Uar Acc
1.3857 0.1538 1 1.3786 0.25 0.1985
1.3322 0.3077 2 1.3549 0.2914 0.2426
1.3112 0.4615 3 1.3165 0.5375 0.6103
1.2981 0.6154 4 1.2905 0.5 0.6029
1.1317 0.7692 5 1.2923 0.4907 0.5956
1.2078 0.9231 6 1.2619 0.5556 0.6471
0.9237 1.0769 7 1.2254 0.5741 0.6618
0.8396 1.2308 8 1.2247 0.5556 0.6471
1.0354 1.3846 9 1.2076 0.5556 0.6471
0.9205 1.5385 10 1.1891 0.5833 0.6691
0.9071 1.6923 11 1.1704 0.6481 0.7206
0.8132 1.8462 12 1.1988 0.6939 0.5735
0.8994 2.0 13 1.1960 0.6574 0.5221
0.7924 2.1538 14 1.1579 0.6658 0.5662
0.7386 2.3077 15 1.1401 0.6944 0.7574
0.6324 2.4615 16 1.1202 0.6111 0.6912
0.7282 2.6154 17 1.1090 0.5833 0.6691
0.673 2.7692 18 1.0907 0.6111 0.6912
0.623 2.9231 19 1.0578 0.7872 0.8235
0.4954 3.0769 20 1.0357 0.8475 0.8676
0.5201 3.2308 21 1.0365 0.7778 0.8235
0.5608 3.3846 22 1.0346 0.75 0.8015
0.6334 3.5385 23 1.0047 0.7685 0.8162
0.3737 3.6923 24 0.9585 0.8658 0.8897
0.5369 3.8462 25 0.9527 0.9178 0.8824
0.3599 4.0 26 0.9682 0.8906 0.8382
0.7642 4.1538 27 0.9418 0.8951 0.8456
0.4882 4.3077 28 0.9095 0.9310 0.9265
0.5011 4.4615 29 0.9378 0.8426 0.875
0.3707 4.6154 30 0.9630 0.7963 0.8382
0.381 4.7692 31 0.9721 0.7870 0.8309
0.2307 4.9231 32 0.9522 0.7963 0.8382
0.2829 5.0769 33 0.9598 0.7870 0.8309
0.2581 5.2308 34 0.9458 0.8056 0.8456
0.4658 5.3846 35 0.9442 0.8148 0.8529
0.2133 5.5385 36 0.9524 0.7870 0.8309
0.1107 5.6923 37 0.9601 0.7870 0.8309
0.3599 5.8462 38 0.9605 0.7778 0.8235
0.3085 6.0 39 0.9522 0.7918 0.8309
0.2739 6.1538 40 0.9564 0.7870 0.8309
0.3279 6.3077 41 0.9582 0.7870 0.8309
0.1346 6.4615 42 0.9646 0.7685 0.8162
0.1429 6.6154 43 0.9695 0.7685 0.8162
0.1 6.7692 44 0.9692 0.7685 0.8162
0.1852 6.9231 45 0.9651 0.7685 0.8162
0.1028 7.0769 46 0.9378 0.8056 0.8456
0.2071 7.2308 47 0.9154 0.8195 0.8529
0.1752 7.3846 48 0.8882 0.8566 0.8824
0.0907 7.5385 49 0.8704 0.8843 0.9044
0.1263 7.6923 50 0.8719 0.8798 0.8971
0.068 7.8462 51 0.8738 0.8798 0.8971
0.0589 8.0 52 0.8881 0.8566 0.8824
0.1494 8.1538 53 0.9001 0.8473 0.875
0.1137 8.3077 54 0.9120 0.8288 0.8603
0.0522 8.4615 55 0.9212 0.8148 0.8529
0.0666 8.6154 56 0.9251 0.8148 0.8529
0.0867 8.7692 57 0.9270 0.8148 0.8529
0.0764 8.9231 58 0.9264 0.8148 0.8529
0.0526 9.0769 59 0.9259 0.8148 0.8529
0.2877 9.2308 60 0.9254 0.8148 0.8529

Framework versions

  • Transformers 4.40.1
  • Pytorch 2.3.0+cu121
  • Datasets 2.19.0
  • Tokenizers 0.19.1