gary109's picture
update model card README.md
ee336eb
metadata
tags:
  - automatic-speech-recognition
  - gary109/AI_Light_Dance
  - generated_from_trainer
datasets:
  - ai_light_dance
model-index:
  - name: ai-light-dance_drums_ft_pretrain_wav2vec2-base-new-13k_onset-drums_fold_1
    results: []

ai-light-dance_drums_ft_pretrain_wav2vec2-base-new-13k_onset-drums_fold_1

This model is a fine-tuned version of gary109/ai-light-dance_drums_pretrain_wav2vec2-base-new-13k on the GARY109/AI_LIGHT_DANCE - ONSET-DRUMS_FOLD_1 dataset. It achieves the following results on the evaluation set:

  • Loss: 0.4630
  • Wer: 0.2145

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0003
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 16
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 100
  • num_epochs: 50.0
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Wer
3.0614 0.99 69 5.1275 1.0
1.8291 1.99 138 2.2008 1.0
1.4664 2.99 207 1.6821 1.0
1.287 3.99 276 1.5681 1.0
1.2642 4.99 345 1.5074 1.0
1.2702 5.99 414 1.4650 1.0
1.2245 6.99 483 1.3027 1.0
1.3461 7.99 552 1.3109 1.0
1.2903 8.99 621 1.3107 1.0
1.2741 9.99 690 1.1842 1.0
1.1446 10.99 759 1.1754 1.0
1.0746 11.99 828 1.1469 0.9999
0.8203 12.99 897 0.9071 0.6202
0.5996 13.99 966 0.7047 0.4234
0.5672 14.99 1035 0.5369 0.2567
0.4965 15.99 1104 0.4644 0.2861
0.5639 16.99 1173 0.4630 0.2145
0.6272 17.99 1242 0.6848 0.2667
0.6764 18.99 1311 0.6074 0.2508
0.7205 19.99 1380 0.6452 0.2184
0.346 20.99 1449 0.5962 0.2457
0.2212 21.99 1518 0.5236 0.2068
0.1646 22.99 1587 0.6130 0.2198
0.3148 23.99 1656 0.5592 0.2620
0.3061 24.99 1725 0.5577 0.2560
0.3137 25.99 1794 0.5247 0.2227
0.389 26.99 1863 0.5799 0.2081
0.4168 27.99 1932 0.5850 0.1818
0.4403 28.99 2001 0.5687 0.2053
0.4936 29.99 2070 0.5511 0.2065
0.2196 30.99 2139 0.5438 0.1706
0.1683 31.99 2208 0.6066 0.1855
0.1552 32.99 2277 0.5248 0.1930
0.1682 33.99 2346 0.5440 0.1783
0.2162 34.99 2415 0.6079 0.1778
0.3041 35.99 2484 0.5608 0.1834
0.3188 36.99 2553 0.6039 0.2007
0.3692 37.99 2622 0.5437 0.1769
0.4446 38.99 2691 0.6475 0.1881
0.386 39.99 2760 0.6468 0.1894
0.1995 40.99 2829 0.6398 0.1906
0.1174 41.99 2898 0.5987 0.1936
0.1288 42.99 2967 0.6133 0.1871
0.1857 43.99 3036 0.6976 0.1995
0.2025 44.99 3105 0.6356 0.1902
0.2922 45.99 3174 0.6324 0.2055
0.3575 46.99 3243 0.6338 0.1862
0.4019 47.99 3312 0.6113 0.1898
0.4211 48.99 3381 0.6320 0.1948
0.4323 49.99 3450 0.6307 0.1917

Framework versions

  • Transformers 4.24.0.dev0
  • Pytorch 1.12.1+cu113
  • Datasets 2.6.1
  • Tokenizers 0.13.1