SpeechT5 TTS YOR_v2

This model is a fine-tuned version of microsoft/speecht5_tts on the lagyamfi/yor_bible dataset. It achieves the following results on the evaluation set:

  • Loss: 0.4262

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0002
  • train_batch_size: 6
  • eval_batch_size: 8
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 4
  • gradient_accumulation_steps: 20
  • total_train_batch_size: 480
  • total_eval_batch_size: 32
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 1000
  • num_epochs: 20
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
No log 0.9966 58 0.4736
No log 1.9931 116 0.4692
No log 2.9897 174 0.4660
No log 3.9863 232 0.4698
No log 5.0 291 0.4874
No log 5.9966 349 0.4755
No log 6.9931 407 0.4630
No log 7.9897 465 0.4591
No log 8.9863 523 0.4646
No log 10.0 582 0.4536
No log 10.9966 640 0.4585
No log 11.9931 698 0.4745
No log 12.9897 756 0.4645
No log 13.9863 814 0.4618
No log 15.0 873 0.4444
No log 15.9966 931 0.4517
No log 16.9931 989 0.4395
0.4803 17.9897 1047 0.4523
0.4803 18.9863 1105 0.4335
0.4803 19.9313 1160 0.4262

Framework versions

  • Transformers 4.40.1
  • Pytorch 2.2.1+cu121
  • Datasets 2.18.0
  • Tokenizers 0.19.1
Downloads last month
43
Safetensors
Model size
144M params
Tensor type
F32
·
Inference API
Unable to determine this model's library. Check the docs .

Model tree for Lagyamfi/speecht5_tts_lagyamfi_yor_v2

Finetuned
(892)
this model