|
--- |
|
license: cc-by-nc-4.0 |
|
base_model: facebook/mms-300m |
|
tags: |
|
- generated_from_trainer |
|
datasets: |
|
- audiofolder |
|
model-index: |
|
- name: wav2vec2-mms-300m-ikk-3 |
|
results: [] |
|
--- |
|
|
|
<!-- This model card has been generated automatically according to the information the Trainer had access to. You |
|
should probably proofread and complete it, then remove this comment. --> |
|
|
|
# wav2vec2-mms-300m-ikk-3 |
|
|
|
This model is a fine-tuned version of [facebook/mms-300m](https://huggingface.co/facebook/mms-300m) on the audiofolder dataset. |
|
It achieves the following results on the evaluation set: |
|
- eval_loss: 1.3763 |
|
- eval_wer: 0.5580 |
|
- eval_runtime: 6.8853 |
|
- eval_samples_per_second: 14.233 |
|
- eval_steps_per_second: 1.888 |
|
- epoch: 19.59 |
|
- step: 480 |
|
|
|
## Model description |
|
|
|
More information needed |
|
|
|
## Intended uses & limitations |
|
|
|
More information needed |
|
|
|
## Training and evaluation data |
|
|
|
Step Training Loss Validation Loss Wer |
|
40 9.648200 4.719201 1.000000 |
|
80 3.953400 3.477898 1.000000 |
|
120 3.289700 3.099611 1.000000 |
|
160 3.038400 2.993551 1.000000 |
|
200 2.994500 2.979574 1.000000 |
|
240 2.959000 2.941970 1.000000 |
|
280 2.802100 2.520133 1.000000 |
|
320 1.862100 1.499739 0.746423 |
|
360 1.191800 1.336315 0.610261 |
|
400 0.951300 1.317062 0.598915 |
|
440 0.773900 1.312918 0.614702 |
|
480 0.624700 1.376327 0.557967 |
|
|
|
/usr/local/lib/python3.10/dist-packages/transformers/models/wav2vec2/processing_wav2vec2.py:156: UserWarning: `as_target_processor` is deprecated and will be removed in v5 of Transformers. You can process your labels by using the argument `text` of the regular `__call__` method (either in the same call as your audio inputs, or in a separate call. |
|
warnings.warn( |
|
/usr/local/lib/python3.10/dist-packages/torch/utils/checkpoint.py:460: UserWarning: torch.utils.checkpoint: please pass in use_reentrant=True or use_reentrant=False explicitly. The default value of use_reentrant will be updated to be False in the future. To maintain current behavior, pass use_reentrant=True. It is recommended that you use use_reentrant=False. Refer to docs for more details on the differences between the two variants. |
|
warnings.warn( |
|
/usr/local/lib/python3.10/dist-packages/transformers/models/wav2vec2/processing_wav2vec2.py:156: UserWarning: `as_target_processor` is deprecated and will be removed in v5 of Transformers. You can process your labels by using the argument `text` of the regular `__call__` method (either in the same call as your audio inputs, or in a separate call. |
|
warnings.warn( |
|
/usr/local/lib/python3.10/dist-packages/torch/utils/checkpoint.py:460: UserWarning: torch.utils.checkpoint: please pass in use_reentrant=True or use_reentrant=False explicitly. The default value of use_reentrant will be updated to be False in the future. To maintain current behavior, pass use_reentrant=True. It is recommended that you use use_reentrant=False. Refer to docs for more details on the differences between the two variants. |
|
warnings.warn( |
|
|
|
## Training procedure |
|
|
|
### Training hyperparameters |
|
|
|
The following hyperparameters were used during training: |
|
- learning_rate: 0.0003 |
|
- train_batch_size: 8 |
|
- eval_batch_size: 8 |
|
- seed: 42 |
|
- gradient_accumulation_steps: 2 |
|
- total_train_batch_size: 16 |
|
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 |
|
- lr_scheduler_type: linear |
|
- lr_scheduler_warmup_steps: 500 |
|
- num_epochs: 30 |
|
- mixed_precision_training: Native AMP |
|
|
|
### Framework versions |
|
|
|
- Transformers 4.39.3 |
|
- Pytorch 2.2.2+cu121 |
|
- Datasets 2.18.0 |
|
- Tokenizers 0.15.2 |
|
|