metadata
license: mit
tags:
- automatic-speech-recognition
- asr
- pytorch
- wav2vec2
- wolof
- wo
wav2vec2-xls-r-300m-wolof
Wolof is a language spoken in Senegal and neighbouring countries, this language is not too well represented, there are few resources in the field of Text en speech In this sense we aim to bring our contribution to this, it is in this sense that enters this repo.
This model is a fine-tuned version of facebook/wav2vec2-xls-r-300m , that is trained with the largest available speech dataset of the ALLFA project
It achieves the following results on the evaluation set:
- Loss: 0.367826
- Wer: 0.212565
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
More information needed
Training results
Step | Training Loss | Validation Loss | Wer |
---|---|---|---|
1500 | 2.854200 | 0.642243 | 0.543964 |
3000 | 0.599200 | 0.468138 | 0.429549 |
4500 | 0.468300 | 0.433436 | 0.405644 |
6000 | 0.427000 | 0.384873 | 0.344150 |
7500 | 0.377000 | 0.374003 | 0.323892 |
9000 | 0.337000 | 0.363674 | 0.306189 |
10500 | 0.302400 | 0.349884 | 0 .283908 |
12000 | 0.264100 | 0.344104 | 0.277120 |
13500 | 0 .254000 | 0.341820 | 0.271316 |
15000 | 0.208400 | 0.326502 | 0.260695 |
16500 | 0.203500 | 0.326209 | 0.250313 |
18000 | 0.159800 | 0.323539 | 0.239851 |
19500 | 0.158200 | 0.310694 | 0.230028 |
21000 | 0.132800 | 0.338318 | 0.229283 |
22500 | 0.112800 | 0.336765 | 0.224145 |
24000 | 0.103600 | 0.350208 | 0.227073 |
25500 | 0.091400 | 0.353609 | 0.221589 |
27000 | 0.084400 | 0.367826 | 0.212565 |
Framework versions
- Transformers 4.11
- Pytorch 1.10.0
- Datasets 1.13