metadata

license: mit
tags:
  - automatic-speech-recognition
  - asr
  - pytorch
  - wav2vec2
  - wolof
  - wo

wav2vec2-xls-r-300m-wolof

Wolof is a language spoken in Senegal and neighbouring countries, this language is not too well represented, there are few resources in the field of Text en speech In this sense we aim to bring our contribution to this, it is in this sense that enters this repo.

This model is a fine-tuned version of facebook/wav2vec2-xls-r-300m , that is trained with the largest available speech dataset of the ALLFA project

It achieves the following results on the evaluation set:

Loss: 0.367826
Wer: 0.212565

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

More information needed

Training results

Step	Training Loss	Validation Loss	Wer
1500	2.854200	0.642243	0.543964
3000	0.599200	0.468138	0.429549
4500	0.468300	0.433436	0.405644
6000	0.427000	0.384873	0.344150
7500	0.377000	0.374003	0.323892
9000	0.337000	0.363674	0.306189
10500	0.302400	0.349884	0 .283908
12000	0.264100	0.344104	0.277120
13500	0 .254000	0.341820	0.271316
15000	0.208400	0.326502	0.260695
16500	0.203500	0.326209	0.250313
18000	0.159800	0.323539	0.239851
19500	0.158200	0.310694	0.230028
21000	0.132800	0.338318	0.229283
22500	0.112800	0.336765	0.224145
24000	0.103600	0.350208	0.227073
25500	0.091400	0.353609	0.221589
27000	0.084400	0.367826	0.212565

Framework versions

Transformers 4.11
Pytorch 1.10.0
Datasets 1.13