|
--- |
|
language: "ar" |
|
pipeline_tag: automatic-speech-recognition |
|
tags: |
|
- CTC |
|
- Attention |
|
- pytorch |
|
- Transformer |
|
license: "cc-by-nc-4.0" |
|
datasets: |
|
- MGB-3 |
|
- egyptian-arabic-conversational-speech-corpus |
|
metrics: |
|
- wer |
|
model-index: |
|
- name: omarxadel/hubert-large-arabic-egyptian |
|
results: |
|
- task: |
|
name: Automatic Speech Recognition |
|
type: automatic-speech-recognition |
|
metrics: |
|
- name: Test WER |
|
type: wer |
|
value: 29.3755 |
|
- name: Validation WER |
|
type: wer |
|
value: 29.1828 |
|
--- |
|
|
|
# Wav2Vec2-XLSR-53 - with CTC fine-tuned on MGB-3 and Egyptian Arabic Conversational Speech Corpus (No LM) |
|
|
|
This model is a fine-tuned version of [Wav2Vec2-XLSR-53](https://huggingface.co/facebook/wav2vec2-large-xlsr-53). We finetuned this model on the MGB-3 and Egyptian Arabic Conversational Speech Corpus datasets, acheiving WER of `29.3755%`. |
|
|
|
The performance of the model on the datasets is the following: |
|
|
|
| Valid WER | Test WER | |
|
|:---------:|:--------:| |
|
| 29.18 | 29.37 | |
|
|
|
# Acknowledgement |
|
|
|
Model fine-tuning and data processing for this work were performed as a part of a Graduation Project from Faculty of Engineering, Alexandria University, CCE Program. |