metadata
base_model: facebook/wav2vec2-large-xls-r-300m
datasets:
- common_voice
language:
- hi
- mr
library_name: transformers
license: mit
metrics:
- wer
- cer
tags:
- code-switching
- ASR
- multilingual
model-index:
- name: wav2vec2-large-xls-r-300m-hindi_marathi-code-switching-experiment
results:
- task:
type: automatic-speech-recognition
dataset:
name: common_voice
type: audio
metrics:
- type: wer
value: 0.28
name: Word Error Rate (WER)
- type: cer
value: 0.24
name: Character Error Rate (CER)
source:
url: >-
https://huggingface.co/Hemantrao/wav2vec2-large-xls-r-300m-hindi_marathi-code-switching-experimentx1/
name: Internal Evaluation
Enhanced Multilingual Code-Switched Speech Recognition for Low-Resource Languages Using Transformer-Based Models and Dynamic Switching Algorithms
Model description
This model is designed to handle code-switched speech in Hindi and Marathi using the wav2vec2-large-xls-r-300m transformer-based model. It leverages advanced techniques such as Q-Learning, SARSA, and Deep Q-Networks (DQN) for determining optimal switch points in code-switched speech.
Intended uses & limitations
Intended uses
- Automatic speech recognition for multilingual environments involving Hindi and Marathi.
- Research in multilingual ASR and code-switching phenomena.
Limitations
- The model may exhibit biases inherent in the training data.
- Potential limitations in accurately recognizing heavily accented or dialectal speech not covered in the training dataset.
Training params and experimental info
The model was fine-tuned using the following parameters:
- Attention Dropout: 0.1
- Hidden Dropout: 0.1
- Feature Projection Dropout: 0.1
- Layerdrop: 0.1
- Learning Rate: 3e-4
- Mask Time Probability: 0.05
Training dataset
The model was trained on the Common Voice dataset, which includes diverse speech samples in both Hindi and Marathi. The dataset was augmented with synthetically generated code-switched speech to improve the model's robustness in handling code-switching scenarios.
Evaluation results
The model achieved the following performance metrics on the test set:
- Word Error Rate (WER): 0.2800
- Character Error Rate (CER): 0.2400