--- language: - hi license: apache-2.0 base_model: openai/whisper-medium tags: - whisper-event - generated_from_trainer datasets: - mozilla-foundation/common_voice_11_0 metrics: - wer model-index: - name: Whisper Medium finetuned Hindi results: - task: name: Automatic Speech Recognition type: automatic-speech-recognition dataset: name: common_voice_11_0 type: mozilla-foundation/common_voice_11_0 config: hi split: test args: hi metrics: - name: Wer type: wer value: 99.8077099166743 --- # iVaani - Fine-tuned ASR model for Hindi Language # Model Description This is iVaani model, specifically optimized for the Hindi language. The fine-tuning process has led to an improvement in accuracy by 2.5% compared to the original Whisper model. # Performance After fine-tuning, the model shows a 2.5% increase in transcription accuracy for Hindi language audio compared to the base Whisper medium model. # How to Use You can use this model directly with a simple API call in Hugging Face. Here is a Python code snippet for using the model: ```python from transformers import AutoModelForCTC, Wav2Vec2Processor model = AutoModelForCTC.from_pretrained("rukaiyah-indika-ai/iVaani") processor = Wav2Vec2Processor.from_pretrained("rukaiyah-indika-ai/iVaani") # Replace 'path_to_audio_file' with the path to your Hindi audio file input_audio = processor(path_to_audio_file, return_tensors="pt", padding=True) # Perform the transcription transcription = model.generate(**input_audio) print("Transcription:", transcription) ``` # Additional Language Models Indika AI has also fine-tuned ASR (Automatic Speech Recognition) models for several other Indic languages, enhancing the accuracy by 2-5% for each language. The word error rate has also been significantly reduced. The additional languages include: | Language | Original Accuracy | |------------|-------------------| | Bengali | 88% | | Telugu | 86% | | Marathi | 87% | | Tamil | 88% | | Gujarati | 90% | | Kannada | 86.5% | | Malayalam | 87.5% | | Punjabi | 89% | | Odia | 88.5% | ### BibTeX entry and citation info If you use this model in your research, please cite it as follows: ```bibtex @misc{whisper-medium-hindi-fine-tuned, author = {Indika AI}, title = {iVaani}, year = {2024}, publisher = {Hugging Face}, journal = {Hugging Face Model Hub} } ``` ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 1e-05 - train_batch_size: 2 - eval_batch_size: 4 - seed: 42 - gradient_accumulation_steps: 2 - total_train_batch_size: 4 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 - lr_scheduler_type: linear - lr_scheduler_warmup_steps: 100 - training_steps: 1000 - mixed_precision_training: Native AMP ### Framework versions - Transformers 4.35.2 - Pytorch 2.1.0+cu121 - Datasets 2.16.0 - Tokenizers 0.15.0