iVaani - Fine-tuned ASR model for Hindi Language

Model Description

This is iVaani model, specifically optimized for the Hindi language. The fine-tuning process has led to an improvement in accuracy by 2.5% compared to the original Whisper model.

Performance

After fine-tuning, the model shows a 2.5% increase in transcription accuracy for Hindi language audio compared to the base Whisper medium model.

How to Use

You can use this model directly with a simple API call in Hugging Face. Here is a Python code snippet for using the model:

from transformers import AutoModelForCTC, Wav2Vec2Processor

model = AutoModelForCTC.from_pretrained("rukaiyah-indika-ai/iVaani")
processor = Wav2Vec2Processor.from_pretrained("rukaiyah-indika-ai/iVaani")

# Replace 'path_to_audio_file' with the path to your Hindi audio file
input_audio = processor(path_to_audio_file, return_tensors="pt", padding=True)

# Perform the transcription
transcription = model.generate(**input_audio)
print("Transcription:", transcription)

Additional Language Models

Indika AI has also fine-tuned ASR (Automatic Speech Recognition) models for several other Indic languages, enhancing the accuracy by 2-5% for each language. The word error rate has also been significantly reduced.

The additional languages include:

Language Original Accuracy
Bengali 88%
Telugu 86%
Marathi 87%
Tamil 88%
Gujarati 90%
Kannada 86.5%
Malayalam 87.5%
Punjabi 89%
Odia 88.5%

BibTeX entry and citation info

If you use this model in your research, please cite it as follows:

@misc{whisper-medium-hindi-fine-tuned,
  author = {Indika AI},
  title = {iVaani},
  year = {2024},
  publisher = {Hugging Face},
  journal = {Hugging Face Model Hub}
}

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 2
  • eval_batch_size: 4
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 4
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 100
  • training_steps: 1000
  • mixed_precision_training: Native AMP

Framework versions

  • Transformers 4.35.2
  • Pytorch 2.1.0+cu121
  • Datasets 2.16.0
  • Tokenizers 0.15.0
Downloads last month
30
Safetensors
Model size
764M params
Tensor type
F32
Β·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for rukaiyah-indika-ai/iVaani

Finetuned
(498)
this model

Dataset used to train rukaiyah-indika-ai/iVaani

Spaces using rukaiyah-indika-ai/iVaani 2

Evaluation results