license: mit
datasets:
- mozilla-foundation/common_voice_17_0
language:
- en
- ta
metrics:
- wer
base_model:
- openai/whisper-small
pipeline_tag: automatic-speech-recognition
library_name: transformers
tags:
- language-identification
- speech-to-text
Whisper-small-ta
This model is trainned for voice to text trancription for tamil language
Model Overview
This model is fine-tuned from openai/whisper-small
using the Mozilla Common Voice 17.0 dataset for language identification and transcription in Tamil . The model is designed to accurately transcribe spoken audio into text and identify whether the language is Tamil .
Key Features:
- Languages: Tamil
- Base Model: Whisper-small from OpenAI
- Dataset: Mozilla Common Voice 17.0
Intended Use
The model is designed for automatic speech recognition (ASR) in Tamil, making it suitable for transcription and language identification in real-time applications.
Training Details
This model was fine-tuned using a subset of the Mozilla Common Voice dataset. The dataset contains '53,468 ' samples
Fine-tuning Process:
- The fine-tuning was performed on
Whisper-small
, a smaller version of OpenAI's Whisper model, for reduced latency and higher accuracy for low-resource languages. - The model was trained for
2
epochs on aGoogle Colab Pro
environment.
Performance
The model achieved a Word Error Rate (WER) of 34%
, using a validation dataset with 8
hours of audio.
We expect further improvements with continued training.
Usage
You can use this model with the following code:
from transformers import WhisperForConditionalGeneration, WhisperProcessor
import torch
model = WhisperForConditionalGeneration.from_pretrained("Lingalingeswaran/whisper-small-ta")
processor = WhisperProcessor.from_pretrained("Lingalingeswaran/whisper-small-ta")
# Example audio input
audio = "path_to_audio_file"
inputs = processor(audio, return_tensors="pt", padding="longest")
with torch.no_grad():
predicted_ids = model.generate(inputs.input_ids)
transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
print(transcription)