shhossain
/

whisper-tiny-bn

@@ -1,30 +1,116 @@
 ---
 license: apache-2.0
-base_model: openai/whisper-tiny
-tags:
-- generated_from_trainer
-metrics:
-- wer
-model-index:
-- name: whisper-tiny-bn
-  results: []
 language:
 - bn
 pipeline_tag: automatic-speech-recognition
 ---
-# whisper-tiny-bn
-This model is a fine-tuned version of [openai/whisper-tiny](https://huggingface.co/openai/whisper-tiny) on the None dataset.
-It achieves the following results on the evaluation set:
-- Loss: 0.4041
-- Wer: 74.0213
-### Framework versions
-- Transformers 4.33.2
-- Pytorch 2.0.1+cu118
-- Datasets 2.14.5
-- Tokenizers 0.13.3

 ---
 license: apache-2.0
 language:
+- en
 - bn
+metrics:
+- wer
+library_name: transformers
 pipeline_tag: automatic-speech-recognition
 ---
+## Results
+- WER 74
+# Use with [BanglaSpeech2text](https://github.com/shhossain/BanglaSpeech2Text)
+## Test it in Google Colab
+- [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/shhossain/BanglaSpeech2Text/blob/main/BanglaSpeech2Text_in_Colab.ipynb)
+## Installation
+You can install the library using pip:
+```bash
+pip install banglaspeech2text
+```
+## Usage
+### Model Initialization
+To use the library, you need to initialize the Speech2Text class with the desired model. By default, it uses the "base" model, but you can choose from different pre-trained models: "tiny", "small", "medium", "base", or "large". Here's an example:
+```python
+from banglaspeech2text import Speech2Text
+stt = Speech2Text(model="shhossain/whisper-tiny-bn")
+```
+### Transcribing Audio Files
+You can transcribe an audio file by calling the transcribe method and passing the path to the audio file. It will return the transcribed text as a string. Here's an example:
+```python
+transcription = stt.transcribe("audio.wav")
+print(transcription)
+```
+### Use with SpeechRecognition
+You can use [SpeechRecognition](https://pypi.org/project/SpeechRecognition/) package to get audio from microphone and transcribe it. Here's an example:
+```python
+import speech_recognition as sr
+from banglaspeech2text import Speech2Text
+stt = Speech2Text(model="shhossain/whisper-tiny-bn")
+r = sr.Recognizer()
+with sr.Microphone() as source:
+    print("Say something!")
+    audio = r.listen(source)
+    output = stt.recognize(audio)
+print(output)
+```
+### Use GPU
+You can use GPU for faster inference. Here's an example:
+```python
+stt = Speech2Text(model="shhossain/whisper-tiny-bn",use_gpu=True)
+```
+### Advanced GPU Usage
+For more advanced GPU usage you can use `device` or `device_map` parameter. Here's an example:
+```python
+stt = Speech2Text(model="shhossain/whisper-tiny-bn",device="cuda:0")
+```
+```python
+stt = Speech2Text(model="shhossain/whisper-tiny-bn",device_map="auto")
+```
+__NOTE__: Read more about [Pytorch Device](https://pytorch.org/docs/stable/tensor_attributes.html#torch.torch.device)
+### Instantly Check with gradio
+You can instantly check the model with gradio. Here's an example:
+```python
+from banglaspeech2text import Speech2Text, available_models
+import gradio as gr
+stt = Speech2Text(model="shhossain/whisper-tiny-bn",use_gpu=True)
+# You can also open the url and check it in mobile
+gr.Interface(
+    fn=stt.transcribe,
+    inputs=gr.Audio(source="microphone", type="filepath"),
+    outputs="text").launch(share=True)
+```
+__Note__: For more usecases and models -> [BanglaSpeech2Text](https://github.com/shhossain/BanglaSpeech2Text)
+# Use with transformers
+### Installation
+```
+pip install transformers
+pip install torch
+```
+## Usage
+### Use with file
+```python
+from transformers import pipeline
+pipe = pipeline('automatic-speech-recognition','shhossain/whisper-tiny-bn')
+def transcribe(audio_path):
+  return pipe(audio_path)['text']
+audio_file = "test.wav"
+print(transcribe(audio_file))
+```