File size: 2,346 Bytes

28cfae8
 
 
 
 
 
 
 
 
 
 
d161df9
 
 
e1cd4e7
28cfae8
49c560b
28cfae8
49c560b
28cfae8
 
 
49c560b
28cfae8
 
 
 
 
 
 
 
 
 
 
 
 
49c560b
 
28cfae8
 
 
 
 
 
 
 
 
 
 
 
 
49c560b
28cfae8
 
 
49c560b
28cfae8
49c560b
28cfae8
e1cd4e7
28cfae8
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
8ec8c4c
28cfae8

```markdown
# Whisper Large v2 Uzbek Speech Recognition Model

This project contains a fine-tuned version of the Faster Whisper Large v2 model for Uzbek speech recognition. The model can be used to transcribe Uzbek audio files into text.

## Installation

1. Ensure you have Python 3.7 or higher installed.

2. Install the required libraries:

   
pip install transformers datasets accelerate soundfile librosa torch
   

## Usage

You can use the model with the following Python code:

```python
from transformers import pipeline, WhisperForConditionalGeneration, WhisperProcessor
import torch

# Load the model and processor
model_name = "totetecdev/whisper-large-v2-uzbek-100steps" 
model = WhisperForConditionalGeneration.from_pretrained(model_name)
processor = WhisperProcessor.from_pretrained(model_name)

# Create the speech recognition pipeline
pipe = pipeline(
    "automatic-speech-recognition",
    model=model,
    tokenizer=processor.tokenizer,
    feature_extractor=processor.feature_extractor,
    torch_dtype=torch.float16,
    device_map="auto",
)

# Transcribe an audio file
audio_file = "path/to/your/audio/file.wav"  # Replace with the path to your audio file
result = pipe(audio_file)

print(result["text"])
```

## Example Usage

1. Prepare your audio file (it should be in WAV format).
2. Save the above code in a Python file (e.g., `transcribe.py`).
3. Update the `model_name` and `audio_file` variables in the code with your values.
4. Run the following command in your terminal or command prompt:

   ```
   python transcribe.py
   ```

5. The transcribed text will be displayed on the screen.

## Notes

- This model will perform best with Uzbek audio files.
- Longer audio files may require more processing time.
- GPU usage is recommended, but the model can also run on CPU.
- If you're using Google Colab, you can upload your audio file using:

  ```python
  from google.colab import files
  uploaded = files.upload()
  audio_file = next(iter(uploaded))
  ```

## Model Details

- Base Model: Faster Whisper Large v2
- Fine-tuned for: Uzbek Speech Recognition

## License

This project is licensed under [LICENSE]. See the LICENSE file for details.

## Contact

For questions or feedback, please contact [KHABIB SALIMOV] at [[email protected]].

## Acknowledgements

- OpenAI for the original Whisper model

```