sivan22
/

faster-whisper-ivrit-ai-whisper-large-v2-tuned

Automatic Speech Recognition

hf-asr-leaderboard

Inference Endpoints

Model card Files Files and versions Community

faster-whisper-ivrit-ai-whisper-large-v2-tuned / README.md

sivan22's picture

Update README.md

f54922b verified 9 months ago

|

history blame contribute delete

2.94 kB

	---
	language:
	- en
	- zh
	- de
	- es
	- ru
	- ko
	- fr
	- ja
	- pt
	- tr
	- pl
	- ca
	- nl
	- ar
	- sv
	- it
	- id
	- hi
	- fi
	- vi
	- he
	- uk
	- el
	- ms
	- cs
	- ro
	- da
	- hu
	- ta
	- 'no'
	- th
	- ur
	- hr
	- bg
	- lt
	- la
	- mi
	- ml
	- cy
	- sk
	- te
	- fa
	- lv
	- bn
	- sr
	- az
	- sl
	- kn
	- et
	- mk
	- br
	- eu
	- is
	- hy
	- ne
	- mn
	- bs
	- kk
	- sq
	- sw
	- gl
	- mr
	- pa
	- si
	- km
	- sn
	- yo
	- so
	- af
	- oc
	- ka
	- be
	- tg
	- sd
	- gu
	- am
	- yi
	- lo
	- uz
	- fo
	- ht
	- ps
	- tk
	- nn
	- mt
	- sa
	- lb
	- my
	- bo
	- tl
	- mg
	- as
	- tt
	- haw
	- ln
	- ha
	- ba
	- jw
	- su
	tags:
	- audio
	- automatic-speech-recognition
	- hf-asr-leaderboard
	widget:
	- example_title: Librispeech sample 1
	src: https://cdn-media.huggingface.co/speech_samples/sample1.flac
	- example_title: Librispeech sample 2
	src: https://cdn-media.huggingface.co/speech_samples/sample2.flac
	pipeline_tag: automatic-speech-recognition
	license: apache-2.0
	datasets:
	- ivrit-ai/whisper-training
	---

	# NOTE: THIS IS A CT-2 (Faster-Whisper) version of the model
	the original model can be found [here](https://huggingface.co/ivrit-ai/whisper-large-v2-tuned)

	# Whisper

	Whisper is a pre-trained model for automatic speech recognition (ASR) and speech translation.
	More details about it are available [here](https://huggingface.co/openai/whisper-large-v2).

	whisper-large-v2-tuned is a version of whisper-large-v2, fine-tuned by [ivrit.ai](https://www.ivrit.ai) to improve Hebrew ASR using crowd-sourced labeling.

	## Model details

	This model comes as a single checkpoint, whisper-large-v2-tuned.
	It is a 1550M parameters multi-lingual ASR solution.

	# Usage

	```python
	from faster_whisper import WhisperModel

	model = WhisperModel("sivan22/faster-whisper-ivrit-ai-whisper-large-v2-tuned")

	segments, info = model.transcribe("audio.mp3")
	for segment in segments:
	print("[%.2fs -> %.2fs] %s" % (segment.start, segment.end, segment.text))
	```

	## Evaluation

	You can use the [evaluate_model.py](https://github.com/yairl/ivrit.ai/blob/master/evaluate_model.py) reference on GitHub to evalute the model's quality.


	### BibTeX entry and citation info

	ivrit.ai: A Comprehensive Dataset of Hebrew Speech for AI Research and Development
	```bibtex
	@misc{marmor2023ivritai,
	title={ivrit.ai: A Comprehensive Dataset of Hebrew Speech for AI Research and Development},
	author={Yanir Marmor and Kinneret Misgav and Yair Lifshitz},
	year={2023},
	eprint={2307.08720},
	archivePrefix={arXiv},
	primaryClass={eess.AS}
	}
	```

	Whisper: Robust Speech Recognition via Large-Scale Weak Supervision
	```bibtex
	@misc{radford2022whisper,
	doi = {10.48550/ARXIV.2212.04356},
	url = {https://arxiv.org/abs/2212.04356},
	author = {Radford, Alec and Kim, Jong Wook and Xu, Tao and Brockman, Greg and McLeavey, Christine and Sutskever, Ilya},
	title = {Robust Speech Recognition via Large-Scale Weak Supervision},
	publisher = {arXiv},
	year = {2022},
	copyright = {arXiv.org perpetual, non-exclusive license}
	}
	```