distil-whisper
/

distil-large-v2

Automatic Speech Recognition

Transformers.js

Inference Endpoints

Model card Files Files and versions Metrics Training metrics Community

sanchit-gandhi HF staff commited on Nov 9, 2023

Commit

0ab491c

•

1 Parent(s): dcbf704

whisper cpp

Files changed (1) hide show

README.md +34 -0

README.md CHANGED Viewed

@@ -263,6 +263,40 @@ To transcribe a local audio file, simply pass the path to the audio file as the
 pred_out = transcribe(model, audio="audio.mp3")
 ```
 ### Transformers.js
 ```js

 pred_out = transcribe(model, audio="audio.mp3")
 ```
+### Whisper.cpp
+Distil-Whisper can be run from the [Whisper.cpp](https://github.com/ggerganov/whisper.cpp) repository with the original
+sequential long-form transcription algorithm. In a [provisional benchmark](https://github.com/ggerganov/whisper.cpp/pull/1424#issuecomment-1793513399)
+on Mac M1, `distil-large-v2` is 2x faster than `large-v2`, while performing to within 0.1% WER over long-form audio.
+Note that future releases of Distil-Whisper will target faster CPU inference more! By distilling smaller encoders, we
+aim to achieve similar speed-ups to what we obtain on GPU.
+Steps for getting started:
+1. Clone the Whisper.cpp repository:
+```
+git clone https://github.com/ggerganov/whisper.cpp.git
+cd whisper.cpp
+```
+2. Download the ggml weights for `distil-medium.en` from the Hugging Face Hub:
+```bash
+python -c "from huggingface_hub import hf_hub_download; hf_hub_download(repo_id='distil-whisper/distil-large-v2', filename='ggml-large-32-2.en.bin', local_dir='./models')"
+```
+Note that if you do not have the `huggingface_hub` package installed, you can also download the weights with `wget`:
+```bash
+wget https://huggingface.co/distil-whisper/distil-large-v2/resolve/main/ggml-large-32-2.en.bin -P ./models
+```
+3. Run inference using the provided sample audio:
+```bash
+make -j && ./main -m models/ggml-large-32-2.en.bin -f samples/jfk.wav
+```
 ### Transformers.js
 ```js