qmeeus
/

whisper-small-nl

Automatic Speech Recognition

Generated from Trainer

Inference Endpoints

Model card Files Files and versions Metrics Training metrics Community

qmeeus commited on May 10, 2023

Commit

10c4eba

•

1 Parent(s): 4fd0dbf

Update README.md

Files changed (1) hide show

README.md +24 -4

README.md CHANGED Viewed

@@ -11,9 +11,6 @@ model-index:
   results: []
 ---
-<!-- This model card has been generated automatically according to the information the Trainer had access to. You
-should probably proofread and complete it, then remove this comment. -->
 # whisper-small-nl
 This model is a fine-tuned version of [qmeeus/whisper-small-nl](https://huggingface.co/qmeeus/whisper-small-nl) on the None dataset.
@@ -27,7 +24,30 @@ More information needed
 ## Intended uses & limitations
-More information needed
 ## Training and evaluation data

   results: []
 ---
 # whisper-small-nl
 This model is a fine-tuned version of [qmeeus/whisper-small-nl](https://huggingface.co/qmeeus/whisper-small-nl) on the None dataset.
 ## Intended uses & limitations
+Transcribe files in Dutch:
+```python
+import soundfile as sf
+from transformers import pipeline
+whisper_asr = pipeline("automatic-speech-recognition", model="qmeeus/whisper-small-nl", device=0)
+whisper_asr.model.config.forced_decoder_ids = whisper_asr.tokenizer.get_decoder_prompt_ids(
+    task="transcribe", language="nl"
+)
+waveform, sr = sf.read(filename)
+def iter_chunks(waveform, sampling_rate=16_000, chunk_length=30.):
+    assert sampling_rate == 16_000
+    n_frames = math.floor(sampling_rate * chunk_length)
+    for start in range(0, len(waveform), n_frames):
+        end = min(len(waveform), start + n_frames)
+        yield waveform[start:end]
+for sentence in whisper_asr(iter_chunks(waveform, sr)):
+    print(sentence["text"])
+```
 ## Training and evaluation data