Update README.md
Browse files
README.md
CHANGED
@@ -11,9 +11,6 @@ model-index:
|
|
11 |
results: []
|
12 |
---
|
13 |
|
14 |
-
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
|
15 |
-
should probably proofread and complete it, then remove this comment. -->
|
16 |
-
|
17 |
# whisper-small-nl
|
18 |
|
19 |
This model is a fine-tuned version of [qmeeus/whisper-small-nl](https://huggingface.co/qmeeus/whisper-small-nl) on the None dataset.
|
@@ -27,7 +24,30 @@ More information needed
|
|
27 |
|
28 |
## Intended uses & limitations
|
29 |
|
30 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
31 |
|
32 |
## Training and evaluation data
|
33 |
|
|
|
11 |
results: []
|
12 |
---
|
13 |
|
|
|
|
|
|
|
14 |
# whisper-small-nl
|
15 |
|
16 |
This model is a fine-tuned version of [qmeeus/whisper-small-nl](https://huggingface.co/qmeeus/whisper-small-nl) on the None dataset.
|
|
|
24 |
|
25 |
## Intended uses & limitations
|
26 |
|
27 |
+
Transcribe files in Dutch:
|
28 |
+
|
29 |
+
```python
|
30 |
+
import soundfile as sf
|
31 |
+
from transformers import pipeline
|
32 |
+
|
33 |
+
whisper_asr = pipeline("automatic-speech-recognition", model="qmeeus/whisper-small-nl", device=0)
|
34 |
+
whisper_asr.model.config.forced_decoder_ids = whisper_asr.tokenizer.get_decoder_prompt_ids(
|
35 |
+
task="transcribe", language="nl"
|
36 |
+
)
|
37 |
+
|
38 |
+
waveform, sr = sf.read(filename)
|
39 |
+
|
40 |
+
def iter_chunks(waveform, sampling_rate=16_000, chunk_length=30.):
|
41 |
+
assert sampling_rate == 16_000
|
42 |
+
n_frames = math.floor(sampling_rate * chunk_length)
|
43 |
+
for start in range(0, len(waveform), n_frames):
|
44 |
+
end = min(len(waveform), start + n_frames)
|
45 |
+
yield waveform[start:end]
|
46 |
+
|
47 |
+
for sentence in whisper_asr(iter_chunks(waveform, sr)):
|
48 |
+
print(sentence["text"])
|
49 |
+
|
50 |
+
```
|
51 |
|
52 |
## Training and evaluation data
|
53 |
|