KBLab
/

whisper-tiny-rixvox

Automatic Speech Recognition

Model card Files Files and versions Metrics Training metrics Community

marma commited on Mar 16, 2023

Commit

1e40ec9

·

1 Parent(s): 45cee1b

Update README.md

Files changed (1) hide show

README.md +18 -5

README.md CHANGED Viewed

@@ -10,14 +10,27 @@ language:
 This is a [Whisper tiny](https://huggingface.co/openai/whisper-tiny) finetuned for Swedish using
 the [RixVox](https://huggingface.co/datasets/KBLab/rixvox) dataset.
 ## Evaluation
-### [Common Voice 11](#):
-* WER: XYZ
-* WER (normalized): XYZ
-* WER: 51.67615433270082
-* WER (normalized): 48.08777429467085
 ## Training

 This is a [Whisper tiny](https://huggingface.co/openai/whisper-tiny) finetuned for Swedish using
 the [RixVox](https://huggingface.co/datasets/KBLab/rixvox) dataset.
+Please note that this model, as every other encoder-decoder speech-to-text model, is prone to
+hallucinating on unexpected inputs and treats the task as translation rather than transcription.
+I.e your mileage may vary depending on filtering and type of data.
+In this release the entire encoder was frozen. Subsequent releases will not do this **if** the
+generalization to other types of data (i.e not parliamentary speeches) is kept when not freezing
+the encoder.
 ## Evaluation
+<! --* Common Voice 11 WER: 17.18
+* Common Voice 11 WER (normalized*): 12.24 -->
+* Fleurs WER: 51.68
+* Fleurs WER (normalized*): 48.09
+*) Normalization is done by applying the following to source and generated texts:
+```
+def normalize(s):
+    return ' '.join([ x for x in sub('[^0-9a-zåäöA-ZÅÄÖ ]', ' ', s.lower()).split() ])
+```
 ## Training