carlosdanielhernandezmena commited on
Commit
13a4888
·
1 Parent(s): d51a9e6

Adding info to the README file

Browse files
Files changed (1) hide show
  1. README.md +108 -0
README.md CHANGED
@@ -1,3 +1,111 @@
1
  ---
 
 
 
 
 
 
 
 
 
 
 
2
  license: cc-by-4.0
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ language: fo
3
+ datasets:
4
+ - carlosdanielhernandezmena/ravnursson_asr
5
+ tags:
6
+ - audio
7
+ - automatic-speech-recognition
8
+ - faroese
9
+ - whisper-large
10
+ - whisper-large-v1
11
+ - ravnur-project
12
+ - faroe-islands
13
  license: cc-by-4.0
14
+ widget: null
15
+ model-index:
16
+ - name: whisper-large-faroese-8k-steps-100h
17
+ results:
18
+ - task:
19
+ name: Automatic Speech Recognition
20
+ type: automatic-speech-recognition
21
+ dataset:
22
+ name: Ravnursson (Test)
23
+ type: carlosdanielhernandezmena/ravnursson_asr
24
+ split: test
25
+ args:
26
+ language: fo
27
+ metrics:
28
+ - name: WER
29
+ type: wer
30
+ value: 6.889
31
+ - task:
32
+ name: Automatic Speech Recognition
33
+ type: automatic-speech-recognition
34
+ dataset:
35
+ name: Ravnursson (Dev)
36
+ type: carlosdanielhernandezmena/ravnursson_asr
37
+ split: validation
38
+ args:
39
+ language: fo
40
+ metrics:
41
+ - name: WER
42
+ type: wer
43
+ value: 5.054
44
  ---
45
+ # whisper-large-faroese-8k-steps-100h
46
+ The "whisper-large-faroese-8k-steps-100h" is an acoustic model suitable for Automatic Speech Recognition in Faroese. It is the result of fine-tuning the model "openai/whisper-large" with 100 hours of Faroese data released by the Ravnur Project (https://maltokni.fo/en/) from the Faroe Islands.
47
+
48
+ The specific dataset used to create the model is called "Ravnursson Faroese Speech and Transcripts" and it is available at http://hdl.handle.net/20.500.12537/276.
49
+
50
+ The fine-tuning process was perform during March (2023) in the servers of the Language and Voice Lab (https://lvl.ru.is/) at Reykjavík University (Iceland) by Carlos Daniel Hernández Mena.
51
+
52
+ # Evaluation
53
+ ```python
54
+ import torch
55
+ from transformers import WhisperForConditionalGeneration, WhisperProcessor
56
+
57
+ #Load the processor and model.
58
+ MODEL_NAME="carlosdanielhernandezmena/whisper-large-faroese-8k-steps-100h"
59
+ processor = WhisperProcessor.from_pretrained(MODEL_NAME)
60
+ model = WhisperForConditionalGeneration.from_pretrained(MODEL_NAME).to("cuda")
61
+
62
+ #Load the dataset
63
+ from datasets import load_dataset, load_metric, Audio
64
+ ds=load_dataset("carlosdanielhernandezmena/ravnursson_asr",split='test')
65
+
66
+ #Downsample to 16kHz
67
+ ds = ds.cast_column("audio", Audio(sampling_rate=16_000))
68
+
69
+ #Process the dataset
70
+ def map_to_pred(batch):
71
+ audio = batch["audio"]
72
+ input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
73
+ batch["reference"] = processor.tokenizer._normalize(batch['normalized_text'])
74
+
75
+ with torch.no_grad():
76
+ predicted_ids = model.generate(input_features.to("cuda"))[0]
77
+
78
+ transcription = processor.decode(predicted_ids)
79
+ batch["prediction"] = processor.tokenizer._normalize(transcription)
80
+
81
+ return batch
82
+
83
+ #Do the evaluation
84
+ result = ds.map(map_to_pred)
85
+
86
+ #Compute the overall WER now.
87
+ from evaluate import load
88
+
89
+ wer = load("wer")
90
+ WER=100 * wer.compute(references=result["reference"], predictions=result["prediction"])
91
+ print(WER)
92
+ ```
93
+ **Test Result**: 6.88978359335682
94
+
95
+ # BibTeX entry and citation info
96
+ * When publishing results based on these models please refer to:
97
+ ```bibtex
98
+ @misc{mena2023whisperlargefaroese,
99
+ title={Acoustic Model in Faroese: whisper-large-faroese-8k-steps-100h.},
100
+ author={Hernandez Mena, Carlos Daniel},
101
+ year={2023},
102
+ url={https://huggingface.co/carlosdanielhernandezmena/whisper-large-faroese-8k-steps-100h},
103
+ }
104
+ ```
105
+ # Acknowledgements
106
+ We want to thank to Jón Guðnason, head of the Language and Voice Lab for providing computational power to make this model possible. We also want to thank to the "Language Technology Programme for Icelandic 2019-2023" which is managed and coordinated by Almannarómur, and it is funded by the Icelandic Ministry of Education, Science and Culture.
107
+
108
+ Thanks to Annika Simonsen and to The Ravnur Project for making their "Basic Language Resource Kit"(BLARK 1.0) publicly available through the research paper "Creating a Basic Language Resource Kit for Faroese" https://aclanthology.org/2022.lrec-1.495.pdf
109
+
110
+ Special thanks to Björn Ingi Stefánsson for setting up the configuration of the server where this model was trained.
111
+