Update README.md
Browse files
README.md
CHANGED
@@ -23,12 +23,12 @@ model-index:
|
|
23 |
metrics:
|
24 |
- name: Test WER
|
25 |
type: wer
|
26 |
-
value: 42.
|
27 |
---
|
28 |
|
29 |
# Wav2Vec2-Large-XLSR-53-Bemba
|
30 |
|
31 |
-
Fine-tuned [facebook/wav2vec2-large-xlsr-53](https://huggingface.co/facebook/wav2vec2-large-xlsr-53) on Bemba using the [BembaSpeech](https://csikasote.github.io/BembaSpeech). When using this model, make sure that your speech input is sampled at 16kHz.
|
32 |
|
33 |
## Usage
|
34 |
|
@@ -79,14 +79,14 @@ from datasets import load_dataset, load_metric
|
|
79 |
from transformers import Wav2Vec2ForCTC, Wav2Vec2Processor
|
80 |
import re
|
81 |
|
82 |
-
test_dataset = load_dataset("csv", data_files={"test": "/content/test.csv"}, delimiter="
|
83 |
wer = load_metric("wer")
|
84 |
|
85 |
processor = Wav2Vec2Processor.from_pretrained("csikasote/wav2vec2-large-xlsr-bemba")
|
86 |
model = Wav2Vec2ForCTC.from_pretrained("csikasote/wav2vec2-large-xlsr-bemba")
|
87 |
model.to("cuda")
|
88 |
|
89 |
-
chars_to_ignore_regex = '[
|
90 |
#resampler = torchaudio.transforms.Resample(48_000, 16_000)
|
91 |
|
92 |
# Preprocessing the datasets.
|
@@ -116,8 +116,8 @@ result = test_dataset.map(evaluate, batched=True, batch_size=8)
|
|
116 |
print("WER: {:2f}".format(100 * wer.compute(predictions=result["pred_strings"], references=result["sentence"])))
|
117 |
```
|
118 |
|
119 |
-
**Test Result**: 42.
|
120 |
|
121 |
## Training
|
122 |
|
123 |
-
The BembaSpeech `train`, `dev` and `test` datasets were used for training, development and evaluation respectively. The script used for
|
|
|
23 |
metrics:
|
24 |
- name: Test WER
|
25 |
type: wer
|
26 |
+
value: 42.17
|
27 |
---
|
28 |
|
29 |
# Wav2Vec2-Large-XLSR-53-Bemba
|
30 |
|
31 |
+
Fine-tuned [facebook/wav2vec2-large-xlsr-53](https://huggingface.co/facebook/wav2vec2-large-xlsr-53) on Bemba language of Zambia using the [BembaSpeech](https://csikasote.github.io/BembaSpeech). When using this model, make sure that your speech input is sampled at 16kHz.
|
32 |
|
33 |
## Usage
|
34 |
|
|
|
79 |
from transformers import Wav2Vec2ForCTC, Wav2Vec2Processor
|
80 |
import re
|
81 |
|
82 |
+
test_dataset = load_dataset("csv", data_files={"test": "/content/test.csv"}, delimiter="\\t")["test"]
|
83 |
wer = load_metric("wer")
|
84 |
|
85 |
processor = Wav2Vec2Processor.from_pretrained("csikasote/wav2vec2-large-xlsr-bemba")
|
86 |
model = Wav2Vec2ForCTC.from_pretrained("csikasote/wav2vec2-large-xlsr-bemba")
|
87 |
model.to("cuda")
|
88 |
|
89 |
+
chars_to_ignore_regex = '[\,\_\?\.\!\;\:\"\“]'
|
90 |
#resampler = torchaudio.transforms.Resample(48_000, 16_000)
|
91 |
|
92 |
# Preprocessing the datasets.
|
|
|
116 |
print("WER: {:2f}".format(100 * wer.compute(predictions=result["pred_strings"], references=result["sentence"])))
|
117 |
```
|
118 |
|
119 |
+
**Test Result**: 42.17 %
|
120 |
|
121 |
## Training
|
122 |
|
123 |
+
The BembaSpeech `train`, `dev` and `test` datasets were used for training, development and evaluation respectively. The script used for evaluating the model on the test dataset can be found [here](https://colab.research.google.com/drive/1aplFHfaXE68HGDwBYV2KqUWPasrk7bXv?usp=sharing).
|