Update README.md
Browse files
README.md
CHANGED
@@ -6,11 +6,11 @@ metrics:
|
|
6 |
library_name: nemo
|
7 |
pipeline_tag: automatic-speech-recognition
|
8 |
tags:
|
9 |
-
- automatic-speech-recognition
|
10 |
- speech
|
11 |
- audio
|
12 |
- pytorch
|
13 |
- stt
|
|
|
14 |
---
|
15 |
|
16 |
|
@@ -24,7 +24,7 @@ NumPy 1.21.6\
|
|
24 |
PyTorch 1.21.1\
|
25 |
NVIDIA NeMo 1.7.0
|
26 |
|
27 |
-
```
|
28 |
pip3 install nemo_toolkit['all']
|
29 |
```
|
30 |
|
@@ -34,14 +34,14 @@ The model is accessible within the NeMo toolkit [1] and can serve as a pre-train
|
|
34 |
|
35 |
#### How to Import
|
36 |
|
37 |
-
```
|
38 |
import nemo.collections.asr as nemo_asr
|
39 |
model = nemo_asr.models.ASRModel.restore_from(restore_path="stt_kz_quartznet15x5.nemo")
|
40 |
```
|
41 |
|
42 |
#### How to Train
|
43 |
|
44 |
-
```
|
45 |
python3 train.py --train_manifest path/to/manifest.json --val_manifest path/to/manifest.json \
|
46 |
--accelerator "gpu" --batch_size BATCH_SIZE --num_epochs NUM_EPOCHS \
|
47 |
--model_save_path path/to/save/model.nemo
|
@@ -49,18 +49,18 @@ python3 train.py --train_manifest path/to/manifest.json --val_manifest path/to/m
|
|
49 |
|
50 |
#### How to Evaluate
|
51 |
|
52 |
-
```
|
53 |
python3 evaluate.py --model_path /path/to/model.nemo --test_manifest path/to/manifest.json --batch_size BATCH_SIZE
|
54 |
```
|
55 |
|
56 |
#### How to Transcribe Audio File
|
57 |
|
58 |
Sample audio to test the model:
|
59 |
-
```
|
60 |
wget https://asr-kz-example.s3.us-west-2.amazonaws.com/sample_kz.wav
|
61 |
```
|
62 |
This line is to transcribe the single audio:
|
63 |
-
```
|
64 |
python3 transcribe.py --model_path /path/to/model.nemo --audio_file_path path/to/audio/file
|
65 |
```
|
66 |
|
@@ -81,7 +81,7 @@ In total, KSC2 contains around 1.2k hours of high-quality transcribed data compr
|
|
81 |
|
82 |
## Performance
|
83 |
The model achieved:\
|
84 |
-
Average WER: 13.53
|
85 |
through the applying of **Greedy Decoding**.
|
86 |
|
87 |
## Limitations
|
|
|
6 |
library_name: nemo
|
7 |
pipeline_tag: automatic-speech-recognition
|
8 |
tags:
|
|
|
9 |
- speech
|
10 |
- audio
|
11 |
- pytorch
|
12 |
- stt
|
13 |
+
- automatic-speech-recognition
|
14 |
---
|
15 |
|
16 |
|
|
|
24 |
PyTorch 1.21.1\
|
25 |
NVIDIA NeMo 1.7.0
|
26 |
|
27 |
+
```bash
|
28 |
pip3 install nemo_toolkit['all']
|
29 |
```
|
30 |
|
|
|
34 |
|
35 |
#### How to Import
|
36 |
|
37 |
+
```python
|
38 |
import nemo.collections.asr as nemo_asr
|
39 |
model = nemo_asr.models.ASRModel.restore_from(restore_path="stt_kz_quartznet15x5.nemo")
|
40 |
```
|
41 |
|
42 |
#### How to Train
|
43 |
|
44 |
+
```bash
|
45 |
python3 train.py --train_manifest path/to/manifest.json --val_manifest path/to/manifest.json \
|
46 |
--accelerator "gpu" --batch_size BATCH_SIZE --num_epochs NUM_EPOCHS \
|
47 |
--model_save_path path/to/save/model.nemo
|
|
|
49 |
|
50 |
#### How to Evaluate
|
51 |
|
52 |
+
```bash
|
53 |
python3 evaluate.py --model_path /path/to/model.nemo --test_manifest path/to/manifest.json --batch_size BATCH_SIZE
|
54 |
```
|
55 |
|
56 |
#### How to Transcribe Audio File
|
57 |
|
58 |
Sample audio to test the model:
|
59 |
+
```bash
|
60 |
wget https://asr-kz-example.s3.us-west-2.amazonaws.com/sample_kz.wav
|
61 |
```
|
62 |
This line is to transcribe the single audio:
|
63 |
+
```bash
|
64 |
python3 transcribe.py --model_path /path/to/model.nemo --audio_file_path path/to/audio/file
|
65 |
```
|
66 |
|
|
|
81 |
|
82 |
## Performance
|
83 |
The model achieved:\
|
84 |
+
Average WER: **13.53%**\
|
85 |
through the applying of **Greedy Decoding**.
|
86 |
|
87 |
## Limitations
|