transiteration commited on
Commit
8ea3efa
·
verified ·
1 Parent(s): 011f320

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +19 -12
README.md CHANGED
@@ -34,27 +34,33 @@ pip install nemo_toolkit['all']
34
  The model is accessible within the NeMo toolkit [1] and can serve as a pre-trained checkpoint for either making inferences or for fine-tuning on a different dataset.
35
 
36
  #### How to Import
 
37
  ```
38
  import nemo.collections.asr as nemo_asr
39
- asr_model = nemo_asr.models.EncDecCTCModel.restore_from(restore_path="stt_kz_quartznet15x5.nemo")
40
  ```
41
- #### How to Transcribe Single Audio File
42
- We can get a sample audio to test the model:
 
43
  ```
44
- wget https://asr-kz-example.s3.us-west-2.amazonaws.com/sample_kz.wav
45
  ```
46
- Then this line of code is to transcribe the single audio:
 
 
47
  ```
48
- asr_model.transcribe(['sample_kz.wav'])
49
  ```
50
- #### How to Transcribe Multiple Audio Files
 
 
 
51
  ```
52
- python transcribe_speech.py model_path=stt_kz_quartznet15x5.nemo audio_dir="<DIRECTORY CONTAINING AUDIO FILES>"
53
  ```
54
-
55
- If you have a manifest file about your audio files:
56
  ```
57
- python transcribe_speech.py model_path=stt_kz_quartznet15x5.nemo dataset_manifest=manifest.json
58
  ```
59
 
60
  ## Input and Output
@@ -74,8 +80,9 @@ In total, KSC2 contains around 1.2k hours of high-quality transcribed data compr
74
 
75
  ## Performance
76
  The model achieved:\
77
- Average WER: 15.53%\
78
  through the applying of **Greedy Decoding**.
 
79
  ## Limitations
80
 
81
  Because the GPU has limited power, we used a lightweight model architecture for fine-tuning.\
 
34
  The model is accessible within the NeMo toolkit [1] and can serve as a pre-trained checkpoint for either making inferences or for fine-tuning on a different dataset.
35
 
36
  #### How to Import
37
+
38
  ```
39
  import nemo.collections.asr as nemo_asr
40
+ model = nemo_asr.models.EncDecCTCModel.restore_from(restore_path="stt_kz_quartznet15x5.nemo")
41
  ```
42
+
43
+ #### How to Train
44
+
45
  ```
46
+ python3 train.py --train_manifest path/to/manifest.json --val_manifest path/to/manifest.json --batch_size BATCH_SIZE --num_epochs NUM_EPOCHS --model_save_path path/to/save/model.nemo
47
  ```
48
+
49
+ #### How to Evaluate
50
+
51
  ```
52
+ python3 evaluate.py --model_path=/path/to/stt_kz_quartznet15x5.nemo --test_manifest path/to/manifest.json"
53
  ```
54
+
55
+ #### How to Transcribe Audio File
56
+
57
+ We can get a sample audio to test the model:
58
  ```
59
+ wget https://asr-kz-example.s3.us-west-2.amazonaws.com/sample_kz.wav
60
  ```
61
+ Then this line of code is to transcribe the single audio:
 
62
  ```
63
+ python3 transcibe.py --model_path /path/to/stt_kz_quartznet15x5.nemo --audio_file_path path/to/audio/file
64
  ```
65
 
66
  ## Input and Output
 
80
 
81
  ## Performance
82
  The model achieved:\
83
+ Average WER: 13.53%\
84
  through the applying of **Greedy Decoding**.
85
+
86
  ## Limitations
87
 
88
  Because the GPU has limited power, we used a lightweight model architecture for fine-tuning.\