Automatic Speech Recognition
ESPnet
English
audio
wanchichen commited on
Commit
91d9731
·
verified ·
1 Parent(s): d6afe74

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +15 -1
README.md CHANGED
@@ -12,7 +12,21 @@ license: cc-by-4.0
12
  ## ESPnet2 ASR model
13
 
14
  This is a simple baseline for the ML-SUPERB 2.0 Challenge. It is a self-supervised [MMS 1B](https://huggingface.co/facebook/mms-1b) model fine-tuned on [142 languages of ML-SUPERB](https://huggingface.co/datasets/ftshijt/mlsuperb_8th) using CTC loss.
15
- The MMS model is frozen and used as a feature extractor for a small Transformer encoder during fine-tuning, which took approximately 1 day on a single GPU.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
16
 
17
  ### Demo: How to use in ESPnet2
18
 
 
12
  ## ESPnet2 ASR model
13
 
14
  This is a simple baseline for the ML-SUPERB 2.0 Challenge. It is a self-supervised [MMS 1B](https://huggingface.co/facebook/mms-1b) model fine-tuned on [142 languages of ML-SUPERB](https://huggingface.co/datasets/ftshijt/mlsuperb_8th) using CTC loss.
15
+ The MMS model is frozen and used as a feature extractor for a small Transformer encoder during fine-tuning, which took approximately 1 day on a single GPU.
16
+
17
+
18
+ The model was trained using the [ML-SUPERB recipe](https://github.com/espnet/espnet/tree/master/egs2/ml_superb/asr1) in ESPnet. Inference can be performed with the following script:
19
+
20
+ ```
21
+ from espnet2.bin.asr_inference import Speech2Text
22
+
23
+ model = Speech2Text.from_pretrained(
24
+ "espnet/mms_1b_mlsuperb"
25
+ )
26
+
27
+ speech, rate = soundfile.read("speech.wav")
28
+ text, *_ = model(speech)[0]
29
+ ```
30
 
31
  ### Demo: How to use in ESPnet2
32