espnet
/

mms_1b_mlsuperb_old

Automatic Speech Recognition

Model card Files Files and versions Community

wanchichen commited on Nov 25, 2024

Commit

91d9731

·

verified ·

1 Parent(s): d6afe74

Update README.md

Files changed (1) hide show

README.md +15 -1

README.md CHANGED Viewed

@@ -12,7 +12,21 @@ license: cc-by-4.0
 ## ESPnet2 ASR model
 This is a simple baseline for the ML-SUPERB 2.0 Challenge. It is a self-supervised [MMS 1B](https://huggingface.co/facebook/mms-1b) model fine-tuned on [142 languages of ML-SUPERB](https://huggingface.co/datasets/ftshijt/mlsuperb_8th) using CTC loss.
-The MMS model is frozen and used as a feature extractor for a small Transformer encoder during fine-tuning, which took approximately 1 day on a single GPU.
 ### Demo: How to use in ESPnet2

 ## ESPnet2 ASR model
 This is a simple baseline for the ML-SUPERB 2.0 Challenge. It is a self-supervised [MMS 1B](https://huggingface.co/facebook/mms-1b) model fine-tuned on [142 languages of ML-SUPERB](https://huggingface.co/datasets/ftshijt/mlsuperb_8th) using CTC loss.
+The MMS model is frozen and used as a feature extractor for a small Transformer encoder during fine-tuning, which took approximately 1 day on a single GPU.
+The model was trained using the [ML-SUPERB recipe](https://github.com/espnet/espnet/tree/master/egs2/ml_superb/asr1) in ESPnet. Inference can be performed with the following script:
+```
+from espnet2.bin.asr_inference import Speech2Text
+model = Speech2Text.from_pretrained(
+  "espnet/mms_1b_mlsuperb"
+)
+speech, rate = soundfile.read("speech.wav")
+text, *_ = model(speech)[0]
+```
 ### Demo: How to use in ESPnet2