nvidia
/

stt_en_conformer_transducer_xlarge

Automatic Speech Recognition

hf-asr-leaderboard

Model card Files Files and versions Community

eharper commited on Jun 17, 2022

Commit

a7ebc39

·

1 Parent(s): f1db9f8

Update README.md

Files changed (1) hide show

README.md +4 -8

README.md CHANGED Viewed

@@ -156,15 +156,11 @@ pip install nemo_toolkit['all']
 ## NVIDIA Riva: Deployment
-If you like this and other models from NVIDIA (i.e., CTC-based Conformers) check out [NVIDIA Riva](https://developer.nvidia.com/riva), an accelerated speech AI SDK deployable on-prem, in all clouds, multi-cloud, hybrid, on edge, and embedded. This model, as well as other RNNT-based models are currently not supported by Riva. You can find the list of models supported by Riva [here](https://docs.nvidia.com/deeplearning/riva/user-guide/docs/reference/models/index.html).
-Additionally, Riva provides:
-* World-class out-of-the-box accuracy for the most common languages with model checkpoints trained on proprietary data with hundreds of thousands of GPU-compute hours
-* Best in class accuracy via customization with run-time word boosting (e.g., brand and product names), acoustic model training, language model training, and inverse text normalization customizations
-* Streaming speech recognition, Kubernetes compatible scaling, and Enterprise-grade support
-Check out [Riva live demo](https://developer.nvidia.com/riva#demos).
 ### Automatically instantiate the model
@@ -236,7 +232,7 @@ The list of the available models in this collection is shown in the following ta
 | 1.10.0 | SentencePiece Unigram | 1024 | 3.01 | 1.62 | 1.17 | 2.05 | 5.70 | 5.32 | 4.59 | 6.46 | NeMo ASRSET 3.0 |
 ## Limitations
-Since this model was trained on publically available speech datasets, the performance of this model might degrade for speech which includes technical terms, or vernacular that the model has not been trained on. The model might also perform worse for accented speech.
 ## References
 [1] [Conformer: Convolution-augmented Transformer for Speech Recognition](https://arxiv.org/abs/2005.08100)

 ## NVIDIA Riva: Deployment
+[CTC-based Conformers](https://huggingface.co/nvidia/stt_en_conformer_ctc_large) are supported by Riva today. This model, as well as other RNNT-based models, will be supported by future versions of [NVIDIA Riva](https://developer.nvidia.com/riva).
+## How to Use this Model
+The model is available for use in the NeMo toolkit [3], and can be used as a pre-trained checkpoint for inference or for fine-tuning on another dataset.
 ### Automatically instantiate the model
 | 1.10.0 | SentencePiece Unigram | 1024 | 3.01 | 1.62 | 1.17 | 2.05 | 5.70 | 5.32 | 4.59 | 6.46 | NeMo ASRSET 3.0 |
 ## Limitations
+Since this model was trained on publicly available speech datasets, the performance of this model might degrade for speech which includes technical terms, or vernacular that the model has not been trained on. The model might also perform worse for accented speech.
 ## References
 [1] [Conformer: Convolution-augmented Transformer for Speech Recognition](https://arxiv.org/abs/2005.08100)