Update README.md
Browse files
README.md
CHANGED
@@ -156,15 +156,11 @@ pip install nemo_toolkit['all']
|
|
156 |
|
157 |
## NVIDIA Riva: Deployment
|
158 |
|
159 |
-
|
160 |
|
161 |
-
|
162 |
|
163 |
-
|
164 |
-
* Best in class accuracy via customization with run-time word boosting (e.g., brand and product names), acoustic model training, language model training, and inverse text normalization customizations
|
165 |
-
* Streaming speech recognition, Kubernetes compatible scaling, and Enterprise-grade support
|
166 |
-
|
167 |
-
Check out [Riva live demo](https://developer.nvidia.com/riva#demos).
|
168 |
|
169 |
### Automatically instantiate the model
|
170 |
|
@@ -236,7 +232,7 @@ The list of the available models in this collection is shown in the following ta
|
|
236 |
| 1.10.0 | SentencePiece Unigram | 1024 | 3.01 | 1.62 | 1.17 | 2.05 | 5.70 | 5.32 | 4.59 | 6.46 | NeMo ASRSET 3.0 |
|
237 |
|
238 |
## Limitations
|
239 |
-
Since this model was trained on
|
240 |
|
241 |
## References
|
242 |
[1] [Conformer: Convolution-augmented Transformer for Speech Recognition](https://arxiv.org/abs/2005.08100)
|
|
|
156 |
|
157 |
## NVIDIA Riva: Deployment
|
158 |
|
159 |
+
[CTC-based Conformers](https://huggingface.co/nvidia/stt_en_conformer_ctc_large) are supported by Riva today. This model, as well as other RNNT-based models, will be supported by future versions of [NVIDIA Riva](https://developer.nvidia.com/riva).
|
160 |
|
161 |
+
## How to Use this Model
|
162 |
|
163 |
+
The model is available for use in the NeMo toolkit [3], and can be used as a pre-trained checkpoint for inference or for fine-tuning on another dataset.
|
|
|
|
|
|
|
|
|
164 |
|
165 |
### Automatically instantiate the model
|
166 |
|
|
|
232 |
| 1.10.0 | SentencePiece Unigram | 1024 | 3.01 | 1.62 | 1.17 | 2.05 | 5.70 | 5.32 | 4.59 | 6.46 | NeMo ASRSET 3.0 |
|
233 |
|
234 |
## Limitations
|
235 |
+
Since this model was trained on publicly available speech datasets, the performance of this model might degrade for speech which includes technical terms, or vernacular that the model has not been trained on. The model might also perform worse for accented speech.
|
236 |
|
237 |
## References
|
238 |
[1] [Conformer: Convolution-augmented Transformer for Speech Recognition](https://arxiv.org/abs/2005.08100)
|