nvidia
/

stt_hr_conformer_ctc_large

Automatic Speech Recognition

hf-asr-leaderboard

Model card Files Files and versions Community

anteju commited on Aug 4, 2022

Commit

c661777

·

1 Parent(s): 83b23d7

Update README.md

Files changed (1) hide show

README.md +5 -5

README.md CHANGED Viewed

@@ -90,14 +90,14 @@ The tokenizers for these models were built using the text transcripts of the tra
 The vocabulary we use contains 27 characters:
 ```python
-['a', 'b', 'c', 'č', 'ć', 'd', 'đ', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'r', 's', 'š', 't', 'u', 'v', 'z', 'ž']
 ```
-Full config can be found inside the .nemo files.
 ### Datasets
-All the models in this collection are trained on ParlaSpeech-HR v1.0 Croatian dataset, which contains around 1665 hours of training data after data cleaning, 2.2 hours of developement and 2.3 hours of test data.
 ## Performance
@@ -105,13 +105,13 @@ The list of the available models in this collection is shown in the following ta
 | Version | Tokenizer             | Vocabulary Size | Dev WER | Test WER | Train Dataset       |
 |---------|-----------------------|-----------------|---------|----------|---------------------|
-| 1.11.0  | SentencePiece Unigram | 128             | X.YZ    | X.YZ     | ParlaSpeech-HR v1.0 |
 You may use language models (LMs) and beam search to improve the accuracy of the models.
 ## Limitations
-Since the model is trained just on ParlaSpeech-HR v1.0 dataset, the performance of this model might degrade for speech which includes terms, or vernecular that the model has not been trained on. The model might also perform worse for accented speech.
 ## Deployment with NVIDIA Riva

 The vocabulary we use contains 27 characters:
 ```python
+[' ', 'a', 'b', 'c', 'č', 'ć', 'd', 'đ', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'r', 's', 'š', 't', 'u', 'v', 'z', 'ž']
 ```
+Full config can be found inside the `.nemo` files.
 ### Datasets
+All the models in this collection are trained on ParlaSpeech-HR v1.0 Croatian dataset, which contains around 1665 hours of training data after data cleaning, 2.2 hours of development and 2.3 hours of test data.
 ## Performance
 | Version | Tokenizer             | Vocabulary Size | Dev WER | Test WER | Train Dataset       |
 |---------|-----------------------|-----------------|---------|----------|---------------------|
+| 1.11.0  | SentencePiece Unigram | 128             | 4.43    | 4.70     | ParlaSpeech-HR v1.0 |
 You may use language models (LMs) and beam search to improve the accuracy of the models.
 ## Limitations
+Since the model is trained just on ParlaSpeech-HR v1.0 dataset, the performance of this model might degrade for speech which includes terms, or vernacular that the model has not been trained on. The model might also perform worse for accented speech.
 ## Deployment with NVIDIA Riva