--- license: mit language: - ru library_name: transformers tags: - text-generation-inference --- # text-normalization-ru-new Normalization for Russian text. Couldn't find any existing solutions (besides algorithms, don't like those) so made this. It was designed for Silero TTS model which cant handle english and numbers for russian text to speach. This model is a fine-tuned version of [cointegrated/rut5-small](https://huggingface.co/cointegrated/rut5-small) on https://www.kaggle.com/c/text-normalization-challenge-russian-language and additional dataset prepared by me using typical messages. It achieves the following results on the evaluation set: - Loss: 0.0177 - Mean Distance: 0 - Max Distance: 15 ## Model description Tiny T5 trained from scratch for normalizing Russian texts: - translating numbers into words - expanding abbreviations into phonetic letter combinations - transliterating english into russian letters - whatever else was in the dataset (see below)