--- datasets: - oscar - hieronymusa/MaCoCu-dataset-250k language: - cs - cr - hr - pl - sl - sk --- # Slavic T5 Base Aim of this model is to reach the best results for the Slavic laguages with Latin script. It is suitable for tasks such as: - summarization, - extractive question answering, - machine translation between slavic languages in Latin script. The model is trained on the selected parts of OSCAR corpus and MaCoCu corpus. It supports this languages: Czech, Croatian, Polish , Slovak, Slovenian, Vocabulary has 120 000 tokens, contains capital letters.