MU-NLPC
/

CzeGPT-2

@@ -18,7 +18,7 @@ Along, we also provide a tokenizer (vocab and merges) with vocab size of 50257 t
 The model's perplexity on a 250 MB random slice of csTenTen17 dataset is 42.12 but is not directly comparable to any other model, since there is no competition in Czech models yet (and comparison with models for other languages is meaningless, because of different tokenization and test data).
 # Running the predictions
-The repository includes a simple Jupyter Notebook that can help with the first steps when using the model.
 # How to cite
 @unpublished{hajek_horak2022,

 The model's perplexity on a 250 MB random slice of csTenTen17 dataset is 42.12 but is not directly comparable to any other model, since there is no competition in Czech models yet (and comparison with models for other languages is meaningless, because of different tokenization and test data).
 # Running the predictions
+The repository includes a simple Jupyter Notebook that can help with the first steps when using the model. (TODO)
 # How to cite
 @unpublished{hajek_horak2022,