sdadas commited on
Commit
a273489
1 Parent(s): 464c73d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -0
README.md CHANGED
@@ -49,6 +49,8 @@ Below is a summary of the Qra-1B model:
49
 
50
  In this section we compare the perplexity of Qra models on Polish texts with other Polish and English LLMs.
51
 
 
 
52
  ### PolEval-2018
53
 
54
  In 2018, the PolEval competition included a language modeling task, for which training and test sets totaling over 20 million Polish sentences were made available. We used the first 10k sentences from the test set to evaluate modern neural language models. To calculate the perplexity, we used a script from the [HuggingFace Evaluate](https://huggingface.co/spaces/evaluate-metric/perplexity) library.
 
49
 
50
  In this section we compare the perplexity of Qra models on Polish texts with other Polish and English LLMs.
51
 
52
+ Note that perplexity values between different text segmentations are not directly comparable. Therefore, we can draw conclusions based on comparisons only beetween models using the same tokenizer, such as Qra and the original LLama / TinyLLama.
53
+
54
  ### PolEval-2018
55
 
56
  In 2018, the PolEval competition included a language modeling task, for which training and test sets totaling over 20 million Polish sentences were made available. We used the first 10k sentences from the test set to evaluate modern neural language models. To calculate the perplexity, we used a script from the [HuggingFace Evaluate](https://huggingface.co/spaces/evaluate-metric/perplexity) library.