Cyrile commited on
Commit
2db7df3
·
1 Parent(s): e5b1faf

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -31,7 +31,7 @@ Sans honneur que précaire, sans liberté que provisoire, [...], et de façon qu
31
 
32
  | model | GPT 3.5 | Boris | Flan-T5 | LLaMA | Dolly | MPT | Falcon | Bloomz |
33
  |:--------------:|:-------:|:-----:|:-------:|:-----:|:-----:|:---:|:------:|:------:|
34
- | tokens by word | 2.3 | 2.3 | 2 | 1.9 | 1.9 | 1.9 | 1.8 | 1.4 |
35
 
36
 
37
  For comparison, with a specialized French tokenizer like [CamemBERT](https://huggingface.co/camembert/camembert-base) or [DistilCamemBERT](cmarkea/distilcamembert-base), we have 1.5 tokens per word. In addition to its positive impact on inference time and resource consumption, there has already been a demonstrated direct relationship between the number of tokens per word required for modeling and the predictive performance of the model [1].
 
31
 
32
  | model | GPT 3.5 | Boris | Flan-T5 | LLaMA | Dolly | MPT | Falcon | Bloomz |
33
  |:--------------:|:-------:|:-----:|:-------:|:-----:|:-----:|:---:|:------:|:------:|
34
+ | tokens per word | 2.3 | 2.3 | 2 | 1.9 | 1.9 | 1.9 | 1.8 | 1.4 |
35
 
36
 
37
  For comparison, with a specialized French tokenizer like [CamemBERT](https://huggingface.co/camembert/camembert-base) or [DistilCamemBERT](cmarkea/distilcamembert-base), we have 1.5 tokens per word. In addition to its positive impact on inference time and resource consumption, there has already been a demonstrated direct relationship between the number of tokens per word required for modeling and the predictive performance of the model [1].