zolicsaki commited on
Commit
5180789
·
verified ·
1 Parent(s): 2352a2c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -3
README.md CHANGED
@@ -52,10 +52,8 @@ All pre-training is done on the [Cultura-X](https://huggingface.co/datasets/uonl
52
  We extended the vocabulary of the base llama model from 32,000 tokens to 57,000 tokens by adding up to 25,000 non-overlapping tokens from the new language.
53
 
54
  ## Evaluation
55
- || SambaLingo-Turkish-Base | TURNA | bloom-7b1 | xglm-7.5B | mGPT-13B |
56
  |-------------------------------|---------------------|-----------|-----------|----------|--------|
57
- | Perplexity (Lower Is Better) | 1.589 | 13.435 | 2.804 | 1.799 | 2.386 |
58
- | SambaLingo-Bulgarian-Base | mGPT-1.3B-bulgarian | bloom-7b1 | xglm-7.5B | mGPT-13B | |
59
  | Perplexity (Lower Is Better) | 1.416 | 1.755 | 2.051 | 1.502 | 1.889 |
60
  | FLORES en->bg (8 shot, CHRF) | 0.558 | 0.143 | 0.211 | 0.484 | 0.136 |
61
  | FLORES bg->en (8 shot, CHRF) | 0.621 | 0.227 | 0.182 | 0.347 | 0.145 |
 
52
  We extended the vocabulary of the base llama model from 32,000 tokens to 57,000 tokens by adding up to 25,000 non-overlapping tokens from the new language.
53
 
54
  ## Evaluation
55
+ || SambaLingo-Bulgarian-Base | mGPT-1.3B-bulgarian | bloom-7b1 | xglm-7.5B | mGPT-13B |
56
  |-------------------------------|---------------------|-----------|-----------|----------|--------|
 
 
57
  | Perplexity (Lower Is Better) | 1.416 | 1.755 | 2.051 | 1.502 | 1.889 |
58
  | FLORES en->bg (8 shot, CHRF) | 0.558 | 0.143 | 0.211 | 0.484 | 0.136 |
59
  | FLORES bg->en (8 shot, CHRF) | 0.621 | 0.227 | 0.182 | 0.347 | 0.145 |