Update README.md
Browse files
README.md
CHANGED
@@ -52,7 +52,7 @@ All pre-training is done on the [Cultura-X](https://huggingface.co/datasets/uonl
|
|
52 |
We extended the vocabulary of the base llama model from 32,000 tokens to 57,000 tokens by adding up to 25,000 non-overlapping tokens from the new language.
|
53 |
|
54 |
## Evaluation Results
|
55 |
-
| |
|
56 |
|------------------------------------------|-----------------------------------|--------------------------------------|----------------------|--------------------|---------------------|--------|
|
57 |
| Holdout Perplexity (Lower is better) | **1.444** | 1.556 | 1.611 | 1.797 | 1.504 | 1.806 |
|
58 |
| FLORES en->ru (8 shot, CHRF) | **0.472** | 0.425 | 0.319 | 0.204 | 0.263 | 0.211 |
|
|
|
52 |
We extended the vocabulary of the base llama model from 32,000 tokens to 57,000 tokens by adding up to 25,000 non-overlapping tokens from the new language.
|
53 |
|
54 |
## Evaluation Results
|
55 |
+
| | SambaLingo-Russian-Base | saiga_mistral_7b_merged | rugpt3large_based_on_gpt2 | bloom-7b1 | xglm-7.5B | mGPT-13B |
|
56 |
|------------------------------------------|-----------------------------------|--------------------------------------|----------------------|--------------------|---------------------|--------|
|
57 |
| Holdout Perplexity (Lower is better) | **1.444** | 1.556 | 1.611 | 1.797 | 1.504 | 1.806 |
|
58 |
| FLORES en->ru (8 shot, CHRF) | **0.472** | 0.425 | 0.319 | 0.204 | 0.263 | 0.211 |
|