sambanovasystems
/

SambaLingo-Russian-Base

@@ -51,6 +51,20 @@ All pre-training is done on the [Cultura-X](https://huggingface.co/datasets/uonl
 ## Tokenizer Details
 We extended the vocabulary of the base llama model from 32,000 tokens to 57,000 tokens by adding up to 25,000 non-overlapping tokens from the new language.
 ## Uses
 <!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
@@ -59,7 +73,6 @@ We extended the vocabulary of the base llama model from 32,000 tokens to 57,000
 <!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
 This model is intended for commercial and research use.
 ### Out-of-Scope Use
 <!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->

 ## Tokenizer Details
 We extended the vocabulary of the base llama model from 32,000 tokens to 57,000 tokens by adding up to 25,000 non-overlapping tokens from the new language.
+## Evaluation Results
+| sambanovasystems/SambaLingo-Russian-Base | IlyaGusev/saiga_mistral_7b_merged | ai-forever/rugpt3large_based_on_gpt2 | bigscience/bloom-7b1 | facebook/xglm-7.5B | ai-forever/mGPT-13B |        |
+|------------------------------------------|-----------------------------------|--------------------------------------|----------------------|--------------------|---------------------|--------|
+| Holdout Perplexity (Lower is better)     | 1.444                             | 1.556                                | 1.611                | 1.797              | 1.504               | 1.806  |
+| FLORES en->ru (8 shot, CHRF)             | 47.19%                            | 42.46%                               | 31.90%               | 20.42%             | 26.26%              | 21.12% |
+| FLORES ru->en ((8 shot, CHRF)            | 58.74%                            | 52.72%                               | 31.73%               | 25.77%             | 42.89%              | 25.06% |
+| FLORES en->ru  ((8 shot, BLEU)           | 19.41%                            | 14.50%                               | 7.36%                | 1.15%              | 4.50%               | 2.14%  |
+| FLORES ru->en  ((8 shot, BLEU)           | 30.05%                            | 24.93%                               | 6.20%                | 3.24%              | 15.18%              | 3.91%  |
+| Belebele (3 shot)                        | 39.00%                            | 34.44%                               | 24.33%               | 29.00%             | 21.89%              | 23.67% |
+| SIB-200 (3 shot)                         | 69.12%                            | 78.92%                               | 32.84%               | 46.08%             | 63.73%              | 42.65% |
+| XNLI (0 shot)                            | 35.29%                            | 49.78%                               | 45.61%               | 42.61%             | 46.39%              | 45.39% |
+| XStoryCloze (0 shot)                     | 71.67%                            | 68.96%                               | 60.75%               | 52.68%             | 63.40%              | 59.43% |
+| XWinograd (0 shot)                       | 69.21%                            | 66.67%                               | 60.63%               | 57.14%             | 63.17%              | 60.00% |
 ## Uses
 <!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
 <!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
 This model is intended for commercial and research use.
 ### Out-of-Scope Use
 <!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->