jpacifico
/

Chocolatine-14B-Instruct-DPO-v1.3

@@ -12,7 +12,7 @@ language:
 base_model:
 - microsoft/phi-4
 ---
-### Chocolatine-2-14B-Instruct-v2.0.4
 DPO fine-tuning of [microsoft/Phi-4](https://huggingface.co/microsoft/Phi-4) (14B params)
 using the [jpacifico/french-orca-dpo-pairs-revised](https://huggingface.co/datasets/jpacifico/french-orca-dpo-pairs-revised) rlhf dataset.
@@ -21,7 +21,7 @@ Window context = up to 16k tokens
 ### OpenLLM Leaderboard
-Chocolatine-2-14B-Instruct-v2.0.4 is the best-performing 14B fine-tuned model on the [OpenLLM Leaderboard](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard)
 for only 1.70kgCo2 (versus > 3kg for other models in the same category and performance)
 [Updated 2025-02-17]
@@ -38,14 +38,14 @@ for only 1.70kgCo2 (versus > 3kg for other models in the same category and perfo
 ### MT-Bench-French
-Chocolatine-2-14B-Instruct-v2.0.4 outperforms its previous versions and its base model Phi-4 on [MT-Bench-French](https://huggingface.co/datasets/bofenghuang/mt-bench-french), used with [multilingual-mt-bench](https://github.com/Peter-Devine/multilingual_mt_bench) and GPT-4-Turbo as LLM-judge.
 ```
 ########## First turn ##########
                                              score
 model                                 turn
 gpt-4o-mini                           1     9.2875
-Chocolatine-2-14B-Instruct-v2.0.4     1     9.0125
 Chocolatine-14B-Instruct-DPO-v1.2     1     8.6125
 Phi-3.5-mini-instruct                 1     8.5250
 Chocolatine-3B-Instruct-DPO-v1.2      1     8.3750
@@ -65,7 +65,7 @@ vigogne-2-7b-chat                     1     5.6625
                                                score
 model                                 turn
 gpt-4o-mini                           2     8.912500
-Chocolatine-2-14B-Instruct-v2.0.4     2     8.762500
 Chocolatine-14B-Instruct-DPO-v1.2     2     8.337500
 phi-4                                 2     8.131250
 Chocolatine-3B-Instruct-DPO-Revised   2     7.937500
@@ -85,7 +85,7 @@ vigogne-2-7b-chat                     2     2.775000
                                           score
 model
 gpt-4o-mini                            9.100000
-Chocolatine-2-14B-Instruct-v2.0.4      8.825000
 Chocolatine-14B-Instruct-DPO-v1.2      8.475000
 phi-4                                  8.215625
 Chocolatine-3B-Instruct-DPO-v1.2       8.118750
@@ -106,7 +106,7 @@ vigogne-2-7b-chat                      4.218750
 You can run this model using my [Colab notebook](https://github.com/jpacifico/Chocolatine-LLM/blob/main/Chocolatine_14B_inference_test_colab.ipynb)
-You can also run Chocolatine-2 using the following code:
 ```python
 import transformers

 base_model:
 - microsoft/phi-4
 ---
+### Chocolatine-14B-Instruct-DPO-v1.3
 DPO fine-tuning of [microsoft/Phi-4](https://huggingface.co/microsoft/Phi-4) (14B params)
 using the [jpacifico/french-orca-dpo-pairs-revised](https://huggingface.co/datasets/jpacifico/french-orca-dpo-pairs-revised) rlhf dataset.
 ### OpenLLM Leaderboard
+Chocolatine-14B-Instruct-DPO-v1.3 is the best-performing Phi-4 based model on the [OpenLLM Leaderboard](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard)
 for only 1.70kgCo2 (versus > 3kg for other models in the same category and performance)
 [Updated 2025-02-17]
 ### MT-Bench-French
+Chocolatine-14B-Instruct-DPO-v1.3 outperforms its previous Chocolatine versions and its base model Phi-4 on [MT-Bench-French](https://huggingface.co/datasets/bofenghuang/mt-bench-french), used with [multilingual-mt-bench](https://github.com/Peter-Devine/multilingual_mt_bench) and GPT-4-Turbo as LLM-judge.
 ```
 ########## First turn ##########
                                              score
 model                                 turn
 gpt-4o-mini                           1     9.2875
+Chocolatine-14B-Instruct-DPO-v1.3     1     9.0125
 Chocolatine-14B-Instruct-DPO-v1.2     1     8.6125
 Phi-3.5-mini-instruct                 1     8.5250
 Chocolatine-3B-Instruct-DPO-v1.2      1     8.3750
                                                score
 model                                 turn
 gpt-4o-mini                           2     8.912500
+Chocolatine-14B-Instruct-DPO-v1.3     2     8.762500
 Chocolatine-14B-Instruct-DPO-v1.2     2     8.337500
 phi-4                                 2     8.131250
 Chocolatine-3B-Instruct-DPO-Revised   2     7.937500
                                           score
 model
 gpt-4o-mini                            9.100000
+Chocolatine-14B-Instruct-DPO-v1.3      8.825000
 Chocolatine-14B-Instruct-DPO-v1.2      8.475000
 phi-4                                  8.215625
 Chocolatine-3B-Instruct-DPO-v1.2       8.118750
 You can run this model using my [Colab notebook](https://github.com/jpacifico/Chocolatine-LLM/blob/main/Chocolatine_14B_inference_test_colab.ipynb)
+You can also run Chocolatine using the following code:
 ```python
 import transformers