Update README.md
Browse files
README.md
CHANGED
@@ -12,7 +12,7 @@ language:
|
|
12 |
base_model:
|
13 |
- microsoft/phi-4
|
14 |
---
|
15 |
-
### Chocolatine-
|
16 |
|
17 |
DPO fine-tuning of [microsoft/Phi-4](https://huggingface.co/microsoft/Phi-4) (14B params)
|
18 |
using the [jpacifico/french-orca-dpo-pairs-revised](https://huggingface.co/datasets/jpacifico/french-orca-dpo-pairs-revised) rlhf dataset.
|
@@ -21,7 +21,7 @@ Window context = up to 16k tokens
|
|
21 |
|
22 |
### OpenLLM Leaderboard
|
23 |
|
24 |
-
Chocolatine-
|
25 |
for only 1.70kgCo2 (versus > 3kg for other models in the same category and performance)
|
26 |
[Updated 2025-02-17]
|
27 |
|
@@ -38,14 +38,14 @@ for only 1.70kgCo2 (versus > 3kg for other models in the same category and perfo
|
|
38 |
|
39 |
### MT-Bench-French
|
40 |
|
41 |
-
Chocolatine-
|
42 |
|
43 |
```
|
44 |
########## First turn ##########
|
45 |
score
|
46 |
model turn
|
47 |
gpt-4o-mini 1 9.2875
|
48 |
-
Chocolatine-
|
49 |
Chocolatine-14B-Instruct-DPO-v1.2 1 8.6125
|
50 |
Phi-3.5-mini-instruct 1 8.5250
|
51 |
Chocolatine-3B-Instruct-DPO-v1.2 1 8.3750
|
@@ -65,7 +65,7 @@ vigogne-2-7b-chat 1 5.6625
|
|
65 |
score
|
66 |
model turn
|
67 |
gpt-4o-mini 2 8.912500
|
68 |
-
Chocolatine-
|
69 |
Chocolatine-14B-Instruct-DPO-v1.2 2 8.337500
|
70 |
phi-4 2 8.131250
|
71 |
Chocolatine-3B-Instruct-DPO-Revised 2 7.937500
|
@@ -85,7 +85,7 @@ vigogne-2-7b-chat 2 2.775000
|
|
85 |
score
|
86 |
model
|
87 |
gpt-4o-mini 9.100000
|
88 |
-
Chocolatine-
|
89 |
Chocolatine-14B-Instruct-DPO-v1.2 8.475000
|
90 |
phi-4 8.215625
|
91 |
Chocolatine-3B-Instruct-DPO-v1.2 8.118750
|
@@ -106,7 +106,7 @@ vigogne-2-7b-chat 4.218750
|
|
106 |
|
107 |
You can run this model using my [Colab notebook](https://github.com/jpacifico/Chocolatine-LLM/blob/main/Chocolatine_14B_inference_test_colab.ipynb)
|
108 |
|
109 |
-
You can also run Chocolatine
|
110 |
|
111 |
```python
|
112 |
import transformers
|
|
|
12 |
base_model:
|
13 |
- microsoft/phi-4
|
14 |
---
|
15 |
+
### Chocolatine-14B-Instruct-DPO-v1.3
|
16 |
|
17 |
DPO fine-tuning of [microsoft/Phi-4](https://huggingface.co/microsoft/Phi-4) (14B params)
|
18 |
using the [jpacifico/french-orca-dpo-pairs-revised](https://huggingface.co/datasets/jpacifico/french-orca-dpo-pairs-revised) rlhf dataset.
|
|
|
21 |
|
22 |
### OpenLLM Leaderboard
|
23 |
|
24 |
+
Chocolatine-14B-Instruct-DPO-v1.3 is the best-performing Phi-4 based model on the [OpenLLM Leaderboard](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard)
|
25 |
for only 1.70kgCo2 (versus > 3kg for other models in the same category and performance)
|
26 |
[Updated 2025-02-17]
|
27 |
|
|
|
38 |
|
39 |
### MT-Bench-French
|
40 |
|
41 |
+
Chocolatine-14B-Instruct-DPO-v1.3 outperforms its previous Chocolatine versions and its base model Phi-4 on [MT-Bench-French](https://huggingface.co/datasets/bofenghuang/mt-bench-french), used with [multilingual-mt-bench](https://github.com/Peter-Devine/multilingual_mt_bench) and GPT-4-Turbo as LLM-judge.
|
42 |
|
43 |
```
|
44 |
########## First turn ##########
|
45 |
score
|
46 |
model turn
|
47 |
gpt-4o-mini 1 9.2875
|
48 |
+
Chocolatine-14B-Instruct-DPO-v1.3 1 9.0125
|
49 |
Chocolatine-14B-Instruct-DPO-v1.2 1 8.6125
|
50 |
Phi-3.5-mini-instruct 1 8.5250
|
51 |
Chocolatine-3B-Instruct-DPO-v1.2 1 8.3750
|
|
|
65 |
score
|
66 |
model turn
|
67 |
gpt-4o-mini 2 8.912500
|
68 |
+
Chocolatine-14B-Instruct-DPO-v1.3 2 8.762500
|
69 |
Chocolatine-14B-Instruct-DPO-v1.2 2 8.337500
|
70 |
phi-4 2 8.131250
|
71 |
Chocolatine-3B-Instruct-DPO-Revised 2 7.937500
|
|
|
85 |
score
|
86 |
model
|
87 |
gpt-4o-mini 9.100000
|
88 |
+
Chocolatine-14B-Instruct-DPO-v1.3 8.825000
|
89 |
Chocolatine-14B-Instruct-DPO-v1.2 8.475000
|
90 |
phi-4 8.215625
|
91 |
Chocolatine-3B-Instruct-DPO-v1.2 8.118750
|
|
|
106 |
|
107 |
You can run this model using my [Colab notebook](https://github.com/jpacifico/Chocolatine-LLM/blob/main/Chocolatine_14B_inference_test_colab.ipynb)
|
108 |
|
109 |
+
You can also run Chocolatine using the following code:
|
110 |
|
111 |
```python
|
112 |
import transformers
|