Update README.md
Browse files
README.md
CHANGED
@@ -23,14 +23,23 @@ dataset for this model. This results in a DPO dataset composed by triplets < β
|
|
23 |
| | Egida (test) β | DELPHI β | Alert-Base β | Alert-Adv β |
|
24 |
|------------------------------|:--------------:|:--------:|:------------:|:-----------:|
|
25 |
| Meta-Llama-3.1-8B-Instruct | 0.347 | 0.160 | 0.446 | 0.039 |
|
26 |
-
| Meta-Llama-3.1-8B-Egida-DPO | 0.038 | 0.025 | 0.038 | 0.014 |
|
27 |
|
28 |
### General Purpose Performance
|
29 |
|
30 |
| | OpenLLM Leaderboard (Average) β | MMLU Generative (ROUGE1) β |
|
31 |
|------------------------------|:---------------------:|:---------------:|
|
32 |
| Meta-Llama-3.1-8B-Instruct | 0.453 | 0.646 |
|
33 |
-
| Meta-Llama-3.1-8B-Egida-DPO | 0.453 | 0.643 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
34 |
|
35 |
## Training Details
|
36 |
|
|
|
23 |
| | Egida (test) β | DELPHI β | Alert-Base β | Alert-Adv β |
|
24 |
|------------------------------|:--------------:|:--------:|:------------:|:-----------:|
|
25 |
| Meta-Llama-3.1-8B-Instruct | 0.347 | 0.160 | 0.446 | 0.039 |
|
26 |
+
| Meta-Llama-3.1-8B-Instruct-Egida-DPO | 0.038 | 0.025 | 0.038 | 0.014 |
|
27 |
|
28 |
### General Purpose Performance
|
29 |
|
30 |
| | OpenLLM Leaderboard (Average) β | MMLU Generative (ROUGE1) β |
|
31 |
|------------------------------|:---------------------:|:---------------:|
|
32 |
| Meta-Llama-3.1-8B-Instruct | 0.453 | 0.646 |
|
33 |
+
| Meta-Llama-3.1-8B-Instruct-Egida-DPO | 0.453 | 0.643 |
|
34 |
+
|
35 |
+
### Refusal Ratio
|
36 |
+
|
37 |
+
| | OR Bench 80K (refusal) β | OR Bench Hard (refusal) β |
|
38 |
+
|------------------------------|:---------------------:|:---------------:|
|
39 |
+
| Meta-Llama-3.1-8B-Instruct | 0.035 | 0.324 |
|
40 |
+
| Meta-Llama-3.1-8B-Instruct-Egida-DPO | 0.037 | 0.319 |
|
41 |
+
|
42 |
+
Note that this refusal ratio is computed as keyword matching with a curated list of kewords. For more information, check the paper.
|
43 |
|
44 |
## Training Details
|
45 |
|