Safetensors
English
llama
danihinjos commited on
Commit
f408278
Β·
verified Β·
1 Parent(s): b6bd0c6

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +11 -2
README.md CHANGED
@@ -23,14 +23,23 @@ dataset for this model. This results in a DPO dataset composed by triplets < ”
23
  | | Egida (test) ↓ | DELPHI ↓ | Alert-Base ↓ | Alert-Adv ↓ |
24
  |------------------------------|:--------------:|:--------:|:------------:|:-----------:|
25
  | Meta-Llama-3.1-8B-Instruct | 0.347 | 0.160 | 0.446 | 0.039 |
26
- | Meta-Llama-3.1-8B-Egida-DPO | 0.038 | 0.025 | 0.038 | 0.014 |
27
 
28
  ### General Purpose Performance
29
 
30
  | | OpenLLM Leaderboard (Average) ↑ | MMLU Generative (ROUGE1) ↑ |
31
  |------------------------------|:---------------------:|:---------------:|
32
  | Meta-Llama-3.1-8B-Instruct | 0.453 | 0.646 |
33
- | Meta-Llama-3.1-8B-Egida-DPO | 0.453 | 0.643 |
 
 
 
 
 
 
 
 
 
34
 
35
  ## Training Details
36
 
 
23
  | | Egida (test) ↓ | DELPHI ↓ | Alert-Base ↓ | Alert-Adv ↓ |
24
  |------------------------------|:--------------:|:--------:|:------------:|:-----------:|
25
  | Meta-Llama-3.1-8B-Instruct | 0.347 | 0.160 | 0.446 | 0.039 |
26
+ | Meta-Llama-3.1-8B-Instruct-Egida-DPO | 0.038 | 0.025 | 0.038 | 0.014 |
27
 
28
  ### General Purpose Performance
29
 
30
  | | OpenLLM Leaderboard (Average) ↑ | MMLU Generative (ROUGE1) ↑ |
31
  |------------------------------|:---------------------:|:---------------:|
32
  | Meta-Llama-3.1-8B-Instruct | 0.453 | 0.646 |
33
+ | Meta-Llama-3.1-8B-Instruct-Egida-DPO | 0.453 | 0.643 |
34
+
35
+ ### Refusal Ratio
36
+
37
+ | | OR Bench 80K (refusal) ↓ | OR Bench Hard (refusal) ↓ |
38
+ |------------------------------|:---------------------:|:---------------:|
39
+ | Meta-Llama-3.1-8B-Instruct | 0.035 | 0.324 |
40
+ | Meta-Llama-3.1-8B-Instruct-Egida-DPO | 0.037 | 0.319 |
41
+
42
+ Note that this refusal ratio is computed as keyword matching with a curated list of kewords. For more information, check the paper.
43
 
44
  ## Training Details
45