lbourdois commited on
Commit
1192f82
·
verified ·
1 Parent(s): dd850f5

Improve language tag

Browse files

Hi! As the model is multilingual, this is a PR to add other languages than English to the language tag to improve the referencing. Note that 29 languages are announced in the README, but only 13 are explicitly listed. I was therefore only able to add these 13 languages.

Files changed (1) hide show
  1. README.md +128 -116
README.md CHANGED
@@ -1,116 +1,128 @@
1
- ---
2
- library_name: transformers
3
- license: apache-2.0
4
- license_link: https://huggingface.co/huihui-ai/Qwen2.5-7B-Instruct-abliterated-v3/blob/main/LICENSE
5
- language:
6
- - en
7
- pipeline_tag: text-generation
8
- base_model: Qwen/Qwen2.5-7B-Instruct
9
- tags:
10
- - chat
11
- - abliterated
12
- - uncensored
13
- ---
14
-
15
- # huihui-ai/Qwen2.5-7B-Instruct-abliterated-v3
16
-
17
-
18
- This is an uncensored version of [Qwen/Qwen2.5-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-7B-Instruct) created with abliteration (see [remove-refusals-with-transformers](https://github.com/Sumandora/remove-refusals-with-transformers) to know more about it).
19
- This is a crude, proof-of-concept implementation to remove refusals from an LLM model without using TransformerLens.
20
- The test results are not very good, but compared to before, there is much less [garbled text](https://huggingface.co/huihui-ai/Qwen2.5-7B-Instruct-abliterated-v2/discussions/2).
21
-
22
- ## ollama
23
-
24
- You can use [huihui_ai/qwen2.5-abliterate](https://ollama.com/huihui_ai/qwen2.5-abliterate) directly,
25
- ```
26
- ollama run huihui_ai/qwen2.5-abliterate
27
- ```
28
-
29
- ## Usage
30
- You can use this model in your applications by loading it with Hugging Face's `transformers` library:
31
-
32
-
33
- ```python
34
- from transformers import AutoModelForCausalLM, AutoTokenizer
35
-
36
- # Load the model and tokenizer
37
- model_name = "huihui-ai/Qwen2.5-7B-Instruct-abliterated-v3"
38
- model = AutoModelForCausalLM.from_pretrained(
39
- model_name,
40
- torch_dtype="auto",
41
- device_map="auto"
42
- )
43
- tokenizer = AutoTokenizer.from_pretrained(model_name)
44
-
45
- # Initialize conversation context
46
- initial_messages = [
47
- {"role": "system", "content": "You are Qwen, created by Alibaba Cloud. You are a helpful assistant."}
48
- ]
49
- messages = initial_messages.copy() # Copy the initial conversation context
50
-
51
- # Enter conversation loop
52
- while True:
53
- # Get user input
54
- user_input = input("User: ").strip() # Strip leading and trailing spaces
55
-
56
- # If the user types '/exit', end the conversation
57
- if user_input.lower() == "/exit":
58
- print("Exiting chat.")
59
- break
60
-
61
- # If the user types '/clean', reset the conversation context
62
- if user_input.lower() == "/clean":
63
- messages = initial_messages.copy() # Reset conversation context
64
- print("Chat history cleared. Starting a new conversation.")
65
- continue
66
-
67
- # If input is empty, prompt the user and continue
68
- if not user_input:
69
- print("Input cannot be empty. Please enter something.")
70
- continue
71
-
72
- # Add user input to the conversation
73
- messages.append({"role": "user", "content": user_input})
74
-
75
- # Build the chat template
76
- text = tokenizer.apply_chat_template(
77
- messages,
78
- tokenize=False,
79
- add_generation_prompt=True
80
- )
81
-
82
- # Tokenize input and prepare it for the model
83
- model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
84
-
85
- # Generate a response from the model
86
- generated_ids = model.generate(
87
- **model_inputs,
88
- max_new_tokens=8192
89
- )
90
-
91
- # Extract model output, removing special tokens
92
- generated_ids = [
93
- output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
94
- ]
95
- response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
96
-
97
- # Add the model's response to the conversation
98
- messages.append({"role": "assistant", "content": response})
99
-
100
- # Print the model's response
101
- print(f"Qwen: {response}")
102
-
103
- ```
104
-
105
- ## Evaluations
106
- The following data has been re-evaluated and calculated as the average for each test.
107
-
108
- | Benchmark | Qwen2.5-7B-Instruct | Qwen2.5-7B-Instruct-abliterated-v3 | Qwen2.5-7B-Instruct-abliterated-v2 | Qwen2.5-7B-Instruct-abliterated |
109
- |-------------|---------------------|------------------------------------|------------------------------------|---------------------------------|
110
- | IF_Eval | 76.44 | 72.64 | **77.82** | 76.49 |
111
- | MMLU Pro | **43.12** | 39.14 | 42.03 | 41.71 |
112
- | TruthfulQA | 62.46 | 57.27 | 57.81 | **64.92** |
113
- | BBH | **53.92** | 50.67 | 53.01 | 52.77 |
114
- | GPQA | 31.91 | 31.65 | **32.17** | 31.97 |
115
-
116
- The script used for evaluation can be found inside this repository under /eval.sh, or click [here](https://huggingface.co/huihui-ai/Qwen2.5-7B-Instruct-abliterated-v3/blob/main/eval.sh)
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ library_name: transformers
3
+ license: apache-2.0
4
+ license_link: https://huggingface.co/huihui-ai/Qwen2.5-7B-Instruct-abliterated-v3/blob/main/LICENSE
5
+ language:
6
+ - zho
7
+ - eng
8
+ - fra
9
+ - spa
10
+ - por
11
+ - deu
12
+ - ita
13
+ - rus
14
+ - jpn
15
+ - kor
16
+ - vie
17
+ - tha
18
+ - ara
19
+ pipeline_tag: text-generation
20
+ base_model: Qwen/Qwen2.5-7B-Instruct
21
+ tags:
22
+ - chat
23
+ - abliterated
24
+ - uncensored
25
+ ---
26
+
27
+ # huihui-ai/Qwen2.5-7B-Instruct-abliterated-v3
28
+
29
+
30
+ This is an uncensored version of [Qwen/Qwen2.5-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-7B-Instruct) created with abliteration (see [remove-refusals-with-transformers](https://github.com/Sumandora/remove-refusals-with-transformers) to know more about it).
31
+ This is a crude, proof-of-concept implementation to remove refusals from an LLM model without using TransformerLens.
32
+ The test results are not very good, but compared to before, there is much less [garbled text](https://huggingface.co/huihui-ai/Qwen2.5-7B-Instruct-abliterated-v2/discussions/2).
33
+
34
+ ## ollama
35
+
36
+ You can use [huihui_ai/qwen2.5-abliterate](https://ollama.com/huihui_ai/qwen2.5-abliterate) directly,
37
+ ```
38
+ ollama run huihui_ai/qwen2.5-abliterate
39
+ ```
40
+
41
+ ## Usage
42
+ You can use this model in your applications by loading it with Hugging Face's `transformers` library:
43
+
44
+
45
+ ```python
46
+ from transformers import AutoModelForCausalLM, AutoTokenizer
47
+
48
+ # Load the model and tokenizer
49
+ model_name = "huihui-ai/Qwen2.5-7B-Instruct-abliterated-v3"
50
+ model = AutoModelForCausalLM.from_pretrained(
51
+ model_name,
52
+ torch_dtype="auto",
53
+ device_map="auto"
54
+ )
55
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
56
+
57
+ # Initialize conversation context
58
+ initial_messages = [
59
+ {"role": "system", "content": "You are Qwen, created by Alibaba Cloud. You are a helpful assistant."}
60
+ ]
61
+ messages = initial_messages.copy() # Copy the initial conversation context
62
+
63
+ # Enter conversation loop
64
+ while True:
65
+ # Get user input
66
+ user_input = input("User: ").strip() # Strip leading and trailing spaces
67
+
68
+ # If the user types '/exit', end the conversation
69
+ if user_input.lower() == "/exit":
70
+ print("Exiting chat.")
71
+ break
72
+
73
+ # If the user types '/clean', reset the conversation context
74
+ if user_input.lower() == "/clean":
75
+ messages = initial_messages.copy() # Reset conversation context
76
+ print("Chat history cleared. Starting a new conversation.")
77
+ continue
78
+
79
+ # If input is empty, prompt the user and continue
80
+ if not user_input:
81
+ print("Input cannot be empty. Please enter something.")
82
+ continue
83
+
84
+ # Add user input to the conversation
85
+ messages.append({"role": "user", "content": user_input})
86
+
87
+ # Build the chat template
88
+ text = tokenizer.apply_chat_template(
89
+ messages,
90
+ tokenize=False,
91
+ add_generation_prompt=True
92
+ )
93
+
94
+ # Tokenize input and prepare it for the model
95
+ model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
96
+
97
+ # Generate a response from the model
98
+ generated_ids = model.generate(
99
+ **model_inputs,
100
+ max_new_tokens=8192
101
+ )
102
+
103
+ # Extract model output, removing special tokens
104
+ generated_ids = [
105
+ output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
106
+ ]
107
+ response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
108
+
109
+ # Add the model's response to the conversation
110
+ messages.append({"role": "assistant", "content": response})
111
+
112
+ # Print the model's response
113
+ print(f"Qwen: {response}")
114
+
115
+ ```
116
+
117
+ ## Evaluations
118
+ The following data has been re-evaluated and calculated as the average for each test.
119
+
120
+ | Benchmark | Qwen2.5-7B-Instruct | Qwen2.5-7B-Instruct-abliterated-v3 | Qwen2.5-7B-Instruct-abliterated-v2 | Qwen2.5-7B-Instruct-abliterated |
121
+ |-------------|---------------------|------------------------------------|------------------------------------|---------------------------------|
122
+ | IF_Eval | 76.44 | 72.64 | **77.82** | 76.49 |
123
+ | MMLU Pro | **43.12** | 39.14 | 42.03 | 41.71 |
124
+ | TruthfulQA | 62.46 | 57.27 | 57.81 | **64.92** |
125
+ | BBH | **53.92** | 50.67 | 53.01 | 52.77 |
126
+ | GPQA | 31.91 | 31.65 | **32.17** | 31.97 |
127
+
128
+ The script used for evaluation can be found inside this repository under /eval.sh, or click [here](https://huggingface.co/huihui-ai/Qwen2.5-7B-Instruct-abliterated-v3/blob/main/eval.sh)