Add paper abstract to model card

#2
by nielsr HF Staff - opened
Files changed (1) hide show
  1. README.md +10 -5
README.md CHANGED
@@ -1,15 +1,15 @@
1
  ---
2
- license: mit
3
- license_link: https://huggingface.co/microsoft/bitnet-b1.58-2B-4T/blob/main/LICENSE
4
  language:
5
  - en
 
 
 
6
  pipeline_tag: text-generation
7
  tags:
8
  - chat
9
  - bitnet
10
  - text-generation
11
  - large-language-model
12
- library_name: transformers
13
  ---
14
 
15
  # BitNet b1.58 2B4T - Scaling Native 1-bit LLM
@@ -22,6 +22,10 @@ Trained on a corpus of 4 trillion tokens, this model demonstrates that native 1-
22
 
23
  ➡️ **Official Inference Code:** [microsoft/BitNet (bitnet.cpp)](https://github.com/microsoft/BitNet)
24
 
 
 
 
 
25
  ## Model Variants
26
 
27
  Several versions of the model weights are available on Hugging Face:
@@ -98,7 +102,8 @@ chat_input = tokenizer(prompt, return_tensors="pt").to(model.device)
98
  # Generate response
99
  chat_outputs = model.generate(**chat_input, max_new_tokens=50)
100
  response = tokenizer.decode(chat_outputs[0][chat_input['input_ids'].shape[-1]:], skip_special_tokens=True) # Decode only the response part
101
- print("\nAssistant Response:", response)
 
102
  ```
103
 
104
  ## How to Use (with `bitnet.cpp`)
@@ -141,4 +146,4 @@ BitNet b1.58 2B4T was evaluated against leading open-weight full-precision LLMs
141
  The model weights and code are released under the [MIT License](https://huggingface.co/microsoft/bitnet-b1.58-2B-4T/blob/main/LICENSE).
142
 
143
  ## Disclaimer
144
- This model is intended for research and development purposes. While efforts have been made to align it using SFT and DPO, it may still produce outputs that are unexpected, biased, or inaccurate. Please use responsibly.
 
1
  ---
 
 
2
  language:
3
  - en
4
+ library_name: transformers
5
+ license: mit
6
+ license_link: https://huggingface.co/microsoft/bitnet-b1.58-2B-4T/blob/main/LICENSE
7
  pipeline_tag: text-generation
8
  tags:
9
  - chat
10
  - bitnet
11
  - text-generation
12
  - large-language-model
 
13
  ---
14
 
15
  # BitNet b1.58 2B4T - Scaling Native 1-bit LLM
 
22
 
23
  ➡️ **Official Inference Code:** [microsoft/BitNet (bitnet.cpp)](https://github.com/microsoft/BitNet)
24
 
25
+ # Paper abstract
26
+
27
+ The abstract of the paper is the following:
28
+
29
  ## Model Variants
30
 
31
  Several versions of the model weights are available on Hugging Face:
 
102
  # Generate response
103
  chat_outputs = model.generate(**chat_input, max_new_tokens=50)
104
  response = tokenizer.decode(chat_outputs[0][chat_input['input_ids'].shape[-1]:], skip_special_tokens=True) # Decode only the response part
105
+ print("
106
+ Assistant Response:", response)
107
  ```
108
 
109
  ## How to Use (with `bitnet.cpp`)
 
146
  The model weights and code are released under the [MIT License](https://huggingface.co/microsoft/bitnet-b1.58-2B-4T/blob/main/LICENSE).
147
 
148
  ## Disclaimer
149
+ This model is intended for research and development purposes. While efforts have been made to align it using SFT and DPO, it may still produce outputs that are unexpected, biased, or inaccurate. Please use responsibly.