Update README.md
Browse files
README.md
CHANGED
@@ -5,11 +5,11 @@ license: apache-2.0
|
|
5 |
widget:
|
6 |
- text: "生活的真谛是[MASK]。"
|
7 |
---
|
8 |
-
# Zhouwenwang-1.3B model (Chinese),one model of [Fengshenbang-LM](https://github.com/IDEA-CCNL/Fengshenbang-LM).
|
9 |
-
Zhouwenwang-1.3B apply a new unified structure, and jointly developed by the IDEA-CCNL and Zhuiyi Technology. In the pre-training, the model considers LM (Language Model) and MLM (Mask Language Model) tasks uniformly, and adds rotational position coding, so that the model has the ability to generate and understand. Zhouwenwang-1.3B is the largest model for LM and MLM tasks in the Chinese field. It will continue to be optimized in the direction of model scale, knowledge integration, and supervision task assistance.
|
10 |
|
11 |
## Usage
|
12 |
-
There is no structure of Zhouwenwang-1.3B in [Transformers](https://github.com/huggingface/transformers), you can run follow code to get structure of Zhouwenwang-1.3B from [Fengshenbang-LM](https://github.com/IDEA-CCNL/Fengshenbang-LM)
|
13 |
|
14 |
```shell
|
15 |
git clone https://github.com/IDEA-CCNL/Fengshenbang-LM.git
|
@@ -21,9 +21,9 @@ from fengshen import RoFormerModel
|
|
21 |
from fengshen import RoFormerConfig
|
22 |
from transformers import BertTokenizer
|
23 |
|
24 |
-
tokenizer = BertTokenizer.from_pretrained("IDEA-CCNL/Zhouwenwang-1.3B")
|
25 |
-
config = RoFormerConfig.from_pretrained("IDEA-CCNL/Zhouwenwang-1.3B")
|
26 |
-
model = RoFormerModel.from_pretrained("IDEA-CCNL/Zhouwenwang-1.3B")
|
27 |
|
28 |
|
29 |
```
|
@@ -39,8 +39,8 @@ import numpy as np
|
|
39 |
sentence = '清华大学位于'
|
40 |
max_length = 32
|
41 |
|
42 |
-
tokenizer = AutoTokenizer.from_pretrained("IDEA-CCNL/Zhouwenwang-1.3B")
|
43 |
-
model = RoFormerModel.from_pretrained("IDEA-CCNL/Zhouwenwang-1.3B")
|
44 |
|
45 |
for i in range(max_length):
|
46 |
encode = torch.tensor(
|
@@ -61,7 +61,7 @@ print(sentence)
|
|
61 |
| Model| afqmc | tnews | iflytek | ocnli | cmnli | wsc | csl |
|
62 |
| :--------: | :-----: | :----: | :-----: | :----: | :----: | :----: | :----: |
|
63 |
| roberta-wwm-ext-large | 0.7514 | 0.5872 | 0.6152 | 0.777 | 0.814 | 0.8914 | 0.86 |
|
64 |
-
| Zhouwenwang-1.3B | 0.7463 | 0.6036 | 0.6288 | 0.7654 | 0.7741 | 0.8849 | 0. 8777 |
|
65 |
|
66 |
## Citation
|
67 |
If you find the resource is useful, please cite the following website in your paper.
|
|
|
5 |
widget:
|
6 |
- text: "生活的真谛是[MASK]。"
|
7 |
---
|
8 |
+
# Zhouwenwang-Unified-1.3B model (Chinese),one model of [Fengshenbang-LM](https://github.com/IDEA-CCNL/Fengshenbang-LM).
|
9 |
+
Zhouwenwang-Unified-1.3B apply a new unified structure, and jointly developed by the IDEA-CCNL and Zhuiyi Technology. In the pre-training, the model considers LM (Language Model) and MLM (Mask Language Model) tasks uniformly, and adds rotational position coding, so that the model has the ability to generate and understand. Zhouwenwang-Unified-1.3B is the largest model for LM and MLM tasks in the Chinese field. It will continue to be optimized in the direction of model scale, knowledge integration, and supervision task assistance.
|
10 |
|
11 |
## Usage
|
12 |
+
There is no structure of Zhouwenwang-Unified-1.3B in [Transformers](https://github.com/huggingface/transformers), you can run follow code to get structure of Zhouwenwang-Unified-1.3B from [Fengshenbang-LM](https://github.com/IDEA-CCNL/Fengshenbang-LM)
|
13 |
|
14 |
```shell
|
15 |
git clone https://github.com/IDEA-CCNL/Fengshenbang-LM.git
|
|
|
21 |
from fengshen import RoFormerConfig
|
22 |
from transformers import BertTokenizer
|
23 |
|
24 |
+
tokenizer = BertTokenizer.from_pretrained("IDEA-CCNL/Zhouwenwang-Unified-1.3B")
|
25 |
+
config = RoFormerConfig.from_pretrained("IDEA-CCNL/Zhouwenwang-Unified-1.3B")
|
26 |
+
model = RoFormerModel.from_pretrained("IDEA-CCNL/Zhouwenwang-Unified-1.3B")
|
27 |
|
28 |
|
29 |
```
|
|
|
39 |
sentence = '清华大学位于'
|
40 |
max_length = 32
|
41 |
|
42 |
+
tokenizer = AutoTokenizer.from_pretrained("IDEA-CCNL/Zhouwenwang-Unified-1.3B")
|
43 |
+
model = RoFormerModel.from_pretrained("IDEA-CCNL/Zhouwenwang-Unified-1.3B")
|
44 |
|
45 |
for i in range(max_length):
|
46 |
encode = torch.tensor(
|
|
|
61 |
| Model| afqmc | tnews | iflytek | ocnli | cmnli | wsc | csl |
|
62 |
| :--------: | :-----: | :----: | :-----: | :----: | :----: | :----: | :----: |
|
63 |
| roberta-wwm-ext-large | 0.7514 | 0.5872 | 0.6152 | 0.777 | 0.814 | 0.8914 | 0.86 |
|
64 |
+
| Zhouwenwang-Unified-1.3B | 0.7463 | 0.6036 | 0.6288 | 0.7654 | 0.7741 | 0.8849 | 0. 8777 |
|
65 |
|
66 |
## Citation
|
67 |
If you find the resource is useful, please cite the following website in your paper.
|