Transformers
PyTorch
Chinese
megatron-bert
suolyer commited on
Commit
9cfe461
·
1 Parent(s): ca8ec1b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +9 -9
README.md CHANGED
@@ -5,11 +5,11 @@ license: apache-2.0
5
  widget:
6
  - text: "生活的真谛是[MASK]。"
7
  ---
8
- # Zhouwenwang-1.3B model (Chinese),one model of [Fengshenbang-LM](https://github.com/IDEA-CCNL/Fengshenbang-LM).
9
- Zhouwenwang-1.3B apply a new unified structure, and jointly developed by the IDEA-CCNL and Zhuiyi Technology. In the pre-training, the model considers LM (Language Model) and MLM (Mask Language Model) tasks uniformly, and adds rotational position coding, so that the model has the ability to generate and understand. Zhouwenwang-1.3B is the largest model for LM and MLM tasks in the Chinese field. It will continue to be optimized in the direction of model scale, knowledge integration, and supervision task assistance.
10
 
11
  ## Usage
12
- There is no structure of Zhouwenwang-1.3B in [Transformers](https://github.com/huggingface/transformers), you can run follow code to get structure of Zhouwenwang-1.3B from [Fengshenbang-LM](https://github.com/IDEA-CCNL/Fengshenbang-LM)
13
 
14
  ```shell
15
  git clone https://github.com/IDEA-CCNL/Fengshenbang-LM.git
@@ -21,9 +21,9 @@ from fengshen import RoFormerModel
21
  from fengshen import RoFormerConfig
22
  from transformers import BertTokenizer
23
 
24
- tokenizer = BertTokenizer.from_pretrained("IDEA-CCNL/Zhouwenwang-1.3B")
25
- config = RoFormerConfig.from_pretrained("IDEA-CCNL/Zhouwenwang-1.3B")
26
- model = RoFormerModel.from_pretrained("IDEA-CCNL/Zhouwenwang-1.3B")
27
 
28
 
29
  ```
@@ -39,8 +39,8 @@ import numpy as np
39
  sentence = '清华大学位于'
40
  max_length = 32
41
 
42
- tokenizer = AutoTokenizer.from_pretrained("IDEA-CCNL/Zhouwenwang-1.3B")
43
- model = RoFormerModel.from_pretrained("IDEA-CCNL/Zhouwenwang-1.3B")
44
 
45
  for i in range(max_length):
46
  encode = torch.tensor(
@@ -61,7 +61,7 @@ print(sentence)
61
  | Model| afqmc | tnews | iflytek | ocnli | cmnli | wsc | csl |
62
  | :--------: | :-----: | :----: | :-----: | :----: | :----: | :----: | :----: |
63
  | roberta-wwm-ext-large | 0.7514 | 0.5872 | 0.6152 | 0.777 | 0.814 | 0.8914 | 0.86 |
64
- | Zhouwenwang-1.3B | 0.7463 | 0.6036 | 0.6288 | 0.7654 | 0.7741 | 0.8849 | 0. 8777 |
65
 
66
  ## Citation
67
  If you find the resource is useful, please cite the following website in your paper.
 
5
  widget:
6
  - text: "生活的真谛是[MASK]。"
7
  ---
8
+ # Zhouwenwang-Unified-1.3B model (Chinese),one model of [Fengshenbang-LM](https://github.com/IDEA-CCNL/Fengshenbang-LM).
9
+ Zhouwenwang-Unified-1.3B apply a new unified structure, and jointly developed by the IDEA-CCNL and Zhuiyi Technology. In the pre-training, the model considers LM (Language Model) and MLM (Mask Language Model) tasks uniformly, and adds rotational position coding, so that the model has the ability to generate and understand. Zhouwenwang-Unified-1.3B is the largest model for LM and MLM tasks in the Chinese field. It will continue to be optimized in the direction of model scale, knowledge integration, and supervision task assistance.
10
 
11
  ## Usage
12
+ There is no structure of Zhouwenwang-Unified-1.3B in [Transformers](https://github.com/huggingface/transformers), you can run follow code to get structure of Zhouwenwang-Unified-1.3B from [Fengshenbang-LM](https://github.com/IDEA-CCNL/Fengshenbang-LM)
13
 
14
  ```shell
15
  git clone https://github.com/IDEA-CCNL/Fengshenbang-LM.git
 
21
  from fengshen import RoFormerConfig
22
  from transformers import BertTokenizer
23
 
24
+ tokenizer = BertTokenizer.from_pretrained("IDEA-CCNL/Zhouwenwang-Unified-1.3B")
25
+ config = RoFormerConfig.from_pretrained("IDEA-CCNL/Zhouwenwang-Unified-1.3B")
26
+ model = RoFormerModel.from_pretrained("IDEA-CCNL/Zhouwenwang-Unified-1.3B")
27
 
28
 
29
  ```
 
39
  sentence = '清华大学位于'
40
  max_length = 32
41
 
42
+ tokenizer = AutoTokenizer.from_pretrained("IDEA-CCNL/Zhouwenwang-Unified-1.3B")
43
+ model = RoFormerModel.from_pretrained("IDEA-CCNL/Zhouwenwang-Unified-1.3B")
44
 
45
  for i in range(max_length):
46
  encode = torch.tensor(
 
61
  | Model| afqmc | tnews | iflytek | ocnli | cmnli | wsc | csl |
62
  | :--------: | :-----: | :----: | :-----: | :----: | :----: | :----: | :----: |
63
  | roberta-wwm-ext-large | 0.7514 | 0.5872 | 0.6152 | 0.777 | 0.814 | 0.8914 | 0.86 |
64
+ | Zhouwenwang-Unified-1.3B | 0.7463 | 0.6036 | 0.6288 | 0.7654 | 0.7741 | 0.8849 | 0. 8777 |
65
 
66
  ## Citation
67
  If you find the resource is useful, please cite the following website in your paper.