ryan-u commited on
Commit
cf03053
·
verified ·
1 Parent(s): 485fee4

Upload folder using huggingface_hub

Browse files
.gitattributes CHANGED
@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ assets/performance/flops-vs-mmlu.jpg filter=lfs diff=lfs merge=lfs -text
README.md ADDED
@@ -0,0 +1,760 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - en
4
+ - ko
5
+ library_name: transformers
6
+ license: cc-by-nc-4.0
7
+ pipeline_tag: text-generation
8
+ model_id: kakaocorp/kanana-nano-2.1b-embedding
9
+ repo: kakaocorp/kanana-nano-2.1b-embedding
10
+ developers: Kanana LLM
11
+ training_regime: bf16 mixed precision
12
+ ---
13
+
14
+ # Kanana
15
+ <p align="center">
16
+ <br>
17
+ <picture>
18
+ <img src="./assets/logo/kanana-logo.png" width="60%" style="margin: 40px auto;">
19
+ </picture>
20
+ </br>
21
+ <p align="center"> 🤗 <a href="https://huggingface.co/collections/kakaocorp/kanana-nano-21b-67a326cda1c449c8d4172259">Models</a> &nbsp | &nbsp 📕 <a href="https://tech.kakao.com/posts/689"> Blog </a> &nbsp | &nbsp 📜 <a> Technical Report </a> | &nbsp 💻 <a href="https://github.com/kakao/kanana"> Github </a>
22
+
23
+ <br>
24
+
25
+ <br>
26
+
27
+ ## Introduction
28
+
29
+ We introduce Kanana, a series of bilingual language models (developed by [Kakao](https://github.com/kakao)) that demonstrate exceeding performance in Korean and competitive performance in English. The computational cost of Kanana is significantly lower than that of state-of-the-art models of similar size. The report details the techniques employed during pre-training to achieve compute-efficient yet competitive models, including high-quality data filtering, staged pre-training, depth up-scaling, and pruning and distillation. Furthermore, the report outlines the methodologies utilized during the post-training of the Kanana models, encompassing supervised fine-tuning and preference optimization, aimed at enhancing their capability for seamless interaction with users. Lastly, the report elaborates on plausible approaches used for language model adaptation to specific scenarios, such as embedding, function calling, and Retrieval Augmented Generation (RAG). The Kanana model series spans from 2.1B to 32.5B parameters with 2.1B models (base, instruct, embedding, function call, and RAG) publicly released to promote research on Korean language models.
30
+
31
+ > [!Note]
32
+ > Neither the pre-training nor the post-training data includes Kakao user data.
33
+
34
+ <p align="center">
35
+ <picture>
36
+ <img src="assets/performance/flops-vs-mmlu.jpg", width="700" style="margin: 40px auto;">
37
+ </picture>
38
+
39
+ <br>
40
+
41
+ ## Table of Contents
42
+
43
+ - [News](#news)
44
+ - [Performance](#performance)
45
+ - [Quickstart](#quickstart)
46
+ - [License](#license)
47
+ - [Citation](#citation)
48
+ - [Contributors](#contributors)
49
+ - [Contact](#contact)
50
+
51
+ <br>
52
+
53
+ ## News
54
+
55
+ - 📜`2025/02/27`: Released [Technical Report]() and 🤗[HF model weights](https://huggingface.co/collections/kakaocorp/kanana-nano-21b-67a326cda1c449c8d4172259).
56
+ - 📕`2025/01/10`: Published a blog post about the development of `Kanana-Nano` model. ([Kanana-Nano](https://tech.kakao.com/posts/682))
57
+ - 📕`2024/11/14`: Published blog posts about the development of `Kanana` models. ([Kanana LLM: Pre-training](https://tech.kakao.com/posts/661), [Kanana LLM: Post-training](https://tech.kakao.com/posts/662))
58
+ - ▶️`2024/11/06`: Published a presentation video about the development of the `Kanana` models. ([if(kakaoAI)2024](https://youtu.be/HTBl142x9GI?si=o_we6t9suYK8DfX3))
59
+
60
+ <br>
61
+
62
+ ## Performance
63
+
64
+ Below are partial report on the performance of the `Kanana` model series. Please refer to the [Technical Report]() for the full results.
65
+
66
+ ### Pre-trained Model Performance
67
+
68
+ <table>
69
+ <tr>
70
+ <th>Models</th>
71
+ <th>MMLU</th>
72
+ <th>KMMLU</th>
73
+ <th>HAERAE</th>
74
+ <th>HumanEval</th>
75
+ <th>MBPP</th>
76
+ <th>GSM8K</th>
77
+ </tr>
78
+ <tr>
79
+ <th colspan="8" height="30px">27b+ scale</th>
80
+ </tr>
81
+ <tr>
82
+ <td>Kanana-Flag-32.5b</td>
83
+ <td align="center">77.68</td>
84
+ <td align="center">62.10</td>
85
+ <td align="center"><strong>90.47</strong></td>
86
+ <td align="center"><strong>51.22</strong></td>
87
+ <td align="center">63.40</td>
88
+ <td align="center">70.05</td>
89
+ </tr>
90
+ <tr>
91
+ <td>Qwen2.5-32b</td>
92
+ <td align="center"><strong>83.10</strong></td>
93
+ <td align="center"><strong>63.15</strong></td>
94
+ <td align="center">75.16</td>
95
+ <td align="center">50.00</td>
96
+ <td align="center">73.40</td>
97
+ <td align="center"><strong>82.41</strong></td>
98
+ </tr>
99
+ <tr>
100
+ <td>Gemma-2-27b</td>
101
+ <td align="center">75.45</td>
102
+ <td align="center">51.16</td>
103
+ <td align="center">69.11</td>
104
+ <td align="center"><strong>51.22</strong></td>
105
+ <td align="center">64.60</td>
106
+ <td align="center">74.37</td>
107
+ </tr>
108
+ <tr>
109
+ <td>EXAONE-3.5-32b</td>
110
+ <td align="center">72.68</td>
111
+ <td align="center">46.36</td>
112
+ <td align="center">82.22</td>
113
+ <td align="center">-</td>
114
+ <td align="center">-</td>
115
+ <td align="center">-</td>
116
+ </tr>
117
+ <tr>
118
+ <td>Aya-Expanse-32b</td>
119
+ <td align="center">74.52</td>
120
+ <td align="center">49.57</td>
121
+ <td align="center">80.66</td>
122
+ <td align="center">-</td>
123
+ <td align="center">-</td>
124
+ <td align="center">-</td>
125
+ </tr>
126
+ <tr>
127
+ <th colspan="8" height="30px">7b+ scale</th>
128
+ </tr>
129
+ <tr>
130
+ <td>Kanana-Essence-9.8b</td>
131
+ <td align="center">67.61</td>
132
+ <td align="center">50.57</td>
133
+ <td align="center"><strong>84.98</strong></td>
134
+ <td align="center">40.24</td>
135
+ <td align="center">53.60</td>
136
+ <td align="center">63.61</td>
137
+ </tr>
138
+ <tr>
139
+ <td>Llama-3.1-8b</td>
140
+ <td align="center">65.18</td>
141
+ <td align="center">41.02</td>
142
+ <td align="center">61.78</td>
143
+ <td align="center">35.37</td>
144
+ <td align="center">48.60</td>
145
+ <td align="center">50.87</td>
146
+ </tr>
147
+ <tr>
148
+ <td>Qwen2.5-7b</td>
149
+ <td align="center"><strong>74.19</strong></td>
150
+ <td align="center"><strong>51.68</strong></td>
151
+ <td align="center">67.46</td>
152
+ <td align="center"><strong>56.71</strong></td>
153
+ <td align="center"><strong>63.20</strong></td>
154
+ <td align="center"><strong>83.85</strong></td>
155
+ </tr>
156
+ <tr>
157
+ <td>Gemma-2-9b</td>
158
+ <td align="center">70.34</td>
159
+ <td align="center">48.18</td>
160
+ <td align="center">66.18</td>
161
+ <td align="center">37.20</td>
162
+ <td align="center">53.60</td>
163
+ <td align="center">68.16</td>
164
+ </tr>
165
+ <tr>
166
+ <td>EXAONE-3.5-7.8b</td>
167
+ <td align="center">65.36</td>
168
+ <td align="center">45.30</td>
169
+ <td align="center">77.54</td>
170
+ <td align="center">-</td>
171
+ <td align="center">-</td>
172
+ <td align="center">-</td>
173
+ </tr>
174
+ <tr>
175
+ <td>Aya-Expanse-8b</td>
176
+ <td align="center">62.52</td>
177
+ <td align="center">40.11</td>
178
+ <td align="center">71.95</td>
179
+ <td align="center">-</td>
180
+ <td align="center">-</td>
181
+ <td align="center">-</td>
182
+ </tr>
183
+ <tr>
184
+ <th colspan="8" height="30px">2b+ scale</th>
185
+ </tr>
186
+ <tr>
187
+ <td>Kanana-Nano-2.1b</td>
188
+ <td align="center">54.83</td>
189
+ <td align="center">44.80</td>
190
+ <td align="center"><strong>77.09</strong></td>
191
+ <td align="center">31.10</td>
192
+ <td align="center">46.20</td>
193
+ <td align="center">46.32</td>
194
+ </tr>
195
+ <tr>
196
+ <td>Llama-3.2-3b</td>
197
+ <td align="center">56.40</td>
198
+ <td align="center">35.57</td>
199
+ <td align="center">47.66</td>
200
+ <td align="center">25.61</td>
201
+ <td align="center">39.00</td>
202
+ <td align="center">27.37</td>
203
+ </tr>
204
+ <tr>
205
+ <td>Qwen2.5-3b</td>
206
+ <td align="center"><strong>65.57</strong></td>
207
+ <td align="center"><strong>45.28</strong></td>
208
+ <td align="center">61.32</td>
209
+ <td align="center"><strong>37.80</strong></td>
210
+ <td align="center"><strong>55.60</strong></td>
211
+ <td align="center"><strong>69.07</strong></td>
212
+ </tr>
213
+ <tr>
214
+ <td>Gemma-2-2b</td>
215
+ <td align="center">52.89</td>
216
+ <td align="center">30.67</td>
217
+ <td align="center">45.55</td>
218
+ <td align="center">20.12</td>
219
+ <td align="center">28.20</td>
220
+ <td align="center">24.72</td>
221
+ </tr>
222
+ <tr>
223
+ <td>EXAONE-3.5-2.4b</td>
224
+ <td align="center">59.27</td>
225
+ <td align="center">43.58</td>
226
+ <td align="center">69.65</td>
227
+ <td align="center">-</td>
228
+ <td align="center">-</td>
229
+ <td align="center">-</td>
230
+ </tr>
231
+ <tr>
232
+ <th colspan="8" height="30px">70b+ scale</th>
233
+ </tr>
234
+ <tr>
235
+ <td>Llama-3.1-70b</td>
236
+ <td align="center">78.93</td>
237
+ <td align="center">53.00</td>
238
+ <td align="center">76.35</td>
239
+ <td align="center">57.32</td>
240
+ <td align="center">66.60</td>
241
+ <td align="center">81.73</td>
242
+ </tr>
243
+ <tr>
244
+ <td>Qwen2.5-72b</td>
245
+ <td align="center">86.12</td>
246
+ <td align="center">68.57</td>
247
+ <td align="center">80.84</td>
248
+ <td align="center">55.49</td>
249
+ <td align="center">76.40</td>
250
+ <td align="center">92.04</td>
251
+ </tr>
252
+ </table>
253
+
254
+ <br>
255
+
256
+
257
+ ### Post-trained Model Performance
258
+
259
+ #### Instruction-following Benchmarks
260
+ <table>
261
+ <tr>
262
+ <th>Models</th>
263
+ <th>MT-Bench</th>
264
+ <th>LogicKor</th>
265
+ <th>KoMT-Bench</th>
266
+ <th>WildBench</th>
267
+ <th>IFEval</th>
268
+ </tr>
269
+ <tr>
270
+ <th colspan="8" height="30px">27b+ scale</th>
271
+ </tr>
272
+ <tr>
273
+ <td>Kanana-Flag-32.5b</td>
274
+ <td align="center">8.356</td>
275
+ <td align="center"><strong>9.524</strong></td>
276
+ <td align="center"><strong>8.058</strong></td>
277
+ <td align="center">54.14</td>
278
+ <td align="center"><strong>0.856</strong></td>
279
+ </tr>
280
+ <tr>
281
+ <td>Qwen2.5-32b</td>
282
+ <td align="center">8.331</td>
283
+ <td align="center">8.988</td>
284
+ <td align="center">7.847</td>
285
+ <td align="center">51.13</td>
286
+ <td align="center">0.822</td>
287
+ </tr>
288
+ <tr>
289
+ <td>Gemma-2-27b</td>
290
+ <td align="center">8.088</td>
291
+ <td align="center">8.869</td>
292
+ <td align="center">7.373</td>
293
+ <td align="center">46.46</td>
294
+ <td align="center">0.817</td>
295
+ </tr>
296
+ <tr>
297
+ <td>EXAONE-3.5-32b</td>
298
+ <td align="center"><strong>8.375</strong></td>
299
+ <td align="center">9.202</td>
300
+ <td align="center">7.907</td>
301
+ <td align="center"><strong>54.30</strong></td>
302
+ <td align="center">0.845</td>
303
+ </tr>
304
+ <tr>
305
+ <td>Aya-Expanse-32b</td>
306
+ <td align="center">7.788</td>
307
+ <td align="center">8.941</td>
308
+ <td align="center">7.626</td>
309
+ <td align="center">48.36</td>
310
+ <td align="center">0.735</td>
311
+ </tr>
312
+ <tr>
313
+ <th colspan="8" height="30px">7b+ scale</th>
314
+ </tr>
315
+ <tr>
316
+ <td>Kanana-Essence-9.8b</td>
317
+ <td align="center">7.769</td>
318
+ <td align="center">8.964</td>
319
+ <td align="center">7.706</td>
320
+ <td align="center">47.27</td>
321
+ <td align="center">0.799</td>
322
+ </tr>
323
+ <tr>
324
+ <td>Llama-3.1-8b</td>
325
+ <td align="center">7.500</td>
326
+ <td align="center">6.512</td>
327
+ <td align="center">5.336</td>
328
+ <td align="center">33.20</td>
329
+ <td align="center">0.772</td>
330
+ </tr>
331
+ <tr>
332
+ <td>Qwen2.5-7b</td>
333
+ <td align="center">7.625</td>
334
+ <td align="center">7.952</td>
335
+ <td align="center">6.808</td>
336
+ <td align="center">41.31</td>
337
+ <td align="center">0.760</td>
338
+ </tr>
339
+ <tr>
340
+ <td>Gemma-2-9b</td>
341
+ <td align="center">7.633</td>
342
+ <td align="center">8.643</td>
343
+ <td align="center">7.029</td>
344
+ <td align="center">40.92</td>
345
+ <td align="center">0.750</td>
346
+ </tr>
347
+ <tr>
348
+ <td>EXAONE-3.5-7.8b</td>
349
+ <td align="center"><strong>8.213</strong></td>
350
+ <td align="center"><strong>9.357</strong></td>
351
+ <td align="center"><strong>8.013</strong></td>
352
+ <td align="center"><strong>50.98</strong></td>
353
+ <td align="center"><strong>0.826</strong></td>
354
+ </tr>
355
+ <tr>
356
+ <td>Aya-Expanse-8b</td>
357
+ <td align="center">7.131</td>
358
+ <td align="center">8.357</td>
359
+ <td align="center">7.006</td>
360
+ <td align="center">38.50</td>
361
+ <td align="center">0.645</td>
362
+ </tr>
363
+ <tr>
364
+ <th colspan="8" height="30px">2b+ scale</th>
365
+ </tr>
366
+ <tr>
367
+ <td>Kanana-Nano-2.1b</td>
368
+ <td align="center">6.400</td>
369
+ <td align="center">7.964</td>
370
+ <td align="center">5.857</td>
371
+ <td align="center">25.41</td>
372
+ <td align="center">0.720</td>
373
+ </tr>
374
+ <tr>
375
+ <td>Llama-3.2-3b</td>
376
+ <td align="center">7.050</td>
377
+ <td align="center">4.452</td>
378
+ <td align="center">3.967</td>
379
+ <td align="center">21.91</td>
380
+ <td align="center">0.767</td>
381
+ </tr>
382
+ <tr>
383
+ <td>Qwen2.5-3b</td>
384
+ <td align="center">6.969</td>
385
+ <td align="center">6.488</td>
386
+ <td align="center">5.274</td>
387
+ <td align="center">25.76</td>
388
+ <td align="center">0.355</td>
389
+ </tr>
390
+ <tr>
391
+ <td>Gemma-2-2b</td>
392
+ <td align="center">7.225</td>
393
+ <td align="center">5.917</td>
394
+ <td align="center">4.835</td>
395
+ <td align="center">28.71</td>
396
+ <td align="center">0.428</td>
397
+ </tr>
398
+ <tr>
399
+ <td>EXAONE-3.5-2.4b</td>
400
+ <td align="center"><strong>7.919</strong></td>
401
+ <td align="center"><strong>8.941</strong></td>
402
+ <td align="center"><strong>7.223</strong></td>
403
+ <td align="center"><strong>41.68</strong></td>
404
+ <td align="center"><strong>0.790</strong></td>
405
+ </tr>
406
+ <tr>
407
+ <th colspan="8" height="30px">70b+ scale</th>
408
+ </tr>
409
+ <tr>
410
+ <td>Llama-3.1-70b</td>
411
+ <td align="center">8.275</td>
412
+ <td align="center">8.250</td>
413
+ <td align="center">6.970</td>
414
+ <td align="center">46.50</td>
415
+ <td align="center">0.875</td>
416
+ </tr>
417
+ <tr>
418
+ <td>Qwen2.5-72b</td>
419
+ <td align="center">8.619</td>
420
+ <td align="center">9.214</td>
421
+ <td align="center">8.281</td>
422
+ <td align="center">55.25</td>
423
+ <td align="center">0.861</td>
424
+ </tr>
425
+ </table>
426
+
427
+ <br>
428
+
429
+ #### General Benchmarks
430
+
431
+ <table>
432
+ <tr>
433
+ <th>Models</th>
434
+ <th>MMLU</th>
435
+ <th>KMMLU</th>
436
+ <th>HAE-RAE</th>
437
+ <th>HumanEval+</th>
438
+ <th>MBPP+</th>
439
+ <th>GSM8K</th>
440
+ <th>MATH</th>
441
+ </tr>
442
+ <tr>
443
+ <th colspan="8" height="30px">27b+ scale</th>
444
+ </tr>
445
+ <tr>
446
+ <td>Kanana-Flag-32.5b</td>
447
+ <td align="center">81.08</td>
448
+ <td align="center"><strong>64.19</strong></td>
449
+ <td align="center"><strong>68.18</strong></td>
450
+ <td align="center">77.44</td>
451
+ <td align="center">69.84</td>
452
+ <td align="center">90.83</td>
453
+ <td align="center">57.82</td>
454
+ </tr>
455
+ <tr>
456
+ <td>Qwen2.5-32b</td>
457
+ <td align="center"><strong>84.40</strong></td>
458
+ <td align="center">59.37</td>
459
+ <td align="center">48.30</td>
460
+ <td align="center"><strong>82.32</strong></td>
461
+ <td align="center"><strong>71.96</strong></td>
462
+ <td align="center"><strong>95.30</strong></td>
463
+ <td align="center"><strong>81.90</strong></td>
464
+ </tr>
465
+ <tr>
466
+ <td>Gemma-2-27b</td>
467
+ <td align="center">78.01</td>
468
+ <td align="center">49.98</td>
469
+ <td align="center">46.02</td>
470
+ <td align="center">70.12</td>
471
+ <td align="center">70.90</td>
472
+ <td align="center">91.05</td>
473
+ <td align="center">53.80</td>
474
+ </tr>
475
+ <tr>
476
+ <td>EXAONE-3.5-32b</td>
477
+ <td align="center">78.30</td>
478
+ <td align="center">55.44</td>
479
+ <td align="center">52.27</td>
480
+ <td align="center">78.66</td>
481
+ <td align="center">70.90</td>
482
+ <td align="center">93.56</td>
483
+ <td align="center">76.80</td>
484
+ </tr>
485
+ <tr>
486
+ <td>Aya-Expanse-32b</td>
487
+ <td align="center">74.49</td>
488
+ <td align="center">42.35</td>
489
+ <td align="center">51.14</td>
490
+ <td align="center">64.63</td>
491
+ <td align="center">65.61</td>
492
+ <td align="center">75.06</td>
493
+ <td align="center">42.82</td>
494
+ </tr>
495
+ <tr>
496
+ <th colspan="8" height="30px">7b+ scale</th>
497
+ </tr>
498
+ <tr>
499
+ <td>Kanana-Essence-9.8b</td>
500
+ <td align="center">70.64</td>
501
+ <td align="center">50.76</td>
502
+ <td align="center"><strong>47.16</strong></td>
503
+ <td align="center">72.56</td>
504
+ <td align="center">69.05</td>
505
+ <td align="center">84.91</td>
506
+ <td align="center">42.24</td>
507
+ </tr>
508
+ <tr>
509
+ <td>Llama-3.1-8b</td>
510
+ <td align="center">71.18</td>
511
+ <td align="center">39.24</td>
512
+ <td align="center">40.91</td>
513
+ <td align="center">60.98</td>
514
+ <td align="center">57.67</td>
515
+ <td align="center">82.71</td>
516
+ <td align="center">49.86</td>
517
+ </tr>
518
+ <tr>
519
+ <td>Qwen2.5-7b</td>
520
+ <td align="center"><strong>77.23</strong></td>
521
+ <td align="center">46.87</td>
522
+ <td align="center">37.50</td>
523
+ <td align="center">73.78</td>
524
+ <td align="center"><strong>70.63</strong></td>
525
+ <td align="center"><strong>91.58</strong></td>
526
+ <td align="center"><strong>75.22</strong></td>
527
+ </tr>
528
+ <tr>
529
+ <td>Gemma-2-9b</td>
530
+ <td align="center">73.47</td>
531
+ <td align="center">44.47</td>
532
+ <td align="center">39.77</td>
533
+ <td align="center">59.76</td>
534
+ <td align="center">64.55</td>
535
+ <td align="center">87.72</td>
536
+ <td align="center">48.10</td>
537
+ </tr>
538
+ <tr>
539
+ <td>EXAONE-3.5-7.8b</td>
540
+ <td align="center">72.62</td>
541
+ <td align="center"><strong>52.09</strong></td>
542
+ <td align="center">46.02</td>
543
+ <td align="center"><strong>79.27</strong></td>
544
+ <td align="center">66.67</td>
545
+ <td align="center">89.99</td>
546
+ <td align="center">73.50</td>
547
+ </tr>
548
+ <tr>
549
+ <td>Aya-Expanse-8b</td>
550
+ <td align="center">61.23</td>
551
+ <td align="center">35.78</td>
552
+ <td align="center">39.20</td>
553
+ <td align="center">42.68</td>
554
+ <td align="center">56.88</td>
555
+ <td align="center">78.85</td>
556
+ <td align="center">30.80</td>
557
+ </tr>
558
+ <tr>
559
+ <th colspan="8" height="30px">2b+ scale</th>
560
+ </tr>
561
+ <tr>
562
+ <td>Kanana-Nano-2.1b</td>
563
+ <td align="center">52.48</td>
564
+ <td align="center"><strong>38.51</strong></td>
565
+ <td align="center"><strong>33.52</strong></td>
566
+ <td align="center">63.41</td>
567
+ <td align="center">62.43</td>
568
+ <td align="center">72.32</td>
569
+ <td align="center">29.26</td>
570
+ </tr>
571
+ <tr>
572
+ <td>Llama-3.2-3b</td>
573
+ <td align="center">56.09</td>
574
+ <td align="center">3.07</td>
575
+ <td align="center">17.05</td>
576
+ <td align="center">56.71</td>
577
+ <td align="center">50.26</td>
578
+ <td align="center">66.57</td>
579
+ <td align="center">38.18</td>
580
+ </tr>
581
+ <tr>
582
+ <td>Qwen2.5-3b</td>
583
+ <td align="center"><strong>69.18</strong></td>
584
+ <td align="center">38.33</td>
585
+ <td align="center">32.39</td>
586
+ <td align="center">67.68</td>
587
+ <td align="center"><strong>64.02</strong></td>
588
+ <td align="center"><strong>84.00</strong></td>
589
+ <td align="center"><strong>65.72</strong></td>
590
+ </tr>
591
+ <tr>
592
+ <td>Gemma-2-2b</td>
593
+ <td align="center">57.69</td>
594
+ <td align="center">6.99</td>
595
+ <td align="center">7.95</td>
596
+ <td align="center">35.37</td>
597
+ <td align="center">45.24</td>
598
+ <td align="center">49.81</td>
599
+ <td align="center">21.68</td>
600
+ </tr>
601
+ <tr>
602
+ <td>EXAONE-3.5-2.4b</td>
603
+ <td align="center">63.19</td>
604
+ <td align="center">14.27</td>
605
+ <td align="center">14.20</td>
606
+ <td align="center"><strong>70.73</strong></td>
607
+ <td align="center">59.79</td>
608
+ <td align="center">83.78</td>
609
+ <td align="center">64.04</td>
610
+ </tr>
611
+ <tr>
612
+ <th colspan="8" height="30px">70b+ scale</th>
613
+ </tr>
614
+ <tr>
615
+ <td>Llama-3.1-70b</td>
616
+ <td align="center">83.48</td>
617
+ <td align="center">39.08</td>
618
+ <td align="center">53.41</td>
619
+ <td align="center">75.61</td>
620
+ <td align="center">66.40</td>
621
+ <td align="center">91.66</td>
622
+ <td align="center">63.98</td>
623
+ </tr>
624
+ <tr>
625
+ <td>Qwen2.5-72b</td>
626
+ <td align="center">87.14</td>
627
+ <td align="center">65.78</td>
628
+ <td align="center">60.80</td>
629
+ <td align="center">81.10</td>
630
+ <td align="center">75.66</td>
631
+ <td align="center">95.45</td>
632
+ <td align="center">82.60</td>
633
+ </tr>
634
+ </table>
635
+
636
+ <br>
637
+
638
+ ### Embedding Model Performance
639
+ <table>
640
+ <tr>
641
+ <td align="center">Backbone</td>
642
+ <td align="center">Kanana-Nano-2.1b</td>
643
+ <td align="center">Llama-3.2-3b</td>
644
+ <td align="center">Qwen2.5-3b</td>
645
+ <td align="center">Llama-3.2-1b</td>
646
+ <td align="center">Qwen-2.5-1.5b</td>
647
+ </tr>
648
+ <tr>
649
+ <td align="center">English</td>
650
+ <td align="center">51.56</td>
651
+ <td align="center">53.28</td>
652
+ <td align="center"><strong>54.00</strong></td>
653
+ <td align="center">48.77</td>
654
+ <td align="center">50.60</td>
655
+ </tr>
656
+ <tr>
657
+ <td align="center">Korean</td>
658
+ <td align="center"><strong>65.00</strong></td>
659
+ <td align="center">59.43</td>
660
+ <td align="center">62.10</td>
661
+ <td align="center">54.68</td>
662
+ <td align="center">54.60</td>
663
+ </tr>
664
+ <tr>
665
+ <td align="center">Avg.</td>
666
+ <td align="center"><strong>58.28</strong></td>
667
+ <td align="center">56.35</td>
668
+ <td align="center">58.05</td>
669
+ <td align="center">51.73</td>
670
+ <td align="center">52.60</td>
671
+ </tr>
672
+ </table>
673
+
674
+ <br>
675
+
676
+ ## Quickstart
677
+
678
+ ### 🤗 HuggingFace Transformers
679
+
680
+ - `transformers>=4.45.0` or the latest version is required to run `Kanana` model.
681
+ ```bash
682
+ pip install transformers>=4.45.0
683
+ ```
684
+
685
+ #### Example Usage for `kanana-nano-2.1b-embedding`
686
+
687
+ > [!Note]
688
+ > You need to install `datasets` via `pip install datasets` before using `kanana-nano-2.1b-embedding` model.
689
+
690
+ ```python
691
+ import torch.nn.functional as F
692
+ from transformers import AutoModel
693
+
694
+ instruction = "Given a question, retrieve passages that answer the question"
695
+ queries = [
696
+ "are judo throws allowed in wrestling?",
697
+ "how to become a radiology technician in michigan?",
698
+ ]
699
+
700
+ passages = [
701
+ "Since you're reading this, you are probably someone from a judo background or someone who is just wondering how judo techniques can be applied under wrestling rules. So without further ado, let's get to the question. Are Judo throws allowed in wrestling? Yes, judo throws are allowed in freestyle and folkstyle wrestling. You only need to be careful to follow the slam rules when executing judo throws. In wrestling, a slam is lifting and returning an opponent to the mat with unnecessary force.",
702
+ "Below are the basic steps to becoming a radiologic technologist in Michigan:Earn a high school diploma. As with most careers in health care, a high school education is the first step to finding entry-level employment. Taking classes in math and science, such as anatomy, biology, chemistry, physiology, and physics, can help prepare students for their college studies and future careers.Earn an associate degree. Entry-level radiologic positions typically require at least an Associate of Applied Science. Before enrolling in one of these degree programs, students should make sure it has been properly accredited by the Joint Review Committee on Education in Radiologic Technology (JRCERT).Get licensed or certified in the state of Michigan.",
703
+ ]
704
+
705
+ model = AutoModel.from_pretrained(
706
+ "kakaocorp/kanana-nano-2.1b-embedding",
707
+ trust_remote_code=True,
708
+ ).to("cuda")
709
+
710
+ max_length = 512
711
+ query_embeddings = model.encode(queries, instruction=instruction, max_length=max_length)
712
+ passage_embeddings = model.encode(passages, instruction="", max_length=max_length)
713
+
714
+ # get the embeddings with DataLoader (spliting the datasets into multiple mini-batches)
715
+ # batch_size = 2
716
+ # query_embeddings = model._do_encode(queries, batch_size=batch_size, instruction=instruction, max_length=max_length)
717
+ # passage_embeddings = model._do_encode(passages, batch_size=batch_size, instruction="", max_length=max_length)
718
+
719
+ query_embeddings = F.normalize(query_embeddings, p=2, dim=1)
720
+ passage_embeddings = F.normalize(passage_embeddings, p=2, dim=1)
721
+
722
+
723
+ scores = (query_embeddings @ passage_embeddings.T) * 100
724
+ print(scores.tolist())
725
+
726
+ # Output:
727
+ # [[84.36527252197266, 31.752296447753906], [35.940425872802734, 81.82719421386719]]
728
+ ```
729
+
730
+ <br>
731
+
732
+ ## License
733
+
734
+ The `Kanana` models are licensed under [CC-BY-NC-4.0](https://spdx.org/licenses/CC-BY-NC-4.0).
735
+
736
+ <br>
737
+
738
+ ## Citation
739
+
740
+ ```
741
+ @article{kanana,
742
+ title={Kanana: Compute-efficient Bilingual Language Models},
743
+ author={Kanana LLM Team},
744
+ journal={TBD},
745
+ year={2025}
746
+ }
747
+ ```
748
+
749
+ <br>
750
+
751
+ ## Contributors
752
+ - Pre-training: Yunju Bak, Doohae Jung, Boseop Kim, Nayeon Kim, Hojin Lee, Jaesun Park, Minho Ryu
753
+ - Post-training: Jiyeon Ham, Seungjae Jung, Hyunho Kim, Hyunwoong Ko, Changmin Lee, Daniel Wontae Nam, Kyoung-Woon On
754
+ - Adaptation: Seulye Baeg, Junrae Cho, Taegyeong Eo, Sunghee Jung, Jieun Kang, EungGyun Kim, Eunhwa Kim, Byeongil Ko, Daniel Lee, Donghun Lee, Minchul Lee, Miok Lee, Shinbok Lee, Minho Ryu, Gaeun Seo
755
+
756
+ <br>
757
+
758
+ ## Contact
759
+ - Kanana LLM Team Technical Support: [email protected]
760
+ - Business & Partnership Contact: [email protected]
assets/logo/kanana-logo.png ADDED
assets/performance/flops-vs-mmlu.jpg ADDED

Git LFS Details

  • SHA256: 72eb65fd674025c294d4abeb4f47dec5953ba93014abd32eb02d896126d0cb5e
  • Pointer size: 131 Bytes
  • Size of remote file: 333 kB
config.json ADDED
@@ -0,0 +1,34 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "architectures": [
3
+ "Kanana2VecModel"
4
+ ],
5
+ "auto_map": {
6
+ "AutoConfig": "configuration_kanana2vec.Kanana2VecConfig",
7
+ "AutoModel": "modeling_kanana2vec.Kanana2VecModel"
8
+ },
9
+ "attention_bias": false,
10
+ "attention_dropout": 0.0,
11
+ "bos_token_id": 128000,
12
+ "eos_token_id": 128001,
13
+ "head_dim": 128,
14
+ "hidden_act": "silu",
15
+ "hidden_size": 1792,
16
+ "initializer_range": 0.02,
17
+ "intermediate_size": 8064,
18
+ "max_position_embeddings": 8192,
19
+ "mlp_bias": false,
20
+ "model_type": "kanana2vec",
21
+ "num_attention_heads": 24,
22
+ "num_hidden_layers": 32,
23
+ "num_key_value_heads": 8,
24
+ "pad_token_id": 128001,
25
+ "pretraining_tp": 1,
26
+ "rms_norm_eps": 1e-05,
27
+ "rope_scaling": null,
28
+ "rope_theta": 500000.0,
29
+ "tie_word_embeddings": true,
30
+ "torch_dtype": "bfloat16",
31
+ "transformers_version": "4.49.0.dev0",
32
+ "use_cache": true,
33
+ "vocab_size": 128256
34
+ }
configuration_kanana2vec.py ADDED
@@ -0,0 +1,194 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # coding=utf-8
2
+ # Copyright 2024 Kakao Corp. team. All rights reserved.
3
+ #
4
+ # Licensed under the Apache License, Version 2.0 (the "License");
5
+ # you may not use this file except in compliance with the License.
6
+ # You may obtain a copy of the License at
7
+ #
8
+ # http://www.apache.org/licenses/LICENSE-2.0
9
+ #
10
+ # Unless required by applicable law or agreed to in writing, software
11
+ # distributed under the License is distributed on an "AS IS" BASIS,
12
+ # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13
+ # See the License for the specific language governing permissions and
14
+ # limitations under the License.
15
+ """Kanana2Vec model configuration"""
16
+
17
+ from transformers import AutoConfig
18
+ from transformers.models.llama import LlamaConfig
19
+ from transformers.configuration_utils import PretrainedConfig
20
+ from transformers.modeling_rope_utils import rope_config_validation
21
+
22
+
23
+ class Kanana2VecConfig(PretrainedConfig):
24
+ r"""
25
+ This is the configuration class to store the configuration of a [`Kanana2VecModel`]. It is used to instantiate an LLaMA
26
+ model according to the specified arguments, defining the model architecture. Instantiating a configuration with the
27
+ defaults will yield a similar configuration to that of the LLaMA-7B.
28
+
29
+ Configuration objects inherit from [`PretrainedConfig`] and can be used to control the model outputs. Read the
30
+ documentation from [`PretrainedConfig`] for more information.
31
+
32
+
33
+ Args:
34
+ vocab_size (`int`, *optional*, defaults to 32000):
35
+ Vocabulary size of the LLaMA model. Defines the number of different tokens that can be represented by the
36
+ `inputs_ids` passed when calling [`Kanana2VecModel`]
37
+ hidden_size (`int`, *optional*, defaults to 4096):
38
+ Dimension of the hidden representations.
39
+ intermediate_size (`int`, *optional*, defaults to 11008):
40
+ Dimension of the MLP representations.
41
+ num_hidden_layers (`int`, *optional*, defaults to 32):
42
+ Number of hidden layers in the Transformer decoder.
43
+ num_attention_heads (`int`, *optional*, defaults to 32):
44
+ Number of attention heads for each attention layer in the Transformer decoder.
45
+ num_key_value_heads (`int`, *optional*):
46
+ This is the number of key_value heads that should be used to implement Grouped Query Attention. If
47
+ `num_key_value_heads=num_attention_heads`, the model will use Multi Head Attention (MHA), if
48
+ `num_key_value_heads=1` the model will use Multi Query Attention (MQA) otherwise GQA is used. When
49
+ converting a multi-head checkpoint to a GQA checkpoint, each group key and value head should be constructed
50
+ by meanpooling all the original heads within that group. For more details checkout [this
51
+ paper](https://arxiv.org/pdf/2305.13245.pdf). If it is not specified, will default to
52
+ `num_attention_heads`.
53
+ hidden_act (`str` or `function`, *optional*, defaults to `"silu"`):
54
+ The non-linear activation function (function or string) in the decoder.
55
+ max_position_embeddings (`int`, *optional*, defaults to 2048):
56
+ The maximum sequence length that this model might ever be used with. Llama 1 supports up to 2048 tokens,
57
+ Llama 2 up to 4096, CodeLlama up to 16384.
58
+ initializer_range (`float`, *optional*, defaults to 0.02):
59
+ The standard deviation of the truncated_normal_initializer for initializing all weight matrices.
60
+ rms_norm_eps (`float`, *optional*, defaults to 1e-06):
61
+ The epsilon used by the rms normalization layers.
62
+ use_cache (`bool`, *optional*, defaults to `True`):
63
+ Whether or not the model should return the last key/values attentions (not used by all models). Only
64
+ relevant if `config.is_decoder=True`.
65
+ pad_token_id (`int`, *optional*):
66
+ Padding token id.
67
+ bos_token_id (`int`, *optional*, defaults to 1):
68
+ Beginning of stream token id.
69
+ eos_token_id (`int`, *optional*, defaults to 2):
70
+ End of stream token id.
71
+ pretraining_tp (`int`, *optional*, defaults to 1):
72
+ Experimental feature. Tensor parallelism rank used during pretraining. Please refer to [this
73
+ document](https://huggingface.co/docs/transformers/main/perf_train_gpu_many#tensor-parallelism) to
74
+ understand more about it. This value is necessary to ensure exact reproducibility of the pretraining
75
+ results. Please refer to [this issue](https://github.com/pytorch/pytorch/issues/76232).
76
+ tie_word_embeddings (`bool`, *optional*, defaults to `False`):
77
+ Whether to tie weight embeddings
78
+ rope_theta (`float`, *optional*, defaults to 10000.0):
79
+ The base period of the RoPE embeddings.
80
+ rope_scaling (`Dict`, *optional*):
81
+ Dictionary containing the scaling configuration for the RoPE embeddings. NOTE: if you apply new rope type
82
+ and you expect the model to work on longer `max_position_embeddings`, we recommend you to update this value
83
+ accordingly.
84
+ Expected contents:
85
+ `rope_type` (`str`):
86
+ The sub-variant of RoPE to use. Can be one of ['default', 'linear', 'dynamic', 'yarn', 'longrope',
87
+ 'llama3'], with 'default' being the original RoPE implementation.
88
+ `factor` (`float`, *optional*):
89
+ Used with all rope types except 'default'. The scaling factor to apply to the RoPE embeddings. In
90
+ most scaling types, a `factor` of x will enable the model to handle sequences of length x *
91
+ original maximum pre-trained length.
92
+ `original_max_position_embeddings` (`int`, *optional*):
93
+ Used with 'dynamic', 'longrope' and 'llama3'. The original max position embeddings used during
94
+ pretraining.
95
+ `attention_factor` (`float`, *optional*):
96
+ Used with 'yarn' and 'longrope'. The scaling factor to be applied on the attention
97
+ computation. If unspecified, it defaults to value recommended by the implementation, using the
98
+ `factor` field to infer the suggested value.
99
+ `beta_fast` (`float`, *optional*):
100
+ Only used with 'yarn'. Parameter to set the boundary for extrapolation (only) in the linear
101
+ ramp function. If unspecified, it defaults to 32.
102
+ `beta_slow` (`float`, *optional*):
103
+ Only used with 'yarn'. Parameter to set the boundary for interpolation (only) in the linear
104
+ ramp function. If unspecified, it defaults to 1.
105
+ `short_factor` (`List[float]`, *optional*):
106
+ Only used with 'longrope'. The scaling factor to be applied to short contexts (<
107
+ `original_max_position_embeddings`). Must be a list of numbers with the same length as the hidden
108
+ size divided by the number of attention heads divided by 2
109
+ `long_factor` (`List[float]`, *optional*):
110
+ Only used with 'longrope'. The scaling factor to be applied to long contexts (<
111
+ `original_max_position_embeddings`). Must be a list of numbers with the same length as the hidden
112
+ size divided by the number of attention heads divided by 2
113
+ `low_freq_factor` (`float`, *optional*):
114
+ Only used with 'llama3'. Scaling factor applied to low frequency components of the RoPE
115
+ `high_freq_factor` (`float`, *optional*):
116
+ Only used with 'llama3'. Scaling factor applied to high frequency components of the RoPE
117
+ attention_bias (`bool`, *optional*, defaults to `False`):
118
+ Whether to use a bias in the query, key, value and output projection layers during self-attention.
119
+ attention_dropout (`float`, *optional*, defaults to 0.0):
120
+ The dropout ratio for the attention probabilities.
121
+ mlp_bias (`bool`, *optional*, defaults to `False`):
122
+ Whether to use a bias in up_proj, down_proj and gate_proj layers in the MLP layers.
123
+ head_dim (`int`, *optional*):
124
+ The attention head dimension. If None, it will default to hidden_size // num_heads"""
125
+
126
+ model_type = "kanana2vec"
127
+ keys_to_ignore_at_inference = ["past_key_values"]
128
+
129
+ def __init__(
130
+ self,
131
+ vocab_size=32000,
132
+ hidden_size=4096,
133
+ intermediate_size=11008,
134
+ num_hidden_layers=32,
135
+ num_attention_heads=32,
136
+ num_key_value_heads=None,
137
+ hidden_act="silu",
138
+ max_position_embeddings=2048,
139
+ initializer_range=0.02,
140
+ rms_norm_eps=1e-6,
141
+ use_cache=True,
142
+ pad_token_id=None,
143
+ bos_token_id=1,
144
+ eos_token_id=2,
145
+ pretraining_tp=1,
146
+ tie_word_embeddings=False,
147
+ rope_theta=10000.0,
148
+ rope_scaling=None,
149
+ attention_bias=False,
150
+ attention_dropout=0.0,
151
+ mlp_bias=False,
152
+ head_dim=None,
153
+ **kwargs,
154
+ ):
155
+ self.vocab_size = vocab_size
156
+ self.max_position_embeddings = max_position_embeddings
157
+ self.hidden_size = hidden_size
158
+ self.intermediate_size = intermediate_size
159
+ self.num_hidden_layers = num_hidden_layers
160
+ self.num_attention_heads = num_attention_heads
161
+
162
+ # for backward compatibility
163
+ if num_key_value_heads is None:
164
+ num_key_value_heads = num_attention_heads
165
+
166
+ self.num_key_value_heads = num_key_value_heads
167
+ self.hidden_act = hidden_act
168
+ self.initializer_range = initializer_range
169
+ self.rms_norm_eps = rms_norm_eps
170
+ self.pretraining_tp = pretraining_tp
171
+ self.use_cache = use_cache
172
+ self.rope_theta = rope_theta
173
+ self.rope_scaling = rope_scaling
174
+ self.attention_bias = attention_bias
175
+ self.attention_dropout = attention_dropout
176
+ self.mlp_bias = mlp_bias
177
+ self.head_dim = head_dim if head_dim is not None else self.hidden_size // self.num_attention_heads
178
+ # Validate the correctness of rotary position embeddings parameters
179
+ # BC: if there is a 'type' field, copy it it to 'rope_type'.
180
+ if self.rope_scaling is not None and "type" in self.rope_scaling:
181
+ self.rope_scaling["rope_type"] = self.rope_scaling["type"]
182
+ rope_config_validation(self)
183
+
184
+ super().__init__(
185
+ pad_token_id=pad_token_id,
186
+ bos_token_id=bos_token_id,
187
+ eos_token_id=eos_token_id,
188
+ tie_word_embeddings=tie_word_embeddings,
189
+ **kwargs,
190
+ )
191
+
192
+
193
+ class BiLlamaConfig(LlamaConfig):
194
+ model_type = "billama"
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:dca0866bb825bbc0b9c40201153f80d57da4c6e55b01f1a927487ce0f4217ba9
3
+ size 4173990512
modeling_kanana2vec.py ADDED
@@ -0,0 +1,249 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from dataclasses import dataclass
2
+ from typing import List, Union, Dict, Mapping, TypedDict
3
+ from functools import partial
4
+ from tqdm.auto import tqdm
5
+ import numpy as np
6
+
7
+ from datasets import Dataset
8
+ import torch
9
+ from torch.utils.data import DataLoader
10
+ from transformers import (
11
+ AutoTokenizer,
12
+ BatchEncoding,
13
+ DataCollatorWithPadding,
14
+ PreTrainedModel,
15
+ PreTrainedTokenizerFast,
16
+ )
17
+ from transformers.models.llama.modeling_llama import (
18
+ LlamaConfig,
19
+ LlamaModel,
20
+ LLAMA_INPUTS_DOCSTRING,
21
+ )
22
+ from transformers.utils import (
23
+ logging,
24
+ ModelOutput,
25
+ )
26
+ from .configuration_kanana2vec import Kanana2VecConfig
27
+
28
+
29
+ logger = logging.get_logger(__name__)
30
+
31
+ def _move_to_device(maybe_tensor, device: torch.device):
32
+ if torch.is_tensor(maybe_tensor):
33
+ return maybe_tensor.to(device, non_blocking=device.type == "cuda")
34
+ elif isinstance(maybe_tensor, dict):
35
+ return {key: _move_to_device(value, device) for key, value in maybe_tensor.items()}
36
+ elif isinstance(maybe_tensor, list):
37
+ return [_move_to_device(x, device) for x in maybe_tensor]
38
+ elif isinstance(maybe_tensor, tuple):
39
+ return tuple([_move_to_device(x, device) for x in maybe_tensor])
40
+ elif isinstance(maybe_tensor, Mapping):
41
+ return type(maybe_tensor)({k: _move_to_device(v, device) for k, v in maybe_tensor.items()})
42
+ else:
43
+ return maybe_tensor
44
+
45
+ def move_to_device(sample, device: torch.device):
46
+ if device.type == "cpu":
47
+ return sample
48
+
49
+ if len(sample) == 0:
50
+ return {}
51
+ return _move_to_device(sample, device)
52
+
53
+ def input_transform_func(
54
+ tokenizer: PreTrainedTokenizerFast,
55
+ examples: Dict[str, List],
56
+ max_length: int,
57
+ instruction: str,
58
+ ) -> BatchEncoding:
59
+ if len(instruction) > 0:
60
+ examples['text'] = [f"{instruction.strip()} {text.strip()}" for text in examples['text']]
61
+ else:
62
+ examples['text'] = [f"{text.strip()}" for text in examples['text']]
63
+
64
+ batch_dict = tokenizer(
65
+ examples['text'],
66
+ max_length=max_length,
67
+ padding=True,
68
+ return_token_type_ids=False,
69
+ return_tensors="pt",
70
+ truncation=True)
71
+ return batch_dict
72
+
73
+ def format_insruction(instruction: str):
74
+ return f"Instruct: {instruction}\nQuery:" if len(instruction) > 0 else ""
75
+
76
+
77
+ class Kanana2VecFeatures(TypedDict):
78
+ input_dict: torch.Tensor
79
+ attention_mask: torch.Tensor
80
+ pool_mask: torch.Tensor
81
+
82
+ @dataclass
83
+ class EmbeddingModelOutput(ModelOutput):
84
+ embedding: torch.FloatTensor = None
85
+
86
+
87
+ class BiLlamaModel(LlamaModel):
88
+ config_class = Kanana2VecConfig
89
+
90
+ def __init__(self, config: LlamaConfig):
91
+ super().__init__(config)
92
+ for layer in self.layers:
93
+ layer.self_attn.is_causal = False
94
+
95
+ @staticmethod
96
+ def _prepare_4d_causal_attention_mask_with_cache_position(
97
+ attention_mask: torch.Tensor,
98
+ sequence_length: int,
99
+ target_length: int,
100
+ dtype: torch.dtype,
101
+ device: torch.device,
102
+ cache_position: torch.Tensor,
103
+ batch_size: int,
104
+ **kwargs,
105
+ ):
106
+ """
107
+ Creates a causal 4D mask of shape `(batch_size, 1, query_length, key_value_length)` from a 2D mask of shape
108
+ `(batch_size, key_value_length)`, or if the input `attention_mask` is already 4D, do nothing.
109
+
110
+ Args:
111
+ attention_mask (`torch.Tensor`):
112
+ A 2D attention mask of shape `(batch_size, key_value_length)` or a 4D attention mask of shape
113
+ `(batch_size, 1, query_length, key_value_length)`.
114
+ sequence_length (`int`):
115
+ The sequence length being processed.
116
+ target_length (`int`):
117
+ The target length: when generating with static cache, the mask should be as long as the static cache,
118
+ to account for the 0 padding, the part of the cache that is not filled yet.
119
+ dtype (`torch.dtype`):
120
+ The dtype to use for the 4D attention mask.
121
+ device (`torch.device`):
122
+ The device to plcae the 4D attention mask on.
123
+ cache_position (`torch.Tensor`):
124
+ Indices depicting the position of the input sequence tokens in the sequence.
125
+ batch_size (`torch.Tensor`):
126
+ Batch size.
127
+ """
128
+ if attention_mask is not None and attention_mask.dim() == 4:
129
+ # In this case we assume that the mask comes already in inverted form and requires no inversion or slicing.
130
+ causal_mask = attention_mask
131
+ else:
132
+ min_dtype = torch.finfo(dtype).min
133
+ causal_mask = torch.zeros(
134
+ (sequence_length, target_length), dtype=dtype, device=device
135
+ )
136
+ causal_mask = causal_mask[None, None, :, :].expand(batch_size, 1, -1, -1)
137
+ if attention_mask is not None:
138
+ causal_mask = causal_mask.clone() # copy to contiguous memory for in-place edit
139
+ mask_length = attention_mask.shape[-1]
140
+ padding_mask = causal_mask[:, :, :, :mask_length] + attention_mask[:, None, None, :]
141
+ padding_mask = padding_mask == 0
142
+ causal_mask[:, :, :, :mask_length] = causal_mask[:, :, :, :mask_length].masked_fill(
143
+ padding_mask, min_dtype
144
+ )
145
+ return causal_mask
146
+
147
+ class Kanana2VecModel(PreTrainedModel):
148
+ config_class = Kanana2VecConfig
149
+ base_model_prefix = "model"
150
+ supports_gradient_checkpointing = True
151
+ _no_split_modules = ["LlamaDecoderLayer"]
152
+ _skip_keys_device_placement = ["past_key_values"]
153
+ _supports_flash_attn_2 = True
154
+ _supports_sdpa = True
155
+ _supports_cache_class = True
156
+ _supports_quantized_cache = True
157
+ _supports_static_cache = True
158
+
159
+ def __init__(self, config: Kanana2VecConfig):
160
+ super().__init__(config)
161
+ self.model = BiLlamaModel(config)
162
+ self.tokenizer = AutoTokenizer.from_pretrained(config._name_or_path, trust_remote_code=True)
163
+ self.add_pad_token()
164
+
165
+ def add_pad_token(self):
166
+ self.tokenizer.pad_token = self.tokenizer.eos_token
167
+
168
+ def prepare_kwargs_from_batch(self, batch_dict: dict, instruction_lens: int, device: torch.device):
169
+ batch_dict = move_to_device(batch_dict, device)
170
+ attention_mask = batch_dict['attention_mask'].clone()
171
+ attention_mask[:, :instruction_lens] = 0
172
+ features: Kanana2VecFeatures = {
173
+ 'input_ids': batch_dict['input_ids'],
174
+ 'attention_mask': batch_dict['attention_mask'],
175
+ 'pool_mask': attention_mask,
176
+ }
177
+ return features
178
+
179
+ def forward(
180
+ self,
181
+ input_ids: torch.Tensor,
182
+ attention_mask: torch.Tensor,
183
+ pool_mask: torch.Tensor,
184
+ return_dict: bool=True,
185
+ **kwargs,
186
+ ):
187
+ last_hidden_states = self.model(
188
+ input_ids=input_ids,
189
+ attention_mask=attention_mask,
190
+ ).last_hidden_state
191
+ pool_mask = pool_mask.to(last_hidden_states.device)
192
+ s = torch.sum(last_hidden_states * pool_mask.unsqueeze(-1).float(), dim=1)
193
+ d = pool_mask.sum(dim=1, keepdim=True).float()
194
+ embedding = s / d
195
+ if not return_dict:
196
+ return (embedding,)
197
+ return EmbeddingModelOutput(embedding=embedding)
198
+
199
+ @torch.no_grad()
200
+ def _do_encode(self,
201
+ sentences: List[str],
202
+ batch_size: int = 1,
203
+ instruction: str = "",
204
+ max_length: int = 512,
205
+ num_workers: int = 0,
206
+ **kwargs
207
+ ) -> Union[np.ndarray, torch.FloatTensor]:
208
+ dataset: Dataset = Dataset.from_dict({'text': sentences})
209
+ instruction = format_insruction(instruction)
210
+ dataset.set_transform(partial(input_transform_func,
211
+ self.tokenizer,
212
+ max_length=max_length,
213
+ instruction=instruction))
214
+
215
+ data_collator = DataCollatorWithPadding(self.tokenizer)
216
+ data_loader = DataLoader(
217
+ dataset,
218
+ batch_size = batch_size,
219
+ shuffle = False,
220
+ drop_last = False,
221
+ num_workers = num_workers,
222
+ collate_fn = data_collator,
223
+ pin_memory = True,
224
+ )
225
+ instruction_lens = len(self.tokenizer.encode(instruction)) if len(instruction) > 0 else 0
226
+
227
+ encoded_embeds = []
228
+ for batch_dict in tqdm(data_loader, desc='encoding', mininterval=10):
229
+ features = self.prepare_kwargs_from_batch(batch_dict, instruction_lens, device=self.device)
230
+ embeds=self(**features).embedding
231
+ encoded_embeds.append(embeds)
232
+ encoded_embeds = torch.cat(encoded_embeds, axis=0)
233
+ if "return_numpy" in kwargs and kwargs.get("return_numpy"):
234
+ encoded_embeds = encoded_embeds.cpu().detach().numpy()
235
+ return encoded_embeds
236
+
237
+ @torch.no_grad()
238
+ def encode(self, sentences: List[str], instruction: str="", max_length: int=512, **kwargs):
239
+ instruction = format_insruction(instruction)
240
+ instruction_lens = len(self.tokenizer.encode(instruction)) if len(instruction) > 0 else 0
241
+
242
+ batch_dict = input_transform_func(
243
+ self.tokenizer,
244
+ {'text': sentences},
245
+ max_length=max_length,
246
+ instruction=instruction,
247
+ )
248
+ features: Kanana2VecFeatures = self.prepare_kwargs_from_batch(batch_dict, instruction_lens, device=self.device)
249
+ return self.forward(**features).embedding
special_tokens_map.json ADDED
@@ -0,0 +1,23 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "bos_token": {
3
+ "content": "<|begin_of_text|>",
4
+ "lstrip": false,
5
+ "normalized": false,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "eos_token": {
10
+ "content": "<|eot_id|>",
11
+ "lstrip": false,
12
+ "normalized": false,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "pad_token": {
17
+ "content": "<|end_of_text|>",
18
+ "lstrip": false,
19
+ "normalized": false,
20
+ "rstrip": false,
21
+ "single_word": false
22
+ }
23
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,2063 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "added_tokens_decoder": {
3
+ "128000": {
4
+ "content": "<|begin_of_text|>",
5
+ "lstrip": false,
6
+ "normalized": false,
7
+ "rstrip": false,
8
+ "single_word": false,
9
+ "special": true
10
+ },
11
+ "128001": {
12
+ "content": "<|end_of_text|>",
13
+ "lstrip": false,
14
+ "normalized": false,
15
+ "rstrip": false,
16
+ "single_word": false,
17
+ "special": true
18
+ },
19
+ "128002": {
20
+ "content": "<|reserved_special_token_0|>",
21
+ "lstrip": false,
22
+ "normalized": false,
23
+ "rstrip": false,
24
+ "single_word": false,
25
+ "special": true
26
+ },
27
+ "128003": {
28
+ "content": "<|reserved_special_token_1|>",
29
+ "lstrip": false,
30
+ "normalized": false,
31
+ "rstrip": false,
32
+ "single_word": false,
33
+ "special": true
34
+ },
35
+ "128004": {
36
+ "content": "<|reserved_special_token_2|>",
37
+ "lstrip": false,
38
+ "normalized": false,
39
+ "rstrip": false,
40
+ "single_word": false,
41
+ "special": true
42
+ },
43
+ "128005": {
44
+ "content": "<|reserved_special_token_3|>",
45
+ "lstrip": false,
46
+ "normalized": false,
47
+ "rstrip": false,
48
+ "single_word": false,
49
+ "special": true
50
+ },
51
+ "128006": {
52
+ "content": "<|start_header_id|>",
53
+ "lstrip": false,
54
+ "normalized": false,
55
+ "rstrip": false,
56
+ "single_word": false,
57
+ "special": true
58
+ },
59
+ "128007": {
60
+ "content": "<|end_header_id|>",
61
+ "lstrip": false,
62
+ "normalized": false,
63
+ "rstrip": false,
64
+ "single_word": false,
65
+ "special": true
66
+ },
67
+ "128008": {
68
+ "content": "<|reserved_special_token_4|>",
69
+ "lstrip": false,
70
+ "normalized": false,
71
+ "rstrip": false,
72
+ "single_word": false,
73
+ "special": true
74
+ },
75
+ "128009": {
76
+ "content": "<|eot_id|>",
77
+ "lstrip": false,
78
+ "normalized": false,
79
+ "rstrip": false,
80
+ "single_word": false,
81
+ "special": true
82
+ },
83
+ "128010": {
84
+ "content": "<|reserved_special_token_5|>",
85
+ "lstrip": false,
86
+ "normalized": false,
87
+ "rstrip": false,
88
+ "single_word": false,
89
+ "special": true
90
+ },
91
+ "128011": {
92
+ "content": "<|reserved_special_token_6|>",
93
+ "lstrip": false,
94
+ "normalized": false,
95
+ "rstrip": false,
96
+ "single_word": false,
97
+ "special": true
98
+ },
99
+ "128012": {
100
+ "content": "<|reserved_special_token_7|>",
101
+ "lstrip": false,
102
+ "normalized": false,
103
+ "rstrip": false,
104
+ "single_word": false,
105
+ "special": true
106
+ },
107
+ "128013": {
108
+ "content": "<|reserved_special_token_8|>",
109
+ "lstrip": false,
110
+ "normalized": false,
111
+ "rstrip": false,
112
+ "single_word": false,
113
+ "special": true
114
+ },
115
+ "128014": {
116
+ "content": "<|reserved_special_token_9|>",
117
+ "lstrip": false,
118
+ "normalized": false,
119
+ "rstrip": false,
120
+ "single_word": false,
121
+ "special": true
122
+ },
123
+ "128015": {
124
+ "content": "<|reserved_special_token_10|>",
125
+ "lstrip": false,
126
+ "normalized": false,
127
+ "rstrip": false,
128
+ "single_word": false,
129
+ "special": true
130
+ },
131
+ "128016": {
132
+ "content": "<|reserved_special_token_11|>",
133
+ "lstrip": false,
134
+ "normalized": false,
135
+ "rstrip": false,
136
+ "single_word": false,
137
+ "special": true
138
+ },
139
+ "128017": {
140
+ "content": "<|reserved_special_token_12|>",
141
+ "lstrip": false,
142
+ "normalized": false,
143
+ "rstrip": false,
144
+ "single_word": false,
145
+ "special": true
146
+ },
147
+ "128018": {
148
+ "content": "<|reserved_special_token_13|>",
149
+ "lstrip": false,
150
+ "normalized": false,
151
+ "rstrip": false,
152
+ "single_word": false,
153
+ "special": true
154
+ },
155
+ "128019": {
156
+ "content": "<|reserved_special_token_14|>",
157
+ "lstrip": false,
158
+ "normalized": false,
159
+ "rstrip": false,
160
+ "single_word": false,
161
+ "special": true
162
+ },
163
+ "128020": {
164
+ "content": "<|reserved_special_token_15|>",
165
+ "lstrip": false,
166
+ "normalized": false,
167
+ "rstrip": false,
168
+ "single_word": false,
169
+ "special": true
170
+ },
171
+ "128021": {
172
+ "content": "<|reserved_special_token_16|>",
173
+ "lstrip": false,
174
+ "normalized": false,
175
+ "rstrip": false,
176
+ "single_word": false,
177
+ "special": true
178
+ },
179
+ "128022": {
180
+ "content": "<|reserved_special_token_17|>",
181
+ "lstrip": false,
182
+ "normalized": false,
183
+ "rstrip": false,
184
+ "single_word": false,
185
+ "special": true
186
+ },
187
+ "128023": {
188
+ "content": "<|reserved_special_token_18|>",
189
+ "lstrip": false,
190
+ "normalized": false,
191
+ "rstrip": false,
192
+ "single_word": false,
193
+ "special": true
194
+ },
195
+ "128024": {
196
+ "content": "<|reserved_special_token_19|>",
197
+ "lstrip": false,
198
+ "normalized": false,
199
+ "rstrip": false,
200
+ "single_word": false,
201
+ "special": true
202
+ },
203
+ "128025": {
204
+ "content": "<|reserved_special_token_20|>",
205
+ "lstrip": false,
206
+ "normalized": false,
207
+ "rstrip": false,
208
+ "single_word": false,
209
+ "special": true
210
+ },
211
+ "128026": {
212
+ "content": "<|reserved_special_token_21|>",
213
+ "lstrip": false,
214
+ "normalized": false,
215
+ "rstrip": false,
216
+ "single_word": false,
217
+ "special": true
218
+ },
219
+ "128027": {
220
+ "content": "<|reserved_special_token_22|>",
221
+ "lstrip": false,
222
+ "normalized": false,
223
+ "rstrip": false,
224
+ "single_word": false,
225
+ "special": true
226
+ },
227
+ "128028": {
228
+ "content": "<|reserved_special_token_23|>",
229
+ "lstrip": false,
230
+ "normalized": false,
231
+ "rstrip": false,
232
+ "single_word": false,
233
+ "special": true
234
+ },
235
+ "128029": {
236
+ "content": "<|reserved_special_token_24|>",
237
+ "lstrip": false,
238
+ "normalized": false,
239
+ "rstrip": false,
240
+ "single_word": false,
241
+ "special": true
242
+ },
243
+ "128030": {
244
+ "content": "<|reserved_special_token_25|>",
245
+ "lstrip": false,
246
+ "normalized": false,
247
+ "rstrip": false,
248
+ "single_word": false,
249
+ "special": true
250
+ },
251
+ "128031": {
252
+ "content": "<|reserved_special_token_26|>",
253
+ "lstrip": false,
254
+ "normalized": false,
255
+ "rstrip": false,
256
+ "single_word": false,
257
+ "special": true
258
+ },
259
+ "128032": {
260
+ "content": "<|reserved_special_token_27|>",
261
+ "lstrip": false,
262
+ "normalized": false,
263
+ "rstrip": false,
264
+ "single_word": false,
265
+ "special": true
266
+ },
267
+ "128033": {
268
+ "content": "<|reserved_special_token_28|>",
269
+ "lstrip": false,
270
+ "normalized": false,
271
+ "rstrip": false,
272
+ "single_word": false,
273
+ "special": true
274
+ },
275
+ "128034": {
276
+ "content": "<|reserved_special_token_29|>",
277
+ "lstrip": false,
278
+ "normalized": false,
279
+ "rstrip": false,
280
+ "single_word": false,
281
+ "special": true
282
+ },
283
+ "128035": {
284
+ "content": "<|reserved_special_token_30|>",
285
+ "lstrip": false,
286
+ "normalized": false,
287
+ "rstrip": false,
288
+ "single_word": false,
289
+ "special": true
290
+ },
291
+ "128036": {
292
+ "content": "<|reserved_special_token_31|>",
293
+ "lstrip": false,
294
+ "normalized": false,
295
+ "rstrip": false,
296
+ "single_word": false,
297
+ "special": true
298
+ },
299
+ "128037": {
300
+ "content": "<|reserved_special_token_32|>",
301
+ "lstrip": false,
302
+ "normalized": false,
303
+ "rstrip": false,
304
+ "single_word": false,
305
+ "special": true
306
+ },
307
+ "128038": {
308
+ "content": "<|reserved_special_token_33|>",
309
+ "lstrip": false,
310
+ "normalized": false,
311
+ "rstrip": false,
312
+ "single_word": false,
313
+ "special": true
314
+ },
315
+ "128039": {
316
+ "content": "<|reserved_special_token_34|>",
317
+ "lstrip": false,
318
+ "normalized": false,
319
+ "rstrip": false,
320
+ "single_word": false,
321
+ "special": true
322
+ },
323
+ "128040": {
324
+ "content": "<|reserved_special_token_35|>",
325
+ "lstrip": false,
326
+ "normalized": false,
327
+ "rstrip": false,
328
+ "single_word": false,
329
+ "special": true
330
+ },
331
+ "128041": {
332
+ "content": "<|reserved_special_token_36|>",
333
+ "lstrip": false,
334
+ "normalized": false,
335
+ "rstrip": false,
336
+ "single_word": false,
337
+ "special": true
338
+ },
339
+ "128042": {
340
+ "content": "<|reserved_special_token_37|>",
341
+ "lstrip": false,
342
+ "normalized": false,
343
+ "rstrip": false,
344
+ "single_word": false,
345
+ "special": true
346
+ },
347
+ "128043": {
348
+ "content": "<|reserved_special_token_38|>",
349
+ "lstrip": false,
350
+ "normalized": false,
351
+ "rstrip": false,
352
+ "single_word": false,
353
+ "special": true
354
+ },
355
+ "128044": {
356
+ "content": "<|reserved_special_token_39|>",
357
+ "lstrip": false,
358
+ "normalized": false,
359
+ "rstrip": false,
360
+ "single_word": false,
361
+ "special": true
362
+ },
363
+ "128045": {
364
+ "content": "<|reserved_special_token_40|>",
365
+ "lstrip": false,
366
+ "normalized": false,
367
+ "rstrip": false,
368
+ "single_word": false,
369
+ "special": true
370
+ },
371
+ "128046": {
372
+ "content": "<|reserved_special_token_41|>",
373
+ "lstrip": false,
374
+ "normalized": false,
375
+ "rstrip": false,
376
+ "single_word": false,
377
+ "special": true
378
+ },
379
+ "128047": {
380
+ "content": "<|reserved_special_token_42|>",
381
+ "lstrip": false,
382
+ "normalized": false,
383
+ "rstrip": false,
384
+ "single_word": false,
385
+ "special": true
386
+ },
387
+ "128048": {
388
+ "content": "<|reserved_special_token_43|>",
389
+ "lstrip": false,
390
+ "normalized": false,
391
+ "rstrip": false,
392
+ "single_word": false,
393
+ "special": true
394
+ },
395
+ "128049": {
396
+ "content": "<|reserved_special_token_44|>",
397
+ "lstrip": false,
398
+ "normalized": false,
399
+ "rstrip": false,
400
+ "single_word": false,
401
+ "special": true
402
+ },
403
+ "128050": {
404
+ "content": "<|reserved_special_token_45|>",
405
+ "lstrip": false,
406
+ "normalized": false,
407
+ "rstrip": false,
408
+ "single_word": false,
409
+ "special": true
410
+ },
411
+ "128051": {
412
+ "content": "<|reserved_special_token_46|>",
413
+ "lstrip": false,
414
+ "normalized": false,
415
+ "rstrip": false,
416
+ "single_word": false,
417
+ "special": true
418
+ },
419
+ "128052": {
420
+ "content": "<|reserved_special_token_47|>",
421
+ "lstrip": false,
422
+ "normalized": false,
423
+ "rstrip": false,
424
+ "single_word": false,
425
+ "special": true
426
+ },
427
+ "128053": {
428
+ "content": "<|reserved_special_token_48|>",
429
+ "lstrip": false,
430
+ "normalized": false,
431
+ "rstrip": false,
432
+ "single_word": false,
433
+ "special": true
434
+ },
435
+ "128054": {
436
+ "content": "<|reserved_special_token_49|>",
437
+ "lstrip": false,
438
+ "normalized": false,
439
+ "rstrip": false,
440
+ "single_word": false,
441
+ "special": true
442
+ },
443
+ "128055": {
444
+ "content": "<|reserved_special_token_50|>",
445
+ "lstrip": false,
446
+ "normalized": false,
447
+ "rstrip": false,
448
+ "single_word": false,
449
+ "special": true
450
+ },
451
+ "128056": {
452
+ "content": "<|reserved_special_token_51|>",
453
+ "lstrip": false,
454
+ "normalized": false,
455
+ "rstrip": false,
456
+ "single_word": false,
457
+ "special": true
458
+ },
459
+ "128057": {
460
+ "content": "<|reserved_special_token_52|>",
461
+ "lstrip": false,
462
+ "normalized": false,
463
+ "rstrip": false,
464
+ "single_word": false,
465
+ "special": true
466
+ },
467
+ "128058": {
468
+ "content": "<|reserved_special_token_53|>",
469
+ "lstrip": false,
470
+ "normalized": false,
471
+ "rstrip": false,
472
+ "single_word": false,
473
+ "special": true
474
+ },
475
+ "128059": {
476
+ "content": "<|reserved_special_token_54|>",
477
+ "lstrip": false,
478
+ "normalized": false,
479
+ "rstrip": false,
480
+ "single_word": false,
481
+ "special": true
482
+ },
483
+ "128060": {
484
+ "content": "<|reserved_special_token_55|>",
485
+ "lstrip": false,
486
+ "normalized": false,
487
+ "rstrip": false,
488
+ "single_word": false,
489
+ "special": true
490
+ },
491
+ "128061": {
492
+ "content": "<|reserved_special_token_56|>",
493
+ "lstrip": false,
494
+ "normalized": false,
495
+ "rstrip": false,
496
+ "single_word": false,
497
+ "special": true
498
+ },
499
+ "128062": {
500
+ "content": "<|reserved_special_token_57|>",
501
+ "lstrip": false,
502
+ "normalized": false,
503
+ "rstrip": false,
504
+ "single_word": false,
505
+ "special": true
506
+ },
507
+ "128063": {
508
+ "content": "<|reserved_special_token_58|>",
509
+ "lstrip": false,
510
+ "normalized": false,
511
+ "rstrip": false,
512
+ "single_word": false,
513
+ "special": true
514
+ },
515
+ "128064": {
516
+ "content": "<|reserved_special_token_59|>",
517
+ "lstrip": false,
518
+ "normalized": false,
519
+ "rstrip": false,
520
+ "single_word": false,
521
+ "special": true
522
+ },
523
+ "128065": {
524
+ "content": "<|reserved_special_token_60|>",
525
+ "lstrip": false,
526
+ "normalized": false,
527
+ "rstrip": false,
528
+ "single_word": false,
529
+ "special": true
530
+ },
531
+ "128066": {
532
+ "content": "<|reserved_special_token_61|>",
533
+ "lstrip": false,
534
+ "normalized": false,
535
+ "rstrip": false,
536
+ "single_word": false,
537
+ "special": true
538
+ },
539
+ "128067": {
540
+ "content": "<|reserved_special_token_62|>",
541
+ "lstrip": false,
542
+ "normalized": false,
543
+ "rstrip": false,
544
+ "single_word": false,
545
+ "special": true
546
+ },
547
+ "128068": {
548
+ "content": "<|reserved_special_token_63|>",
549
+ "lstrip": false,
550
+ "normalized": false,
551
+ "rstrip": false,
552
+ "single_word": false,
553
+ "special": true
554
+ },
555
+ "128069": {
556
+ "content": "<|reserved_special_token_64|>",
557
+ "lstrip": false,
558
+ "normalized": false,
559
+ "rstrip": false,
560
+ "single_word": false,
561
+ "special": true
562
+ },
563
+ "128070": {
564
+ "content": "<|reserved_special_token_65|>",
565
+ "lstrip": false,
566
+ "normalized": false,
567
+ "rstrip": false,
568
+ "single_word": false,
569
+ "special": true
570
+ },
571
+ "128071": {
572
+ "content": "<|reserved_special_token_66|>",
573
+ "lstrip": false,
574
+ "normalized": false,
575
+ "rstrip": false,
576
+ "single_word": false,
577
+ "special": true
578
+ },
579
+ "128072": {
580
+ "content": "<|reserved_special_token_67|>",
581
+ "lstrip": false,
582
+ "normalized": false,
583
+ "rstrip": false,
584
+ "single_word": false,
585
+ "special": true
586
+ },
587
+ "128073": {
588
+ "content": "<|reserved_special_token_68|>",
589
+ "lstrip": false,
590
+ "normalized": false,
591
+ "rstrip": false,
592
+ "single_word": false,
593
+ "special": true
594
+ },
595
+ "128074": {
596
+ "content": "<|reserved_special_token_69|>",
597
+ "lstrip": false,
598
+ "normalized": false,
599
+ "rstrip": false,
600
+ "single_word": false,
601
+ "special": true
602
+ },
603
+ "128075": {
604
+ "content": "<|reserved_special_token_70|>",
605
+ "lstrip": false,
606
+ "normalized": false,
607
+ "rstrip": false,
608
+ "single_word": false,
609
+ "special": true
610
+ },
611
+ "128076": {
612
+ "content": "<|reserved_special_token_71|>",
613
+ "lstrip": false,
614
+ "normalized": false,
615
+ "rstrip": false,
616
+ "single_word": false,
617
+ "special": true
618
+ },
619
+ "128077": {
620
+ "content": "<|reserved_special_token_72|>",
621
+ "lstrip": false,
622
+ "normalized": false,
623
+ "rstrip": false,
624
+ "single_word": false,
625
+ "special": true
626
+ },
627
+ "128078": {
628
+ "content": "<|reserved_special_token_73|>",
629
+ "lstrip": false,
630
+ "normalized": false,
631
+ "rstrip": false,
632
+ "single_word": false,
633
+ "special": true
634
+ },
635
+ "128079": {
636
+ "content": "<|reserved_special_token_74|>",
637
+ "lstrip": false,
638
+ "normalized": false,
639
+ "rstrip": false,
640
+ "single_word": false,
641
+ "special": true
642
+ },
643
+ "128080": {
644
+ "content": "<|reserved_special_token_75|>",
645
+ "lstrip": false,
646
+ "normalized": false,
647
+ "rstrip": false,
648
+ "single_word": false,
649
+ "special": true
650
+ },
651
+ "128081": {
652
+ "content": "<|reserved_special_token_76|>",
653
+ "lstrip": false,
654
+ "normalized": false,
655
+ "rstrip": false,
656
+ "single_word": false,
657
+ "special": true
658
+ },
659
+ "128082": {
660
+ "content": "<|reserved_special_token_77|>",
661
+ "lstrip": false,
662
+ "normalized": false,
663
+ "rstrip": false,
664
+ "single_word": false,
665
+ "special": true
666
+ },
667
+ "128083": {
668
+ "content": "<|reserved_special_token_78|>",
669
+ "lstrip": false,
670
+ "normalized": false,
671
+ "rstrip": false,
672
+ "single_word": false,
673
+ "special": true
674
+ },
675
+ "128084": {
676
+ "content": "<|reserved_special_token_79|>",
677
+ "lstrip": false,
678
+ "normalized": false,
679
+ "rstrip": false,
680
+ "single_word": false,
681
+ "special": true
682
+ },
683
+ "128085": {
684
+ "content": "<|reserved_special_token_80|>",
685
+ "lstrip": false,
686
+ "normalized": false,
687
+ "rstrip": false,
688
+ "single_word": false,
689
+ "special": true
690
+ },
691
+ "128086": {
692
+ "content": "<|reserved_special_token_81|>",
693
+ "lstrip": false,
694
+ "normalized": false,
695
+ "rstrip": false,
696
+ "single_word": false,
697
+ "special": true
698
+ },
699
+ "128087": {
700
+ "content": "<|reserved_special_token_82|>",
701
+ "lstrip": false,
702
+ "normalized": false,
703
+ "rstrip": false,
704
+ "single_word": false,
705
+ "special": true
706
+ },
707
+ "128088": {
708
+ "content": "<|reserved_special_token_83|>",
709
+ "lstrip": false,
710
+ "normalized": false,
711
+ "rstrip": false,
712
+ "single_word": false,
713
+ "special": true
714
+ },
715
+ "128089": {
716
+ "content": "<|reserved_special_token_84|>",
717
+ "lstrip": false,
718
+ "normalized": false,
719
+ "rstrip": false,
720
+ "single_word": false,
721
+ "special": true
722
+ },
723
+ "128090": {
724
+ "content": "<|reserved_special_token_85|>",
725
+ "lstrip": false,
726
+ "normalized": false,
727
+ "rstrip": false,
728
+ "single_word": false,
729
+ "special": true
730
+ },
731
+ "128091": {
732
+ "content": "<|reserved_special_token_86|>",
733
+ "lstrip": false,
734
+ "normalized": false,
735
+ "rstrip": false,
736
+ "single_word": false,
737
+ "special": true
738
+ },
739
+ "128092": {
740
+ "content": "<|reserved_special_token_87|>",
741
+ "lstrip": false,
742
+ "normalized": false,
743
+ "rstrip": false,
744
+ "single_word": false,
745
+ "special": true
746
+ },
747
+ "128093": {
748
+ "content": "<|reserved_special_token_88|>",
749
+ "lstrip": false,
750
+ "normalized": false,
751
+ "rstrip": false,
752
+ "single_word": false,
753
+ "special": true
754
+ },
755
+ "128094": {
756
+ "content": "<|reserved_special_token_89|>",
757
+ "lstrip": false,
758
+ "normalized": false,
759
+ "rstrip": false,
760
+ "single_word": false,
761
+ "special": true
762
+ },
763
+ "128095": {
764
+ "content": "<|reserved_special_token_90|>",
765
+ "lstrip": false,
766
+ "normalized": false,
767
+ "rstrip": false,
768
+ "single_word": false,
769
+ "special": true
770
+ },
771
+ "128096": {
772
+ "content": "<|reserved_special_token_91|>",
773
+ "lstrip": false,
774
+ "normalized": false,
775
+ "rstrip": false,
776
+ "single_word": false,
777
+ "special": true
778
+ },
779
+ "128097": {
780
+ "content": "<|reserved_special_token_92|>",
781
+ "lstrip": false,
782
+ "normalized": false,
783
+ "rstrip": false,
784
+ "single_word": false,
785
+ "special": true
786
+ },
787
+ "128098": {
788
+ "content": "<|reserved_special_token_93|>",
789
+ "lstrip": false,
790
+ "normalized": false,
791
+ "rstrip": false,
792
+ "single_word": false,
793
+ "special": true
794
+ },
795
+ "128099": {
796
+ "content": "<|reserved_special_token_94|>",
797
+ "lstrip": false,
798
+ "normalized": false,
799
+ "rstrip": false,
800
+ "single_word": false,
801
+ "special": true
802
+ },
803
+ "128100": {
804
+ "content": "<|reserved_special_token_95|>",
805
+ "lstrip": false,
806
+ "normalized": false,
807
+ "rstrip": false,
808
+ "single_word": false,
809
+ "special": true
810
+ },
811
+ "128101": {
812
+ "content": "<|reserved_special_token_96|>",
813
+ "lstrip": false,
814
+ "normalized": false,
815
+ "rstrip": false,
816
+ "single_word": false,
817
+ "special": true
818
+ },
819
+ "128102": {
820
+ "content": "<|reserved_special_token_97|>",
821
+ "lstrip": false,
822
+ "normalized": false,
823
+ "rstrip": false,
824
+ "single_word": false,
825
+ "special": true
826
+ },
827
+ "128103": {
828
+ "content": "<|reserved_special_token_98|>",
829
+ "lstrip": false,
830
+ "normalized": false,
831
+ "rstrip": false,
832
+ "single_word": false,
833
+ "special": true
834
+ },
835
+ "128104": {
836
+ "content": "<|reserved_special_token_99|>",
837
+ "lstrip": false,
838
+ "normalized": false,
839
+ "rstrip": false,
840
+ "single_word": false,
841
+ "special": true
842
+ },
843
+ "128105": {
844
+ "content": "<|reserved_special_token_100|>",
845
+ "lstrip": false,
846
+ "normalized": false,
847
+ "rstrip": false,
848
+ "single_word": false,
849
+ "special": true
850
+ },
851
+ "128106": {
852
+ "content": "<|reserved_special_token_101|>",
853
+ "lstrip": false,
854
+ "normalized": false,
855
+ "rstrip": false,
856
+ "single_word": false,
857
+ "special": true
858
+ },
859
+ "128107": {
860
+ "content": "<|reserved_special_token_102|>",
861
+ "lstrip": false,
862
+ "normalized": false,
863
+ "rstrip": false,
864
+ "single_word": false,
865
+ "special": true
866
+ },
867
+ "128108": {
868
+ "content": "<|reserved_special_token_103|>",
869
+ "lstrip": false,
870
+ "normalized": false,
871
+ "rstrip": false,
872
+ "single_word": false,
873
+ "special": true
874
+ },
875
+ "128109": {
876
+ "content": "<|reserved_special_token_104|>",
877
+ "lstrip": false,
878
+ "normalized": false,
879
+ "rstrip": false,
880
+ "single_word": false,
881
+ "special": true
882
+ },
883
+ "128110": {
884
+ "content": "<|reserved_special_token_105|>",
885
+ "lstrip": false,
886
+ "normalized": false,
887
+ "rstrip": false,
888
+ "single_word": false,
889
+ "special": true
890
+ },
891
+ "128111": {
892
+ "content": "<|reserved_special_token_106|>",
893
+ "lstrip": false,
894
+ "normalized": false,
895
+ "rstrip": false,
896
+ "single_word": false,
897
+ "special": true
898
+ },
899
+ "128112": {
900
+ "content": "<|reserved_special_token_107|>",
901
+ "lstrip": false,
902
+ "normalized": false,
903
+ "rstrip": false,
904
+ "single_word": false,
905
+ "special": true
906
+ },
907
+ "128113": {
908
+ "content": "<|reserved_special_token_108|>",
909
+ "lstrip": false,
910
+ "normalized": false,
911
+ "rstrip": false,
912
+ "single_word": false,
913
+ "special": true
914
+ },
915
+ "128114": {
916
+ "content": "<|reserved_special_token_109|>",
917
+ "lstrip": false,
918
+ "normalized": false,
919
+ "rstrip": false,
920
+ "single_word": false,
921
+ "special": true
922
+ },
923
+ "128115": {
924
+ "content": "<|reserved_special_token_110|>",
925
+ "lstrip": false,
926
+ "normalized": false,
927
+ "rstrip": false,
928
+ "single_word": false,
929
+ "special": true
930
+ },
931
+ "128116": {
932
+ "content": "<|reserved_special_token_111|>",
933
+ "lstrip": false,
934
+ "normalized": false,
935
+ "rstrip": false,
936
+ "single_word": false,
937
+ "special": true
938
+ },
939
+ "128117": {
940
+ "content": "<|reserved_special_token_112|>",
941
+ "lstrip": false,
942
+ "normalized": false,
943
+ "rstrip": false,
944
+ "single_word": false,
945
+ "special": true
946
+ },
947
+ "128118": {
948
+ "content": "<|reserved_special_token_113|>",
949
+ "lstrip": false,
950
+ "normalized": false,
951
+ "rstrip": false,
952
+ "single_word": false,
953
+ "special": true
954
+ },
955
+ "128119": {
956
+ "content": "<|reserved_special_token_114|>",
957
+ "lstrip": false,
958
+ "normalized": false,
959
+ "rstrip": false,
960
+ "single_word": false,
961
+ "special": true
962
+ },
963
+ "128120": {
964
+ "content": "<|reserved_special_token_115|>",
965
+ "lstrip": false,
966
+ "normalized": false,
967
+ "rstrip": false,
968
+ "single_word": false,
969
+ "special": true
970
+ },
971
+ "128121": {
972
+ "content": "<|reserved_special_token_116|>",
973
+ "lstrip": false,
974
+ "normalized": false,
975
+ "rstrip": false,
976
+ "single_word": false,
977
+ "special": true
978
+ },
979
+ "128122": {
980
+ "content": "<|reserved_special_token_117|>",
981
+ "lstrip": false,
982
+ "normalized": false,
983
+ "rstrip": false,
984
+ "single_word": false,
985
+ "special": true
986
+ },
987
+ "128123": {
988
+ "content": "<|reserved_special_token_118|>",
989
+ "lstrip": false,
990
+ "normalized": false,
991
+ "rstrip": false,
992
+ "single_word": false,
993
+ "special": true
994
+ },
995
+ "128124": {
996
+ "content": "<|reserved_special_token_119|>",
997
+ "lstrip": false,
998
+ "normalized": false,
999
+ "rstrip": false,
1000
+ "single_word": false,
1001
+ "special": true
1002
+ },
1003
+ "128125": {
1004
+ "content": "<|reserved_special_token_120|>",
1005
+ "lstrip": false,
1006
+ "normalized": false,
1007
+ "rstrip": false,
1008
+ "single_word": false,
1009
+ "special": true
1010
+ },
1011
+ "128126": {
1012
+ "content": "<|reserved_special_token_121|>",
1013
+ "lstrip": false,
1014
+ "normalized": false,
1015
+ "rstrip": false,
1016
+ "single_word": false,
1017
+ "special": true
1018
+ },
1019
+ "128127": {
1020
+ "content": "<|reserved_special_token_122|>",
1021
+ "lstrip": false,
1022
+ "normalized": false,
1023
+ "rstrip": false,
1024
+ "single_word": false,
1025
+ "special": true
1026
+ },
1027
+ "128128": {
1028
+ "content": "<|reserved_special_token_123|>",
1029
+ "lstrip": false,
1030
+ "normalized": false,
1031
+ "rstrip": false,
1032
+ "single_word": false,
1033
+ "special": true
1034
+ },
1035
+ "128129": {
1036
+ "content": "<|reserved_special_token_124|>",
1037
+ "lstrip": false,
1038
+ "normalized": false,
1039
+ "rstrip": false,
1040
+ "single_word": false,
1041
+ "special": true
1042
+ },
1043
+ "128130": {
1044
+ "content": "<|reserved_special_token_125|>",
1045
+ "lstrip": false,
1046
+ "normalized": false,
1047
+ "rstrip": false,
1048
+ "single_word": false,
1049
+ "special": true
1050
+ },
1051
+ "128131": {
1052
+ "content": "<|reserved_special_token_126|>",
1053
+ "lstrip": false,
1054
+ "normalized": false,
1055
+ "rstrip": false,
1056
+ "single_word": false,
1057
+ "special": true
1058
+ },
1059
+ "128132": {
1060
+ "content": "<|reserved_special_token_127|>",
1061
+ "lstrip": false,
1062
+ "normalized": false,
1063
+ "rstrip": false,
1064
+ "single_word": false,
1065
+ "special": true
1066
+ },
1067
+ "128133": {
1068
+ "content": "<|reserved_special_token_128|>",
1069
+ "lstrip": false,
1070
+ "normalized": false,
1071
+ "rstrip": false,
1072
+ "single_word": false,
1073
+ "special": true
1074
+ },
1075
+ "128134": {
1076
+ "content": "<|reserved_special_token_129|>",
1077
+ "lstrip": false,
1078
+ "normalized": false,
1079
+ "rstrip": false,
1080
+ "single_word": false,
1081
+ "special": true
1082
+ },
1083
+ "128135": {
1084
+ "content": "<|reserved_special_token_130|>",
1085
+ "lstrip": false,
1086
+ "normalized": false,
1087
+ "rstrip": false,
1088
+ "single_word": false,
1089
+ "special": true
1090
+ },
1091
+ "128136": {
1092
+ "content": "<|reserved_special_token_131|>",
1093
+ "lstrip": false,
1094
+ "normalized": false,
1095
+ "rstrip": false,
1096
+ "single_word": false,
1097
+ "special": true
1098
+ },
1099
+ "128137": {
1100
+ "content": "<|reserved_special_token_132|>",
1101
+ "lstrip": false,
1102
+ "normalized": false,
1103
+ "rstrip": false,
1104
+ "single_word": false,
1105
+ "special": true
1106
+ },
1107
+ "128138": {
1108
+ "content": "<|reserved_special_token_133|>",
1109
+ "lstrip": false,
1110
+ "normalized": false,
1111
+ "rstrip": false,
1112
+ "single_word": false,
1113
+ "special": true
1114
+ },
1115
+ "128139": {
1116
+ "content": "<|reserved_special_token_134|>",
1117
+ "lstrip": false,
1118
+ "normalized": false,
1119
+ "rstrip": false,
1120
+ "single_word": false,
1121
+ "special": true
1122
+ },
1123
+ "128140": {
1124
+ "content": "<|reserved_special_token_135|>",
1125
+ "lstrip": false,
1126
+ "normalized": false,
1127
+ "rstrip": false,
1128
+ "single_word": false,
1129
+ "special": true
1130
+ },
1131
+ "128141": {
1132
+ "content": "<|reserved_special_token_136|>",
1133
+ "lstrip": false,
1134
+ "normalized": false,
1135
+ "rstrip": false,
1136
+ "single_word": false,
1137
+ "special": true
1138
+ },
1139
+ "128142": {
1140
+ "content": "<|reserved_special_token_137|>",
1141
+ "lstrip": false,
1142
+ "normalized": false,
1143
+ "rstrip": false,
1144
+ "single_word": false,
1145
+ "special": true
1146
+ },
1147
+ "128143": {
1148
+ "content": "<|reserved_special_token_138|>",
1149
+ "lstrip": false,
1150
+ "normalized": false,
1151
+ "rstrip": false,
1152
+ "single_word": false,
1153
+ "special": true
1154
+ },
1155
+ "128144": {
1156
+ "content": "<|reserved_special_token_139|>",
1157
+ "lstrip": false,
1158
+ "normalized": false,
1159
+ "rstrip": false,
1160
+ "single_word": false,
1161
+ "special": true
1162
+ },
1163
+ "128145": {
1164
+ "content": "<|reserved_special_token_140|>",
1165
+ "lstrip": false,
1166
+ "normalized": false,
1167
+ "rstrip": false,
1168
+ "single_word": false,
1169
+ "special": true
1170
+ },
1171
+ "128146": {
1172
+ "content": "<|reserved_special_token_141|>",
1173
+ "lstrip": false,
1174
+ "normalized": false,
1175
+ "rstrip": false,
1176
+ "single_word": false,
1177
+ "special": true
1178
+ },
1179
+ "128147": {
1180
+ "content": "<|reserved_special_token_142|>",
1181
+ "lstrip": false,
1182
+ "normalized": false,
1183
+ "rstrip": false,
1184
+ "single_word": false,
1185
+ "special": true
1186
+ },
1187
+ "128148": {
1188
+ "content": "<|reserved_special_token_143|>",
1189
+ "lstrip": false,
1190
+ "normalized": false,
1191
+ "rstrip": false,
1192
+ "single_word": false,
1193
+ "special": true
1194
+ },
1195
+ "128149": {
1196
+ "content": "<|reserved_special_token_144|>",
1197
+ "lstrip": false,
1198
+ "normalized": false,
1199
+ "rstrip": false,
1200
+ "single_word": false,
1201
+ "special": true
1202
+ },
1203
+ "128150": {
1204
+ "content": "<|reserved_special_token_145|>",
1205
+ "lstrip": false,
1206
+ "normalized": false,
1207
+ "rstrip": false,
1208
+ "single_word": false,
1209
+ "special": true
1210
+ },
1211
+ "128151": {
1212
+ "content": "<|reserved_special_token_146|>",
1213
+ "lstrip": false,
1214
+ "normalized": false,
1215
+ "rstrip": false,
1216
+ "single_word": false,
1217
+ "special": true
1218
+ },
1219
+ "128152": {
1220
+ "content": "<|reserved_special_token_147|>",
1221
+ "lstrip": false,
1222
+ "normalized": false,
1223
+ "rstrip": false,
1224
+ "single_word": false,
1225
+ "special": true
1226
+ },
1227
+ "128153": {
1228
+ "content": "<|reserved_special_token_148|>",
1229
+ "lstrip": false,
1230
+ "normalized": false,
1231
+ "rstrip": false,
1232
+ "single_word": false,
1233
+ "special": true
1234
+ },
1235
+ "128154": {
1236
+ "content": "<|reserved_special_token_149|>",
1237
+ "lstrip": false,
1238
+ "normalized": false,
1239
+ "rstrip": false,
1240
+ "single_word": false,
1241
+ "special": true
1242
+ },
1243
+ "128155": {
1244
+ "content": "<|reserved_special_token_150|>",
1245
+ "lstrip": false,
1246
+ "normalized": false,
1247
+ "rstrip": false,
1248
+ "single_word": false,
1249
+ "special": true
1250
+ },
1251
+ "128156": {
1252
+ "content": "<|reserved_special_token_151|>",
1253
+ "lstrip": false,
1254
+ "normalized": false,
1255
+ "rstrip": false,
1256
+ "single_word": false,
1257
+ "special": true
1258
+ },
1259
+ "128157": {
1260
+ "content": "<|reserved_special_token_152|>",
1261
+ "lstrip": false,
1262
+ "normalized": false,
1263
+ "rstrip": false,
1264
+ "single_word": false,
1265
+ "special": true
1266
+ },
1267
+ "128158": {
1268
+ "content": "<|reserved_special_token_153|>",
1269
+ "lstrip": false,
1270
+ "normalized": false,
1271
+ "rstrip": false,
1272
+ "single_word": false,
1273
+ "special": true
1274
+ },
1275
+ "128159": {
1276
+ "content": "<|reserved_special_token_154|>",
1277
+ "lstrip": false,
1278
+ "normalized": false,
1279
+ "rstrip": false,
1280
+ "single_word": false,
1281
+ "special": true
1282
+ },
1283
+ "128160": {
1284
+ "content": "<|reserved_special_token_155|>",
1285
+ "lstrip": false,
1286
+ "normalized": false,
1287
+ "rstrip": false,
1288
+ "single_word": false,
1289
+ "special": true
1290
+ },
1291
+ "128161": {
1292
+ "content": "<|reserved_special_token_156|>",
1293
+ "lstrip": false,
1294
+ "normalized": false,
1295
+ "rstrip": false,
1296
+ "single_word": false,
1297
+ "special": true
1298
+ },
1299
+ "128162": {
1300
+ "content": "<|reserved_special_token_157|>",
1301
+ "lstrip": false,
1302
+ "normalized": false,
1303
+ "rstrip": false,
1304
+ "single_word": false,
1305
+ "special": true
1306
+ },
1307
+ "128163": {
1308
+ "content": "<|reserved_special_token_158|>",
1309
+ "lstrip": false,
1310
+ "normalized": false,
1311
+ "rstrip": false,
1312
+ "single_word": false,
1313
+ "special": true
1314
+ },
1315
+ "128164": {
1316
+ "content": "<|reserved_special_token_159|>",
1317
+ "lstrip": false,
1318
+ "normalized": false,
1319
+ "rstrip": false,
1320
+ "single_word": false,
1321
+ "special": true
1322
+ },
1323
+ "128165": {
1324
+ "content": "<|reserved_special_token_160|>",
1325
+ "lstrip": false,
1326
+ "normalized": false,
1327
+ "rstrip": false,
1328
+ "single_word": false,
1329
+ "special": true
1330
+ },
1331
+ "128166": {
1332
+ "content": "<|reserved_special_token_161|>",
1333
+ "lstrip": false,
1334
+ "normalized": false,
1335
+ "rstrip": false,
1336
+ "single_word": false,
1337
+ "special": true
1338
+ },
1339
+ "128167": {
1340
+ "content": "<|reserved_special_token_162|>",
1341
+ "lstrip": false,
1342
+ "normalized": false,
1343
+ "rstrip": false,
1344
+ "single_word": false,
1345
+ "special": true
1346
+ },
1347
+ "128168": {
1348
+ "content": "<|reserved_special_token_163|>",
1349
+ "lstrip": false,
1350
+ "normalized": false,
1351
+ "rstrip": false,
1352
+ "single_word": false,
1353
+ "special": true
1354
+ },
1355
+ "128169": {
1356
+ "content": "<|reserved_special_token_164|>",
1357
+ "lstrip": false,
1358
+ "normalized": false,
1359
+ "rstrip": false,
1360
+ "single_word": false,
1361
+ "special": true
1362
+ },
1363
+ "128170": {
1364
+ "content": "<|reserved_special_token_165|>",
1365
+ "lstrip": false,
1366
+ "normalized": false,
1367
+ "rstrip": false,
1368
+ "single_word": false,
1369
+ "special": true
1370
+ },
1371
+ "128171": {
1372
+ "content": "<|reserved_special_token_166|>",
1373
+ "lstrip": false,
1374
+ "normalized": false,
1375
+ "rstrip": false,
1376
+ "single_word": false,
1377
+ "special": true
1378
+ },
1379
+ "128172": {
1380
+ "content": "<|reserved_special_token_167|>",
1381
+ "lstrip": false,
1382
+ "normalized": false,
1383
+ "rstrip": false,
1384
+ "single_word": false,
1385
+ "special": true
1386
+ },
1387
+ "128173": {
1388
+ "content": "<|reserved_special_token_168|>",
1389
+ "lstrip": false,
1390
+ "normalized": false,
1391
+ "rstrip": false,
1392
+ "single_word": false,
1393
+ "special": true
1394
+ },
1395
+ "128174": {
1396
+ "content": "<|reserved_special_token_169|>",
1397
+ "lstrip": false,
1398
+ "normalized": false,
1399
+ "rstrip": false,
1400
+ "single_word": false,
1401
+ "special": true
1402
+ },
1403
+ "128175": {
1404
+ "content": "<|reserved_special_token_170|>",
1405
+ "lstrip": false,
1406
+ "normalized": false,
1407
+ "rstrip": false,
1408
+ "single_word": false,
1409
+ "special": true
1410
+ },
1411
+ "128176": {
1412
+ "content": "<|reserved_special_token_171|>",
1413
+ "lstrip": false,
1414
+ "normalized": false,
1415
+ "rstrip": false,
1416
+ "single_word": false,
1417
+ "special": true
1418
+ },
1419
+ "128177": {
1420
+ "content": "<|reserved_special_token_172|>",
1421
+ "lstrip": false,
1422
+ "normalized": false,
1423
+ "rstrip": false,
1424
+ "single_word": false,
1425
+ "special": true
1426
+ },
1427
+ "128178": {
1428
+ "content": "<|reserved_special_token_173|>",
1429
+ "lstrip": false,
1430
+ "normalized": false,
1431
+ "rstrip": false,
1432
+ "single_word": false,
1433
+ "special": true
1434
+ },
1435
+ "128179": {
1436
+ "content": "<|reserved_special_token_174|>",
1437
+ "lstrip": false,
1438
+ "normalized": false,
1439
+ "rstrip": false,
1440
+ "single_word": false,
1441
+ "special": true
1442
+ },
1443
+ "128180": {
1444
+ "content": "<|reserved_special_token_175|>",
1445
+ "lstrip": false,
1446
+ "normalized": false,
1447
+ "rstrip": false,
1448
+ "single_word": false,
1449
+ "special": true
1450
+ },
1451
+ "128181": {
1452
+ "content": "<|reserved_special_token_176|>",
1453
+ "lstrip": false,
1454
+ "normalized": false,
1455
+ "rstrip": false,
1456
+ "single_word": false,
1457
+ "special": true
1458
+ },
1459
+ "128182": {
1460
+ "content": "<|reserved_special_token_177|>",
1461
+ "lstrip": false,
1462
+ "normalized": false,
1463
+ "rstrip": false,
1464
+ "single_word": false,
1465
+ "special": true
1466
+ },
1467
+ "128183": {
1468
+ "content": "<|reserved_special_token_178|>",
1469
+ "lstrip": false,
1470
+ "normalized": false,
1471
+ "rstrip": false,
1472
+ "single_word": false,
1473
+ "special": true
1474
+ },
1475
+ "128184": {
1476
+ "content": "<|reserved_special_token_179|>",
1477
+ "lstrip": false,
1478
+ "normalized": false,
1479
+ "rstrip": false,
1480
+ "single_word": false,
1481
+ "special": true
1482
+ },
1483
+ "128185": {
1484
+ "content": "<|reserved_special_token_180|>",
1485
+ "lstrip": false,
1486
+ "normalized": false,
1487
+ "rstrip": false,
1488
+ "single_word": false,
1489
+ "special": true
1490
+ },
1491
+ "128186": {
1492
+ "content": "<|reserved_special_token_181|>",
1493
+ "lstrip": false,
1494
+ "normalized": false,
1495
+ "rstrip": false,
1496
+ "single_word": false,
1497
+ "special": true
1498
+ },
1499
+ "128187": {
1500
+ "content": "<|reserved_special_token_182|>",
1501
+ "lstrip": false,
1502
+ "normalized": false,
1503
+ "rstrip": false,
1504
+ "single_word": false,
1505
+ "special": true
1506
+ },
1507
+ "128188": {
1508
+ "content": "<|reserved_special_token_183|>",
1509
+ "lstrip": false,
1510
+ "normalized": false,
1511
+ "rstrip": false,
1512
+ "single_word": false,
1513
+ "special": true
1514
+ },
1515
+ "128189": {
1516
+ "content": "<|reserved_special_token_184|>",
1517
+ "lstrip": false,
1518
+ "normalized": false,
1519
+ "rstrip": false,
1520
+ "single_word": false,
1521
+ "special": true
1522
+ },
1523
+ "128190": {
1524
+ "content": "<|reserved_special_token_185|>",
1525
+ "lstrip": false,
1526
+ "normalized": false,
1527
+ "rstrip": false,
1528
+ "single_word": false,
1529
+ "special": true
1530
+ },
1531
+ "128191": {
1532
+ "content": "<|reserved_special_token_186|>",
1533
+ "lstrip": false,
1534
+ "normalized": false,
1535
+ "rstrip": false,
1536
+ "single_word": false,
1537
+ "special": true
1538
+ },
1539
+ "128192": {
1540
+ "content": "<|reserved_special_token_187|>",
1541
+ "lstrip": false,
1542
+ "normalized": false,
1543
+ "rstrip": false,
1544
+ "single_word": false,
1545
+ "special": true
1546
+ },
1547
+ "128193": {
1548
+ "content": "<|reserved_special_token_188|>",
1549
+ "lstrip": false,
1550
+ "normalized": false,
1551
+ "rstrip": false,
1552
+ "single_word": false,
1553
+ "special": true
1554
+ },
1555
+ "128194": {
1556
+ "content": "<|reserved_special_token_189|>",
1557
+ "lstrip": false,
1558
+ "normalized": false,
1559
+ "rstrip": false,
1560
+ "single_word": false,
1561
+ "special": true
1562
+ },
1563
+ "128195": {
1564
+ "content": "<|reserved_special_token_190|>",
1565
+ "lstrip": false,
1566
+ "normalized": false,
1567
+ "rstrip": false,
1568
+ "single_word": false,
1569
+ "special": true
1570
+ },
1571
+ "128196": {
1572
+ "content": "<|reserved_special_token_191|>",
1573
+ "lstrip": false,
1574
+ "normalized": false,
1575
+ "rstrip": false,
1576
+ "single_word": false,
1577
+ "special": true
1578
+ },
1579
+ "128197": {
1580
+ "content": "<|reserved_special_token_192|>",
1581
+ "lstrip": false,
1582
+ "normalized": false,
1583
+ "rstrip": false,
1584
+ "single_word": false,
1585
+ "special": true
1586
+ },
1587
+ "128198": {
1588
+ "content": "<|reserved_special_token_193|>",
1589
+ "lstrip": false,
1590
+ "normalized": false,
1591
+ "rstrip": false,
1592
+ "single_word": false,
1593
+ "special": true
1594
+ },
1595
+ "128199": {
1596
+ "content": "<|reserved_special_token_194|>",
1597
+ "lstrip": false,
1598
+ "normalized": false,
1599
+ "rstrip": false,
1600
+ "single_word": false,
1601
+ "special": true
1602
+ },
1603
+ "128200": {
1604
+ "content": "<|reserved_special_token_195|>",
1605
+ "lstrip": false,
1606
+ "normalized": false,
1607
+ "rstrip": false,
1608
+ "single_word": false,
1609
+ "special": true
1610
+ },
1611
+ "128201": {
1612
+ "content": "<|reserved_special_token_196|>",
1613
+ "lstrip": false,
1614
+ "normalized": false,
1615
+ "rstrip": false,
1616
+ "single_word": false,
1617
+ "special": true
1618
+ },
1619
+ "128202": {
1620
+ "content": "<|reserved_special_token_197|>",
1621
+ "lstrip": false,
1622
+ "normalized": false,
1623
+ "rstrip": false,
1624
+ "single_word": false,
1625
+ "special": true
1626
+ },
1627
+ "128203": {
1628
+ "content": "<|reserved_special_token_198|>",
1629
+ "lstrip": false,
1630
+ "normalized": false,
1631
+ "rstrip": false,
1632
+ "single_word": false,
1633
+ "special": true
1634
+ },
1635
+ "128204": {
1636
+ "content": "<|reserved_special_token_199|>",
1637
+ "lstrip": false,
1638
+ "normalized": false,
1639
+ "rstrip": false,
1640
+ "single_word": false,
1641
+ "special": true
1642
+ },
1643
+ "128205": {
1644
+ "content": "<|reserved_special_token_200|>",
1645
+ "lstrip": false,
1646
+ "normalized": false,
1647
+ "rstrip": false,
1648
+ "single_word": false,
1649
+ "special": true
1650
+ },
1651
+ "128206": {
1652
+ "content": "<|reserved_special_token_201|>",
1653
+ "lstrip": false,
1654
+ "normalized": false,
1655
+ "rstrip": false,
1656
+ "single_word": false,
1657
+ "special": true
1658
+ },
1659
+ "128207": {
1660
+ "content": "<|reserved_special_token_202|>",
1661
+ "lstrip": false,
1662
+ "normalized": false,
1663
+ "rstrip": false,
1664
+ "single_word": false,
1665
+ "special": true
1666
+ },
1667
+ "128208": {
1668
+ "content": "<|reserved_special_token_203|>",
1669
+ "lstrip": false,
1670
+ "normalized": false,
1671
+ "rstrip": false,
1672
+ "single_word": false,
1673
+ "special": true
1674
+ },
1675
+ "128209": {
1676
+ "content": "<|reserved_special_token_204|>",
1677
+ "lstrip": false,
1678
+ "normalized": false,
1679
+ "rstrip": false,
1680
+ "single_word": false,
1681
+ "special": true
1682
+ },
1683
+ "128210": {
1684
+ "content": "<|reserved_special_token_205|>",
1685
+ "lstrip": false,
1686
+ "normalized": false,
1687
+ "rstrip": false,
1688
+ "single_word": false,
1689
+ "special": true
1690
+ },
1691
+ "128211": {
1692
+ "content": "<|reserved_special_token_206|>",
1693
+ "lstrip": false,
1694
+ "normalized": false,
1695
+ "rstrip": false,
1696
+ "single_word": false,
1697
+ "special": true
1698
+ },
1699
+ "128212": {
1700
+ "content": "<|reserved_special_token_207|>",
1701
+ "lstrip": false,
1702
+ "normalized": false,
1703
+ "rstrip": false,
1704
+ "single_word": false,
1705
+ "special": true
1706
+ },
1707
+ "128213": {
1708
+ "content": "<|reserved_special_token_208|>",
1709
+ "lstrip": false,
1710
+ "normalized": false,
1711
+ "rstrip": false,
1712
+ "single_word": false,
1713
+ "special": true
1714
+ },
1715
+ "128214": {
1716
+ "content": "<|reserved_special_token_209|>",
1717
+ "lstrip": false,
1718
+ "normalized": false,
1719
+ "rstrip": false,
1720
+ "single_word": false,
1721
+ "special": true
1722
+ },
1723
+ "128215": {
1724
+ "content": "<|reserved_special_token_210|>",
1725
+ "lstrip": false,
1726
+ "normalized": false,
1727
+ "rstrip": false,
1728
+ "single_word": false,
1729
+ "special": true
1730
+ },
1731
+ "128216": {
1732
+ "content": "<|reserved_special_token_211|>",
1733
+ "lstrip": false,
1734
+ "normalized": false,
1735
+ "rstrip": false,
1736
+ "single_word": false,
1737
+ "special": true
1738
+ },
1739
+ "128217": {
1740
+ "content": "<|reserved_special_token_212|>",
1741
+ "lstrip": false,
1742
+ "normalized": false,
1743
+ "rstrip": false,
1744
+ "single_word": false,
1745
+ "special": true
1746
+ },
1747
+ "128218": {
1748
+ "content": "<|reserved_special_token_213|>",
1749
+ "lstrip": false,
1750
+ "normalized": false,
1751
+ "rstrip": false,
1752
+ "single_word": false,
1753
+ "special": true
1754
+ },
1755
+ "128219": {
1756
+ "content": "<|reserved_special_token_214|>",
1757
+ "lstrip": false,
1758
+ "normalized": false,
1759
+ "rstrip": false,
1760
+ "single_word": false,
1761
+ "special": true
1762
+ },
1763
+ "128220": {
1764
+ "content": "<|reserved_special_token_215|>",
1765
+ "lstrip": false,
1766
+ "normalized": false,
1767
+ "rstrip": false,
1768
+ "single_word": false,
1769
+ "special": true
1770
+ },
1771
+ "128221": {
1772
+ "content": "<|reserved_special_token_216|>",
1773
+ "lstrip": false,
1774
+ "normalized": false,
1775
+ "rstrip": false,
1776
+ "single_word": false,
1777
+ "special": true
1778
+ },
1779
+ "128222": {
1780
+ "content": "<|reserved_special_token_217|>",
1781
+ "lstrip": false,
1782
+ "normalized": false,
1783
+ "rstrip": false,
1784
+ "single_word": false,
1785
+ "special": true
1786
+ },
1787
+ "128223": {
1788
+ "content": "<|reserved_special_token_218|>",
1789
+ "lstrip": false,
1790
+ "normalized": false,
1791
+ "rstrip": false,
1792
+ "single_word": false,
1793
+ "special": true
1794
+ },
1795
+ "128224": {
1796
+ "content": "<|reserved_special_token_219|>",
1797
+ "lstrip": false,
1798
+ "normalized": false,
1799
+ "rstrip": false,
1800
+ "single_word": false,
1801
+ "special": true
1802
+ },
1803
+ "128225": {
1804
+ "content": "<|reserved_special_token_220|>",
1805
+ "lstrip": false,
1806
+ "normalized": false,
1807
+ "rstrip": false,
1808
+ "single_word": false,
1809
+ "special": true
1810
+ },
1811
+ "128226": {
1812
+ "content": "<|reserved_special_token_221|>",
1813
+ "lstrip": false,
1814
+ "normalized": false,
1815
+ "rstrip": false,
1816
+ "single_word": false,
1817
+ "special": true
1818
+ },
1819
+ "128227": {
1820
+ "content": "<|reserved_special_token_222|>",
1821
+ "lstrip": false,
1822
+ "normalized": false,
1823
+ "rstrip": false,
1824
+ "single_word": false,
1825
+ "special": true
1826
+ },
1827
+ "128228": {
1828
+ "content": "<|reserved_special_token_223|>",
1829
+ "lstrip": false,
1830
+ "normalized": false,
1831
+ "rstrip": false,
1832
+ "single_word": false,
1833
+ "special": true
1834
+ },
1835
+ "128229": {
1836
+ "content": "<|reserved_special_token_224|>",
1837
+ "lstrip": false,
1838
+ "normalized": false,
1839
+ "rstrip": false,
1840
+ "single_word": false,
1841
+ "special": true
1842
+ },
1843
+ "128230": {
1844
+ "content": "<|reserved_special_token_225|>",
1845
+ "lstrip": false,
1846
+ "normalized": false,
1847
+ "rstrip": false,
1848
+ "single_word": false,
1849
+ "special": true
1850
+ },
1851
+ "128231": {
1852
+ "content": "<|reserved_special_token_226|>",
1853
+ "lstrip": false,
1854
+ "normalized": false,
1855
+ "rstrip": false,
1856
+ "single_word": false,
1857
+ "special": true
1858
+ },
1859
+ "128232": {
1860
+ "content": "<|reserved_special_token_227|>",
1861
+ "lstrip": false,
1862
+ "normalized": false,
1863
+ "rstrip": false,
1864
+ "single_word": false,
1865
+ "special": true
1866
+ },
1867
+ "128233": {
1868
+ "content": "<|reserved_special_token_228|>",
1869
+ "lstrip": false,
1870
+ "normalized": false,
1871
+ "rstrip": false,
1872
+ "single_word": false,
1873
+ "special": true
1874
+ },
1875
+ "128234": {
1876
+ "content": "<|reserved_special_token_229|>",
1877
+ "lstrip": false,
1878
+ "normalized": false,
1879
+ "rstrip": false,
1880
+ "single_word": false,
1881
+ "special": true
1882
+ },
1883
+ "128235": {
1884
+ "content": "<|reserved_special_token_230|>",
1885
+ "lstrip": false,
1886
+ "normalized": false,
1887
+ "rstrip": false,
1888
+ "single_word": false,
1889
+ "special": true
1890
+ },
1891
+ "128236": {
1892
+ "content": "<|reserved_special_token_231|>",
1893
+ "lstrip": false,
1894
+ "normalized": false,
1895
+ "rstrip": false,
1896
+ "single_word": false,
1897
+ "special": true
1898
+ },
1899
+ "128237": {
1900
+ "content": "<|reserved_special_token_232|>",
1901
+ "lstrip": false,
1902
+ "normalized": false,
1903
+ "rstrip": false,
1904
+ "single_word": false,
1905
+ "special": true
1906
+ },
1907
+ "128238": {
1908
+ "content": "<|reserved_special_token_233|>",
1909
+ "lstrip": false,
1910
+ "normalized": false,
1911
+ "rstrip": false,
1912
+ "single_word": false,
1913
+ "special": true
1914
+ },
1915
+ "128239": {
1916
+ "content": "<|reserved_special_token_234|>",
1917
+ "lstrip": false,
1918
+ "normalized": false,
1919
+ "rstrip": false,
1920
+ "single_word": false,
1921
+ "special": true
1922
+ },
1923
+ "128240": {
1924
+ "content": "<|reserved_special_token_235|>",
1925
+ "lstrip": false,
1926
+ "normalized": false,
1927
+ "rstrip": false,
1928
+ "single_word": false,
1929
+ "special": true
1930
+ },
1931
+ "128241": {
1932
+ "content": "<|reserved_special_token_236|>",
1933
+ "lstrip": false,
1934
+ "normalized": false,
1935
+ "rstrip": false,
1936
+ "single_word": false,
1937
+ "special": true
1938
+ },
1939
+ "128242": {
1940
+ "content": "<|reserved_special_token_237|>",
1941
+ "lstrip": false,
1942
+ "normalized": false,
1943
+ "rstrip": false,
1944
+ "single_word": false,
1945
+ "special": true
1946
+ },
1947
+ "128243": {
1948
+ "content": "<|reserved_special_token_238|>",
1949
+ "lstrip": false,
1950
+ "normalized": false,
1951
+ "rstrip": false,
1952
+ "single_word": false,
1953
+ "special": true
1954
+ },
1955
+ "128244": {
1956
+ "content": "<|reserved_special_token_239|>",
1957
+ "lstrip": false,
1958
+ "normalized": false,
1959
+ "rstrip": false,
1960
+ "single_word": false,
1961
+ "special": true
1962
+ },
1963
+ "128245": {
1964
+ "content": "<|reserved_special_token_240|>",
1965
+ "lstrip": false,
1966
+ "normalized": false,
1967
+ "rstrip": false,
1968
+ "single_word": false,
1969
+ "special": true
1970
+ },
1971
+ "128246": {
1972
+ "content": "<|reserved_special_token_241|>",
1973
+ "lstrip": false,
1974
+ "normalized": false,
1975
+ "rstrip": false,
1976
+ "single_word": false,
1977
+ "special": true
1978
+ },
1979
+ "128247": {
1980
+ "content": "<|reserved_special_token_242|>",
1981
+ "lstrip": false,
1982
+ "normalized": false,
1983
+ "rstrip": false,
1984
+ "single_word": false,
1985
+ "special": true
1986
+ },
1987
+ "128248": {
1988
+ "content": "<|reserved_special_token_243|>",
1989
+ "lstrip": false,
1990
+ "normalized": false,
1991
+ "rstrip": false,
1992
+ "single_word": false,
1993
+ "special": true
1994
+ },
1995
+ "128249": {
1996
+ "content": "<|reserved_special_token_244|>",
1997
+ "lstrip": false,
1998
+ "normalized": false,
1999
+ "rstrip": false,
2000
+ "single_word": false,
2001
+ "special": true
2002
+ },
2003
+ "128250": {
2004
+ "content": "<|reserved_special_token_245|>",
2005
+ "lstrip": false,
2006
+ "normalized": false,
2007
+ "rstrip": false,
2008
+ "single_word": false,
2009
+ "special": true
2010
+ },
2011
+ "128251": {
2012
+ "content": "<|reserved_special_token_246|>",
2013
+ "lstrip": false,
2014
+ "normalized": false,
2015
+ "rstrip": false,
2016
+ "single_word": false,
2017
+ "special": true
2018
+ },
2019
+ "128252": {
2020
+ "content": "<|reserved_special_token_247|>",
2021
+ "lstrip": false,
2022
+ "normalized": false,
2023
+ "rstrip": false,
2024
+ "single_word": false,
2025
+ "special": true
2026
+ },
2027
+ "128253": {
2028
+ "content": "<|reserved_special_token_248|>",
2029
+ "lstrip": false,
2030
+ "normalized": false,
2031
+ "rstrip": false,
2032
+ "single_word": false,
2033
+ "special": true
2034
+ },
2035
+ "128254": {
2036
+ "content": "<|reserved_special_token_249|>",
2037
+ "lstrip": false,
2038
+ "normalized": false,
2039
+ "rstrip": false,
2040
+ "single_word": false,
2041
+ "special": true
2042
+ },
2043
+ "128255": {
2044
+ "content": "<|reserved_special_token_250|>",
2045
+ "lstrip": false,
2046
+ "normalized": false,
2047
+ "rstrip": false,
2048
+ "single_word": false,
2049
+ "special": true
2050
+ }
2051
+ },
2052
+ "bos_token": "<|begin_of_text|>",
2053
+ "chat_template": "{% set loop_messages = messages %}{% for message in loop_messages %}{% set content = '<|start_header_id|>' + message['role'] + '<|end_header_id|>\n\n'+ message['content'] | trim + '<|eot_id|>' %}{% if loop.index0 == 0 %}{% set content = bos_token + content %}{% endif %}{{ content }}{% endfor %}{% if add_generation_prompt %}{{ '<|start_header_id|>assistant<|end_header_id|>\n\n' }}{% endif %}",
2054
+ "clean_up_tokenization_spaces": false,
2055
+ "eos_token": "<|eot_id|>",
2056
+ "model_input_names": [
2057
+ "input_ids",
2058
+ "attention_mask"
2059
+ ],
2060
+ "model_max_length": 1000000000000000019884624838656,
2061
+ "pad_token": "<|end_of_text|>",
2062
+ "tokenizer_class": "PreTrainedTokenizerFast"
2063
+ }