Upload folder using huggingface_hub

Browse files

Files changed (13) hide show

.gitattributes +1 -0
README.md +58 -5
config.json +41 -0
generation_config.json +7 -0
huggingface-metadata.txt +20 -0
measurement.json +0 -0
model.safetensors.index.json +329 -0
output-00001-of-00003.safetensors +3 -0
output-00002-of-00003.safetensors +3 -0
output-00003-of-00003.safetensors +3 -0
special_tokens_map.json +23 -0
tokenizer.json +3 -0
tokenizer_config.json +314 -0

.gitattributes CHANGED Viewed

@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text

 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text
+tokenizer.json filter=lfs diff=lfs merge=lfs -text

README.md CHANGED Viewed

@@ -1,5 +1,58 @@
----
-license: other
-license_name: wtfpl
-license_link: LICENSE
----

+---
+license: wtfpl
+language:
+- en
+- zh
+- ja
+- de
+datasets:
+- JosephusCheung/GuanacoDataset
+- meta-math/MetaMathQA
+- jondurbin/airoboros-3.1
+- WizardLM/WizardLM_evol_instruct_V2_196k
+- RyokoAI/ShareGPT52K
+- RyokoAI/Fandom23K
+- milashkaarshif/MoeGirlPedia_wikitext_raw_archive
+- wikipedia
+- wiki_lingua
+- garage-bAInd/Open-Platypus
+- LDJnr/Puffin
+- BAAI/COIG
+- TigerResearch/tigerbot-zhihu-zh-10k
+- liwu/MNBVC
+- teknium/openhermes
+- CausalLM/Refined-Anime-Text
+- microsoft/orca-math-word-problems-200k
+- m-a-p/CodeFeedback-Filtered-Instruction
+---
+**Sorry, it's no longer available on Hugging Face. Please reach out to those who have already downloaded it. If you have a copy, please refrain from re-uploading it to Hugging Face.**
+**Due to repeated conflicts with HF and what we perceive as their repeated misuse of the "Contributor Covenant Code of Conduct," we have lost confidence in the platform and decided to temporarily suspend all new download access requests. It appears to us that HF's original intention has been abandoned in pursuit of commercialization, and they no longer prioritize the well-being of the community.**
+Demo: [![](https://huggingface.co/datasets/huggingface/badges/raw/main/open-in-hf-spaces-sm.svg)](https://huggingface.co/spaces/JosephusCheung/CausalLM-35B-long-Q6K-GGUF)
+# 35b-beta-long
+This release, CausalLM/35b-beta-long, represents the culmination of our experience and accumulated training data in fine-tuning large language models. We are open-sourcing these weights to foster development within the open-source community.
+We chose Cohere's multilingual, 35B-parameter with long context [CohereForAI/c4ai-command-r-v01] MHA model as our base. In our evaluation, it proved to be the most responsive to the quality of training data throughout the Supervised Fine-Tuning process, outperforming other open-source LLMs. Although its initial SFT/RL focuses on specific tasks and comes with a non-commercial license, we believe it's currently the best foundation for personal and internal use cases.
+Utilizing extensive factual content from web crawls, we synthesized over 30 million multi-turn dialogue data entries, grounded in multiple web-pages or documents. This process involved substantial human oversight and a data pipeline designed to ensure high quality. The model was then trained on this data in full 128K context using BF16 precision. We also incorporated widely-used open-source dialogue datasets to enhance general conversational fluency.
+Our data synthesis approach addressed crucial limitations in typical LLM training corpora. LLMs often struggle to extract thematic summaries, key information, or perform comparisons at the paragraph or document level. Therefore, we focused on generating fact-based data using multiple documents within a long context setting. This involved leveraging existing SOTA LLMs with human guidance to synthesize information through thematic summarization, information extraction, and comparison of source materials.
+This approach yielded significant improvements in model performance during fine-tuning. We observed reductions in hallucinations, enhanced long-context capabilities, and improvements in general abilities such as math, coding, and knowledge recall. The training process incorporated both the original source material and the synthesized outputs, further reinforcing the model's ability to recall and utilize abstract concepts embedded within the pre-training data. Our analysis revealed that this combination of original and synthesized data was crucial for achieving a more balanced performance profile. Intermediate checkpoints and models trained solely on synthesized data are also released for research purposes.
+Compared to the original task-specific model, our further fine-tuned model demonstrates more robust recall in long-context scenarios without requiring specific document formatting or prompt engineering. This fine-tuned model also exhibits performance comparable to models twice its size in quantifiable benchmarks.
+As this model has only undergone SFT, it may still exhibit biases or generate undesirable content. We implemented basic safety measures using open-source refusal datasets to mitigate outputs related to illegal activities, NSFW content, and violence. However, further Reinforcement Learning is necessary for robust alignment with human values.
+## Please note
+Tokenizer is different from cohere - and chat template is **ChatML**.
+Pressure Testing from: https://github.com/LeonEricsson/llmcontext
+![image/png](https://cdn-uploads.huggingface.co/production/uploads/63468a143ea42ee2cb49ddd1/2XbONpyTeMH1qWCtE9ziH.png)

config.json ADDED Viewed

	@@ -0,0 +1,41 @@

+{
+    "_name_or_path": "35b",
+    "architectures": [
+        "CohereForCausalLM"
+    ],
+    "attention_bias": false,
+    "attention_dropout": 0.0,
+    "bos_token_id": 5,
+    "eos_token_id": 6,
+    "hidden_act": "silu",
+    "hidden_size": 8192,
+    "initializer_range": 0.02,
+    "intermediate_size": 22528,
+    "layer_norm_eps": 1e-05,
+    "logit_scale": 0.0625,
+    "max_position_embeddings": 8192,
+    "model_max_length": 131072,
+    "model_type": "cohere",
+    "num_attention_heads": 64,
+    "num_hidden_layers": 40,
+    "num_key_value_heads": 64,
+    "pad_token_id": 0,
+    "pretraining_tp": 1,
+    "rms_norm_eps": 1e-05,
+    "rope_theta": 8000000.0,
+    "torch_dtype": "bfloat16",
+    "transformers_version": "4.38.2",
+    "use_cache": true,
+    "vocab_size": 256000,
+    "quantization_config": {
+        "quant_method": "exl2",
+        "version": "0.2.2",
+        "bits": 4.0,
+        "head_bits": 8,
+        "calibration": {
+            "rows": 115,
+            "length": 2048,
+            "dataset": "(default)"
+        }
+    }
+}

generation_config.json ADDED Viewed

	@@ -0,0 +1,7 @@

+{
+  "_from_model_config": true,
+  "bos_token_id": 5,
+  "eos_token_id": 6,
+  "pad_token_id": 0,
+  "transformers_version": "4.38.2"
+}

huggingface-metadata.txt ADDED Viewed

	@@ -0,0 +1,20 @@

+url: https://huggingface.co/CausalLM/35b-beta-long
+branch: main
+download date: 2024-09-15 22:30:39
+sha256sum:
+    5428fa31fd03765d5c0eb14d3680ba058ee1e0eca4b25140092bb9d669914bbf model-00001-of-00015.safetensors
+    c8d70c9ce69e42faf9616e1bda1448c2766f00fcaa20800d3beda8302cbb8e5c model-00002-of-00015.safetensors
+    10f1162a4f10ebf07324293635b6b9ee3509a835dc47a409abd87b92203f4d26 model-00003-of-00015.safetensors
+    6d3d16b8c67947bbfbd37c3b235f50337aa4e0b8450a5c1f21d216bb75456e59 model-00004-of-00015.safetensors
+    120a4429056e6efd04b1d2756b3e625bd829c4b458203d1cbf1e2e8a7b678489 model-00005-of-00015.safetensors
+    478c89965e4390aa458d52bbef525f95cb69eb277db1f8454ad3b0dbd8b52b7c model-00006-of-00015.safetensors
+    5d71536b1c2a5c33f27330b19010f7493c1599898207dc57aa1e7e38767a4c2b model-00007-of-00015.safetensors
+    57a64b41fcf22f9fd4855f542dac9d99aae242c9ad1245d34d2b71c428fe32aa model-00008-of-00015.safetensors
+    7ad83531189bb6d9456710a903396ec02987be03f6539048b85f1ac25a01dd10 model-00009-of-00015.safetensors
+    b18986af87bed9d98c7b9deff616540b7721c379113668191bf8f848e5a050fc model-00010-of-00015.safetensors
+    5ac15fdc4368f7a3532b7e114938aa5e8e50db07f01962bc3801240b9939d9c1 model-00011-of-00015.safetensors
+    b9ae9ccf809835bfd3c3466c80b1377da957896b34a7090614a508220bd7c1df model-00012-of-00015.safetensors
+    b058ba038b230322ef83091c8a91731d384eb6ca11058a9f5df38a7c3da3df83 model-00013-of-00015.safetensors
+    7f6db7c3e17ce948ac5202197613ad977a2dd6a8e474e30076f5146571a4a0a4 model-00014-of-00015.safetensors
+    cd1fabad5e9533b25b07d107ba37c5f580bc7a8c1871b794baefdae3fa976b76 model-00015-of-00015.safetensors
+    3ec24d1fe80ac960489b2004b7399ea561799de2fae774bd5a9234c13e6a0726 tokenizer.json

measurement.json ADDED Viewed

The diff for this file is too large to render. See raw diff

model.safetensors.index.json ADDED Viewed

	@@ -0,0 +1,329 @@

+{
+  "metadata": {
+    "total_size": 69961662464
+  },
+  "weight_map": {
+    "model.embed_tokens.weight": "model-00001-of-00015.safetensors",
+    "model.layers.0.input_layernorm.weight": "model-00002-of-00015.safetensors",
+    "model.layers.0.mlp.down_proj.weight": "model-00002-of-00015.safetensors",
+    "model.layers.0.mlp.gate_proj.weight": "model-00002-of-00015.safetensors",
+    "model.layers.0.mlp.up_proj.weight": "model-00002-of-00015.safetensors",
+    "model.layers.0.self_attn.k_proj.weight": "model-00001-of-00015.safetensors",
+    "model.layers.0.self_attn.o_proj.weight": "model-00001-of-00015.safetensors",
+    "model.layers.0.self_attn.q_proj.weight": "model-00001-of-00015.safetensors",
+    "model.layers.0.self_attn.v_proj.weight": "model-00001-of-00015.safetensors",
+    "model.layers.1.input_layernorm.weight": "model-00002-of-00015.safetensors",
+    "model.layers.1.mlp.down_proj.weight": "model-00002-of-00015.safetensors",
+    "model.layers.1.mlp.gate_proj.weight": "model-00002-of-00015.safetensors",
+    "model.layers.1.mlp.up_proj.weight": "model-00002-of-00015.safetensors",
+    "model.layers.1.self_attn.k_proj.weight": "model-00002-of-00015.safetensors",
+    "model.layers.1.self_attn.o_proj.weight": "model-00002-of-00015.safetensors",
+    "model.layers.1.self_attn.q_proj.weight": "model-00002-of-00015.safetensors",
+    "model.layers.1.self_attn.v_proj.weight": "model-00002-of-00015.safetensors",
+    "model.layers.10.input_layernorm.weight": "model-00005-of-00015.safetensors",
+    "model.layers.10.mlp.down_proj.weight": "model-00005-of-00015.safetensors",
+    "model.layers.10.mlp.gate_proj.weight": "model-00005-of-00015.safetensors",
+    "model.layers.10.mlp.up_proj.weight": "model-00005-of-00015.safetensors",
+    "model.layers.10.self_attn.k_proj.weight": "model-00005-of-00015.safetensors",
+    "model.layers.10.self_attn.o_proj.weight": "model-00005-of-00015.safetensors",
+    "model.layers.10.self_attn.q_proj.weight": "model-00005-of-00015.safetensors",
+    "model.layers.10.self_attn.v_proj.weight": "model-00005-of-00015.safetensors",
+    "model.layers.11.input_layernorm.weight": "model-00005-of-00015.safetensors",
+    "model.layers.11.mlp.down_proj.weight": "model-00005-of-00015.safetensors",
+    "model.layers.11.mlp.gate_proj.weight": "model-00005-of-00015.safetensors",
+    "model.layers.11.mlp.up_proj.weight": "model-00005-of-00015.safetensors",
+    "model.layers.11.self_attn.k_proj.weight": "model-00005-of-00015.safetensors",
+    "model.layers.11.self_attn.o_proj.weight": "model-00005-of-00015.safetensors",
+    "model.layers.11.self_attn.q_proj.weight": "model-00005-of-00015.safetensors",
+    "model.layers.11.self_attn.v_proj.weight": "model-00005-of-00015.safetensors",
+    "model.layers.12.input_layernorm.weight": "model-00006-of-00015.safetensors",
+    "model.layers.12.mlp.down_proj.weight": "model-00006-of-00015.safetensors",
+    "model.layers.12.mlp.gate_proj.weight": "model-00006-of-00015.safetensors",
+    "model.layers.12.mlp.up_proj.weight": "model-00006-of-00015.safetensors",
+    "model.layers.12.self_attn.k_proj.weight": "model-00005-of-00015.safetensors",
+    "model.layers.12.self_attn.o_proj.weight": "model-00005-of-00015.safetensors",
+    "model.layers.12.self_attn.q_proj.weight": "model-00005-of-00015.safetensors",
+    "model.layers.12.self_attn.v_proj.weight": "model-00005-of-00015.safetensors",
+    "model.layers.13.input_layernorm.weight": "model-00006-of-00015.safetensors",
+    "model.layers.13.mlp.down_proj.weight": "model-00006-of-00015.safetensors",
+    "model.layers.13.mlp.gate_proj.weight": "model-00006-of-00015.safetensors",
+    "model.layers.13.mlp.up_proj.weight": "model-00006-of-00015.safetensors",
+    "model.layers.13.self_attn.k_proj.weight": "model-00006-of-00015.safetensors",
+    "model.layers.13.self_attn.o_proj.weight": "model-00006-of-00015.safetensors",
+    "model.layers.13.self_attn.q_proj.weight": "model-00006-of-00015.safetensors",
+    "model.layers.13.self_attn.v_proj.weight": "model-00006-of-00015.safetensors",
+    "model.layers.14.input_layernorm.weight": "model-00006-of-00015.safetensors",
+    "model.layers.14.mlp.down_proj.weight": "model-00006-of-00015.safetensors",
+    "model.layers.14.mlp.gate_proj.weight": "model-00006-of-00015.safetensors",
+    "model.layers.14.mlp.up_proj.weight": "model-00006-of-00015.safetensors",
+    "model.layers.14.self_attn.k_proj.weight": "model-00006-of-00015.safetensors",
+    "model.layers.14.self_attn.o_proj.weight": "model-00006-of-00015.safetensors",
+    "model.layers.14.self_attn.q_proj.weight": "model-00006-of-00015.safetensors",
+    "model.layers.14.self_attn.v_proj.weight": "model-00006-of-00015.safetensors",
+    "model.layers.15.input_layernorm.weight": "model-00007-of-00015.safetensors",
+    "model.layers.15.mlp.down_proj.weight": "model-00007-of-00015.safetensors",
+    "model.layers.15.mlp.gate_proj.weight": "model-00007-of-00015.safetensors",
+    "model.layers.15.mlp.up_proj.weight": "model-00007-of-00015.safetensors",
+    "model.layers.15.self_attn.k_proj.weight": "model-00006-of-00015.safetensors",
+    "model.layers.15.self_attn.o_proj.weight": "model-00006-of-00015.safetensors",
+    "model.layers.15.self_attn.q_proj.weight": "model-00006-of-00015.safetensors",
+    "model.layers.15.self_attn.v_proj.weight": "model-00006-of-00015.safetensors",
+    "model.layers.16.input_layernorm.weight": "model-00007-of-00015.safetensors",
+    "model.layers.16.mlp.down_proj.weight": "model-00007-of-00015.safetensors",
+    "model.layers.16.mlp.gate_proj.weight": "model-00007-of-00015.safetensors",
+    "model.layers.16.mlp.up_proj.weight": "model-00007-of-00015.safetensors",
+    "model.layers.16.self_attn.k_proj.weight": "model-00007-of-00015.safetensors",
+    "model.layers.16.self_attn.o_proj.weight": "model-00007-of-00015.safetensors",
+    "model.layers.16.self_attn.q_proj.weight": "model-00007-of-00015.safetensors",
+    "model.layers.16.self_attn.v_proj.weight": "model-00007-of-00015.safetensors",
+    "model.layers.17.input_layernorm.weight": "model-00007-of-00015.safetensors",
+    "model.layers.17.mlp.down_proj.weight": "model-00007-of-00015.safetensors",
+    "model.layers.17.mlp.gate_proj.weight": "model-00007-of-00015.safetensors",
+    "model.layers.17.mlp.up_proj.weight": "model-00007-of-00015.safetensors",
+    "model.layers.17.self_attn.k_proj.weight": "model-00007-of-00015.safetensors",
+    "model.layers.17.self_attn.o_proj.weight": "model-00007-of-00015.safetensors",
+    "model.layers.17.self_attn.q_proj.weight": "model-00007-of-00015.safetensors",
+    "model.layers.17.self_attn.v_proj.weight": "model-00007-of-00015.safetensors",
+    "model.layers.18.input_layernorm.weight": "model-00008-of-00015.safetensors",
+    "model.layers.18.mlp.down_proj.weight": "model-00008-of-00015.safetensors",
+    "model.layers.18.mlp.gate_proj.weight": "model-00008-of-00015.safetensors",
+    "model.layers.18.mlp.up_proj.weight": "model-00008-of-00015.safetensors",
+    "model.layers.18.self_attn.k_proj.weight": "model-00007-of-00015.safetensors",
+    "model.layers.18.self_attn.o_proj.weight": "model-00007-of-00015.safetensors",
+    "model.layers.18.self_attn.q_proj.weight": "model-00007-of-00015.safetensors",
+    "model.layers.18.self_attn.v_proj.weight": "model-00007-of-00015.safetensors",
+    "model.layers.19.input_layernorm.weight": "model-00008-of-00015.safetensors",
+    "model.layers.19.mlp.down_proj.weight": "model-00008-of-00015.safetensors",
+    "model.layers.19.mlp.gate_proj.weight": "model-00008-of-00015.safetensors",
+    "model.layers.19.mlp.up_proj.weight": "model-00008-of-00015.safetensors",
+    "model.layers.19.self_attn.k_proj.weight": "model-00008-of-00015.safetensors",
+    "model.layers.19.self_attn.o_proj.weight": "model-00008-of-00015.safetensors",
+    "model.layers.19.self_attn.q_proj.weight": "model-00008-of-00015.safetensors",
+    "model.layers.19.self_attn.v_proj.weight": "model-00008-of-00015.safetensors",
+    "model.layers.2.input_layernorm.weight": "model-00002-of-00015.safetensors",
+    "model.layers.2.mlp.down_proj.weight": "model-00002-of-00015.safetensors",
+    "model.layers.2.mlp.gate_proj.weight": "model-00002-of-00015.safetensors",
+    "model.layers.2.mlp.up_proj.weight": "model-00002-of-00015.safetensors",
+    "model.layers.2.self_attn.k_proj.weight": "model-00002-of-00015.safetensors",
+    "model.layers.2.self_attn.o_proj.weight": "model-00002-of-00015.safetensors",
+    "model.layers.2.self_attn.q_proj.weight": "model-00002-of-00015.safetensors",
+    "model.layers.2.self_attn.v_proj.weight": "model-00002-of-00015.safetensors",
+    "model.layers.20.input_layernorm.weight": "model-00008-of-00015.safetensors",
+    "model.layers.20.mlp.down_proj.weight": "model-00008-of-00015.safetensors",
+    "model.layers.20.mlp.gate_proj.weight": "model-00008-of-00015.safetensors",
+    "model.layers.20.mlp.up_proj.weight": "model-00008-of-00015.safetensors",
+    "model.layers.20.self_attn.k_proj.weight": "model-00008-of-00015.safetensors",
+    "model.layers.20.self_attn.o_proj.weight": "model-00008-of-00015.safetensors",
+    "model.layers.20.self_attn.q_proj.weight": "model-00008-of-00015.safetensors",
+    "model.layers.20.self_attn.v_proj.weight": "model-00008-of-00015.safetensors",
+    "model.layers.21.input_layernorm.weight": "model-00009-of-00015.safetensors",
+    "model.layers.21.mlp.down_proj.weight": "model-00009-of-00015.safetensors",
+    "model.layers.21.mlp.gate_proj.weight": "model-00009-of-00015.safetensors",
+    "model.layers.21.mlp.up_proj.weight": "model-00009-of-00015.safetensors",
+    "model.layers.21.self_attn.k_proj.weight": "model-00008-of-00015.safetensors",
+    "model.layers.21.self_attn.o_proj.weight": "model-00008-of-00015.safetensors",
+    "model.layers.21.self_attn.q_proj.weight": "model-00008-of-00015.safetensors",
+    "model.layers.21.self_attn.v_proj.weight": "model-00008-of-00015.safetensors",
+    "model.layers.22.input_layernorm.weight": "model-00009-of-00015.safetensors",
+    "model.layers.22.mlp.down_proj.weight": "model-00009-of-00015.safetensors",
+    "model.layers.22.mlp.gate_proj.weight": "model-00009-of-00015.safetensors",
+    "model.layers.22.mlp.up_proj.weight": "model-00009-of-00015.safetensors",
+    "model.layers.22.self_attn.k_proj.weight": "model-00009-of-00015.safetensors",
+    "model.layers.22.self_attn.o_proj.weight": "model-00009-of-00015.safetensors",
+    "model.layers.22.self_attn.q_proj.weight": "model-00009-of-00015.safetensors",
+    "model.layers.22.self_attn.v_proj.weight": "model-00009-of-00015.safetensors",
+    "model.layers.23.input_layernorm.weight": "model-00009-of-00015.safetensors",
+    "model.layers.23.mlp.down_proj.weight": "model-00009-of-00015.safetensors",
+    "model.layers.23.mlp.gate_proj.weight": "model-00009-of-00015.safetensors",
+    "model.layers.23.mlp.up_proj.weight": "model-00009-of-00015.safetensors",
+    "model.layers.23.self_attn.k_proj.weight": "model-00009-of-00015.safetensors",
+    "model.layers.23.self_attn.o_proj.weight": "model-00009-of-00015.safetensors",
+    "model.layers.23.self_attn.q_proj.weight": "model-00009-of-00015.safetensors",
+    "model.layers.23.self_attn.v_proj.weight": "model-00009-of-00015.safetensors",
+    "model.layers.24.input_layernorm.weight": "model-00010-of-00015.safetensors",
+    "model.layers.24.mlp.down_proj.weight": "model-00010-of-00015.safetensors",
+    "model.layers.24.mlp.gate_proj.weight": "model-00010-of-00015.safetensors",
+    "model.layers.24.mlp.up_proj.weight": "model-00010-of-00015.safetensors",
+    "model.layers.24.self_attn.k_proj.weight": "model-00009-of-00015.safetensors",
+    "model.layers.24.self_attn.o_proj.weight": "model-00009-of-00015.safetensors",
+    "model.layers.24.self_attn.q_proj.weight": "model-00009-of-00015.safetensors",
+    "model.layers.24.self_attn.v_proj.weight": "model-00009-of-00015.safetensors",
+    "model.layers.25.input_layernorm.weight": "model-00010-of-00015.safetensors",
+    "model.layers.25.mlp.down_proj.weight": "model-00010-of-00015.safetensors",
+    "model.layers.25.mlp.gate_proj.weight": "model-00010-of-00015.safetensors",
+    "model.layers.25.mlp.up_proj.weight": "model-00010-of-00015.safetensors",
+    "model.layers.25.self_attn.k_proj.weight": "model-00010-of-00015.safetensors",
+    "model.layers.25.self_attn.o_proj.weight": "model-00010-of-00015.safetensors",
+    "model.layers.25.self_attn.q_proj.weight": "model-00010-of-00015.safetensors",
+    "model.layers.25.self_attn.v_proj.weight": "model-00010-of-00015.safetensors",
+    "model.layers.26.input_layernorm.weight": "model-00010-of-00015.safetensors",
+    "model.layers.26.mlp.down_proj.weight": "model-00010-of-00015.safetensors",
+    "model.layers.26.mlp.gate_proj.weight": "model-00010-of-00015.safetensors",
+    "model.layers.26.mlp.up_proj.weight": "model-00010-of-00015.safetensors",
+    "model.layers.26.self_attn.k_proj.weight": "model-00010-of-00015.safetensors",
+    "model.layers.26.self_attn.o_proj.weight": "model-00010-of-00015.safetensors",
+    "model.layers.26.self_attn.q_proj.weight": "model-00010-of-00015.safetensors",
+    "model.layers.26.self_attn.v_proj.weight": "model-00010-of-00015.safetensors",
+    "model.layers.27.input_layernorm.weight": "model-00011-of-00015.safetensors",
+    "model.layers.27.mlp.down_proj.weight": "model-00011-of-00015.safetensors",
+    "model.layers.27.mlp.gate_proj.weight": "model-00011-of-00015.safetensors",
+    "model.layers.27.mlp.up_proj.weight": "model-00011-of-00015.safetensors",
+    "model.layers.27.self_attn.k_proj.weight": "model-00010-of-00015.safetensors",
+    "model.layers.27.self_attn.o_proj.weight": "model-00010-of-00015.safetensors",
+    "model.layers.27.self_attn.q_proj.weight": "model-00010-of-00015.safetensors",
+    "model.layers.27.self_attn.v_proj.weight": "model-00010-of-00015.safetensors",
+    "model.layers.28.input_layernorm.weight": "model-00011-of-00015.safetensors",
+    "model.layers.28.mlp.down_proj.weight": "model-00011-of-00015.safetensors",
+    "model.layers.28.mlp.gate_proj.weight": "model-00011-of-00015.safetensors",
+    "model.layers.28.mlp.up_proj.weight": "model-00011-of-00015.safetensors",
+    "model.layers.28.self_attn.k_proj.weight": "model-00011-of-00015.safetensors",
+    "model.layers.28.self_attn.o_proj.weight": "model-00011-of-00015.safetensors",
+    "model.layers.28.self_attn.q_proj.weight": "model-00011-of-00015.safetensors",
+    "model.layers.28.self_attn.v_proj.weight": "model-00011-of-00015.safetensors",
+    "model.layers.29.input_layernorm.weight": "model-00011-of-00015.safetensors",
+    "model.layers.29.mlp.down_proj.weight": "model-00011-of-00015.safetensors",
+    "model.layers.29.mlp.gate_proj.weight": "model-00011-of-00015.safetensors",
+    "model.layers.29.mlp.up_proj.weight": "model-00011-of-00015.safetensors",
+    "model.layers.29.self_attn.k_proj.weight": "model-00011-of-00015.safetensors",
+    "model.layers.29.self_attn.o_proj.weight": "model-00011-of-00015.safetensors",
+    "model.layers.29.self_attn.q_proj.weight": "model-00011-of-00015.safetensors",
+    "model.layers.29.self_attn.v_proj.weight": "model-00011-of-00015.safetensors",
+    "model.layers.3.input_layernorm.weight": "model-00003-of-00015.safetensors",
+    "model.layers.3.mlp.down_proj.weight": "model-00003-of-00015.safetensors",
+    "model.layers.3.mlp.gate_proj.weight": "model-00003-of-00015.safetensors",
+    "model.layers.3.mlp.up_proj.weight": "model-00003-of-00015.safetensors",
+    "model.layers.3.self_attn.k_proj.weight": "model-00002-of-00015.safetensors",
+    "model.layers.3.self_attn.o_proj.weight": "model-00002-of-00015.safetensors",
+    "model.layers.3.self_attn.q_proj.weight": "model-00002-of-00015.safetensors",
+    "model.layers.3.self_attn.v_proj.weight": "model-00002-of-00015.safetensors",
+    "model.layers.30.input_layernorm.weight": "model-00012-of-00015.safetensors",
+    "model.layers.30.mlp.down_proj.weight": "model-00012-of-00015.safetensors",
+    "model.layers.30.mlp.gate_proj.weight": "model-00012-of-00015.safetensors",
+    "model.layers.30.mlp.up_proj.weight": "model-00012-of-00015.safetensors",
+    "model.layers.30.self_attn.k_proj.weight": "model-00011-of-00015.safetensors",
+    "model.layers.30.self_attn.o_proj.weight": "model-00011-of-00015.safetensors",
+    "model.layers.30.self_attn.q_proj.weight": "model-00011-of-00015.safetensors",
+    "model.layers.30.self_attn.v_proj.weight": "model-00011-of-00015.safetensors",
+    "model.layers.31.input_layernorm.weight": "model-00012-of-00015.safetensors",
+    "model.layers.31.mlp.down_proj.weight": "model-00012-of-00015.safetensors",
+    "model.layers.31.mlp.gate_proj.weight": "model-00012-of-00015.safetensors",
+    "model.layers.31.mlp.up_proj.weight": "model-00012-of-00015.safetensors",
+    "model.layers.31.self_attn.k_proj.weight": "model-00012-of-00015.safetensors",
+    "model.layers.31.self_attn.o_proj.weight": "model-00012-of-00015.safetensors",
+    "model.layers.31.self_attn.q_proj.weight": "model-00012-of-00015.safetensors",
+    "model.layers.31.self_attn.v_proj.weight": "model-00012-of-00015.safetensors",
+    "model.layers.32.input_layernorm.weight": "model-00012-of-00015.safetensors",
+    "model.layers.32.mlp.down_proj.weight": "model-00012-of-00015.safetensors",
+    "model.layers.32.mlp.gate_proj.weight": "model-00012-of-00015.safetensors",
+    "model.layers.32.mlp.up_proj.weight": "model-00012-of-00015.safetensors",
+    "model.layers.32.self_attn.k_proj.weight": "model-00012-of-00015.safetensors",
+    "model.layers.32.self_attn.o_proj.weight": "model-00012-of-00015.safetensors",
+    "model.layers.32.self_attn.q_proj.weight": "model-00012-of-00015.safetensors",
+    "model.layers.32.self_attn.v_proj.weight": "model-00012-of-00015.safetensors",
+    "model.layers.33.input_layernorm.weight": "model-00013-of-00015.safetensors",
+    "model.layers.33.mlp.down_proj.weight": "model-00013-of-00015.safetensors",
+    "model.layers.33.mlp.gate_proj.weight": "model-00013-of-00015.safetensors",
+    "model.layers.33.mlp.up_proj.weight": "model-00013-of-00015.safetensors",
+    "model.layers.33.self_attn.k_proj.weight": "model-00012-of-00015.safetensors",
+    "model.layers.33.self_attn.o_proj.weight": "model-00012-of-00015.safetensors",
+    "model.layers.33.self_attn.q_proj.weight": "model-00012-of-00015.safetensors",
+    "model.layers.33.self_attn.v_proj.weight": "model-00012-of-00015.safetensors",
+    "model.layers.34.input_layernorm.weight": "model-00013-of-00015.safetensors",
+    "model.layers.34.mlp.down_proj.weight": "model-00013-of-00015.safetensors",
+    "model.layers.34.mlp.gate_proj.weight": "model-00013-of-00015.safetensors",
+    "model.layers.34.mlp.up_proj.weight": "model-00013-of-00015.safetensors",
+    "model.layers.34.self_attn.k_proj.weight": "model-00013-of-00015.safetensors",
+    "model.layers.34.self_attn.o_proj.weight": "model-00013-of-00015.safetensors",
+    "model.layers.34.self_attn.q_proj.weight": "model-00013-of-00015.safetensors",
+    "model.layers.34.self_attn.v_proj.weight": "model-00013-of-00015.safetensors",
+    "model.layers.35.input_layernorm.weight": "model-00013-of-00015.safetensors",
+    "model.layers.35.mlp.down_proj.weight": "model-00013-of-00015.safetensors",
+    "model.layers.35.mlp.gate_proj.weight": "model-00013-of-00015.safetensors",
+    "model.layers.35.mlp.up_proj.weight": "model-00013-of-00015.safetensors",
+    "model.layers.35.self_attn.k_proj.weight": "model-00013-of-00015.safetensors",
+    "model.layers.35.self_attn.o_proj.weight": "model-00013-of-00015.safetensors",
+    "model.layers.35.self_attn.q_proj.weight": "model-00013-of-00015.safetensors",
+    "model.layers.35.self_attn.v_proj.weight": "model-00013-of-00015.safetensors",
+    "model.layers.36.input_layernorm.weight": "model-00014-of-00015.safetensors",
+    "model.layers.36.mlp.down_proj.weight": "model-00014-of-00015.safetensors",
+    "model.layers.36.mlp.gate_proj.weight": "model-00014-of-00015.safetensors",
+    "model.layers.36.mlp.up_proj.weight": "model-00014-of-00015.safetensors",
+    "model.layers.36.self_attn.k_proj.weight": "model-00013-of-00015.safetensors",
+    "model.layers.36.self_attn.o_proj.weight": "model-00013-of-00015.safetensors",
+    "model.layers.36.self_attn.q_proj.weight": "model-00013-of-00015.safetensors",
+    "model.layers.36.self_attn.v_proj.weight": "model-00013-of-00015.safetensors",
+    "model.layers.37.input_layernorm.weight": "model-00014-of-00015.safetensors",
+    "model.layers.37.mlp.down_proj.weight": "model-00014-of-00015.safetensors",
+    "model.layers.37.mlp.gate_proj.weight": "model-00014-of-00015.safetensors",
+    "model.layers.37.mlp.up_proj.weight": "model-00014-of-00015.safetensors",
+    "model.layers.37.self_attn.k_proj.weight": "model-00014-of-00015.safetensors",
+    "model.layers.37.self_attn.o_proj.weight": "model-00014-of-00015.safetensors",
+    "model.layers.37.self_attn.q_proj.weight": "model-00014-of-00015.safetensors",
+    "model.layers.37.self_attn.v_proj.weight": "model-00014-of-00015.safetensors",
+    "model.layers.38.input_layernorm.weight": "model-00014-of-00015.safetensors",
+    "model.layers.38.mlp.down_proj.weight": "model-00014-of-00015.safetensors",
+    "model.layers.38.mlp.gate_proj.weight": "model-00014-of-00015.safetensors",
+    "model.layers.38.mlp.up_proj.weight": "model-00014-of-00015.safetensors",
+    "model.layers.38.self_attn.k_proj.weight": "model-00014-of-00015.safetensors",
+    "model.layers.38.self_attn.o_proj.weight": "model-00014-of-00015.safetensors",
+    "model.layers.38.self_attn.q_proj.weight": "model-00014-of-00015.safetensors",
+    "model.layers.38.self_attn.v_proj.weight": "model-00014-of-00015.safetensors",
+    "model.layers.39.input_layernorm.weight": "model-00015-of-00015.safetensors",
+    "model.layers.39.mlp.down_proj.weight": "model-00015-of-00015.safetensors",
+    "model.layers.39.mlp.gate_proj.weight": "model-00015-of-00015.safetensors",
+    "model.layers.39.mlp.up_proj.weight": "model-00015-of-00015.safetensors",
+    "model.layers.39.self_attn.k_proj.weight": "model-00014-of-00015.safetensors",
+    "model.layers.39.self_attn.o_proj.weight": "model-00014-of-00015.safetensors",
+    "model.layers.39.self_attn.q_proj.weight": "model-00014-of-00015.safetensors",
+    "model.layers.39.self_attn.v_proj.weight": "model-00014-of-00015.safetensors",
+    "model.layers.4.input_layernorm.weight": "model-00003-of-00015.safetensors",
+    "model.layers.4.mlp.down_proj.weight": "model-00003-of-00015.safetensors",
+    "model.layers.4.mlp.gate_proj.weight": "model-00003-of-00015.safetensors",
+    "model.layers.4.mlp.up_proj.weight": "model-00003-of-00015.safetensors",
+    "model.layers.4.self_attn.k_proj.weight": "model-00003-of-00015.safetensors",
+    "model.layers.4.self_attn.o_proj.weight": "model-00003-of-00015.safetensors",
+    "model.layers.4.self_attn.q_proj.weight": "model-00003-of-00015.safetensors",
+    "model.layers.4.self_attn.v_proj.weight": "model-00003-of-00015.safetensors",
+    "model.layers.5.input_layernorm.weight": "model-00003-of-00015.safetensors",
+    "model.layers.5.mlp.down_proj.weight": "model-00003-of-00015.safetensors",
+    "model.layers.5.mlp.gate_proj.weight": "model-00003-of-00015.safetensors",
+    "model.layers.5.mlp.up_proj.weight": "model-00003-of-00015.safetensors",
+    "model.layers.5.self_attn.k_proj.weight": "model-00003-of-00015.safetensors",
+    "model.layers.5.self_attn.o_proj.weight": "model-00003-of-00015.safetensors",
+    "model.layers.5.self_attn.q_proj.weight": "model-00003-of-00015.safetensors",
+    "model.layers.5.self_attn.v_proj.weight": "model-00003-of-00015.safetensors",
+    "model.layers.6.input_layernorm.weight": "model-00004-of-00015.safetensors",
+    "model.layers.6.mlp.down_proj.weight": "model-00004-of-00015.safetensors",
+    "model.layers.6.mlp.gate_proj.weight": "model-00004-of-00015.safetensors",
+    "model.layers.6.mlp.up_proj.weight": "model-00004-of-00015.safetensors",
+    "model.layers.6.self_attn.k_proj.weight": "model-00003-of-00015.safetensors",
+    "model.layers.6.self_attn.o_proj.weight": "model-00003-of-00015.safetensors",
+    "model.layers.6.self_attn.q_proj.weight": "model-00003-of-00015.safetensors",
+    "model.layers.6.self_attn.v_proj.weight": "model-00003-of-00015.safetensors",
+    "model.layers.7.input_layernorm.weight": "model-00004-of-00015.safetensors",
+    "model.layers.7.mlp.down_proj.weight": "model-00004-of-00015.safetensors",
+    "model.layers.7.mlp.gate_proj.weight": "model-00004-of-00015.safetensors",
+    "model.layers.7.mlp.up_proj.weight": "model-00004-of-00015.safetensors",
+    "model.layers.7.self_attn.k_proj.weight": "model-00004-of-00015.safetensors",
+    "model.layers.7.self_attn.o_proj.weight": "model-00004-of-00015.safetensors",
+    "model.layers.7.self_attn.q_proj.weight": "model-00004-of-00015.safetensors",
+    "model.layers.7.self_attn.v_proj.weight": "model-00004-of-00015.safetensors",
+    "model.layers.8.input_layernorm.weight": "model-00004-of-00015.safetensors",
+    "model.layers.8.mlp.down_proj.weight": "model-00004-of-00015.safetensors",
+    "model.layers.8.mlp.gate_proj.weight": "model-00004-of-00015.safetensors",
+    "model.layers.8.mlp.up_proj.weight": "model-00004-of-00015.safetensors",
+    "model.layers.8.self_attn.k_proj.weight": "model-00004-of-00015.safetensors",
+    "model.layers.8.self_attn.o_proj.weight": "model-00004-of-00015.safetensors",
+    "model.layers.8.self_attn.q_proj.weight": "model-00004-of-00015.safetensors",
+    "model.layers.8.self_attn.v_proj.weight": "model-00004-of-00015.safetensors",
+    "model.layers.9.input_layernorm.weight": "model-00005-of-00015.safetensors",
+    "model.layers.9.mlp.down_proj.weight": "model-00005-of-00015.safetensors",
+    "model.layers.9.mlp.gate_proj.weight": "model-00005-of-00015.safetensors",
+    "model.layers.9.mlp.up_proj.weight": "model-00005-of-00015.safetensors",
+    "model.layers.9.self_attn.k_proj.weight": "model-00004-of-00015.safetensors",
+    "model.layers.9.self_attn.o_proj.weight": "model-00004-of-00015.safetensors",
+    "model.layers.9.self_attn.q_proj.weight": "model-00004-of-00015.safetensors",
+    "model.layers.9.self_attn.v_proj.weight": "model-00004-of-00015.safetensors",
+    "model.norm.weight": "model-00015-of-00015.safetensors"
+  }
+}

output-00001-of-00003.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:1aacfd58261f09e1a78ad82f31c92c3d09b820053e01b6162072dbfb663edf33
+size 8495211742

output-00002-of-00003.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:2a0c5fb2f23ad266985f961b37ad553a1118dade0e76a38d65868340bf0dfd9d
+size 8558121072

output-00003-of-00003.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:2e6adf9995a88b0b8723a0e71ca7169a4c497edafc556331e1222ac4a0b834e0
+size 6737196570

special_tokens_map.json ADDED Viewed

	@@ -0,0 +1,23 @@

+{
+  "bos_token": {
+    "content": "<s>",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "eos_token": {
+    "content": "</s>",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "pad_token": {
+    "content": "<PAD>",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  }
+}

tokenizer.json ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:3ec24d1fe80ac960489b2004b7399ea561799de2fae774bd5a9234c13e6a0726
+size 12777306

tokenizer_config.json ADDED Viewed

	@@ -0,0 +1,314 @@

+{
+    "add_bos_token": true,
+    "add_eos_token": false,
+    "added_tokens_decoder": {
+      "0": {
+        "content": "<PAD>",
+        "lstrip": false,
+        "normalized": false,
+        "rstrip": false,
+        "single_word": false,
+        "special": true
+      },
+      "1": {
+        "content": "<UNK>",
+        "lstrip": false,
+        "normalized": false,
+        "rstrip": false,
+        "single_word": false,
+        "special": true
+      },
+      "2": {
+        "content": "<CLS>",
+        "lstrip": false,
+        "normalized": false,
+        "rstrip": false,
+        "single_word": false,
+        "special": true
+      },
+      "3": {
+        "content": "<SEP>",
+        "lstrip": false,
+        "normalized": false,
+        "rstrip": false,
+        "single_word": false,
+        "special": true
+      },
+      "4": {
+        "content": "<MASK_TOKEN>",
+        "lstrip": false,
+        "normalized": false,
+        "rstrip": false,
+        "single_word": false,
+        "special": true
+      },
+      "5": {
+        "content": "<s>",
+        "lstrip": false,
+        "normalized": false,
+        "rstrip": false,
+        "single_word": false,
+        "special": true
+      },
+      "6": {
+        "content": "</s>",
+        "lstrip": false,
+        "normalized": false,
+        "rstrip": false,
+        "single_word": false,
+        "special": true
+      },
+      "7": {
+        "content": "<EOP_TOKEN>",
+        "lstrip": false,
+        "normalized": false,
+        "rstrip": false,
+        "single_word": false,
+        "special": true
+      },
+      "255000": {
+        "content": "<|im_start|>",
+        "lstrip": false,
+        "normalized": false,
+        "rstrip": false,
+        "single_word": false,
+        "special": false
+      },
+      "255001": {
+        "content": "<|im_end|>",
+        "lstrip": false,
+        "normalized": false,
+        "rstrip": false,
+        "single_word": false,
+        "special": false
+      },
+      "255002": {
+        "content": "<|YES_TOKEN|>",
+        "lstrip": false,
+        "normalized": false,
+        "rstrip": false,
+        "single_word": false,
+        "special": false
+      },
+      "255003": {
+        "content": "<|NO_TOKEN|>",
+        "lstrip": false,
+        "normalized": false,
+        "rstrip": false,
+        "single_word": false,
+        "special": false
+      },
+      "255004": {
+        "content": "<|GOOD_TOKEN|>",
+        "lstrip": false,
+        "normalized": false,
+        "rstrip": false,
+        "single_word": false,
+        "special": false
+      },
+      "255005": {
+        "content": "<|BAD_TOKEN|>",
+        "lstrip": false,
+        "normalized": false,
+        "rstrip": false,
+        "single_word": false,
+        "special": false
+      },
+      "255006": {
+        "content": "<|USER_TOKEN|>",
+        "lstrip": false,
+        "normalized": false,
+        "rstrip": false,
+        "single_word": false,
+        "special": false
+      },
+      "255007": {
+        "content": "<|CHATBOT_TOKEN|>",
+        "lstrip": false,
+        "normalized": false,
+        "rstrip": false,
+        "single_word": false,
+        "special": false
+      },
+      "255008": {
+        "content": "<|SYSTEM_TOKEN|>",
+        "lstrip": false,
+        "normalized": false,
+        "rstrip": false,
+        "single_word": false,
+        "special": false
+      },
+      "255009": {
+        "content": "<|USER_0_TOKEN|>",
+        "lstrip": false,
+        "normalized": false,
+        "rstrip": false,
+        "single_word": false,
+        "special": false
+      },
+      "255010": {
+        "content": "<|USER_1_TOKEN|>",
+        "lstrip": false,
+        "normalized": false,
+        "rstrip": false,
+        "single_word": false,
+        "special": false
+      },
+      "255011": {
+        "content": "<|USER_2_TOKEN|>",
+        "lstrip": false,
+        "normalized": false,
+        "rstrip": false,
+        "single_word": false,
+        "special": false
+      },
+      "255012": {
+        "content": "<|USER_3_TOKEN|>",
+        "lstrip": false,
+        "normalized": false,
+        "rstrip": false,
+        "single_word": false,
+        "special": false
+      },
+      "255013": {
+        "content": "<|USER_4_TOKEN|>",
+        "lstrip": false,
+        "normalized": false,
+        "rstrip": false,
+        "single_word": false,
+        "special": false
+      },
+      "255014": {
+        "content": "<|USER_5_TOKEN|>",
+        "lstrip": false,
+        "normalized": false,
+        "rstrip": false,
+        "single_word": false,
+        "special": false
+      },
+      "255015": {
+        "content": "<|USER_6_TOKEN|>",
+        "lstrip": false,
+        "normalized": false,
+        "rstrip": false,
+        "single_word": false,
+        "special": false
+      },
+      "255016": {
+        "content": "<|USER_7_TOKEN|>",
+        "lstrip": false,
+        "normalized": false,
+        "rstrip": false,
+        "single_word": false,
+        "special": false
+      },
+      "255017": {
+        "content": "<|USER_8_TOKEN|>",
+        "lstrip": false,
+        "normalized": false,
+        "rstrip": false,
+        "single_word": false,
+        "special": false
+      },
+      "255018": {
+        "content": "<|USER_9_TOKEN|>",
+        "lstrip": false,
+        "normalized": false,
+        "rstrip": false,
+        "single_word": false,
+        "special": false
+      },
+      "255019": {
+        "content": "<|EXTRA_0_TOKEN|>",
+        "lstrip": false,
+        "normalized": false,
+        "rstrip": false,
+        "single_word": false,
+        "special": false
+      },
+      "255020": {
+        "content": "<|EXTRA_1_TOKEN|>",
+        "lstrip": false,
+        "normalized": false,
+        "rstrip": false,
+        "single_word": false,
+        "special": false
+      },
+      "255021": {
+        "content": "<|EXTRA_2_TOKEN|>",
+        "lstrip": false,
+        "normalized": false,
+        "rstrip": false,
+        "single_word": false,
+        "special": false
+      },
+      "255022": {
+        "content": "<|EXTRA_3_TOKEN|>",
+        "lstrip": false,
+        "normalized": false,
+        "rstrip": false,
+        "single_word": false,
+        "special": false
+      },
+      "255023": {
+        "content": "<|EXTRA_4_TOKEN|>",
+        "lstrip": false,
+        "normalized": false,
+        "rstrip": false,
+        "single_word": false,
+        "special": false
+      },
+      "255024": {
+        "content": "<|EXTRA_5_TOKEN|>",
+        "lstrip": false,
+        "normalized": false,
+        "rstrip": false,
+        "single_word": false,
+        "special": false
+      },
+      "255025": {
+        "content": "<|EXTRA_6_TOKEN|>",
+        "lstrip": false,
+        "normalized": false,
+        "rstrip": false,
+        "single_word": false,
+        "special": false
+      },
+      "255026": {
+        "content": "<|EXTRA_7_TOKEN|>",
+        "lstrip": false,
+        "normalized": false,
+        "rstrip": false,
+        "single_word": false,
+        "special": false
+      },
+      "255027": {
+        "content": "<|EXTRA_8_TOKEN|>",
+        "lstrip": false,
+        "normalized": false,
+        "rstrip": false,
+        "single_word": false,
+        "special": false
+      },
+      "255028": {
+        "content": "<|EXTRA_9_TOKEN|>",
+        "lstrip": false,
+        "normalized": false,
+        "rstrip": false,
+        "single_word": false,
+        "special": false
+      }
+    },
+    "bos_token": "<s>",
+    "clean_up_tokenization_spaces": false,
+    "eos_token": "</s>",
+    "chat_template": "{% for message in messages %}{{'<|im_start|>' + message['role'] + '\n' + message['content'] + '<|im_end|>' + '\n'}}{% endfor %}",
+    "legacy": true,
+    "model_max_length": 1000000000000000019884624838656,
+    "pad_token": "<PAD>",
+    "sp_model_kwargs": {},
+    "spaces_between_special_tokens": false,
+    "tokenizer_class": "LlamaTokenizer",
+    "unk_token": null,
+    "use_default_system_prompt": false
+  }