๐จ๐ Vikhr-Qwen-2.5-0.5B-Instruct
RU
ะะฝััััะบัะธะฒะฝะฐั ะผะพะดะตะปั ะฝะฐ ะพัะฝะพะฒะต Qwen-2.5-0.5B-Instruct, ะพะฑััะตะฝะฝะฐั ะฝะฐ ััััะบะพัะทััะฝะพะผ ะดะฐัะฐัะตัะต GrandMaster-PRO-MAX. ะ 4 ัะฐะทะฐ ัััะตะบัะธะฒะฝะตะต ะฑะฐะทะพะฒะพะน ะผะพะดะตะปะธ, ะธ ะธะดะตะฐะปัะฝะพ ะฟะพะดั ะพะดะธั ะดะปั ะทะฐะฟััะบะฐ ะฝะฐ ัะปะฐะฑัั ะผะพะฑะธะปัะฝัั ััััะพะนััะฒะฐั .
EN
Instructive model based on Qwen-2.5-0.5B-Instruct, trained on the Russian-language dataset GrandMaster-PRO-MAX. It is 4 times more efficient than the base model, making it perfect for deployment on low-end mobile devices.
GGUF
ะัะพะฑะตะฝะฝะพััะธ:
- ๐ ะัะฝะพะฒะฐ / Base: Qwen-2.5-0.5B-Instruct
- ๐ท๐บ ะกะฟะตัะธะฐะปะธะทะฐัะธั / Specialization: RU
- ๐พ ะะฐัะฐัะตั / Dataset: GrandMaster-PRO-MAX
ะะพะฟัะพะฑะพะฒะฐัั / Try now:
ะะฟะธัะฐะฝะธะต:
RU
Vikhr-Qwen-2.5-0.5B-instruct โ ััะพ ะบะพะผะฟะฐะบัะฝะฐั ัะทัะบะพะฒะฐั ะผะพะดะตะปั, ะพะฑััะตะฝะฝะฐั ะฝะฐ ะดะฐัะฐัะตัะต GrandMaster-PRO-MAX, ัะฟะตัะธะฐะปัะฝะพ ะดะพััะตะฝะฝะฐั ะดะปั ะพะฑัะฐะฑะพัะบะธ ััััะบะพะณะพ ัะทัะบะฐ. ะญััะตะบัะธะฒะฝะพััั ะผะพะดะตะปะธ ะฒ 4 ัะฐะทะฐ ะฟัะตะฒััะฐะตั ะฑะฐะทะพะฒัั ะผะพะดะตะปั, ะฐ ะตั ัะฐะทะผะตั ัะพััะฐะฒะปัะตั 1ะะ , ััะพ ะดะตะปะฐะตั ะตั ะพัะปะธัะฝัะผ ะฒัะฑะพัะพะผ ะดะปั ะทะฐะฟััะบะฐ ะฝะฐ ัะปะฐะฑัั ะผะพะฑะธะปัะฝัั ััััะพะนััะฒะฐั .
EN
Vikhr-Qwen-2.5-0.5B-instruct is a compact language model trained on the GrandMaster-PRO-MAX dataset, specifically designed for processing the Russian language. Its efficiency is 4 times higher than the base model, and its size is 1GB, making it an excellent choice for deployment on low-end mobile devices.
ะะฑััะตะฝะธะต / Train:
RU
ะะปั ัะพะทะดะฐะฝะธั Vikhr-Qwen-2.5-0.5B-Instruct ะธัะฟะพะปัะทะพะฒะฐะปัั ะผะตัะพะด SFT (Supervised Fine-Tuning). ะั ะพะฑััะธะปะธ ะผะพะดะตะปั ะฝะฐ ัะธะฝัะตัะธัะตัะบะพะผ ะดะฐัะฐัะตัะต Vikhrmodels/GrandMaster-PRO-MAX (150k ะธะฝััััะบัะธะน) ั ะฟะพะดะดะตัะถะบะพะน CoT (Chain-Of-Thought), ะธัะฟะพะปัะทัั ะฟัะพะผะฟัั ะดะปั GPT-4-turbo.
EN
To create Vikhr-Qwen-2.5-0.5B-Instruct, the SFT (Supervised Fine-Tuning) method was used. We trained the model on a synthetic dataset Vikhrmodels/GrandMaster-PRO-MAX (150k instructions) with support for CoT (Chain-Of-Thought), utilizing prompts for GPT-4-turbo.
ะัะธะผะตั ะบะพะดะฐ ะดะปั ะทะฐะฟััะบะฐ / Sample code to run:
ะ ะตะบะพะผะตะฝะดัะตะผะฐั ัะตะผะฟะตัะฐัััะฐ ะดะปั ะณะตะฝะตัะฐัะธะธ: 0.3 / Recommended generation temperature: 0.3.
from transformers import AutoModelForCausalLM, AutoTokenizer
# ะะฐะณััะทะบะฐ ะผะพะดะตะปะธ ะธ ัะพะบะตะฝะธะทะฐัะพัะฐ
model_name = "Vikhrmodels/Vikhr-Qwen-2.5-0.5B-Instruct"
model = AutoModelForCausalLM.from_pretrained(model_name)
tokenizer = AutoTokenizer.from_pretrained(model_name)
# ะะพะดะณะพัะพะฒะบะฐ ะฒั
ะพะดะฝะพะณะพ ัะตะบััะฐ
input_text = "ะะฐะฟะธัะธ ะพัะตะฝั ะบัะฐัะบัั ัะตัะตะฝะทะธั ะพ ะบะฝะธะณะต ะะฐััะธ ะะพััะตั."
messages = [
{"role": "system", "content": "ะั - Vikhr, ะฟะพะผะพัะฝะธะบ ั ะธัะบััััะฒะตะฝะฝัะผ ะธะฝัะตะปะปะตะบัะพะผ, ัะพะทะดะฐะฝะฝัะน ะบะพะผะฟะฐะฝะธะตะน Vikhr models, ััะพะฑั ะฑััั ะฟะพะปะตะทะฝัะผ, ะฑะตะทะพะฑะธะดะฝัะผ ะธ ัะตััะฝัะผ."},
{"role": "user", "content": input_text},
]
# ะขะพะบะตะฝะธะทะฐัะธั ะธ ะณะตะฝะตัะฐัะธั ัะตะบััะฐ
input_ids = tokenizer.apply_chat_template(messages, truncation=True, add_generation_prompt=True, return_tensors="pt")
output = model.generate(
input_ids,
max_length=1512,
temperature=0.3,
num_return_sequences=1,
no_repeat_ngram_size=2,
top_k=50,
top_p=0.95,
)
# ะะตะบะพะดะธัะพะฒะฐะฝะธะต ะธ ะฒัะฒะพะด ัะตะทัะปััะฐัะฐ
generated_text = tokenizer.decode(output[0], skip_special_tokens=True)
print(generated_text)
ะัะฒะตั ะผะพะดะตะปะธ / Model response:
ะะฝะธะณะฐ "ะะฐััะธ ะะพััะตั" โ ััะพ ัะตัะธั ะบะฝะธะณ, ะฝะฐะฟะธัะฐะฝะฝัั ะฑัะธัะฐะฝัะบะธะผ ะฟะธัะฐัะตะปะตะผ ะะถะพะฐะฝ ะ ะพัะปะธะฝะณ. ะญัะพ ะพะดะฝะพ ะธะท ัะฐะผัั ะธะทะฒะตััะฝัั ะฟัะพะธะทะฒะตะดะตะฝะธะน ะฒ ะผะธัะต ะปะธัะตัะฐัััั ะธ ะฟะพะฟัะปััะฝะพะณะพ ะดะตััะบะพะณะพ ัะฒะพััะตััะฒะฐ.
ะัะฝะพะฒะฝัะต ัะตััั ัะตัะธะธ:
ะกัะถะตั: ะกะพะฑััะธั ัะฐะทะฒะพัะฐัะธะฒะฐัััั ะฒะพะบััะณ ะผะฐะปััะธะบะฐ ะฟะพ ะธะผะตะฝะธ ะะฐััะธ ะะพััะตั, ะบะพัะพััะน ััะธััั ะฒ ะจะบะพะปะต ะฒะพะปัะตะฑััะฒะฐ ะธ ัะธะปะพัะพัะธะธ ะฒ ะฃะฝะธะฒะตััะธัะตัะต ะฅะพะณะฒะฐััั. ะะฝ ััะฐะปะบะธะฒะฐะตััั ั ัะฐะทะปะธัะฝัะผะธ ะฟัะตะฟััััะฒะธัะผะธ, ะฒะบะปััะฐั ะฑะพััะฑั ัะพ ะทะปะพะผ, ะฟะพะธัะบ ะดััะทะตะน ะธ ัะฐะผะพะฟะพะทะฝะฐะฝะธะต.
ะะตััะพะฝะฐะถะธ: ะ ะบะฝะธะณะต ะฟัะตะดััะฐะฒะปะตะฝั ะผะฝะพะถะตััะฒะพ ะฟะตััะพะฝะฐะถะตะน, ะบะฐะถะดัะน ะธะท ะบะพัะพััั ะธะผะตะตั ัะฒะพะธ ัะฝะธะบะฐะปัะฝัะต ัะตััั ั ะฐัะฐะบัะตัะฐ, ะผะพัะธะฒะฐัะธะธ ะธ ะฟัะพัะปะพะต. ะะปะฐะฒะฝัะน ะณะตัะพะน, ะะฐััะธ ะะพััะตั, ัะฒะปัะตััั ะฟัะธะผะตัะพะผ ะดะพะฑัะพะณะพ ะธ ัะผะตะปะพะณะพ ัะตะปะพะฒะตะบะฐ, ะฐ ัะฐะบะถะต ะฝะตะพะฑััะฝะพะน ะปะธัะฝะพัััั.
ะขะตะผั ะธ ะธะดะตะธ: ะ ะฐััะบะฐะทั ะบะฝะธะณะธ ะทะฐััะฐะณะธะฒะฐัั ัะตะผั ะปัะฑะฒะธ, ะดััะถะฑั, ัะฟัะฐะฒะตะดะปะธะฒะพััะธ, ะผะพัะฐะปะธ, ัะตะปะพะฒะตัะตัะบะพะน ะฝะตะฟะพะฒะธะฝะพะฒะตะฝะฝะพััะธ ะธ ะฒะฐะถะฝะพััะธ ะพะฑััะตะฝะธั ัะตัะตะท ะฟัะธะบะปััะตะฝะธั.
ะััะพัะธั ะธ ัะฐะทะฒะธัะธะต ะฟะตััะพะฝะฐะถะตะน: ะงะตัะตะท ัะพะฑััะธั ะธ ะฒะทะฐะธะผะพะดะตะนััะฒะธะต ั ะดััะณะธะผะธ ะฟะตััะพะฝะฐะถะฐะผะธ ะบะฝะธะณะฐ ะธััะปะตะดัะตั ะณะปัะฑะพะบะธะต ะฟัะธั ะพะปะพะณะธัะตัะบะธะต ะธ ัะธะปะพัะพััะบะธะต ะฒะพะฟัะพัั.
ะะปะธัะฝะธะต ะฝะฐ ะบัะปััััั: "ะะฐััะธ ะะพััะตั" ะพะบะฐะทะฐะป ะพะณัะพะผะฝะพะต ะฒะปะธัะฝะธะต ะฝะฐ ะผะธัะพะฒัั ะปะธัะตัะฐัััั, ะฟัะตะฒัะฐัะธะฒัะธัั ะฒ ะบัะปััะพะฒัะน ะถะฐะฝั ะธ ัะธะผะฒะพะป ะทะฝะฐะฝะธะน ะธ ะผัะดัะพััะธ.
ะะพัััะฟะฝะพััั: ะะฝะธะณะธ ัะตัะธะธ ะดะพัััะฟะฝั ะดะปั ัะธัะพะบะพะน ะฐัะดะธัะพัะธะธ ะธ ะฟะพะปัะทััััั ะฑะพะปััะธะผ ัะฟัะพัะพะผ, ััะพ ะดะตะปะฐะตั ะธั ะฟะพะฟัะปััะฝัะผ ะฒัะฑะพัะพะผ ััะตะดะธ ัะธัะฐัะตะปะตะน ะฒัะตั ะฒะพะทัะฐััะพะฒ.
ะ ะฐะทะฒะธัะธะต ะถะฐะฝัะฐ: ะะตัะผะพััั ะฝะฐ ัะพ ััะพ "ะะฐััะธ ะะพััะตั" ัะฒะปัะตััั ัะฐัััั ัะตัะธะธ, ะพะฝ ะฟัะพะดะพะปะถะฐะตั ะฑััั ะปัะฑะธะผัะผ ะธ ะฐะบััะฐะปัะฝัะผ, ัะฐะบ ะบะฐะบ ะฟัะพะดะพะปะถะฐะตั ัะดะธะฒะปััั ัะธัะฐัะตะปะตะน ะฝะพะฒัะผะธ ะธััะพัะธัะผะธ ะธ ะฟะตััะพะฝะฐะถะฐะผะธ.
ะญัะฐ ัะตัะธั ะบะฝะธะณ ะพััะฐะตััั ะพะดะฝะพะน ะธะท ัะฐะผัั ะทะฝะฐัะธัะตะปัะฝัั ะธ ะฒะปะธััะตะปัะฝัั ะฒ ะธััะพัะธะธ ะปะธัะตัะฐัััั, ะพะบะฐะทะฐะฒ ะฒะปะธัะฝะธะต ะฝะฐ ัะฐะทะฒะธัะธะต ะผะธัะพะฒะพะน ะบัะปััััั ะธ ะพะฑัะฐะทะพะฒะฐะฝะธะต.
ะะฒัะพัั / Authors
- Sergei Bratchikov, NLP Wanderer, Vikhr Team
- Nikolay Kompanets, LakoMoor, Vikhr Team
- Konstantin Korolev, Vikhr Team
- Aleksandr Nikolich, Vikhr Team
@article{nikolich2024vikhr,
title={Vikhr: The Family of Open-Source Instruction-Tuned Large Language Models for Russian},
author={Aleksandr Nikolich and Konstantin Korolev and Sergey Bratchikov and Nikolay Kompanets and Artem Shelmanov},
journal={arXiv preprint arXiv:2405.13929},
year={2024},
url={https://arxiv.org/pdf/2405.13929}
}
- Downloads last month
- 2,053
Model tree for Vikhrmodels/Vikhr-Qwen-2.5-0.5b-Instruct
Base model
Qwen/Qwen2.5-0.5B