ๆฆ‚่ฆ

ใ“ใฎใƒขใƒ‡ใƒซใฏQwQใฎ้•ทๆ–‡็”Ÿๆˆ่ƒฝๅŠ›ใจR1ใฎๆ€ง่ƒฝใ‚’ๅˆใ‚ใ›ใŸใƒขใƒ‡ใƒซใ‚’ไฝœใ‚‹ใ“ใจใ‚’็›ฎๆจ™ใซMergekitใจFTใ‚’็”จใ„ใฆ่ฃฝไฝœใ—ใพใ—ใŸใ€‚

How to use

from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "DataPilot/SKYDRIVE-32B-v0.1"

tokenizer_name = ""

if tokenizer_name == "":
    tokenizer_name = model_name

model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype="auto",
    device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained(tokenizer_name)

prompt = "ใƒกใ‚ฟใƒ‡ใƒผใ‚ฟใ‚’่งฃๆžใ—ใ€่‡ชๅทฑ้€ฒๅŒ–ใ‚’ใ™ใ‚‹AIใงใ‚ใ‚‹nurture intelligenceใŒๅฎŸ็พใ—ใŸๆœชๆฅใฎๆ—ฅๅธธ็”Ÿๆดปใฎๅงฟใ‚’ๆ•™ใˆใฆใใ ใ•ใ„ใ€‚"
messages = [
    {"role": "system", "content": "ใ‚ใชใŸใฏๅ„ช็ง€ใชๆ—ฅๆœฌ่ชžใ‚ขใ‚ทใ‚นใ‚ฟใƒณใƒˆใงใ‚ใ‚Š้•ท่€ƒใƒขใƒ‡ใƒซใงใ™ใ€‚ๅ•้กŒ่งฃๆฑบใ‚’ใ™ใ‚‹ใŸใ‚ใฎๆ€่€ƒใ‚’ใ—ใŸไธŠใงๅ›ž็ญ”ใ‚’่กŒใฃใฆใใ ใ•ใ„ใ€‚"},
    {"role": "user", "content": prompt}
]
text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True
)
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)

generated_ids = model.generate(
    **model_inputs,
    max_new_tokens=4096
)
generated_ids = [
    output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
]

response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]

print(response)

่ฌ่พž

ใ“ใฎใƒขใƒ‡ใƒซใฎไฝœๆˆ่€…็š†ๆง˜ใจใ€่จˆ็ฎ—่ณ‡ๆบใ‚’่ฒธใ—ใฆใ„ใŸใ ใ„ใŸVOLTMINDใซๆ„Ÿ่ฌใ—ใพใ™ใ€‚ ๅ•้กŒ่งฃๆฑบใซๅ”ๅŠ›ใ—ใฆใใ ใ•ใฃใŸhayashiใ•ใ‚“ใซใ‚‚ๆ„Ÿ่ฌ็”ณใ—ไธŠใ’ใพใ™ใ€‚

Mergekit config

merge_method: slerp
base_model: karakuri-ai/karakuri-lm-32b-thinking-2501-exp
models:
  - model: karakuri-ai/karakuri-lm-32b-thinking-2501-exp
  - model: NovaSky-AI/Sky-T1-32B-Flash
parameters:
  t: 0.4
dtype: bfloat16
name: SKYCAVE_element_Sky_jp
---
merge_method: breadcrumbs_ties
base_model: Qwen/Qwen2.5-32B
tokenizer_source: karakuri-ai/karakuri-lm-32b-thinking-2501-exp
name: SKYDRIVE_element_jp_01
models:
  - model: karakuri-ai/karakuri-lm-32b-thinking-2501-exp
    parameters:
      weight: 1.0
  - model: FuseAI/FuseO1-DeepSeekR1-Qwen2.5-Coder-32B-Preview
    parameters:
      weight: 0.75
dtype: bfloat16
---
merge_method: task_arithmetic
base_model: Qwen/Qwen2.5-32B
tokenizer_source: karakuri-ai/karakuri-lm-32b-thinking-2501-exp
name: SKYDRIVE_element_jp_02
models:
  - model: karakuri-ai/karakuri-lm-32b-thinking-2501-exp
    parameters:
      weight: 1.0
  - model: cyberagent/DeepSeek-R1-Distill-Qwen-32B-Japanese
    parameters:
      weight: 0.9
dtype: bfloat16
---
merge_method: slerp
base_model: karakuri-ai/karakuri-lm-32b-thinking-2501-exp
models:
  - model: karakuri-ai/karakuri-lm-32b-thinking-2501-exp
  - model: TeamDelta/ABEJA-Qwen2.5-32B-base-jp-v0.1
parameters:
  t: 0.5
dtype: bfloat16
name: SKYDRIVE_element_jp_03

---
merge_method: model_stock

base_model: Qwen/Qwen2.5-32B-Instruct

models:
  - model: karakuri-ai/karakuri-lm-32b-thinking-2501-exp
  - model: SKYCAVE_element_Sky_jp
  - model: SKYDRIVE_element_jp_01
  - model: SKYDRIVE_element_jp_02
  - model: SKYDRIVE_element_jp_03
  
dtype: bfloat16

pad_to_multiple_of: 512
tokenizer_source: base

name: SKYDRIVE-32B-v0.1
Downloads last month
0
Safetensors
Model size
32.8B params
Tensor type
BF16
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for DataPilot/SKYDRIVE-32B-v0.1