Uploaded model

Developed by: KanNaga
License: apache-2.0
Finetuned from model : llm-jp/llm-jp-3-13b

This llama model was trained 2x faster with Unsloth and Huggingface's TRL library.

この大規模言語モデルは、公開されているLLMである「llm-jp/llm-jp-3-13b」を、日本語インストラクションデータセット「Ichikara Instruction」を用いてファインチューニングしたものです。

使用した日本語インストラクションデータセット: Ichikara Instruction 関根聡, 安藤まや, 後藤美知子, 鈴木久美, 河原大輔, 井之上直也, 乾健太郎. ichikara-instruction: LLMのための日本語インストラクションデータの構築. 言語処理学会第30回年次大会 (2024)

以下に、ファインチューニング済みの当モデルでタスクを解き、その結果をjsonlファイルとして保存するコードを示します。 '''python_code import json

モデルとトークナイザーの読み込み

from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "KanNaga/my_first_it_model_llm-jp-3-13b-it" model = AutoModelForCausalLM.from_pretrained(model_name) tokenizer = AutoTokenizer.from_pretrained(model_name)

データセットの読み込み

input_file = "読み込むデータセットファイルのパス" with open(input_file, "r") as f: inputs = [json.loads(line) for line in f]

モデルによる推論と結果の保存

results = [] for input_data in inputs: input_text = input_data["input"] inputs = tokenizer.encode(input_text, return_tensors="pt") outputs = model.generate(inputs) output_text = tokenizer.decode(outputs[0], skip_special_tokens=True) result = { "input": input_text, "output": output_text } results.append(result)

結果をjsonlファイルとして保存

output_file = f"保存するファイル名を指定.jsonl" with open(output_file, "w", encoding="utf-8") as f: for result in results: json.dump(result, f, ensure_ascii=False) f.write('\n') '''

KanNaga
/

my_first_it_model_llm-jp-3-13b-it

Uploaded model

モデルとトークナイザーの読み込み

データセットの読み込み

モデルによる推論と結果の保存

結果をjsonlファイルとして保存

Model tree for KanNaga/my_first_it_model_llm-jp-3-13b-it