Qwen-3 14B 🦙 LoRA for Historical-Turkish NER

This LoRA adapter teaches Qwen-3 14B to perform named-entity recognition on Historical (Latin script Ottoman) Turkish texts. It was fine-tuned for BIO tagging of PER, LOC, and ORG on a manually-annotated corpus of ~8 k (datasets/BUCOLIN/HisTR) The full base model remains unchanged.

Adapter type: PEFT-LoRA (rank 32, α 32, no dropout)

Base model: unsloth/Qwen3-14B-unsloth-bnb-4bit

Context length: 2 048

Quantisation: 4-bit NF4 (base), fp16 (adapter)

Training platform: Google Colab free tier (Tesla T4 16 GB) using Unsloth 2025-04-07

Usage 📊

  pip install unsloth
  pip install torch
  pip install peft
from unsloth import FastLanguageModel
import torch, re

BASE_ID    = "unsloth/Qwen3-14B"   # base model
ADAPTER_ID = "cihanunlu/qwen3-ner-lora"     # Lora Adapter model         

# 1️⃣  Load the base model
model, tokenizer = FastLanguageModel.from_pretrained(
        model_name     = BASE_ID,
        load_in_4bit   = True,
        max_seq_length = 2048,
)

# 2️⃣  Turn it into a PEFT container and add the adapter
model = FastLanguageModel.get_peft_model(model)    
model.load_adapter(ADAPTER_ID, adapter_name="ner")  
model.set_adapter("ner")                         
def ner(sentence):
    prompt = [ {"role":"user",
                "content":f"Label this sentence {sentence}"} ]
    chat = tokenizer.apply_chat_template(prompt, tokenize=False,
                                         add_generation_prompt=True,
                                         enable_thinking=False)
    out  = model.generate(**tokenizer(chat, return_tensors="pt").to(model.device),
                          max_new_tokens=64, do_sample=False)[0]
    print(tokenizer.decode(out, skip_special_tokens=True))

ner("Emin Bey’in kuklaları Tepebaşı’nda oynuyor.")
# → B-PER I-PER O O O B-LOC O

Evaluation 📊

TBA

Training Details 🛠️

LoRA rank / α : 32 / 32 Optimiser : AdamW-8bit LR / Scheduler : 2 e-4 linear Batch : 1 (GA = 4 → eff. 4) Epochs : 3 (≈ 2 k steps) Gradient Checkpointing : “unsloth” Hardware : Tesla T4 16 GB (free Colab)

  • Developed by: cihanunlu
  • License: apache-2.0
  • Finetuned from model : unsloth/qwen3-14b-unsloth-bnb-4bit

Limitations:

Not trained for modern Turkish or other entity types. Small domain; may hallucinate tags on very long sentences.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Dataset used to train cihanunlu/qwen3-ner-lora-HistTurk