Qwen-3 14B 🦙 LoRA for Historical-Turkish NER

This LoRA adapter teaches Qwen-3 14B to perform named-entity recognition on Historical (Latin script Ottoman) Turkish texts. It was fine-tuned for BIO tagging of PER, LOC, and ORG on a manually-annotated corpus of ~8 k (datasets/BUCOLIN/HisTR) The full base model remains unchanged.

Adapter type: PEFT-LoRA (rank 32, α 32, no dropout)

Base model: unsloth/Qwen3-14B-unsloth-bnb-4bit

Context length: 2 048

Quantisation: 4-bit NF4 (base), fp16 (adapter)

Training platform: Google Colab free tier (Tesla T4 16 GB) using Unsloth 2025-04-07

Usage 📊

  pip install unsloth
  pip install torch
  pip install peft

from unsloth import FastLanguageModel
import torch, re

BASE_ID    = "unsloth/Qwen3-14B"   # base model
ADAPTER_ID = "cihanunlu/qwen3-ner-lora"     # Lora Adapter model         

# 1️⃣  Load the base model
model, tokenizer = FastLanguageModel.from_pretrained(
        model_name     = BASE_ID,
        load_in_4bit   = True,
        max_seq_length = 2048,
)

# 2️⃣  Turn it into a PEFT container and add the adapter
model = FastLanguageModel.get_peft_model(model)    
model.load_adapter(ADAPTER_ID, adapter_name="ner")  
model.set_adapter("ner")

def ner(sentence):
    prompt = [ {"role":"user",
                "content":f"Label this sentence {sentence}"} ]
    chat = tokenizer.apply_chat_template(prompt, tokenize=False,
                                         add_generation_prompt=True,
                                         enable_thinking=False)
    out  = model.generate(**tokenizer(chat, return_tensors="pt").to(model.device),
                          max_new_tokens=64, do_sample=False)[0]
    print(tokenizer.decode(out, skip_special_tokens=True))

ner("Emin Bey’in kuklaları Tepebaşı’nda oynuyor.")
# → B-PER I-PER O O O B-LOC O

Evaluation 📊

TBA

Training Details 🛠️

LoRA rank / α : 32 / 32 Optimiser : AdamW-8bit LR / Scheduler : 2 e-4 linear Batch : 1 (GA = 4 → eff. 4) Epochs : 3 (≈ 2 k steps) Gradient Checkpointing : “unsloth” Hardware : Tesla T4 16 GB (free Colab)

Developed by: cihanunlu
License: apache-2.0
Finetuned from model : unsloth/qwen3-14b-unsloth-bnb-4bit

Limitations:

Not trained for modern Turkish or other entity types. Small domain; may hallucinate tags on very long sentences.

cihanunlu
/

qwen3-ner-lora-HistTurk

Qwen-3 14B 🦙 LoRA for Historical-Turkish NER

Usage 📊

Evaluation 📊

Training Details 🛠️

Limitations:

Dataset used to train cihanunlu/qwen3-ner-lora-HistTurk