Devavrat28/peshwai-historian-ai

# 📜 Peshwai Historian AI (Marathi Language Model)
**An LLM fine-tuned to answer deep, culturally rich questions about the Peshwa era of Pune — in fluent Marathi.**
---

## ✨ Overview

`peshwai-historian-ai` is a fine-tuned language model focused on the **Maratha Peshwa dynasty**, trained on **curated historical text** in **Marathi**. The model can answer questions about lesser-known events, policies, and figures from the 18th-century Pune region with contextual, factual, and culturally accurate responses — especially useful for educators, students, historians, and heritage lovers.

---

## 🔍 What Makes It Special?

- 📖 **Marathi-native output**: Generates grammatically rich and natural Marathi text
- 🕰️ **Historical awareness**: Avoids commonly known facts (like Bajirao, Shaniwarwada) and focuses on underrepresented topics
- 🧠 **Fine-tuned on real historical documents**: Including rare facts about Nana Phadnavis, Mahadji Shinde, diplomacy, cultural shifts, etc.
- 🗣️ **Few-shot prompting optimized**: Learns from examples to improve answer quality

---

## 🧠 Example Prompt

```marathi
विषय: नाना फडणवीसांचे गुप्त राजकारण
सविस्तर माहिती:

💬 Sample Response

नाना फडणवीस हे पेशवाईतील अत्यंत मुत्सद्दी आणि धोरणशक्ती असलेले व्यक्तिमत्व होते. माधवराव पेशव्यांच्या मृत्यूनंतर, सत्तेची रिकामी जागा भरून काढण्यासाठी त्यांनी 'बारभाई मंडळ' तयार केले...

🧪 How to Use

In Python:

from unsloth import FastLanguageModel
import torch

model, tokenizer = FastLanguageModel.from_pretrained(
    model_name = "Devavrat28/peshwai-historian-ai",  # replace with your HF username
    max_seq_length = 4096,
    dtype = torch.float16,
    load_in_4bit = True
)

FastLanguageModel.for_inference(model)

prompt = "विषय: माधवराव पेशव्यांचा आरोग्यावर झालेला परिणाम\nसविस्तर माहिती:"
inputs = tokenizer([prompt], return_tensors="pt").to("cuda")

outputs = model.generate(**inputs, max_new_tokens=512)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

💡 Fine-Tuning Details

Setting	Value
Base Model	Gemma / LLaMA / Open LLM
Fine-tuning Method	Supervised Fine-Tuning (SFT)
Framework	Unsloth + 🤗 TRL
Dataset	Marathi historical text (~50K tokens)
Technique	Continued pretraining + SFT
Language	मराठी (Marathi)

🛠️ Intended Use

📚 Educational apps in schools or colleges
🏛️ Museums or digital history archives
🗣️ Voice-based Marathi chatbots for local history
📖 Research tools for historians and scholars

📜 Citation

If you use this model in research or production, please consider citing:

@misc{peshwaiHistorian2024,
  title={Peshwai Historian AI: A Marathi LLM for Regional Heritage},
  author={Devavrat Samak},
  year={2024},
  howpublished={\url{https://huggingface.co/devavrat/peshwai-historian-ai}},
}

❤️ Credits

Developed by Devavrat Samak
Inspired by the rich cultural heritage of Pune and the legacy of the Peshwas.

📬 Feedback / Contributions

I welcome pull requests, prompts, dataset contributions, and collaborations. Reach out via Hugging Face or GitHub.

Devavrat28
/

peshwai-historian-ai