You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

Urdu-BitNet-medium

This is a Urdu language model trained using the BitNet architecture. The model is based on the LLaMA architecture but uses 1.58-bit quantization for weights and 8-bit quantization for activations.

Model details

  • Architecture: BitNet (based on LLaMA)
  • Size: Medium
  • Training data: Combined Urdu corpus (news, Wikipedia, OSCAR)

Performance metrics

  • Test loss: 1.5825010538101196

Usage


Limitations

This model is specifically trained for Urdu language understanding and generation. It may not perform well for other languages or specialized domains.

Citation

If you use this model in your research, please cite:


@misc{urdu_bitnet,
  author = {Mahwiz Khalil},
  title = {Urdu-BitNet-medium: A Quantized Urdu Language Model},
  year = {2025},
  publisher = {HuggingFace},
  howpublished = {\url{https://huggingface.co/mahwizzzz/urdu-bitnet-medium}}
}
Downloads last month
8
Safetensors
Model size
210M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for mahwizzzz/urdu-bitnet-medium

Finetuned
(815)
this model

Datasets used to train mahwizzzz/urdu-bitnet-medium