Urdu-BitNet-medium

This is a Urdu language model trained using the BitNet architecture. The model is based on the LLaMA architecture but uses 1.58-bit quantization for weights and 8-bit quantization for activations.

Model details

Architecture: BitNet (based on LLaMA)
Size: Medium
Training data: Combined Urdu corpus (news, Wikipedia, OSCAR)

Performance metrics

Test loss: 1.5825010538101196

Usage

Limitations

This model is specifically trained for Urdu language understanding and generation. It may not perform well for other languages or specialized domains.

Citation

If you use this model in your research, please cite:


@misc{urdu_bitnet,
  author = {Mahwiz Khalil},
  title = {Urdu-BitNet-medium: A Quantized Urdu Language Model},
  year = {2025},
  publisher = {HuggingFace},
  howpublished = {\url{https://huggingface.co/mahwizzzz/urdu-bitnet-medium}}
}

mahwizzzz
/

urdu-bitnet-medium

You need to agree to share your contact information to access this model