Urdu-BitNet-medium
This is a Urdu language model trained using the BitNet architecture. The model is based on the LLaMA architecture but uses 1.58-bit quantization for weights and 8-bit quantization for activations.
Model details
- Architecture: BitNet (based on LLaMA)
- Size: Medium
- Training data: Combined Urdu corpus (news, Wikipedia, OSCAR)
Performance metrics
- Test loss: 1.5825010538101196
Usage
Limitations
This model is specifically trained for Urdu language understanding and generation. It may not perform well for other languages or specialized domains.
Citation
If you use this model in your research, please cite:
@misc{urdu_bitnet,
author = {Mahwiz Khalil},
title = {Urdu-BitNet-medium: A Quantized Urdu Language Model},
year = {2025},
publisher = {HuggingFace},
howpublished = {\url{https://huggingface.co/mahwizzzz/urdu-bitnet-medium}}
}
- Downloads last month
- 8
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support
Model tree for mahwizzzz/urdu-bitnet-medium
Base model
meta-llama/Llama-2-7b-hf