English to Bhojpuri Translation Model Alpha2

The Alpha2 model is a fine-tuned translation model derived from facebook/mbart-large-50-many-to-many-mmt, specifically trained to translate English text into Bhojpuri. It builds on the Alpha1 version, with improvements from training over 4 epochs on a custom parallel dataset.

Space

https://huggingface.co/spaces/nilayshenai/English-to-Bhojpuri-Translator

Updates from Alpha1

  • Trained for 4 epochs for better generalization.
  • Improved translation fluency and accuracy for Bhojpuri.

Contents

  • config.json – Model configuration.
  • generation_config.json – Generation parameters (e.g., max length, beam search).
  • model.safetensors – Fine-tuned model weights.
  • sentencepiece.bpe.model – Tokenizer vocabulary (SentencePiece model).
  • special_tokens_map.json – Mapping of special tokens (e.g., BOS, EOS).
  • tokenizer.json – Full tokenizer JSON.
  • tokenizer_config.json – Tokenizer configuration.

license: mit

Downloads last month
18
Safetensors
Model size
611M params
Tensor type
F32
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for nilayshenai/BART-English-to-Bhojpuri-Alpha2

Finetuned
(147)
this model

Dataset used to train nilayshenai/BART-English-to-Bhojpuri-Alpha2

Space using nilayshenai/BART-English-to-Bhojpuri-Alpha2 1