thouficatranslator / README.md
thaslimthoufica's picture
Update README.md
d117832 verified
metadata
license: apache-2.0
datasets:
  - thaslimthoufica/thoufica_final_dataset
language:
  - en
  - ta
base_model:
  - unsloth/Meta-Llama-3.1-8B-bnb-4bit

English to Tamil Colloquial Translator πŸš€

Fine-tuned LLaMA 3.1 8B Model for Spoken Tamil Translation

🌟 Overview

This project fine-tunes the unsloth/Meta-Llama-3.1-8B-bnb-4bit model using LoRA adapters to translate English sentences into Tamil colloquial (spoken) language.

πŸ”₯ Features

  • Uses LoRA-based fine-tuning to efficiently train a large model on limited resources.
  • Trained on the dataset thaslimthoufica/English_tamil_dataset.
  • Supports inference with sampling (temperature, top-p, and repetition penalty) for natural responses.

πŸ“š Dataset

The dataset contains parallel English-to-Tamil colloquial translations.
You can find it here: Hugging Face Dataset