--- license: mit base_model: - google-bert/bert-base-cased tags: - generated_from_trainer - finetune datasets: - vrclc/dakshina-lexicons-ml language: - si new_version: google-bert/bert-base-cased base-model: - Ransaka/sinhala-bert-medium-v2 wedget: - text: "අපි තමයි [MASK] කරේ." - text: "මට හෙට එන්න වෙන්නේ [MASK]." - text: "අපි ගෙදර [MASK]." - text: 'සිංහල සහ [MASK] අලුත් අවුරුද්ද.' --- # fine-tune-sinhala-bert This model is pretrained on Sinhala data resources. ## Model description hidden_size = 786 num_hidden_layers = 6 num_attention_heads = 6 intermediate_size = 1024 ## Intended uses & limitations More information needed ## Training and evaluation data More information needed ## Training procedure ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 5e-05 - train_batch_size: 64 - eval_batch_size: 16 - seed: 42 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 - lr_scheduler_type: linear - num_epochs: 6 ### Training results Epoch Training Loss Validation Loss 1 3.946600 3.898129 2 3.782100 3.800080 3 3.678300 3.706316 4 3.485600 3.646217 5 3.480900 3.601913 6 3.420000 3.615573 ### Framework versions - Transformers 4.47.0 - Pytorch 2.0.0 - Datasets 3.2.0 - Tokenizers 0.21.0