--- library_name: transformers license: mit base_model: facebook/m2m100_1.2B tags: - generated_from_trainer metrics: - bleu model-index: - name: m2m100_1.2B-ft-en-to-cy results: [] datasets: - techiaith/llyw-cymru-en-cy-ogl - mgrbyte/bydtermcymru-tm-en-cy - mgrbyte/cardiff-university-tm-en-cy - mgrbyte/cwm-taf-morgannwg-university-health-board-tm-en-cy language: - cy - en pipeline_tag: translation --- # m2m100_1.2B-ft-en-to-cy This model is a fine-tuned version of [facebook/m2m100_1.2B](https://huggingface.co/facebook/m2m100_1.2B) on an unknown dataset. It achieves the following results on the evaluation set: - Loss: 0.5864 - Bleu: 54.8016 - Gen Len: 33.9191 ## Model description More information needed ## Intended uses & limitations More information needed ## Training and evaluation data More information needed ## Training procedure ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 1e-05 - train_batch_size: 16 - eval_batch_size: 16 - seed: 42 - optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments - lr_scheduler_type: linear - lr_scheduler_warmup_steps: 6000 - training_steps: 30000 ### Training results | Training Loss | Epoch | Step | Validation Loss | Bleu | Gen Len | |:-------------:|:------:|:-----:|:---------------:|:-------:|:-------:| | 2.3252 | 0.0166 | 2000 | 1.9613 | 21.0612 | 35.9022 | | 1.5651 | 0.0332 | 4000 | 1.2919 | 34.3962 | 34.8431 | | 1.1755 | 0.0499 | 6000 | 0.9977 | 42.2725 | 34.4593 | | 0.9801 | 0.0665 | 8000 | 0.8545 | 46.4396 | 33.9573 | | 0.8697 | 0.0831 | 10000 | 0.7763 | 48.9327 | 34.0146 | | 0.813 | 0.0997 | 12000 | 0.7224 | 50.2154 | 33.8613 | | 0.779 | 0.1164 | 14000 | 0.6911 | 51.4013 | 33.9477 | | 0.7436 | 0.1330 | 16000 | 0.6648 | 52.2204 | 33.9345 | | 0.7224 | 0.1496 | 18000 | 0.6437 | 52.9165 | 33.9964 | | 0.7034 | 0.1662 | 20000 | 0.6279 | 53.6142 | 33.9663 | | 0.6783 | 0.1829 | 22000 | 0.6134 | 53.7386 | 33.9527 | | 0.6765 | 0.1995 | 24000 | 0.6029 | 54.4546 | 33.955 | | 0.656 | 0.2161 | 26000 | 0.5941 | 54.5817 | 33.9145 | | 0.6522 | 0.2327 | 28000 | 0.5884 | 54.728 | 33.9163 | | 0.6562 | 0.2494 | 30000 | 0.5864 | 54.8016 | 33.9191 | ### Framework versions - Transformers 4.50.0 - Pytorch 2.6.0+cu124 - Datasets 3.4.1 - Tokenizers 0.21.1