mt5-base-ainu / README.md
rigarashi's picture
Upload MT5ForConditionalGeneration
d0a52b6 verified
metadata
base_model: google/mt5-base
license: apache-2.0
metrics:
  - bleu
tags:
  - generated_from_trainer
model-index:
  - name: mt5-base-ainu
    results: []

mt5-base-ainu

This model is a fine-tuned version of google/mt5-base on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 2.1105
  • Bleu: 37.4939

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0005
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 32
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_ratio: 0.06
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Bleu
2.1267 1.0 9341 1.8026 20.8450
1.6408 2.0 18682 1.4706 26.7109
1.4098 3.0 28023 1.3494 30.7048
1.2546 4.0 37364 1.2910 32.5056
1.124 5.0 46705 1.2617 33.7060
1.0048 6.0 56046 1.2578 34.5824
0.8872 7.0 65387 1.2639 35.1029
0.8103 8.0 74728 1.2955 35.7998
0.7298 9.0 84069 1.3284 35.8310
0.6494 10.0 93410 1.3780 36.3268
0.5696 11.0 102751 1.4343 36.2494
0.5148 12.0 112092 1.4957 36.8702
0.4487 13.0 121433 1.5511 36.8981
0.3941 14.0 130774 1.6235 36.8809
0.3432 15.0 140115 1.6957 37.0269
0.3023 16.0 149456 1.7935 37.1839
0.2614 17.0 158797 1.8619 37.1935
0.2267 18.0 168138 1.9485 37.4170
0.1996 19.0 177479 2.0348 37.3585
0.1746 20.0 186820 2.1105 37.4939

Framework versions

  • Transformers 4.40.1
  • Pytorch 2.1.2
  • Datasets 2.19.0
  • Tokenizers 0.19.1