mt5-small-finetune-finetuned-research-papers-XX

This model is a fine-tuned version of google/mt5-small on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 2.5181
  • Rouge1: 37.7539
  • Rouge2: 18.9504
  • Rougel: 33.145
  • Rougelsum: 33.1903
  • Gen Len: 16.3255

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5.6e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 4

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
2.8525 0.5 500 2.6013 36.7086 17.6689 31.9973 32.0291 16.3635
2.8439 1.0 1000 2.6248 37.6033 18.3621 32.8963 32.9445 16.497
2.7426 1.5 1500 2.5630 36.6049 17.5093 31.9676 31.9867 16.1745
2.714 2.0 2000 2.5636 37.1961 18.0863 32.5238 32.5846 16.397
2.6864 2.5 2500 2.5606 37.9728 18.93 33.351 33.374 16.2275
2.7265 3.0 3000 2.5343 37.5678 18.7011 33.0497 33.083 16.3985
2.7086 3.5 3500 2.5322 37.8949 18.8538 33.1814 33.2308 16.342
2.7841 4.0 4000 2.5181 37.7539 18.9504 33.145 33.1903 16.3255

Framework versions

  • Transformers 4.51.3
  • Pytorch 2.6.0+cu124
  • Datasets 3.5.0
  • Tokenizers 0.21.1
Downloads last month
5
Safetensors
Model size
300M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Mug3n24/mt5-small-finetune-finetuned-research-papers-XX

Base model

google/mt5-small
Finetuned
(461)
this model