Natet's picture
Training
9060a2e
|
raw
history blame
2.25 kB
metadata
license: apache-2.0
base_model: IlyaGusev/rut5_base_sum_gazeta
tags:
  - summarization_3
  - generated_from_trainer
metrics:
  - rouge
model-index:
  - name: rut5_base_sum_gazeta-finetuned_week_gpt
    results: []

rut5_base_sum_gazeta-finetuned_week_gpt

This model is a fine-tuned version of IlyaGusev/rut5_base_sum_gazeta on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 1.0665
  • Rouge1: 38.7802
  • Rouge2: 18.8758
  • Rougel: 38.1542
  • Rougelsum: 38.195

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5.6e-05
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 8

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum
No log 1.0 555 1.1788 36.7978 17.6912 36.1337 36.1391
1.3896 2.0 1110 1.0992 37.9462 18.6497 37.3932 37.4791
1.3896 3.0 1665 1.1053 38.8205 18.8297 38.0614 38.1843
1.1331 4.0 2220 1.1029 38.3632 18.7051 37.654 37.7872
1.1331 5.0 2775 1.0798 39.1371 18.8761 38.4425 38.4942
1.0312 6.0 3330 1.0602 38.6421 18.9015 38.0504 38.0638
1.0312 7.0 3885 1.0650 39.2291 19.0341 38.6098 38.6528
0.975 8.0 4440 1.0665 38.7802 18.8758 38.1542 38.195

Framework versions

  • Transformers 4.33.0
  • Pytorch 2.0.0
  • Datasets 2.1.0
  • Tokenizers 0.13.3