Configuration Parsing
Warning:
In adapter_config.json: "peft.base_model_name_or_path" must be a string
my-lora-local-combined-sum
This model is a fine-tuned version of google/mt5-small on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 5.3239
- Rouge1: 3.1309
- Rouge2: 0.1383
- Rougel: 2.9320
- Rougelsum: 2.9324
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0001
- train_batch_size: 4
- eval_batch_size: 4
- seed: 42
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- num_epochs: 2
Training results
Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum |
---|---|---|---|---|---|---|---|
23.652 | 0.0160 | 5 | 11.7543 | 0.5335 | 0.0402 | 0.4991 | 0.4949 |
21.4941 | 0.0319 | 10 | 11.7207 | 0.5200 | 0.0374 | 0.4762 | 0.4711 |
19.422 | 0.0479 | 15 | 11.6789 | 0.6095 | 0.0500 | 0.5583 | 0.5571 |
22.1389 | 0.0639 | 20 | 11.6402 | 0.6042 | 0.0481 | 0.5449 | 0.5407 |
20.0441 | 0.0799 | 25 | 11.5933 | 0.6250 | 0.0374 | 0.5482 | 0.5459 |
19.8832 | 0.0958 | 30 | 11.5414 | 0.6758 | 0.0517 | 0.5900 | 0.5834 |
19.6299 | 0.1118 | 35 | 11.4640 | 0.6860 | 0.0464 | 0.6029 | 0.6038 |
21.9358 | 0.1278 | 40 | 11.4154 | 0.6940 | 0.0536 | 0.6154 | 0.6163 |
22.2149 | 0.1438 | 45 | 11.3558 | 0.6759 | 0.0489 | 0.6079 | 0.6052 |
21.5263 | 0.1597 | 50 | 11.2812 | 0.6470 | 0.0584 | 0.5696 | 0.5700 |
20.7417 | 0.1757 | 55 | 11.2026 | 0.6474 | 0.0542 | 0.5906 | 0.5867 |
23.6982 | 0.1917 | 60 | 11.1133 | 0.5920 | 0.0661 | 0.5417 | 0.5435 |
17.6926 | 0.2077 | 65 | 11.0459 | 0.6807 | 0.0713 | 0.6018 | 0.5990 |
21.006 | 0.2236 | 70 | 10.9396 | 0.5966 | 0.0472 | 0.5545 | 0.5506 |
16.5991 | 0.2396 | 75 | 10.8254 | 0.6446 | 0.0594 | 0.5795 | 0.5795 |
16.2146 | 0.2556 | 80 | 10.7155 | 0.6542 | 0.0608 | 0.5892 | 0.5876 |
22.3505 | 0.2716 | 85 | 10.6118 | 0.6909 | 0.0519 | 0.6129 | 0.6082 |
21.3726 | 0.2875 | 90 | 10.4817 | 0.6439 | 0.0595 | 0.5608 | 0.5584 |
16.7014 | 0.3035 | 95 | 10.3467 | 0.6685 | 0.0720 | 0.5853 | 0.5847 |
18.2504 | 0.3195 | 100 | 10.2216 | 0.7344 | 0.0705 | 0.6395 | 0.6370 |
17.9526 | 0.3355 | 105 | 10.1161 | 0.7135 | 0.0850 | 0.6213 | 0.6190 |
19.2727 | 0.3514 | 110 | 10.0225 | 0.6474 | 0.0965 | 0.5498 | 0.5467 |
17.3108 | 0.3674 | 115 | 9.8542 | 0.6541 | 0.0800 | 0.5539 | 0.5547 |
14.8073 | 0.3834 | 120 | 9.7522 | 0.6335 | 0.0599 | 0.5548 | 0.5489 |
16.2244 | 0.3994 | 125 | 9.6181 | 0.7317 | 0.0618 | 0.6425 | 0.6337 |
14.5446 | 0.4153 | 130 | 9.5206 | 0.7480 | 0.0618 | 0.6331 | 0.6282 |
15.6145 | 0.4313 | 135 | 9.4280 | 0.7840 | 0.0915 | 0.6683 | 0.6591 |
14.5598 | 0.4473 | 140 | 9.3180 | 0.7976 | 0.0747 | 0.6857 | 0.6803 |
16.626 | 0.4633 | 145 | 9.1933 | 0.7144 | 0.0680 | 0.6037 | 0.6026 |
15.2717 | 0.4792 | 150 | 9.0056 | 0.6882 | 0.0764 | 0.5920 | 0.5917 |
13.4159 | 0.4952 | 155 | 8.8481 | 0.7891 | 0.0724 | 0.6943 | 0.6895 |
14.1487 | 0.5112 | 160 | 8.6840 | 0.8197 | 0.0542 | 0.7087 | 0.7051 |
16.2853 | 0.5272 | 165 | 8.5537 | 0.8577 | 0.0479 | 0.7464 | 0.7425 |
13.7284 | 0.5431 | 170 | 8.4408 | 0.8904 | 0.0404 | 0.8066 | 0.8058 |
13.035 | 0.5591 | 175 | 8.3316 | 0.9318 | 0.0634 | 0.8208 | 0.8100 |
12.7887 | 0.5751 | 180 | 8.2053 | 1.0418 | 0.0625 | 0.9095 | 0.9066 |
12.2665 | 0.5911 | 185 | 8.0928 | 1.0168 | 0.0567 | 0.8907 | 0.8891 |
12.7005 | 0.6070 | 190 | 7.9882 | 1.1973 | 0.0850 | 1.0934 | 1.0860 |
12.7293 | 0.6230 | 195 | 7.8489 | 1.2476 | 0.0833 | 1.1583 | 1.1515 |
12.1713 | 0.6390 | 200 | 7.6980 | 1.6482 | 0.1230 | 1.4793 | 1.4661 |
11.501 | 0.6550 | 205 | 7.5613 | 1.6241 | 0.1111 | 1.4590 | 1.4586 |
11.6457 | 0.6709 | 210 | 7.4689 | 1.8776 | 0.1296 | 1.6991 | 1.6871 |
11.7026 | 0.6869 | 215 | 7.3935 | 1.9311 | 0.1224 | 1.7739 | 1.7714 |
10.5877 | 0.7029 | 220 | 7.3031 | 2.1049 | 0.1427 | 1.8897 | 1.8797 |
10.9339 | 0.7188 | 225 | 7.2184 | 2.2620 | 0.1604 | 2.0377 | 2.0331 |
9.8031 | 0.7348 | 230 | 7.1438 | 2.3388 | 0.1500 | 2.1312 | 2.1271 |
10.2145 | 0.7508 | 235 | 7.0725 | 2.4207 | 0.1572 | 2.1943 | 2.1750 |
10.0909 | 0.7668 | 240 | 7.0029 | 2.4045 | 0.1427 | 2.1776 | 2.1770 |
9.8222 | 0.7827 | 245 | 6.9333 | 2.3929 | 0.1324 | 2.1964 | 2.1889 |
9.0728 | 0.7987 | 250 | 6.8557 | 2.4630 | 0.1364 | 2.2430 | 2.2337 |
9.308 | 0.8147 | 255 | 6.7682 | 2.5453 | 0.1473 | 2.3249 | 2.3172 |
9.0191 | 0.8307 | 260 | 6.6723 | 2.6619 | 0.1206 | 2.4392 | 2.4277 |
9.2552 | 0.8466 | 265 | 6.5731 | 2.5866 | 0.1098 | 2.3381 | 2.3322 |
8.7318 | 0.8626 | 270 | 6.4862 | 2.5568 | 0.1112 | 2.3082 | 2.3006 |
8.9958 | 0.8786 | 275 | 6.4084 | 2.5543 | 0.1102 | 2.3193 | 2.3160 |
8.2857 | 0.8946 | 280 | 6.3348 | 2.5717 | 0.1146 | 2.3284 | 2.3330 |
8.3112 | 0.9105 | 285 | 6.2517 | 2.7736 | 0.1357 | 2.5063 | 2.5098 |
8.8349 | 0.9265 | 290 | 6.1757 | 2.7153 | 0.1111 | 2.4674 | 2.4641 |
8.6952 | 0.9425 | 295 | 6.1052 | 2.7401 | 0.1081 | 2.5038 | 2.5073 |
8.5782 | 0.9585 | 300 | 6.0515 | 2.7391 | 0.1211 | 2.5241 | 2.5096 |
8.6426 | 0.9744 | 305 | 6.0058 | 2.7657 | 0.1443 | 2.5530 | 2.5519 |
8.1808 | 0.9904 | 310 | 5.9701 | 2.7706 | 0.1457 | 2.5485 | 2.5441 |
8.3015 | 1.0064 | 315 | 5.9410 | 2.7316 | 0.1531 | 2.5080 | 2.5072 |
8.3451 | 1.0224 | 320 | 5.9180 | 2.6415 | 0.1317 | 2.4510 | 2.4458 |
7.873 | 1.0383 | 325 | 5.8935 | 2.6211 | 0.1097 | 2.4057 | 2.4020 |
7.8035 | 1.0543 | 330 | 5.8663 | 2.5718 | 0.0779 | 2.3788 | 2.3755 |
7.4984 | 1.0703 | 335 | 5.8367 | 2.4868 | 0.0582 | 2.3166 | 2.3104 |
7.3556 | 1.0863 | 340 | 5.8020 | 2.4792 | 0.0550 | 2.3044 | 2.2916 |
8.0786 | 1.1022 | 345 | 5.7732 | 2.4080 | 0.0420 | 2.2622 | 2.2553 |
7.5338 | 1.1182 | 350 | 5.7436 | 2.4495 | 0.0461 | 2.2676 | 2.2592 |
7.5628 | 1.1342 | 355 | 5.7189 | 2.5874 | 0.0641 | 2.3737 | 2.3701 |
7.4467 | 1.1502 | 360 | 5.6907 | 2.5347 | 0.0413 | 2.3525 | 2.3482 |
7.6056 | 1.1661 | 365 | 5.6671 | 2.5697 | 0.0448 | 2.3524 | 2.3528 |
7.463 | 1.1821 | 370 | 5.6491 | 2.6025 | 0.0448 | 2.3822 | 2.3773 |
7.1363 | 1.1981 | 375 | 5.6348 | 2.5469 | 0.0357 | 2.3393 | 2.3311 |
7.0927 | 1.2141 | 380 | 5.6140 | 2.5685 | 0.0589 | 2.3623 | 2.3654 |
7.0816 | 1.2300 | 385 | 5.5978 | 2.6166 | 0.0589 | 2.4159 | 2.4212 |
6.8517 | 1.2460 | 390 | 5.5818 | 2.6452 | 0.0499 | 2.4727 | 2.4786 |
8.0498 | 1.2620 | 395 | 5.5675 | 2.7419 | 0.0512 | 2.5631 | 2.5606 |
7.0932 | 1.2780 | 400 | 5.5517 | 2.6195 | 0.0476 | 2.4942 | 2.4895 |
6.8587 | 1.2939 | 405 | 5.5344 | 2.5747 | 0.0391 | 2.4363 | 2.4312 |
6.9344 | 1.3099 | 410 | 5.5205 | 2.6110 | 0.0351 | 2.4583 | 2.4549 |
6.7229 | 1.3259 | 415 | 5.5108 | 2.6736 | 0.0334 | 2.5078 | 2.5052 |
6.6704 | 1.3419 | 420 | 5.5011 | 2.6529 | 0.0289 | 2.4826 | 2.4746 |
6.6762 | 1.3578 | 425 | 5.4873 | 2.6323 | 0.0289 | 2.4645 | 2.4530 |
6.6617 | 1.3738 | 430 | 5.4773 | 2.6210 | 0.0211 | 2.4641 | 2.4561 |
6.783 | 1.3898 | 435 | 5.4706 | 2.6646 | 0.0226 | 2.4979 | 2.4957 |
6.54 | 1.4058 | 440 | 5.4643 | 2.6527 | 0.0170 | 2.4945 | 2.4867 |
6.5901 | 1.4217 | 445 | 5.4572 | 2.6805 | 0.0170 | 2.5278 | 2.5214 |
6.6093 | 1.4377 | 450 | 5.4519 | 2.6769 | 0.0234 | 2.5196 | 2.5075 |
6.5077 | 1.4537 | 455 | 5.4466 | 2.6851 | 0.0274 | 2.5514 | 2.5439 |
6.3797 | 1.4696 | 460 | 5.4402 | 2.6756 | 0.0270 | 2.5497 | 2.5380 |
6.64 | 1.4856 | 465 | 5.4353 | 2.7025 | 0.0423 | 2.5647 | 2.5530 |
6.9645 | 1.5016 | 470 | 5.4311 | 2.7576 | 0.0257 | 2.6191 | 2.6132 |
6.6399 | 1.5176 | 475 | 5.4263 | 2.7655 | 0.0217 | 2.6279 | 2.6225 |
7.1064 | 1.5335 | 480 | 5.4224 | 2.7561 | 0.0217 | 2.6194 | 2.6118 |
6.5318 | 1.5495 | 485 | 5.4186 | 2.7226 | 0.0217 | 2.5902 | 2.5744 |
6.6985 | 1.5655 | 490 | 5.4142 | 2.7083 | 0.0217 | 2.5735 | 2.5629 |
6.3787 | 1.5815 | 495 | 5.4100 | 2.7172 | 0.0217 | 2.5852 | 2.5765 |
6.4844 | 1.5974 | 500 | 5.4051 | 2.7730 | 0.0217 | 2.6503 | 2.6442 |
6.5958 | 1.6134 | 505 | 5.4002 | 2.8511 | 0.0384 | 2.7220 | 2.7149 |
6.4135 | 1.6294 | 510 | 5.3951 | 2.8595 | 0.0409 | 2.7355 | 2.7261 |
6.3097 | 1.6454 | 515 | 5.3896 | 2.8092 | 0.0462 | 2.6927 | 2.6906 |
6.3015 | 1.6613 | 520 | 5.3836 | 2.8498 | 0.0579 | 2.7391 | 2.7357 |
6.2501 | 1.6773 | 525 | 5.3776 | 2.8643 | 0.0612 | 2.7585 | 2.7506 |
6.4105 | 1.6933 | 530 | 5.3714 | 2.8698 | 0.0612 | 2.7576 | 2.7500 |
6.9844 | 1.7093 | 535 | 5.3667 | 2.9023 | 0.0698 | 2.7706 | 2.7640 |
6.5541 | 1.7252 | 540 | 5.3623 | 2.9349 | 0.0733 | 2.8098 | 2.8037 |
6.2722 | 1.7412 | 545 | 5.3579 | 2.9333 | 0.0675 | 2.8005 | 2.7961 |
6.3432 | 1.7572 | 550 | 5.3531 | 3.0079 | 0.0698 | 2.8572 | 2.8527 |
6.392 | 1.7732 | 555 | 5.3490 | 3.0083 | 0.0728 | 2.8491 | 2.8482 |
6.4564 | 1.7891 | 560 | 5.3457 | 2.9957 | 0.0699 | 2.8449 | 2.8436 |
6.1712 | 1.8051 | 565 | 5.3425 | 3.0035 | 0.0698 | 2.8504 | 2.8484 |
6.4905 | 1.8211 | 570 | 5.3386 | 3.0050 | 0.0741 | 2.8525 | 2.8527 |
6.483 | 1.8371 | 575 | 5.3353 | 3.0098 | 0.0848 | 2.8661 | 2.8604 |
6.4069 | 1.8530 | 580 | 5.3333 | 3.0188 | 0.0817 | 2.8816 | 2.8737 |
6.3736 | 1.8690 | 585 | 5.3318 | 3.0172 | 0.0817 | 2.8718 | 2.8651 |
6.3639 | 1.8850 | 590 | 5.3299 | 3.0201 | 0.0847 | 2.8755 | 2.8717 |
6.214 | 1.9010 | 595 | 5.3282 | 3.0291 | 0.0847 | 2.8898 | 2.8852 |
6.537 | 1.9169 | 600 | 5.3270 | 3.0432 | 0.0847 | 2.9011 | 2.8966 |
6.4734 | 1.9329 | 605 | 5.3260 | 3.0428 | 0.0908 | 2.8934 | 2.8907 |
6.298 | 1.9489 | 610 | 5.3251 | 3.0780 | 0.1251 | 2.9044 | 2.9019 |
6.4594 | 1.9649 | 615 | 5.3245 | 3.1170 | 0.1383 | 2.9269 | 2.9250 |
6.2875 | 1.9808 | 620 | 5.3241 | 3.1276 | 0.1383 | 2.9320 | 2.9324 |
6.1948 | 1.9968 | 625 | 5.3239 | 3.1309 | 0.1383 | 2.9320 | 2.9324 |
Framework versions
- PEFT 0.14.0
- Transformers 4.49.0
- Pytorch 2.6.0+cu124
- Datasets 3.3.2
- Tokenizers 0.21.0
- Downloads last month
- 1
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support
Model tree for benitoals/my-lora-local-combined-sum
Base model
google/mt5-small