my-lora

This model is a fine-tuned version of google/mt5-small on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 4.4558
  • Rouge1: 6.9916
  • Rouge2: 0.6816
  • Rougel: 5.7118
  • Rougelsum: 5.7018

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 4

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum
23.6756 0.0160 5 11.7300 0.4936 0.0298 0.4390 0.4355
21.3163 0.0319 10 11.6817 0.5950 0.0303 0.5441 0.5398
19.4831 0.0479 15 11.6197 0.5874 0.0422 0.5365 0.5379
22.1207 0.0639 20 11.5248 0.5548 0.0413 0.5229 0.5198
19.8132 0.0799 25 11.4054 0.5510 0.0240 0.5010 0.4974
20.1565 0.0958 30 11.3720 0.6226 0.0216 0.5805 0.5763
19.7845 0.1118 35 11.3636 0.6424 0.0370 0.5877 0.5833
21.83 0.1278 40 11.3292 0.6306 0.0416 0.5777 0.5757
22.8683 0.1438 45 11.2816 0.6071 0.0340 0.5619 0.5612
21.5372 0.1597 50 11.2166 0.6361 0.0311 0.5761 0.5722
20.4096 0.1757 55 11.1637 0.6519 0.0339 0.5762 0.5752
23.4684 0.1917 60 11.1275 0.6089 0.0326 0.5266 0.5304
18.1937 0.2077 65 11.0679 0.6298 0.0284 0.5691 0.5679
20.9906 0.2236 70 11.0010 0.6016 0.0294 0.5618 0.5626
16.9343 0.2396 75 10.9141 0.6042 0.0341 0.5629 0.5605
16.213 0.2556 80 10.8014 0.5869 0.0341 0.5468 0.5422
23.0994 0.2716 85 10.6858 0.6965 0.0399 0.6411 0.6366
21.3159 0.2875 90 10.5525 0.7669 0.0560 0.6992 0.6939
17.8493 0.3035 95 10.4093 0.7420 0.0527 0.6913 0.6831
18.3706 0.3195 100 10.2754 0.7243 0.0332 0.6716 0.6648
18.2941 0.3355 105 10.1246 0.7495 0.0387 0.6865 0.6772
19.6502 0.3514 110 9.9765 0.7874 0.0579 0.7173 0.7144
17.7381 0.3674 115 9.8496 0.7391 0.0366 0.6777 0.6765
15.6761 0.3834 120 9.7321 0.8022 0.0563 0.7337 0.7291
16.2067 0.3994 125 9.6046 0.7913 0.0531 0.7319 0.7289
13.8577 0.4153 130 9.4678 0.8529 0.0571 0.7788 0.7763
15.4965 0.4313 135 9.3599 0.8251 0.0542 0.7540 0.7478
14.362 0.4473 140 9.2515 0.8661 0.0632 0.7880 0.7862
17.1324 0.4633 145 9.0933 0.8565 0.0416 0.8081 0.8043
15.5484 0.4792 150 8.9389 0.8524 0.0501 0.7688 0.7655
13.7671 0.4952 155 8.7882 0.9513 0.0701 0.8614 0.8538
14.2473 0.5112 160 8.6085 0.9936 0.0760 0.8882 0.8819
16.2646 0.5272 165 8.4261 1.0280 0.0825 0.9123 0.9112
14.422 0.5431 170 8.2682 0.9989 0.0732 0.9057 0.9046
12.8457 0.5591 175 8.1451 1.1030 0.0744 0.9912 0.9920
13.1231 0.5751 180 8.0125 1.2091 0.0769 1.0877 1.0854
12.4348 0.5911 185 7.8906 1.2220 0.0873 1.1092 1.1044
12.7066 0.6070 190 7.7846 1.2845 0.0905 1.1493 1.1445
12.6802 0.6230 195 7.6824 1.2024 0.0764 1.0739 1.0730
12.6226 0.6390 200 7.5869 1.2994 0.1020 1.1752 1.1781
11.7003 0.6550 205 7.4930 1.3573 0.0789 1.2515 1.2518
11.7344 0.6709 210 7.4188 1.3916 0.0922 1.2819 1.2819
11.7628 0.6869 215 7.3503 1.4452 0.0918 1.3172 1.3105
10.2049 0.7029 220 7.2817 1.6040 0.0990 1.4637 1.4567
10.803 0.7188 225 7.2149 1.8393 0.1138 1.6247 1.6199
10.014 0.7348 230 7.1472 1.9519 0.1254 1.7554 1.7515
10.6801 0.7508 235 7.0828 2.1037 0.1414 1.8470 1.8434
10.0471 0.7668 240 7.0481 2.0665 0.1527 1.7971 1.7958
9.7399 0.7827 245 7.0096 2.0347 0.1374 1.7735 1.7707
9.1483 0.7987 250 6.9719 2.1064 0.0887 1.8688 1.8726
8.8226 0.8147 255 6.9016 2.3382 0.1402 2.0657 2.0607
8.6236 0.8307 260 6.8058 2.2605 0.1258 2.0048 1.9982
8.8137 0.8466 265 6.7130 2.4050 0.1255 2.1567 2.1508
8.4192 0.8626 270 6.6078 2.4312 0.0998 2.2026 2.1954
8.3304 0.8786 275 6.4693 2.5363 0.1506 2.3432 2.3501
8.0673 0.8946 280 6.3136 2.5803 0.1364 2.3812 2.3723
7.8489 0.9105 285 6.1742 2.5226 0.1223 2.3425 2.3377
8.9521 0.9265 290 6.0328 2.5598 0.1001 2.3583 2.3461
7.6618 0.9425 295 5.8830 2.7390 0.1270 2.5325 2.5231
7.6692 0.9585 300 5.7741 2.7669 0.0933 2.5127 2.5096
7.9787 0.9744 305 5.7020 2.7929 0.0678 2.5641 2.5689
7.577 0.9904 310 5.6454 2.7002 0.0625 2.4869 2.4944
7.7221 1.0064 315 5.6025 2.7943 0.0410 2.5835 2.5853
7.3943 1.0224 320 5.5779 2.8294 0.0574 2.6103 2.6068
7.2515 1.0383 325 5.5558 2.6871 0.0391 2.5212 2.5114
7.13 1.0543 330 5.5428 2.7466 0.0538 2.5637 2.5593
6.7682 1.0703 335 5.5243 2.5682 0.0561 2.4248 2.4171
6.6841 1.0863 340 5.5019 2.5695 0.0390 2.4028 2.3910
7.061 1.1022 345 5.4904 2.6353 0.0232 2.4626 2.4600
6.8416 1.1182 350 5.4781 2.7188 0.0364 2.5442 2.5418
6.833 1.1342 355 5.4651 2.7963 0.0946 2.6191 2.6053
6.6498 1.1502 360 5.4439 2.7616 0.0794 2.5699 2.5632
6.5082 1.1661 365 5.4256 2.8136 0.0643 2.6238 2.6181
6.7253 1.1821 370 5.4129 2.8281 0.0564 2.6235 2.6203
6.3954 1.1981 375 5.3968 2.9543 0.0471 2.7386 2.7316
6.381 1.2141 380 5.3664 3.0891 0.1198 2.8635 2.8481
6.2691 1.2300 385 5.3405 3.1817 0.0991 2.9678 2.9600
6.3465 1.2460 390 5.3137 3.2596 0.0955 3.0371 3.0193
7.5564 1.2620 395 5.2915 3.4445 0.1965 3.2355 3.2178
6.297 1.2780 400 5.2659 3.3451 0.1735 3.1460 3.1341
6.0585 1.2939 405 5.2356 3.5268 0.1811 3.2960 3.2930
6.146 1.3099 410 5.2035 3.5298 0.2186 3.3129 3.3068
5.8611 1.3259 415 5.1748 3.5515 0.2134 3.3375 3.3356
6.1246 1.3419 420 5.1458 3.5293 0.1791 3.3109 3.3109
5.941 1.3578 425 5.1199 3.6252 0.1844 3.3822 3.3820
5.9336 1.3738 430 5.0968 3.7394 0.2487 3.4850 3.4839
6.0573 1.3898 435 5.0772 3.8920 0.2854 3.5845 3.5833
5.8787 1.4058 440 5.0583 3.9947 0.3167 3.7099 3.7060
5.742 1.4217 445 5.0390 4.2624 0.3612 3.9232 3.9237
5.9164 1.4377 450 5.0187 4.5394 0.3685 4.1808 4.1667
5.8206 1.4537 455 4.9996 4.7519 0.4171 4.3353 4.3228
5.7587 1.4696 460 4.9800 4.8192 0.4494 4.4227 4.4091
5.782 1.4856 465 4.9617 4.8078 0.4496 4.3912 4.3769
6.0275 1.5016 470 4.9466 4.8523 0.5041 4.4154 4.3983
5.9688 1.5176 475 4.9314 4.8285 0.4628 4.4055 4.3831
6.2916 1.5335 480 4.9161 4.8893 0.4697 4.4384 4.4320
5.8275 1.5495 485 4.9018 4.8905 0.4666 4.4296 4.4204
5.7246 1.5655 490 4.8863 4.8774 0.4725 4.4002 4.3823
5.6779 1.5815 495 4.8671 4.7665 0.4170 4.2980 4.2936
5.6032 1.5974 500 4.8466 4.9297 0.4819 4.4265 4.4229
5.8246 1.6134 505 4.8283 5.0824 0.4941 4.4664 4.4746
5.6855 1.6294 510 4.8129 5.0445 0.5022 4.4670 4.4665
5.4763 1.6454 515 4.7979 5.1488 0.4781 4.5613 4.5503
5.5418 1.6613 520 4.7839 5.3006 0.4677 4.7011 4.6898
5.489 1.6773 525 4.7723 5.3246 0.4893 4.7215 4.7127
5.6303 1.6933 530 4.7623 5.5604 0.5124 4.8999 4.8991
6.2405 1.7093 535 4.7547 5.5092 0.4656 4.8064 4.7866
5.8127 1.7252 540 4.7469 5.5956 0.5156 4.8723 4.8640
5.5919 1.7412 545 4.7389 5.6788 0.5618 4.9519 4.9475
5.574 1.7572 550 4.7309 5.7099 0.5667 5.0054 4.9991
5.415 1.7732 555 4.7239 5.6398 0.5503 4.8735 4.8653
5.5172 1.7891 560 4.7158 5.5743 0.4962 4.8134 4.7998
5.3554 1.8051 565 4.7086 5.6326 0.5413 4.7946 4.7893
5.632 1.8211 570 4.6997 5.6395 0.5585 4.8162 4.8135
5.6441 1.8371 575 4.6913 5.6594 0.5934 4.8383 4.8440
5.6121 1.8530 580 4.6848 5.7094 0.6287 4.8887 4.8955
5.6247 1.8690 585 4.6811 5.7092 0.5953 4.8865 4.8931
5.5127 1.8850 590 4.6774 5.7369 0.6294 4.9060 4.9123
5.3986 1.9010 595 4.6725 5.8236 0.6239 5.0256 5.0308
5.3653 1.9169 600 4.6657 5.8826 0.6295 5.0282 5.0357
5.5225 1.9329 605 4.6599 5.8561 0.6483 5.0548 5.0661
5.4748 1.9489 610 4.6542 5.8777 0.6013 5.0208 5.0236
5.4552 1.9649 615 4.6488 5.8850 0.6102 4.9776 4.9827
5.4006 1.9808 620 4.6435 5.8263 0.5668 4.9280 4.9319
5.2525 1.9968 625 4.6374 5.8090 0.5567 4.8903 4.8934
5.3651 2.0128 630 4.6327 5.8141 0.5850 4.8955 4.9018
5.3102 2.0288 635 4.6288 5.7891 0.6054 4.9275 4.9390
5.4252 2.0447 640 4.6243 5.7595 0.5903 4.9053 4.9143
5.623 2.0607 645 4.6207 5.8272 0.5991 4.9243 4.9271
5.9216 2.0767 650 4.6182 5.7977 0.5746 4.8630 4.8650
5.2885 2.0927 655 4.6160 5.8832 0.6089 4.9655 4.9590
5.4478 2.1086 660 4.6143 6.1030 0.6493 5.1520 5.1469
5.4389 2.1246 665 4.6138 6.2193 0.6379 5.2267 5.2330
5.3506 2.1406 670 4.6115 6.2155 0.6610 5.2507 5.2505
5.2738 2.1565 675 4.6067 6.4144 0.6590 5.3567 5.3562
5.2637 2.1725 680 4.6028 6.6140 0.6895 5.5451 5.5404
5.4674 2.1885 685 4.5991 6.6112 0.6680 5.5371 5.5289
5.4239 2.2045 690 4.5962 6.6814 0.6930 5.6114 5.6027
5.5427 2.2204 695 4.5940 6.6225 0.6488 5.5083 5.5054
5.33 2.2364 700 4.5913 6.5749 0.6924 5.4796 5.4663
5.326 2.2524 705 4.5876 6.5141 0.6854 5.4379 5.4275
5.1821 2.2684 710 4.5827 6.5709 0.7154 5.4791 5.4750
5.3379 2.2843 715 4.5779 6.6474 0.7288 5.5817 5.5757
5.5842 2.3003 720 4.5742 6.6228 0.6962 5.5504 5.5346
5.3523 2.3163 725 4.5706 6.6456 0.7115 5.5543 5.5382
5.3367 2.3323 730 4.5668 6.6447 0.7167 5.5495 5.5403
5.3159 2.3482 735 4.5639 6.6856 0.7344 5.6046 5.5874
5.2955 2.3642 740 4.5617 6.6819 0.7438 5.6109 5.5915
5.3509 2.3802 745 4.5588 6.6925 0.7367 5.6635 5.6420
5.4309 2.3962 750 4.5564 6.6572 0.7239 5.6270 5.6078
5.3232 2.4121 755 4.5549 6.6658 0.7242 5.6407 5.6248
5.3663 2.4281 760 4.5545 6.6939 0.7227 5.6980 5.6823
5.2839 2.4441 765 4.5528 6.6417 0.7172 5.6179 5.6046
5.1068 2.4601 770 4.5507 6.6977 0.7300 5.6509 5.6364
5.2285 2.4760 775 4.5483 6.6988 0.7397 5.6623 5.6474
5.3325 2.4920 780 4.5455 6.6992 0.7395 5.6787 5.6597
5.2692 2.5080 785 4.5423 6.7269 0.7354 5.6832 5.6717
5.2416 2.5240 790 4.5402 6.6987 0.7277 5.6825 5.6674
5.2668 2.5399 795 4.5380 6.7267 0.7257 5.7369 5.7216
5.3415 2.5559 800 4.5363 6.7654 0.7323 5.7682 5.7448
5.2675 2.5719 805 4.5346 6.9076 0.7348 5.8630 5.8460
5.1813 2.5879 810 4.5333 6.9111 0.7139 5.8215 5.8088
5.3719 2.6038 815 4.5319 6.9676 0.7501 5.8938 5.8751
5.2826 2.6198 820 4.5296 6.9759 0.7424 5.9081 5.8935
5.2366 2.6358 825 4.5281 6.9508 0.7371 5.8471 5.8354
5.1769 2.6518 830 4.5276 6.9875 0.7161 5.8823 5.8731
5.2484 2.6677 835 4.5270 6.9863 0.7222 5.8866 5.8754
5.2323 2.6837 840 4.5263 6.9414 0.6963 5.8284 5.8195
5.1327 2.6997 845 4.5240 6.9841 0.6984 5.8296 5.8195
5.138 2.7157 850 4.5209 7.0102 0.7134 5.8757 5.8637
5.1938 2.7316 855 4.5168 6.9719 0.7021 5.8537 5.8429
5.1618 2.7476 860 4.5136 6.8674 0.6884 5.7749 5.7624
5.1931 2.7636 865 4.5116 6.8026 0.6750 5.7132 5.7083
5.0372 2.7796 870 4.5097 6.7480 0.6611 5.6958 5.6879
5.2144 2.7955 875 4.5084 6.7780 0.6605 5.7056 5.6933
5.3514 2.8115 880 4.5070 6.7706 0.6476 5.7317 5.7224
5.3267 2.8275 885 4.5049 6.7649 0.6556 5.7269 5.7248
5.2012 2.8435 890 4.5025 6.7671 0.6727 5.7185 5.7078
5.231 2.8594 895 4.5000 6.7752 0.6605 5.7131 5.7062
5.5058 2.8754 900 4.4982 6.8061 0.6414 5.7308 5.7243
5.1725 2.8914 905 4.4971 6.7574 0.6312 5.6988 5.6969
5.1605 2.9073 910 4.4953 6.7771 0.6541 5.7257 5.7234
5.205 2.9233 915 4.4932 6.7896 0.6608 5.7603 5.7574
5.3089 2.9393 920 4.4916 6.7881 0.6504 5.7507 5.7542
5.2994 2.9553 925 4.4903 6.7905 0.6427 5.7104 5.7157
5.0648 2.9712 930 4.4885 6.7691 0.6338 5.7117 5.7143
5.1906 2.9872 935 4.4870 6.7841 0.6414 5.7250 5.7231
5.2381 3.0032 940 4.4852 6.8148 0.6582 5.7312 5.7279
5.2319 3.0192 945 4.4837 6.8141 0.6613 5.7176 5.7204
5.1976 3.0351 950 4.4823 6.7903 0.6651 5.6967 5.6950
5.0499 3.0511 955 4.4804 6.8182 0.6681 5.7180 5.7155
5.2995 3.0671 960 4.4784 6.8703 0.6622 5.7614 5.7628
5.1794 3.0831 965 4.4767 6.8704 0.6576 5.7600 5.7633
5.0923 3.0990 970 4.4757 6.8784 0.6412 5.7703 5.7741
5.1729 3.1150 975 4.4747 6.9144 0.6403 5.8102 5.8064
5.1864 3.1310 980 4.4742 6.9348 0.6531 5.8079 5.8027
5.2251 3.1470 985 4.4737 6.9469 0.6869 5.8021 5.8002
5.1631 3.1629 990 4.4733 6.9770 0.6909 5.8255 5.8155
5.2361 3.1789 995 4.4732 6.9978 0.7205 5.8385 5.8312
5.2153 3.1949 1000 4.4727 7.0194 0.7462 5.8473 5.8427
4.949 3.2109 1005 4.4724 7.0442 0.7346 5.8631 5.8543
5.0817 3.2268 1010 4.4721 7.0183 0.7229 5.8697 5.8590
5.0898 3.2428 1015 4.4716 7.0135 0.7131 5.8406 5.8362
5.1289 3.2588 1020 4.4715 6.9679 0.7175 5.8269 5.8193
5.2497 3.2748 1025 4.4708 6.9824 0.7243 5.8302 5.8229
5.2646 3.2907 1030 4.4700 6.9954 0.7178 5.8304 5.8193
5.2577 3.3067 1035 4.4689 6.9859 0.7132 5.8183 5.8091
5.0303 3.3227 1040 4.4675 6.9681 0.7205 5.7512 5.7422
5.0681 3.3387 1045 4.4662 6.9814 0.7321 5.7606 5.7567
5.1703 3.3546 1050 4.4648 6.9691 0.7201 5.7592 5.7554
5.1361 3.3706 1055 4.4635 6.9370 0.7088 5.7440 5.7445
5.2493 3.3866 1060 4.4625 6.9814 0.7020 5.7481 5.7480
5.2599 3.4026 1065 4.4615 6.9617 0.7029 5.7426 5.7407
5.6987 3.4185 1070 4.4608 6.9748 0.7117 5.7502 5.7481
5.1897 3.4345 1075 4.4606 6.9616 0.7094 5.7620 5.7587
5.1351 3.4505 1080 4.4608 6.9574 0.7200 5.7551 5.7568
5.1565 3.4665 1085 4.4612 6.9928 0.7335 5.7861 5.7773
5.0921 3.4824 1090 4.4611 6.9979 0.7437 5.7963 5.7911
5.2995 3.4984 1095 4.4613 6.9737 0.7398 5.7852 5.7807
5.2095 3.5144 1100 4.4610 6.9365 0.7029 5.7513 5.7477
4.9812 3.5304 1105 4.4608 6.9247 0.7040 5.7530 5.7444
5.0206 3.5463 1110 4.4604 6.9237 0.7161 5.7440 5.7371
5.1183 3.5623 1115 4.4602 6.9120 0.7013 5.7443 5.7361
5.1914 3.5783 1120 4.4603 6.9370 0.7057 5.7433 5.7367
4.9982 3.5942 1125 4.4603 6.9551 0.7008 5.7582 5.7546
5.0126 3.6102 1130 4.4604 6.9574 0.6951 5.7389 5.7325
5.1075 3.6262 1135 4.4603 6.9903 0.7045 5.7472 5.7432
5.1985 3.6422 1140 4.4601 6.9734 0.6858 5.7350 5.7301
5.3398 3.6581 1145 4.4600 6.9579 0.6883 5.7210 5.7124
5.1475 3.6741 1150 4.4596 6.9668 0.6785 5.7307 5.7186
5.016 3.6901 1155 4.4594 6.9717 0.6789 5.7341 5.7186
5.2785 3.7061 1160 4.4592 7.0314 0.6962 5.7750 5.7639
5.295 3.7220 1165 4.4589 7.0075 0.6895 5.7655 5.7547
5.1939 3.7380 1170 4.4587 7.0287 0.6944 5.7671 5.7590
5.2976 3.7540 1175 4.4585 7.0327 0.6917 5.7685 5.7589
4.9886 3.7700 1180 4.4581 7.0105 0.6844 5.7469 5.7372
5.2187 3.7859 1185 4.4577 6.9933 0.6810 5.7267 5.7187
5.3142 3.8019 1190 4.4573 7.0135 0.6810 5.7426 5.7333
5.1089 3.8179 1195 4.4572 7.0202 0.6810 5.7430 5.7323
5.2423 3.8339 1200 4.4569 7.0209 0.6810 5.7376 5.7267
5.1709 3.8498 1205 4.4566 7.0130 0.6810 5.7197 5.7088
5.2386 3.8658 1210 4.4564 7.0002 0.6753 5.7134 5.7056
5.0235 3.8818 1215 4.4562 6.9875 0.6753 5.7133 5.7057
5.1144 3.8978 1220 4.4561 6.9713 0.6816 5.6947 5.6868
5.1089 3.9137 1225 4.4560 6.9713 0.6816 5.6947 5.6868
5.1123 3.9297 1230 4.4559 6.9656 0.6816 5.6991 5.6932
4.9316 3.9457 1235 4.4559 6.9584 0.6673 5.6909 5.6877
5.1495 3.9617 1240 4.4558 6.9390 0.6673 5.6744 5.6706
5.169 3.9776 1245 4.4558 6.9656 0.6816 5.6991 5.6932
5.3732 3.9936 1250 4.4558 6.9916 0.6816 5.7118 5.7018

Framework versions

  • PEFT 0.14.0
  • Transformers 4.49.0
  • Pytorch 2.6.0+cu124
  • Datasets 3.3.2
  • Tokenizers 0.21.0
Downloads last month
0
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for benitoals/my-lora

Base model

google/mt5-small
Adapter
(13)
this model