train_mrpc_1744902644

This model is a fine-tuned version of google/gemma-3-1b-it on the mrpc dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1113
  • Num Input Tokens Seen: 68544800

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 123
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 16
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • training_steps: 40000

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.1404 0.9685 200 0.1695 342592
0.1275 1.9395 400 0.1177 685504
0.0781 2.9104 600 0.1113 1027680
0.0769 3.8814 800 0.1123 1371040
0.1249 4.8523 1000 0.1314 1713440
0.0444 5.8232 1200 0.1384 2056384
0.1441 6.7942 1400 0.2321 2400544
0.014 7.7651 1600 0.2418 2741344
0.0086 8.7361 1800 0.3100 3083872
0.0179 9.7070 2000 0.3621 3425696
0.0001 10.6780 2200 0.4251 3769888
0.0 11.6489 2400 0.4051 4110336
0.0001 12.6199 2600 0.4575 4453600
0.0 13.5908 2800 0.4933 4796192
0.0016 14.5617 3000 0.4129 5138720
0.0237 15.5327 3200 0.4371 5480512
0.013 16.5036 3400 0.4575 5822816
0.0073 17.4746 3600 0.6311 6165056
0.0 18.4455 3800 0.4838 6507264
0.0 19.4165 4000 0.4974 6849792
0.0 20.3874 4200 0.5235 7192864
0.0001 21.3584 4400 0.4721 7534272
0.0046 22.3293 4600 0.4387 7877248
0.0 23.3002 4800 0.4762 8220544
0.0 24.2712 5000 0.3946 8562144
0.0 25.2421 5200 0.4913 8905568
0.0114 26.2131 5400 0.5561 9248640
0.0 27.1840 5600 0.3531 9592608
0.0 28.1550 5800 0.4120 9933568
0.0 29.1259 6000 0.4592 10277088
0.0 30.0969 6200 0.5133 10619488
0.0 31.0678 6400 0.4583 10962112
0.0 32.0387 6600 0.4551 11306080
0.0 33.0097 6800 0.4097 11649024
0.0 33.9782 7000 0.3684 11992032
0.0194 34.9492 7200 0.3791 12334784
0.0166 35.9201 7400 0.4535 12677888
0.0 36.8910 7600 0.4523 13020640
0.0 37.8620 7800 0.4779 13363648
0.0 38.8329 8000 0.4870 13706752
0.0 39.8039 8200 0.4966 14048256
0.0 40.7748 8400 0.5086 14392064
0.0 41.7458 8600 0.5198 14733504
0.0 42.7167 8800 0.5251 15076736
0.0 43.6877 9000 0.5299 15418176
0.0 44.6586 9200 0.5417 15762912
0.0 45.6295 9400 0.5441 16105760
0.0 46.6005 9600 0.5475 16448096
0.0 47.5714 9800 0.5651 16790336
0.0 48.5424 10000 0.5677 17132896
0.0 49.5133 10200 0.5680 17477376
0.0 50.4843 10400 0.5789 17817792
0.0 51.4552 10600 0.5780 18160384
0.0 52.4262 10800 0.5925 18502784
0.0 53.3971 11000 0.5932 18845184
0.0 54.3680 11200 0.6012 19187296
0.0 55.3390 11400 0.6042 19529792
0.0 56.3099 11600 0.6070 19873728
0.0 57.2809 11800 0.6165 20215680
0.0 58.2518 12000 0.6266 20558624
0.0 59.2228 12200 0.6337 20901984
0.0 60.1937 12400 0.6364 21244800
0.0 61.1646 12600 0.6426 21588704
0.0 62.1356 12800 0.6454 21931872
0.0 63.1065 13000 0.6533 22274560
0.0 64.0775 13200 0.6551 22618432
0.0 65.0484 13400 0.6620 22961216
0.0 66.0194 13600 0.6525 23304288
0.0 66.9879 13800 0.6678 23646592
0.0 67.9588 14000 0.6672 23989408
0.0 68.9298 14200 0.6774 24332544
0.0 69.9007 14400 0.6770 24675424
0.0 70.8717 14600 0.6935 25017632
0.0 71.8426 14800 0.6860 25360352
0.0 72.8136 15000 0.7022 25701344
0.0 73.7845 15200 0.7024 26046016
0.0 74.7554 15400 0.6954 26388448
0.0 75.7264 15600 0.7024 26729856
0.0 76.6973 15800 0.7081 27072064
0.0 77.6683 16000 0.7094 27415968
0.0 78.6392 16200 0.7083 27759520
0.0 79.6102 16400 0.7118 28101632
0.0 80.5811 16600 0.7147 28446208
0.0 81.5521 16800 0.7190 28787840
0.0 82.5230 17000 0.7137 29129536
0.0 83.4939 17200 0.7217 29473344
0.0 84.4649 17400 0.7221 29815360
0.0 85.4358 17600 0.7304 30157632
0.0 86.4068 17800 0.7359 30501440
0.0 87.3777 18000 0.7194 30843072
0.0 88.3487 18200 0.7295 31187360
0.0 89.3196 18400 0.7234 31528480
0.0 90.2906 18600 0.7244 31872544
0.0 91.2615 18800 0.7390 32214560
0.0 92.2324 19000 0.7186 32558112
0.0 93.2034 19200 0.7357 32900448
0.0 94.1743 19400 0.7300 33244800
0.0 95.1453 19600 0.7255 33587168
0.0 96.1162 19800 0.7249 33929248
0.0 97.0872 20000 0.7191 34271648
0.0 98.0581 20200 0.7206 34613344
0.0 99.0291 20400 0.7128 34957056
0.0 99.9976 20600 0.7137 35299200
0.0 100.9685 20800 0.7131 35642464
0.0 101.9395 21000 0.7303 35985280
0.0 102.9104 21200 0.7185 36327840
0.0 103.8814 21400 0.7315 36669664
0.0 104.8523 21600 0.7328 37012960
0.0 105.8232 21800 0.7422 37355968
0.0 106.7942 22000 0.7540 37698112
0.0 107.7651 22200 0.7701 38040768
0.0 108.7361 22400 0.7782 38383744
0.0 109.7070 22600 0.7898 38726880
0.0 110.6780 22800 0.8094 39068512
0.0 111.6489 23000 0.8276 39411712
0.0 112.6199 23200 0.8414 39754784
0.0 113.5908 23400 0.8567 40097568
0.0 114.5617 23600 0.8633 40441152
0.0 115.5327 23800 0.8983 40784672
0.0 116.5036 24000 0.9232 41127232
0.0 117.4746 24200 0.9453 41468768
0.0 118.4455 24400 0.9560 41811328
0.0 119.4165 24600 0.9686 42154688
0.0 120.3874 24800 0.9733 42497024
0.0 121.3584 25000 0.9843 42838112
0.0 122.3293 25200 0.9787 43181600
0.0 123.3002 25400 0.9857 43524256
0.0 124.2712 25600 0.9885 43867840
0.0 125.2421 25800 0.9734 44207680
0.0 126.2131 26000 0.9819 44551232
0.0 127.1840 26200 0.9840 44894816
0.0 128.1550 26400 0.9968 45236928
0.0 129.1259 26600 0.9759 45579584
0.0 130.0969 26800 0.9648 45923328
0.0 131.0678 27000 0.9423 46264032
0.0 132.0387 27200 0.9583 46607776
0.0 133.0097 27400 0.9305 46950752
0.0 133.9782 27600 0.9275 47293824
0.0 134.9492 27800 0.9141 47637248
0.0 135.9201 28000 0.8370 47979552
0.0 136.8910 28200 0.8343 48322528
0.0 137.8620 28400 1.2008 48663488
0.0 138.8329 28600 0.9126 49008000
0.0 139.8039 28800 0.9262 49350304
0.0 140.7748 29000 0.9216 49694528
0.0 141.7458 29200 0.9236 50035616
0.0 142.7167 29400 0.9204 50378912
0.0 143.6877 29600 0.9210 50722400
0.0 144.6586 29800 0.9201 51064768
0.0 145.6295 30000 0.9210 51407840
0.0 146.6005 30200 0.9201 51749792
0.0 147.5714 30400 0.9301 52094304
0.0 148.5424 30600 0.9215 52436000
0.0 149.5133 30800 0.9322 52777984
0.0 150.4843 31000 0.9233 53119904
0.0 151.4552 31200 0.9287 53462560
0.0 152.4262 31400 0.9219 53806272
0.0 153.3971 31600 0.9260 54148640
0.0 154.3680 31800 0.9341 54489984
0.0 155.3390 32000 0.9273 54832032
0.0 156.3099 32200 0.9320 55173664
0.0 157.2809 32400 0.9285 55517376
0.0 158.2518 32600 0.9442 55861088
0.0 159.2228 32800 0.9394 56203392
0.0 160.1937 33000 0.9390 56545632
0.0 161.1646 33200 0.9412 56888352
0.0 162.1356 33400 0.9363 57231584
0.0 163.1065 33600 0.9346 57574112
0.0 164.0775 33800 0.9826 57917728
0.0 165.0484 34000 0.9828 58261184
0.0 166.0194 34200 0.9926 58604352
0.0 166.9879 34400 0.9897 58946112
0.0 167.9588 34600 0.9964 59289344
0.0 168.9298 34800 0.9922 59631584
0.0 169.9007 35000 0.9911 59974880
0.0 170.8717 35200 0.9971 60318560
0.0 171.8426 35400 0.9983 60662016
0.0 172.8136 35600 0.9955 61004352
0.0 173.7845 35800 0.9933 61347296
0.0 174.7554 36000 0.9964 61689824
0.0 175.7264 36200 0.9940 62033792
0.0 176.6973 36400 0.9953 62376224
0.0 177.6683 36600 0.9961 62720096
0.0 178.6392 36800 1.0026 63062656
0.0 179.6102 37000 1.0058 63405504
0.0 180.5811 37200 1.0105 63748768
0.0 181.5521 37400 0.9988 64092416
0.0 182.5230 37600 0.9969 64436992
0.0 183.4939 37800 0.9983 64777984
0.0 184.4649 38000 1.0028 65120224
0.0 185.4358 38200 1.0073 65462240
0.0 186.4068 38400 1.0058 65805504
0.0 187.3777 38600 1.0065 66148448
0.0 188.3487 38800 1.0097 66490240
0.0 189.3196 39000 1.0036 66832256
0.0 190.2906 39200 1.0056 67174336
0.0 191.2615 39400 1.0044 67517920
0.0 192.2324 39600 1.0036 67860384
0.0 193.2034 39800 1.0035 68203104
0.0 194.1743 40000 1.0042 68544800

Framework versions

  • PEFT 0.15.1
  • Transformers 4.51.3
  • Pytorch 2.6.0+cu124
  • Datasets 3.5.0
  • Tokenizers 0.21.1
Downloads last month
5
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_mrpc_1744902644

Adapter
(81)
this model