train_mrpc_1744902651

This model is a fine-tuned version of mistralai/Mistral-7B-Instruct-v0.3 on the mrpc dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1597
  • Num Input Tokens Seen: 69324064

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.3
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 123
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 16
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • training_steps: 40000

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.2129 0.9685 200 0.2560 346816
0.2093 1.9395 400 0.1950 694112
0.2057 2.9104 600 0.1901 1040448
0.1981 3.8814 800 0.1907 1386944
0.2159 4.8523 1000 0.2127 1733568
0.2096 5.8232 1200 0.1992 2080576
0.198 6.7942 1400 0.1953 2428000
0.1953 7.7651 1600 0.1985 2772832
0.1967 8.7361 1800 0.1901 3119936
0.2 9.7070 2000 0.1920 3464864
0.2084 10.6780 2200 0.1900 3812608
0.215 11.6489 2400 0.1903 4157312
0.1898 12.6199 2600 0.1912 4504256
0.1918 13.5908 2800 0.2050 4850880
0.1703 14.5617 3000 0.2072 5197664
0.191 15.5327 3200 0.1994 5543392
0.1721 16.5036 3400 0.2015 5889024
0.1644 17.4746 3600 0.2295 6234688
0.1857 18.4455 3800 0.1838 6580608
0.1392 19.4165 4000 0.1841 6926432
0.15 20.3874 4200 0.1688 7272896
0.1336 21.3584 4400 0.1597 7618208
0.1329 22.3293 4600 0.1677 7965376
0.1179 23.3002 4800 0.1794 8312352
0.1344 24.2712 5000 0.1844 8657568
0.1166 25.2421 5200 0.1897 9004576
0.0846 26.2131 5400 0.2017 9351552
0.0684 27.1840 5600 0.2355 9699840
0.0524 28.1550 5800 0.2284 10045120
0.0676 29.1259 6000 0.1920 10392096
0.0607 30.0969 6200 0.2144 10738624
0.0581 31.0678 6400 0.2491 11084512
0.0294 32.0387 6600 0.2425 11432096
0.0461 33.0097 6800 0.2511 11779520
0.0274 33.9782 7000 0.2553 12126016
0.0409 34.9492 7200 0.2505 12472160
0.0537 35.9201 7400 0.2833 12819680
0.0377 36.8910 7600 0.3290 13166368
0.019 37.8620 7800 0.3027 13513280
0.0382 38.8329 8000 0.2663 13860256
0.0161 39.8039 8200 0.2997 14205856
0.0249 40.7748 8400 0.4116 14553152
0.0105 41.7458 8600 0.3525 14898752
0.0078 42.7167 8800 0.3092 15245344
0.003 43.6877 9000 0.4724 15590560
0.0212 44.6586 9200 0.2946 15939776
0.0057 45.6295 9400 0.3509 16286016
0.0027 46.6005 9600 0.4230 16633088
0.027 47.5714 9800 0.2952 16978656
0.011 48.5424 10000 0.3346 17325024
0.0007 49.5133 10200 0.4630 17673440
0.0004 50.4843 10400 0.5090 18018272
0.0021 51.4552 10600 0.4792 18364992
0.0037 52.4262 10800 0.3702 18710720
0.0215 53.3971 11000 0.2850 19057408
0.0037 54.3680 11200 0.4289 19403360
0.0029 55.3390 11400 0.4302 19749408
0.0008 56.3099 11600 0.3519 20096416
0.0003 57.2809 11800 0.3904 20442944
0.0001 58.2518 12000 0.4402 20789120
0.0001 59.2228 12200 0.4630 21136768
0.0001 60.1937 12400 0.4810 21482944
0.0 61.1646 12600 0.4910 21830400
0.0 62.1356 12800 0.5048 22177696
0.0 63.1065 13000 0.5164 22523776
0.0 64.0775 13200 0.5217 22871744
0.0 65.0484 13400 0.5344 23218432
0.0 66.0194 13600 0.5355 23565280
0.0 66.9879 13800 0.5501 23911616
0.0 67.9588 14000 0.5571 24257984
0.0 68.9298 14200 0.5620 24604960
0.0 69.9007 14400 0.5700 24951648
0.0 70.8717 14600 0.5768 25297664
0.0 71.8426 14800 0.5825 25644032
0.0 72.8136 15000 0.5863 25989408
0.0 73.7845 15200 0.5961 26337760
0.0 74.7554 15400 0.6001 26684800
0.0 75.7264 15600 0.6087 27029856
0.0 76.6973 15800 0.6104 27376160
0.0 77.6683 16000 0.6174 27723904
0.0 78.6392 16200 0.6252 28071104
0.0 79.6102 16400 0.6269 28417344
0.0 80.5811 16600 0.6351 28766240
0.0 81.5521 16800 0.6470 29111104
0.0 82.5230 17000 0.6534 29456800
0.0 83.4939 17200 0.6542 29804640
0.0 84.4649 17400 0.6596 30151168
0.0 85.4358 17600 0.6673 30497536
0.0 86.4068 17800 0.6761 30845536
0.0 87.3777 18000 0.6790 31191456
0.0 88.3487 18200 0.6859 31539136
0.0 89.3196 18400 0.6945 31884000
0.0 90.2906 18600 0.7009 32231584
0.0 91.2615 18800 0.7076 32577088
0.0 92.2324 19000 0.7129 32924768
0.0 93.2034 19200 0.7167 33271392
0.0 94.1743 19400 0.7232 33619232
0.0 95.1453 19600 0.7298 33965280
0.0 96.1162 19800 0.7323 34311712
0.0 97.0872 20000 0.7365 34658112
0.0 98.0581 20200 0.7460 35004384
0.0 99.0291 20400 0.7533 35351392
0.0 99.9976 20600 0.7673 35698272
0.0 100.9685 20800 0.7691 36045088
0.0 101.9395 21000 0.7673 36391968
0.0 102.9104 21200 0.7807 36739040
0.0 103.8814 21400 0.7864 37084768
0.0 104.8523 21600 0.7933 37431808
0.0 105.8232 21800 0.7928 37779232
0.0 106.7942 22000 0.7955 38126112
0.0 107.7651 22200 0.8025 38472672
0.0 108.7361 22400 0.8094 38818464
0.0 109.7070 22600 0.8131 39165472
0.0 110.6780 22800 0.8207 39511328
0.0 111.6489 23000 0.8154 39858048
0.0 112.6199 23200 0.8248 40205184
0.0 113.5908 23400 0.8288 40552448
0.0 114.5617 23600 0.8311 40899872
0.0 115.5327 23800 0.8397 41246848
0.0 116.5036 24000 0.8407 41593088
0.0 117.4746 24200 0.8490 41938464
0.0 118.4455 24400 0.8508 42284064
0.0 119.4165 24600 0.8555 42631296
0.0 120.3874 24800 0.8583 42976992
0.0 121.3584 25000 0.8649 43321920
0.0 122.3293 25200 0.8704 43669344
0.0 123.3002 25400 0.8657 44016096
0.0 124.2712 25600 0.8713 44363232
0.0 125.2421 25800 0.8755 44706400
0.0 126.2131 26000 0.8748 45054080
0.0 127.1840 26200 0.8776 45400864
0.0 128.1550 26400 0.8825 45746688
0.0 129.1259 26600 0.8819 46093216
0.0 130.0969 26800 0.8913 46440960
0.0 131.0678 27000 0.8923 46785984
0.0 132.0387 27200 0.8876 47133856
0.0 133.0097 27400 0.8983 47481088
0.0 133.9782 27600 0.8977 47827904
0.0 134.9492 27800 0.9005 48175392
0.0 135.9201 28000 0.8896 48521536
0.0 136.8910 28200 0.9022 48867904
0.0 137.8620 28400 0.9042 49212704
0.0 138.8329 28600 0.9121 49561312
0.0 139.8039 28800 0.9098 49907264
0.0 140.7748 29000 0.9094 50254720
0.0 141.7458 29200 0.9094 50600480
0.0 142.7167 29400 0.9118 50947456
0.0 143.6877 29600 0.9100 51295040
0.0 144.6586 29800 0.9159 51641376
0.0 145.6295 30000 0.9118 51988288
0.0 146.6005 30200 0.9192 52334112
0.0 147.5714 30400 0.9197 52683008
0.0 148.5424 30600 0.9226 53028128
0.0 149.5133 30800 0.9173 53374400
0.0 150.4843 31000 0.9247 53720704
0.0 151.4552 31200 0.9205 54067392
0.0 152.4262 31400 0.9210 54414880
0.0 153.3971 31600 0.9176 54760672
0.0 154.3680 31800 0.9281 55106400
0.0 155.3390 32000 0.9198 55452512
0.0 156.3099 32200 0.9223 55798400
0.0 157.2809 32400 0.9278 56146592
0.0 158.2518 32600 0.9310 56493696
0.0 159.2228 32800 0.9294 56840064
0.0 160.1937 33000 0.9296 57186368
0.0 161.1646 33200 0.9284 57532416
0.0 162.1356 33400 0.9298 57880832
0.0 163.1065 33600 0.9338 58227680
0.0 164.0775 33800 0.9280 58574880
0.0 165.0484 34000 0.9291 58922528
0.0 166.0194 34200 0.9313 59269760
0.0 166.9879 34400 0.9314 59615872
0.0 167.9588 34600 0.9329 59962368
0.0 168.9298 34800 0.9302 60308640
0.0 169.9007 35000 0.9295 60655616
0.0 170.8717 35200 0.9329 61003136
0.0 171.8426 35400 0.9291 61350016
0.0 172.8136 35600 0.9312 61696224
0.0 173.7845 35800 0.9349 62044256
0.0 174.7554 36000 0.9345 62389792
0.0 175.7264 36200 0.9363 62738496
0.0 176.6973 36400 0.9296 63084544
0.0 177.6683 36600 0.9367 63431712
0.0 178.6392 36800 0.9293 63778656
0.0 179.6102 37000 0.9344 64124736
0.0 180.5811 37200 0.9393 64471808
0.0 181.5521 37400 0.9260 64820352
0.0 182.5230 37600 0.9328 65167904
0.0 183.4939 37800 0.9315 65513280
0.0 184.4649 38000 0.9286 65859136
0.0 185.4358 38200 0.9327 66205888
0.0 186.4068 38400 0.9287 66552576
0.0 187.3777 38600 0.9321 66899904
0.0 188.3487 38800 0.9281 67245856
0.0 189.3196 39000 0.9361 67591648
0.0 190.2906 39200 0.9313 67937440
0.0 191.2615 39400 0.9313 68285088
0.0 192.2324 39600 0.9308 68631104
0.0 193.2034 39800 0.9323 68978016
0.0 194.1743 40000 0.9357 69324064

Framework versions

  • PEFT 0.15.1
  • Transformers 4.51.3
  • Pytorch 2.6.0+cu124
  • Datasets 3.5.0
  • Tokenizers 0.21.1
Downloads last month
3
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_mrpc_1744902651

Adapter
(352)
this model