train_mrpc_1744902646

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the mrpc dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1242
  • Num Input Tokens Seen: 65784064

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 123
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 16
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • training_steps: 40000

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.1903 0.9685 200 0.1730 329312
0.1798 1.9395 400 0.1689 658560
0.1635 2.9104 600 0.1617 987040
0.1378 3.8814 800 0.1578 1316448
0.1601 4.8523 1000 0.1540 1644608
0.1313 5.8232 1200 0.1527 1974016
0.1632 6.7942 1400 0.1477 2303584
0.0939 7.7651 1600 0.1462 2630688
0.12 8.7361 1800 0.1423 2959808
0.1544 9.7070 2000 0.1421 3287584
0.1232 10.6780 2200 0.1365 3617920
0.1438 11.6489 2400 0.1368 3945536
0.1162 12.6199 2600 0.1330 4274560
0.0953 13.5908 2800 0.1309 4603168
0.1619 14.5617 3000 0.1300 4932448
0.1143 15.5327 3200 0.1301 5261312
0.0975 16.5036 3400 0.1294 5589632
0.1017 17.4746 3600 0.1271 5918112
0.0892 18.4455 3800 0.1308 6246368
0.0953 19.4165 4000 0.1285 6574848
0.1428 20.3874 4200 0.1337 6903520
0.1188 21.3584 4400 0.1258 7231904
0.0901 22.3293 4600 0.1299 7561504
0.084 23.3002 4800 0.1243 7890912
0.1157 24.2712 5000 0.1260 8218592
0.1114 25.2421 5200 0.1245 8548256
0.0834 26.2131 5400 0.1310 8876704
0.1272 27.1840 5600 0.1254 9206272
0.0783 28.1550 5800 0.1287 9534720
0.0962 29.1259 6000 0.1251 9864384
0.0949 30.0969 6200 0.1312 10193376
0.0935 31.0678 6400 0.1262 10521952
0.0857 32.0387 6600 0.1270 10851520
0.0887 33.0097 6800 0.1242 11180544
0.1057 33.9782 7000 0.1293 11509344
0.0701 34.9492 7200 0.1287 11838208
0.0938 35.9201 7400 0.1295 12167872
0.0592 36.8910 7600 0.1312 12496352
0.0979 37.8620 7800 0.1301 12826048
0.0695 38.8329 8000 0.1281 13155040
0.0535 39.8039 8200 0.1323 13483008
0.0947 40.7748 8400 0.1333 13812064
0.0939 41.7458 8600 0.1313 14140576
0.1 42.7167 8800 0.1336 14469248
0.0926 43.6877 9000 0.1334 14796672
0.0547 44.6586 9200 0.1406 15126752
0.0441 45.6295 9400 0.1335 15456160
0.0652 46.6005 9600 0.1413 15784928
0.0773 47.5714 9800 0.1463 16113248
0.0428 48.5424 10000 0.1440 16442496
0.0716 49.5133 10200 0.1483 16772640
0.0924 50.4843 10400 0.1518 17100000
0.0422 51.4552 10600 0.1508 17428768
0.0843 52.4262 10800 0.1514 17757344
0.0498 53.3971 11000 0.1496 18085920
0.0625 54.3680 11200 0.1532 18414336
0.0547 55.3390 11400 0.1539 18743040
0.0618 56.3099 11600 0.1569 19072928
0.0785 57.2809 11800 0.1595 19401376
0.0515 58.2518 12000 0.1727 19730336
0.0549 59.2228 12200 0.1590 20059488
0.0464 60.1937 12400 0.1767 20388064
0.0405 61.1646 12600 0.1657 20718144
0.0726 62.1356 12800 0.1717 21048224
0.0425 63.1065 13000 0.1792 21376576
0.0316 64.0775 13200 0.1876 21706080
0.0474 65.0484 13400 0.1780 22034624
0.0531 66.0194 13600 0.1883 22364128
0.0192 66.9879 13800 0.1883 22692352
0.0419 67.9588 14000 0.1933 23020864
0.067 68.9298 14200 0.2017 23349920
0.0988 69.9007 14400 0.2010 23679072
0.0324 70.8717 14600 0.2105 24007776
0.0344 71.8426 14800 0.2170 24336640
0.0404 72.8136 15000 0.2106 24664576
0.0298 73.7845 15200 0.2163 24994848
0.0582 74.7554 15400 0.2287 25322720
0.0449 75.7264 15600 0.2341 25650784
0.0305 76.6973 15800 0.2328 25980512
0.0386 77.6683 16000 0.2371 26309536
0.0422 78.6392 16200 0.2364 26638944
0.0161 79.6102 16400 0.2499 26967360
0.044 80.5811 16600 0.2613 27297120
0.054 81.5521 16800 0.2651 27626144
0.067 82.5230 17000 0.2756 27954656
0.0305 83.4939 17200 0.2709 28284160
0.05 84.4649 17400 0.2834 28612224
0.0233 85.4358 17600 0.2782 28940448
0.0365 86.4068 17800 0.3002 29270912
0.0151 87.3777 18000 0.3146 29599424
0.022 88.3487 18200 0.2994 29929280
0.0112 89.3196 18400 0.3115 30257504
0.0273 90.2906 18600 0.3140 30586944
0.0263 91.2615 18800 0.3332 30915744
0.0082 92.2324 19000 0.3191 31245216
0.0248 93.2034 19200 0.3308 31573600
0.0162 94.1743 19400 0.3445 31903616
0.0328 95.1453 19600 0.3452 32232032
0.0213 96.1162 19800 0.3485 32560480
0.0362 97.0872 20000 0.3690 32889696
0.012 98.0581 20200 0.3662 33218016
0.0427 99.0291 20400 0.3687 33547296
0.0047 99.9976 20600 0.3761 33876000
0.0102 100.9685 20800 0.3858 34205376
0.0568 101.9395 21000 0.3908 34534496
0.0237 102.9104 21200 0.4042 34864000
0.0047 103.8814 21400 0.3988 35192256
0.0061 104.8523 21600 0.4167 35521376
0.024 105.8232 21800 0.4195 35851264
0.0295 106.7942 22000 0.4325 36180000
0.007 107.7651 22200 0.4265 36508832
0.0174 108.7361 22400 0.4390 36837600
0.002 109.7070 22600 0.4417 37166720
0.0157 110.6780 22800 0.4535 37495520
0.007 111.6489 23000 0.4557 37824352
0.0106 112.6199 23200 0.4728 38153856
0.0069 113.5908 23400 0.4805 38483200
0.0373 114.5617 23600 0.4751 38812672
0.0029 115.5327 23800 0.4993 39142400
0.0227 116.5036 24000 0.4784 39471200
0.0058 117.4746 24200 0.5040 39798848
0.0078 118.4455 24400 0.5107 40127360
0.0033 119.4165 24600 0.5235 40456736
0.0368 120.3874 24800 0.5293 40785312
0.0312 121.3584 25000 0.5303 41112576
0.0136 122.3293 25200 0.5345 41442112
0.0083 123.3002 25400 0.5397 41771552
0.0068 124.2712 25600 0.5434 42101248
0.0024 125.2421 25800 0.5456 42427392
0.0296 126.2131 26000 0.5471 42756704
0.005 127.1840 26200 0.5486 43085664
0.012 128.1550 26400 0.5543 43414240
0.0085 129.1259 26600 0.5537 43743072
0.0045 130.0969 26800 0.5692 44072768
0.0105 131.0678 27000 0.5719 44400192
0.0041 132.0387 27200 0.5656 44729632
0.0204 133.0097 27400 0.5705 45058976
0.001 133.9782 27600 0.5736 45388352
0.002 134.9492 27800 0.5692 45717952
0.0163 135.9201 28000 0.5921 46046144
0.0079 136.8910 28200 0.5880 46375168
0.0019 137.8620 28400 0.5980 46702816
0.0079 138.8329 28600 0.5973 47033152
0.0041 139.8039 28800 0.6017 47361472
0.0179 140.7748 29000 0.5970 47691424
0.0008 141.7458 29200 0.6097 48019712
0.0019 142.7167 29400 0.6020 48348832
0.0009 143.6877 29600 0.6162 48678560
0.0008 144.6586 29800 0.6114 49008256
0.0063 145.6295 30000 0.6136 49337088
0.0021 146.6005 30200 0.6179 49665344
0.0011 147.5714 30400 0.6235 49996128
0.0144 148.5424 30600 0.6181 50324736
0.0009 149.5133 30800 0.6214 50652864
0.0016 150.4843 31000 0.6267 50981920
0.0024 151.4552 31200 0.6401 51310752
0.0017 152.4262 31400 0.6248 51640352
0.0015 153.3971 31600 0.6388 51969184
0.0022 154.3680 31800 0.6335 52297280
0.0009 155.3390 32000 0.6335 52625600
0.0048 156.3099 32200 0.6401 52953920
0.0013 157.2809 32400 0.6387 53283648
0.0011 158.2518 32600 0.6407 53613056
0.0169 159.2228 32800 0.6451 53941632
0.0015 160.1937 33000 0.6412 54270272
0.0005 161.1646 33200 0.6370 54599104
0.0009 162.1356 33400 0.6392 54929056
0.0007 163.1065 33600 0.6526 55257728
0.0005 164.0775 33800 0.6567 55587456
0.0006 165.0484 34000 0.6331 55916576
0.0011 166.0194 34200 0.6455 56245664
0.0008 166.9879 34400 0.6661 56574272
0.001 167.9588 34600 0.6539 56903360
0.0008 168.9298 34800 0.6489 57232032
0.0008 169.9007 35000 0.6646 57561504
0.0007 170.8717 35200 0.6584 57891168
0.0079 171.8426 35400 0.6569 58220352
0.0003 172.8136 35600 0.6573 58548960
0.0085 173.7845 35800 0.6679 58878688
0.0007 174.7554 36000 0.6649 59207104
0.0026 175.7264 36200 0.6500 59536800
0.0329 176.6973 36400 0.6776 59865312
0.0014 177.6683 36600 0.6582 60194816
0.0005 178.6392 36800 0.6660 60523584
0.0006 179.6102 37000 0.6595 60852352
0.0052 180.5811 37200 0.6634 61181024
0.0007 181.5521 37400 0.6693 61510624
0.0009 182.5230 37600 0.6616 61840672
0.0022 183.4939 37800 0.6651 62167808
0.0006 184.4649 38000 0.6539 62496960
0.0018 185.4358 38200 0.6568 62826016
0.0005 186.4068 38400 0.6567 63154784
0.0043 187.3777 38600 0.6627 63483904
0.0013 188.3487 38800 0.6679 63811808
0.0008 189.3196 39000 0.6554 64139488
0.0114 190.2906 39200 0.6666 64467808
0.0015 191.2615 39400 0.6620 64798112
0.001 192.2324 39600 0.6620 65126304
0.0012 193.2034 39800 0.6620 65455776
0.0003 194.1743 40000 0.6620 65784064

Framework versions

  • PEFT 0.15.1
  • Transformers 4.51.3
  • Pytorch 2.6.0+cu124
  • Datasets 3.5.0
  • Tokenizers 0.21.1
Downloads last month
1
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_mrpc_1744902646

Adapter
(850)
this model