train_stsb_1745333591

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the stsb dataset. It achieves the following results on the evaluation set:

  • Loss: 0.5494
  • Num Input Tokens Seen: 54490336

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.3
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 123
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 16
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • training_steps: 40000

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.8213 0.6182 200 0.9578 272576
0.5891 1.2349 400 0.7184 544096
0.6185 1.8532 600 0.6815 818048
0.5622 2.4699 800 0.6753 1089600
0.5506 3.0866 1000 0.6588 1361504
0.7865 3.7048 1200 0.6508 1636960
0.5862 4.3215 1400 0.6580 1909696
0.5254 4.9397 1600 0.6381 2182656
0.5086 5.5564 1800 0.6330 2453904
0.5407 6.1731 2000 0.6232 2727984
0.5487 6.7913 2200 0.6168 2999760
0.5661 7.4080 2400 0.5622 3274528
0.4508 8.0247 2600 0.5814 3546880
0.5109 8.6430 2800 0.5915 3821184
0.428 9.2597 3000 0.5584 4090704
0.4683 9.8779 3200 0.5621 4363696
0.4473 10.4946 3400 0.5494 4636656
0.4395 11.1113 3600 0.5833 4908928
0.4209 11.7295 3800 0.5668 5179040
0.4095 12.3462 4000 0.5749 5452192
0.444 12.9645 4200 0.5647 5724448
0.4249 13.5811 4400 0.5572 5998032
0.3631 14.1978 4600 0.5687 6269792
0.4563 14.8161 4800 0.5626 6541248
0.3545 15.4328 5000 0.5852 6815200
0.3135 16.0495 5200 0.6189 7086224
0.3561 16.6677 5400 0.6123 7360560
0.3582 17.2844 5600 0.6112 7632240
0.3834 17.9026 5800 0.5843 7904432
0.3268 18.5193 6000 0.6199 8177168
0.27 19.1360 6200 0.6794 8449968
0.3261 19.7543 6400 0.6375 8722992
0.262 20.3709 6600 0.6706 8996224
0.2588 20.9892 6800 0.6481 9269504
0.2761 21.6059 7000 0.7299 9542432
0.2007 22.2226 7200 0.7841 9812704
0.2119 22.8408 7400 0.7382 10086272
0.1575 23.4575 7600 0.7728 10358832
0.1397 24.0742 7800 0.8269 10630000
0.1879 24.6924 8000 0.8175 10904880
0.1317 25.3091 8200 0.8720 11176208
0.1594 25.9274 8400 0.9042 11451344
0.1193 26.5440 8600 0.8620 11723328
0.0803 27.1607 8800 0.9757 11996224
0.1226 27.7790 9000 0.9386 12267520
0.0938 28.3957 9200 0.9238 12542064
0.0519 29.0124 9400 1.0646 12812048
0.0728 29.6306 9600 1.0750 13085264
0.0463 30.2473 9800 1.0078 13356384
0.0744 30.8655 10000 1.0580 13629216
0.0701 31.4822 10200 1.0451 13902736
0.0504 32.0989 10400 1.0477 14174192
0.0352 32.7172 10600 1.1435 14448176
0.0855 33.3338 10800 1.0730 14718096
0.0376 33.9521 11000 1.0351 14992048
0.0484 34.5688 11200 1.1395 15265072
0.0267 35.1855 11400 1.1202 15538960
0.0298 35.8037 11600 1.1337 15812880
0.0341 36.4204 11800 1.1777 16082608
0.0415 37.0371 12000 1.1897 16357888
0.0287 37.6553 12200 1.2221 16627872
0.0388 38.2720 12400 1.1698 16900336
0.0232 38.8903 12600 1.1674 17175024
0.0238 39.5070 12800 1.1664 17446864
0.0163 40.1236 13000 1.2493 17716560
0.0154 40.7419 13200 1.3187 17991792
0.0173 41.3586 13400 1.2568 18262992
0.0164 41.9768 13600 1.2448 18536880
0.0158 42.5935 13800 1.2337 18806784
0.0111 43.2102 14000 1.2544 19080608
0.0216 43.8284 14200 1.3476 19352320
0.0259 44.4451 14400 1.2956 19624544
0.013 45.0618 14600 1.2143 19896064
0.0283 45.6801 14800 1.2005 20168064
0.0095 46.2968 15000 1.3231 20440208
0.0119 46.9150 15200 1.2639 20713296
0.008 47.5317 15400 1.3380 20985744
0.0117 48.1484 15600 1.2504 21257920
0.0175 48.7666 15800 1.2863 21529248
0.019 49.3833 16000 1.3123 21800992
0.0077 50.0 16200 1.2967 22073392
0.0052 50.6182 16400 1.3633 22345648
0.01 51.2349 16600 1.3670 22617984
0.0185 51.8532 16800 1.3321 22892544
0.0037 52.4699 17000 1.4302 23163488
0.0382 53.0866 17200 1.3213 23438320
0.0034 53.7048 17400 1.4571 23708720
0.0031 54.3215 17600 1.3874 23984304
0.011 54.9397 17800 1.4203 24256368
0.0037 55.5564 18000 1.3831 24527040
0.0009 56.1731 18200 1.4859 24799312
0.0012 56.7913 18400 1.5054 25072848
0.0028 57.4080 18600 1.4733 25347056
0.0124 58.0247 18800 1.5096 25618400
0.0144 58.6430 19000 1.3225 25892960
0.0057 59.2597 19200 1.4172 26164688
0.0103 59.8779 19400 1.3579 26437392
0.0153 60.4946 19600 1.4063 26710176
0.0056 61.1113 19800 1.4266 26981728
0.0104 61.7295 20000 1.3551 27253632
0.0035 62.3462 20200 1.4744 27524928
0.0028 62.9645 20400 1.5116 27799712
0.0006 63.5811 20600 1.5977 28071024
0.0005 64.1978 20800 1.5763 28342880
0.0003 64.8161 21000 1.6289 28617696
0.0003 65.4328 21200 1.6688 28888112
0.0004 66.0495 21400 1.6156 29162944
0.0003 66.6677 21600 1.6829 29434784
0.0001 67.2844 21800 1.6700 29706800
0.0003 67.9026 22000 1.6916 29980240
0.0001 68.5193 22200 1.7333 30250192
0.0002 69.1360 22400 1.7389 30522672
0.0001 69.7543 22600 1.7203 30795024
0.0001 70.3709 22800 1.7700 31066544
0.0001 70.9892 23000 1.7697 31338128
0.0001 71.6059 23200 1.8099 31609104
0.0007 72.2226 23400 1.8562 31881424
0.0002 72.8408 23600 1.7837 32155024
0.0001 73.4575 23800 1.8126 32425312
0.0001 74.0742 24000 1.8575 32698784
0.0001 74.6924 24200 1.8753 32974144
0.0001 75.3091 24400 1.9167 33245216
0.026 75.9274 24600 1.1968 33517088
0.0078 76.5440 24800 1.3782 33788432
0.0223 77.1607 25000 1.5010 34060416
0.003 77.7790 25200 1.5150 34333408
0.0016 78.3957 25400 1.6160 34605392
0.0009 79.0124 25600 1.5820 34879536
0.0004 79.6306 25800 1.6513 35153488
0.0002 80.2473 26000 1.6964 35424912
0.0001 80.8655 26200 1.7483 35698064
0.0002 81.4822 26400 1.7371 35968160
0.0001 82.0989 26600 1.7791 36240928
0.0021 82.7172 26800 1.7728 36514208
0.0001 83.3338 27000 1.7723 36785136
0.0001 83.9521 27200 1.8002 37061648
0.0012 84.5688 27400 1.8043 37333648
0.0001 85.1855 27600 1.8355 37605184
0.0001 85.8037 27800 1.8401 37875360
0.0001 86.4204 28000 1.8688 38150208
0.0001 87.0371 28200 1.8104 38422048
0.0001 87.6553 28400 1.8730 38692224
0.0001 88.2720 28600 1.8787 38964176
0.0001 88.8903 28800 1.8849 39235184
0.003 89.5070 29000 1.9233 39507520
0.001 90.1236 29200 1.9127 39779328
0.0022 90.7419 29400 1.8981 40051520
0.0 91.3586 29600 1.9303 40322576
0.0001 91.9768 29800 1.9180 40596016
0.0001 92.5935 30000 1.9204 40867568
0.0001 93.2102 30200 1.9712 41140848
0.0 93.8284 30400 1.9761 41412848
0.0001 94.4451 30600 1.9585 41683920
0.0 95.0618 30800 1.9967 41959008
0.0 95.6801 31000 1.9950 42231520
0.0001 96.2968 31200 1.9839 42502416
0.0 96.9150 31400 2.0041 42776304
0.0001 97.5317 31600 2.0162 43048176
0.0 98.1484 31800 2.0103 43320144
0.0 98.7666 32000 2.0081 43591728
0.0027 99.3833 32200 2.0273 43866048
0.0 100.0 32400 2.0347 44137040
0.0 100.6182 32600 2.0524 44408848
0.0008 101.2349 32800 2.0672 44682912
0.0 101.8532 33000 2.0429 44956000
0.0022 102.4699 33200 2.0500 45227824
0.0 103.0866 33400 2.0476 45498320
0.0 103.7048 33600 2.0636 45773648
0.0012 104.3215 33800 2.0808 46044128
0.0 104.9397 34000 2.0721 46317504
0.0 105.5564 34200 2.0830 46589024
0.0013 106.1731 34400 2.0945 46863680
0.0 106.7913 34600 2.0967 47135520
0.0 107.4080 34800 2.1042 47407056
0.0 108.0247 35000 2.0969 47680112
0.0 108.6430 35200 2.1074 47951632
0.0 109.2597 35400 2.1103 48224016
0.0 109.8779 35600 2.1072 48497072
0.0 110.4946 35800 2.1081 48768624
0.0 111.1113 36000 2.1116 49041488
0.0 111.7295 36200 2.1243 49314352
0.0 112.3462 36400 2.1215 49584848
0.0 112.9645 36600 2.1199 49858864
0.0 113.5811 36800 2.1292 50130000
0.0012 114.1978 37000 2.1276 50404128
0.0 114.8161 37200 2.1346 50678112
0.0012 115.4328 37400 2.1323 50946800
0.0 116.0495 37600 2.1319 51219680
0.0 116.6677 37800 2.1324 51492544
0.0 117.2844 38000 2.1351 51764160
0.0 117.9026 38200 2.1349 52039488
0.0 118.5193 38400 2.1382 52311648
0.0 119.1360 38600 2.1390 52584960
0.0 119.7543 38800 2.1410 52855712
0.0 120.3709 39000 2.1428 53128480
0.0 120.9892 39200 2.1429 53401056
0.0 121.6059 39400 2.1412 53673600
0.0 122.2226 39600 2.1376 53943712
0.0 122.8408 39800 2.1381 54217344
0.0 123.4575 40000 2.1370 54490336

Framework versions

  • PEFT 0.15.1
  • Transformers 4.51.3
  • Pytorch 2.6.0+cu124
  • Datasets 3.5.0
  • Tokenizers 0.21.1
Downloads last month
3
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_stsb_1745333591

Adapter
(850)
this model