train_stsb_1745333590

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the stsb dataset. It achieves the following results on the evaluation set:

  • Loss: 0.4812
  • Num Input Tokens Seen: 54490336

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 123
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 16
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • training_steps: 40000

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
1.3021 0.6182 200 1.3604 272576
0.8563 1.2349 400 0.9537 544096
0.7775 1.8532 600 0.7933 818048
0.5938 2.4699 800 0.7215 1089600
0.5399 3.0866 1000 0.6877 1361504
0.8215 3.7048 1200 0.6619 1636960
0.5859 4.3215 1400 0.6443 1909696
0.5313 4.9397 1600 0.6270 2182656
0.4757 5.5564 1800 0.6122 2453904
0.5818 6.1731 2000 0.6038 2727984
0.5458 6.7913 2200 0.5937 2999760
0.6244 7.4080 2400 0.5835 3274528
0.4439 8.0247 2600 0.5784 3546880
0.5198 8.6430 2800 0.5700 3821184
0.565 9.2597 3000 0.5679 4090704
0.4908 9.8779 3200 0.5581 4363696
0.4763 10.4946 3400 0.5546 4636656
0.4321 11.1113 3600 0.5479 4908928
0.4439 11.7295 3800 0.5471 5179040
0.4503 12.3462 4000 0.5413 5452192
0.5098 12.9645 4200 0.5384 5724448
0.4825 13.5811 4400 0.5341 5998032
0.4889 14.1978 4600 0.5312 6269792
0.5382 14.8161 4800 0.5296 6541248
0.388 15.4328 5000 0.5236 6815200
0.4527 16.0495 5200 0.5235 7086224
0.4035 16.6677 5400 0.5216 7360560
0.4303 17.2844 5600 0.5218 7632240
0.4525 17.9026 5800 0.5172 7904432
0.45 18.5193 6000 0.5166 8177168
0.4519 19.1360 6200 0.5149 8449968
0.5066 19.7543 6400 0.5116 8722992
0.3938 20.3709 6600 0.5128 8996224
0.4307 20.9892 6800 0.5109 9269504
0.4476 21.6059 7000 0.5083 9542432
0.4518 22.2226 7200 0.5089 9812704
0.3907 22.8408 7400 0.5076 10086272
0.3981 23.4575 7600 0.5049 10358832
0.4521 24.0742 7800 0.5065 10630000
0.4188 24.6924 8000 0.5049 10904880
0.3882 25.3091 8200 0.5033 11176208
0.4145 25.9274 8400 0.5025 11451344
0.4499 26.5440 8600 0.5004 11723328
0.3958 27.1607 8800 0.5019 11996224
0.4886 27.7790 9000 0.5008 12267520
0.3909 28.3957 9200 0.4996 12542064
0.415 29.0124 9400 0.4997 12812048
0.4272 29.6306 9600 0.4997 13085264
0.464 30.2473 9800 0.4962 13356384
0.4089 30.8655 10000 0.4970 13629216
0.4384 31.4822 10200 0.4971 13902736
0.3831 32.0989 10400 0.4958 14174192
0.4332 32.7172 10600 0.4937 14448176
0.3366 33.3338 10800 0.4965 14718096
0.4022 33.9521 11000 0.4944 14992048
0.372 34.5688 11200 0.4929 15265072
0.3742 35.1855 11400 0.4958 15538960
0.4519 35.8037 11600 0.4919 15812880
0.4465 36.4204 11800 0.4916 16082608
0.4008 37.0371 12000 0.4913 16357888
0.4345 37.6553 12200 0.4953 16627872
0.384 38.2720 12400 0.4915 16900336
0.449 38.8903 12600 0.4908 17175024
0.4306 39.5070 12800 0.4901 17446864
0.3708 40.1236 13000 0.4922 17716560
0.4045 40.7419 13200 0.4897 17991792
0.3715 41.3586 13400 0.4882 18262992
0.4101 41.9768 13600 0.4913 18536880
0.3875 42.5935 13800 0.4885 18806784
0.4068 43.2102 14000 0.4881 19080608
0.4474 43.8284 14200 0.4877 19352320
0.3993 44.4451 14400 0.4883 19624544
0.4251 45.0618 14600 0.4888 19896064
0.396 45.6801 14800 0.4869 20168064
0.3739 46.2968 15000 0.4898 20440208
0.3761 46.9150 15200 0.4862 20713296
0.3866 47.5317 15400 0.4865 20985744
0.3928 48.1484 15600 0.4889 21257920
0.4348 48.7666 15800 0.4865 21529248
0.3957 49.3833 16000 0.4879 21800992
0.3997 50.0 16200 0.4856 22073392
0.4013 50.6182 16400 0.4847 22345648
0.4087 51.2349 16600 0.4867 22617984
0.3616 51.8532 16800 0.4871 22892544
0.3835 52.4699 17000 0.4860 23163488
0.3736 53.0866 17200 0.4872 23438320
0.3932 53.7048 17400 0.4868 23708720
0.3652 54.3215 17600 0.4842 23984304
0.3621 54.9397 17800 0.4853 24256368
0.4472 55.5564 18000 0.4868 24527040
0.4439 56.1731 18200 0.4857 24799312
0.4578 56.7913 18400 0.4862 25072848
0.3759 57.4080 18600 0.4843 25347056
0.3794 58.0247 18800 0.4843 25618400
0.3759 58.6430 19000 0.4843 25892960
0.3669 59.2597 19200 0.4875 26164688
0.4237 59.8779 19400 0.4822 26437392
0.3628 60.4946 19600 0.4840 26710176
0.5107 61.1113 19800 0.4851 26981728
0.4247 61.7295 20000 0.4848 27253632
0.3973 62.3462 20200 0.4838 27524928
0.3955 62.9645 20400 0.4842 27799712
0.3644 63.5811 20600 0.4842 28071024
0.4207 64.1978 20800 0.4853 28342880
0.3765 64.8161 21000 0.4843 28617696
0.5168 65.4328 21200 0.4853 28888112
0.4367 66.0495 21400 0.4817 29162944
0.4098 66.6677 21600 0.4836 29434784
0.3641 67.2844 21800 0.4829 29706800
0.4976 67.9026 22000 0.4826 29980240
0.3862 68.5193 22200 0.4841 30250192
0.4779 69.1360 22400 0.4845 30522672
0.3349 69.7543 22600 0.4827 30795024
0.3657 70.3709 22800 0.4833 31066544
0.368 70.9892 23000 0.4847 31338128
0.3496 71.6059 23200 0.4832 31609104
0.495 72.2226 23400 0.4847 31881424
0.3431 72.8408 23600 0.4832 32155024
0.34 73.4575 23800 0.4835 32425312
0.4171 74.0742 24000 0.4837 32698784
0.4163 74.6924 24200 0.4853 32974144
0.4195 75.3091 24400 0.4843 33245216
0.3131 75.9274 24600 0.4826 33517088
0.3765 76.5440 24800 0.4833 33788432
0.3679 77.1607 25000 0.4826 34060416
0.364 77.7790 25200 0.4815 34333408
0.4308 78.3957 25400 0.4824 34605392
0.3733 79.0124 25600 0.4846 34879536
0.3827 79.6306 25800 0.4834 35153488
0.4743 80.2473 26000 0.4842 35424912
0.3484 80.8655 26200 0.4827 35698064
0.429 81.4822 26400 0.4826 35968160
0.3979 82.0989 26600 0.4825 36240928
0.4266 82.7172 26800 0.4823 36514208
0.3347 83.3338 27000 0.4834 36785136
0.3878 83.9521 27200 0.4831 37061648
0.3678 84.5688 27400 0.4825 37333648
0.372 85.1855 27600 0.4832 37605184
0.3617 85.8037 27800 0.4842 37875360
0.4043 86.4204 28000 0.4821 38150208
0.4293 87.0371 28200 0.4839 38422048
0.4234 87.6553 28400 0.4835 38692224
0.4145 88.2720 28600 0.4831 38964176
0.3874 88.8903 28800 0.4826 39235184
0.352 89.5070 29000 0.4838 39507520
0.3479 90.1236 29200 0.4824 39779328
0.3455 90.7419 29400 0.4823 40051520
0.3366 91.3586 29600 0.4835 40322576
0.4397 91.9768 29800 0.4823 40596016
0.3862 92.5935 30000 0.4819 40867568
0.3997 93.2102 30200 0.4836 41140848
0.3862 93.8284 30400 0.4815 41412848
0.3685 94.4451 30600 0.4812 41683920
0.3664 95.0618 30800 0.4834 41959008
0.3729 95.6801 31000 0.4836 42231520
0.4265 96.2968 31200 0.4827 42502416
0.3464 96.9150 31400 0.4834 42776304
0.3806 97.5317 31600 0.4833 43048176
0.3615 98.1484 31800 0.4827 43320144
0.348 98.7666 32000 0.4829 43591728
0.3959 99.3833 32200 0.4822 43866048
0.3813 100.0 32400 0.4834 44137040
0.3691 100.6182 32600 0.4835 44408848
0.3928 101.2349 32800 0.4827 44682912
0.4105 101.8532 33000 0.4831 44956000
0.3672 102.4699 33200 0.4826 45227824
0.3992 103.0866 33400 0.4820 45498320
0.3749 103.7048 33600 0.4825 45773648
0.388 104.3215 33800 0.4824 46044128
0.3478 104.9397 34000 0.4826 46317504
0.4038 105.5564 34200 0.4823 46589024
0.3426 106.1731 34400 0.4827 46863680
0.3749 106.7913 34600 0.4819 47135520
0.3281 107.4080 34800 0.4836 47407056
0.382 108.0247 35000 0.4832 47680112
0.3511 108.6430 35200 0.4841 47951632
0.3876 109.2597 35400 0.4815 48224016
0.3909 109.8779 35600 0.4840 48497072
0.3368 110.4946 35800 0.4827 48768624
0.3998 111.1113 36000 0.4821 49041488
0.3596 111.7295 36200 0.4829 49314352
0.3438 112.3462 36400 0.4833 49584848
0.3969 112.9645 36600 0.4839 49858864
0.4149 113.5811 36800 0.4836 50130000
0.3473 114.1978 37000 0.4839 50404128
0.3392 114.8161 37200 0.4823 50678112
0.3617 115.4328 37400 0.4842 50946800
0.4183 116.0495 37600 0.4829 51219680
0.3372 116.6677 37800 0.4829 51492544
0.3408 117.2844 38000 0.4827 51764160
0.368 117.9026 38200 0.4823 52039488
0.3565 118.5193 38400 0.4824 52311648
0.3893 119.1360 38600 0.4842 52584960
0.3727 119.7543 38800 0.4838 52855712
0.3457 120.3709 39000 0.4829 53128480
0.3783 120.9892 39200 0.4820 53401056
0.469 121.6059 39400 0.4831 53673600
0.35 122.2226 39600 0.4831 53943712
0.3627 122.8408 39800 0.4831 54217344
0.3989 123.4575 40000 0.4831 54490336

Framework versions

  • PEFT 0.15.1
  • Transformers 4.51.3
  • Pytorch 2.6.0+cu124
  • Datasets 3.5.0
  • Tokenizers 0.21.1
Downloads last month
1
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_stsb_1745333590

Adapter
(850)
this model