train_cb_1745950308

This model is a fine-tuned version of google/gemma-3-1b-it on the cb dataset. It achieves the following results on the evaluation set:

  • Loss: 0.2023
  • Num Input Tokens Seen: 22718312

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 2
  • eval_batch_size: 2
  • seed: 123
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 4
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • training_steps: 40000

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.6391 3.5133 200 0.5417 114504
0.4911 7.0177 400 0.3985 228504
0.3148 10.5310 600 0.3381 341136
0.265 14.0354 800 0.2901 455488
0.1024 17.5487 1000 0.2637 569504
0.1238 21.0531 1200 0.2477 682024
0.0141 24.5664 1400 0.2390 796328
0.0747 28.0708 1600 0.2173 909320
0.0455 31.5841 1800 0.2249 1023696
0.0653 35.0885 2000 0.2210 1137280
0.0144 38.6018 2200 0.2023 1251592
0.0855 42.1062 2400 0.2247 1364312
0.0183 45.6195 2600 0.2138 1478704
0.0104 49.1239 2800 0.2256 1591424
0.0089 52.6372 3000 0.2304 1705000
0.0743 56.1416 3200 0.2253 1818688
0.013 59.6549 3400 0.2388 1932248
0.0327 63.1593 3600 0.2343 2045464
0.0439 66.6726 3800 0.2679 2159128
0.0082 70.1770 4000 0.2531 2272792
0.0034 73.6903 4200 0.2519 2387344
0.0178 77.1947 4400 0.2620 2500160
0.0041 80.7080 4600 0.2711 2614032
0.0022 84.2124 4800 0.2904 2728488
0.0011 87.7257 5000 0.3025 2842656
0.0033 91.2301 5200 0.2947 2956824
0.0015 94.7434 5400 0.3044 3069840
0.0035 98.2478 5600 0.3178 3183600
0.0017 101.7611 5800 0.3312 3297896
0.0004 105.2655 6000 0.3405 3411544
0.0011 108.7788 6200 0.3566 3525472
0.0005 112.2832 6400 0.3447 3638584
0.001 115.7965 6600 0.3534 3752608
0.0005 119.3009 6800 0.3653 3865376
0.0005 122.8142 7000 0.3451 3979464
0.0005 126.3186 7200 0.3638 4093296
0.0003 129.8319 7400 0.3652 4207120
0.0003 133.3363 7600 0.3779 4320568
0.0001 136.8496 7800 0.3726 4434056
0.0002 140.3540 8000 0.3781 4547840
0.0002 143.8673 8200 0.3936 4662192
0.0002 147.3717 8400 0.3909 4774160
0.0001 150.8850 8600 0.3995 4887640
0.0001 154.3894 8800 0.4058 5002864
0.0001 157.9027 9000 0.4136 5116216
0.0001 161.4071 9200 0.4089 5229496
0.0001 164.9204 9400 0.4107 5343528
0.0001 168.4248 9600 0.4285 5455520
0.0001 171.9381 9800 0.4210 5571144
0.0 175.4425 10000 0.4252 5684752
0.0001 178.9558 10200 0.4359 5799088
0.0 182.4602 10400 0.4257 5911888
0.0 185.9735 10600 0.4229 6025544
0.0 189.4779 10800 0.4261 6139264
0.0 192.9912 11000 0.4383 6252832
0.0 196.4956 11200 0.4593 6366440
0.0 200.0 11400 0.4587 6478776
0.0 203.5133 11600 0.4423 6592280
0.0 207.0177 11800 0.4542 6704968
0.0 210.5310 12000 0.4529 6819568
0.0 214.0354 12200 0.4446 6933264
0.0 217.5487 12400 0.4566 7045688
0.0 221.0531 12600 0.4661 7159888
0.0 224.5664 12800 0.4743 7274296
0.0 228.0708 13000 0.4834 7387544
0.0 231.5841 13200 0.4638 7500200
0.0 235.0885 13400 0.4666 7614696
0.0 238.6018 13600 0.4755 7727608
0.0 242.1062 13800 0.4843 7840696
0.0 245.6195 14000 0.4933 7954632
0.0 249.1239 14200 0.4881 8068648
0.0 252.6372 14400 0.5147 8181840
0.0 256.1416 14600 0.4881 8294896
0.0 259.6549 14800 0.5142 8408512
0.0 263.1593 15000 0.4932 8522664
0.0 266.6726 15200 0.4977 8636032
0.0 270.1770 15400 0.5226 8748624
0.0 273.6903 15600 0.5147 8863248
0.0 277.1947 15800 0.5117 8976424
0.0 280.7080 16000 0.5130 9088984
0.0 284.2124 16200 0.5174 9204128
0.0 287.7257 16400 0.5122 9317208
0.0 291.2301 16600 0.5242 9431208
0.0 294.7434 16800 0.5225 9544328
0.0 298.2478 17000 0.5478 9657432
0.0 301.7611 17200 0.5591 9770824
0.0 305.2655 17400 0.5156 9884648
0.0 308.7788 17600 0.5336 9997288
0.0 312.2832 17800 0.5303 10111472
0.0 315.7965 18000 0.5557 10223648
0.0 319.3009 18200 0.5313 10336864
0.0 322.8142 18400 0.5492 10450688
0.0 326.3186 18600 0.5344 10563128
0.0 329.8319 18800 0.5433 10677928
0.0 333.3363 19000 0.5773 10790896
0.0 336.8496 19200 0.5537 10904600
0.0 340.3540 19400 0.5574 11018112
0.0 343.8673 19600 0.5366 11131712
0.0 347.3717 19800 0.5600 11245728
0.0 350.8850 20000 0.5699 11358800
0.0 354.3894 20200 0.5486 11471832
0.0 357.9027 20400 0.5586 11586368
0.0 361.4071 20600 0.5623 11700176
0.0 364.9204 20800 0.5771 11814304
0.0 368.4248 21000 0.5425 11927464
0.0 371.9381 21200 0.5818 12041416
0.0 375.4425 21400 0.5916 12153176
0.0 378.9558 21600 0.5889 12267984
0.0 382.4602 21800 0.5943 12381424
0.0 385.9735 22000 0.5870 12494280
0.0 389.4779 22200 0.5731 12608008
0.0 392.9912 22400 0.6058 12721456
0.0 396.4956 22600 0.5977 12835240
0.0 400.0 22800 0.6147 12948416
0.0 403.5133 23000 0.6086 13061472
0.0 407.0177 23200 0.6105 13175888
0.0 410.5310 23400 0.6152 13289752
0.0 414.0354 23600 0.6163 13403848
0.0 417.5487 23800 0.6257 13518496
0.0 421.0531 24000 0.5990 13631704
0.0 424.5664 24200 0.5993 13745200
0.0 428.0708 24400 0.6045 13859752
0.0 431.5841 24600 0.6135 13972648
0.0 435.0885 24800 0.6303 14086360
0.0 438.6018 25000 0.6207 14201656
0.0 442.1062 25200 0.6126 14314736
0.0 445.6195 25400 0.6147 14428104
0.0 449.1239 25600 0.6082 14541136
0.0 452.6372 25800 0.6216 14655696
0.0 456.1416 26000 0.6219 14768168
0.0 459.6549 26200 0.6315 14882048
0.0 463.1593 26400 0.6396 14996008
0.0 466.6726 26600 0.6411 15109352
0.0 470.1770 26800 0.6570 15223592
0.0 473.6903 27000 0.6647 15338072
0.0 477.1947 27200 0.6556 15451312
0.0 480.7080 27400 0.6473 15565784
0.0 484.2124 27600 0.6647 15679720
0.0 487.7257 27800 0.6632 15792680
0.0 491.2301 28000 0.6731 15906624
0.0 494.7434 28200 0.6559 16019936
0.0 498.2478 28400 0.6320 16133784
0.0 501.7611 28600 0.6781 16248200
0.0 505.2655 28800 0.6782 16361560
0.0 508.7788 29000 0.6502 16475624
0.0 512.2832 29200 0.6390 16588984
0.0 515.7965 29400 0.6706 16702496
0.0 519.3009 29600 0.6885 16816272
0.0 522.8142 29800 0.6672 16929072
0.0 526.3186 30000 0.6908 17043120
0.0 529.8319 30200 0.7010 17156344
0.0 533.3363 30400 0.7022 17268656
0.0 536.8496 30600 0.6844 17383696
0.0 540.3540 30800 0.6849 17495648
0.0 543.8673 31000 0.7018 17609616
0.0 547.3717 31200 0.6727 17723600
0.0 550.8850 31400 0.6931 17836576
0.0 554.3894 31600 0.6648 17949928
0.0 557.9027 31800 0.6720 18064576
0.0 561.4071 32000 0.6760 18177096
0.0 564.9204 32200 0.6887 18290608
0.0 568.4248 32400 0.7023 18404648
0.0 571.9381 32600 0.6980 18517216
0.0 575.4425 32800 0.6711 18631296
0.0 578.9558 33000 0.6660 18745416
0.0 582.4602 33200 0.6717 18857896
0.0 585.9735 33400 0.6783 18971344
0.0 589.4779 33600 0.6766 19085248
0.0 592.9912 33800 0.6796 19199136
0.0 596.4956 34000 0.7248 19311344
0.0 600.0 34200 0.6982 19425472
0.0 603.5133 34400 0.6736 19539112
0.0 607.0177 34600 0.6695 19652392
0.0 610.5310 34800 0.7022 19766904
0.0 614.0354 35000 0.6896 19879808
0.0 617.5487 35200 0.6923 19993952
0.0 621.0531 35400 0.7184 20107560
0.0 624.5664 35600 0.6938 20220888
0.0 628.0708 35800 0.7055 20333904
0.0 631.5841 36000 0.6938 20446736
0.0 635.0885 36200 0.7019 20560472
0.0 638.6018 36400 0.6990 20673984
0.0 642.1062 36600 0.6915 20786240
0.0 645.6195 36800 0.6995 20899128
0.0 649.1239 37000 0.7121 21011928
0.0 652.6372 37200 0.7113 21126880
0.0 656.1416 37400 0.6808 21239760
0.0 659.6549 37600 0.6962 21353776
0.0 663.1593 37800 0.6780 21467368
0.0 666.6726 38000 0.6750 21581512
0.0 670.1770 38200 0.6950 21694376
0.0 673.6903 38400 0.6880 21808568
0.0 677.1947 38600 0.6614 21922424
0.0 680.7080 38800 0.7017 22036600
0.0 684.2124 39000 0.7000 22150992
0.0 687.7257 39200 0.7024 22263616
0.0 691.2301 39400 0.7024 22377936
0.0 694.7434 39600 0.7024 22490328
0.0 698.2478 39800 0.7024 22604096
0.0 701.7611 40000 0.7024 22718312

Framework versions

  • PEFT 0.15.2.dev0
  • Transformers 4.51.3
  • Pytorch 2.6.0+cu124
  • Datasets 3.5.0
  • Tokenizers 0.21.1
Downloads last month
1
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_cb_1745950308

Adapter
(81)
this model

Dataset used to train rbelanec/train_cb_1745950308