train_cb_1745950309

This model is a fine-tuned version of google/gemma-3-1b-it on the cb dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1622
  • Num Input Tokens Seen: 22718312

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.3
  • train_batch_size: 2
  • eval_batch_size: 2
  • seed: 123
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 4
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • training_steps: 40000

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.388 3.5133 200 0.2614 114504
0.1626 7.0177 400 0.2616 228504
0.1683 10.5310 600 0.1827 341136
0.1169 14.0354 800 0.3692 455488
0.2206 17.5487 1000 0.1622 569504
0.0923 21.0531 1200 0.3019 682024
0.0482 24.5664 1400 0.2098 796328
0.1297 28.0708 1600 0.1715 909320
0.0484 31.5841 1800 0.2892 1023696
0.1897 35.0885 2000 0.3683 1137280
0.0221 38.6018 2200 0.2030 1251592
0.0254 42.1062 2400 0.3250 1364312
0.0001 45.6195 2600 0.3399 1478704
0.0001 49.1239 2800 0.3453 1591424
0.0 52.6372 3000 0.3432 1705000
0.0 56.1416 3200 0.3670 1818688
0.0 59.6549 3400 0.3640 1932248
0.0 63.1593 3600 0.3626 2045464
0.0 66.6726 3800 0.3678 2159128
0.0 70.1770 4000 0.3676 2272792
0.0 73.6903 4200 0.3672 2387344
0.0 77.1947 4400 0.3837 2500160
0.0 80.7080 4600 0.3871 2614032
0.0 84.2124 4800 0.3889 2728488
0.0 87.7257 5000 0.3831 2842656
0.0 91.2301 5200 0.3844 2956824
0.0 94.7434 5400 0.3786 3069840
0.0 98.2478 5600 0.3956 3183600
0.0 101.7611 5800 0.3871 3297896
0.0 105.2655 6000 0.3919 3411544
0.0 108.7788 6200 0.4034 3525472
0.0 112.2832 6400 0.4128 3638584
0.0 115.7965 6600 0.4014 3752608
0.0 119.3009 6800 0.4136 3865376
0.0 122.8142 7000 0.4074 3979464
0.0 126.3186 7200 0.4088 4093296
0.0 129.8319 7400 0.3879 4207120
0.0 133.3363 7600 0.4211 4320568
0.0 136.8496 7800 0.4221 4434056
0.0 140.3540 8000 0.4260 4547840
0.0 143.8673 8200 0.4357 4662192
0.0 147.3717 8400 0.4341 4774160
0.0 150.8850 8600 0.4432 4887640
0.0 154.3894 8800 0.4430 5002864
0.0 157.9027 9000 0.4386 5116216
0.0 161.4071 9200 0.4497 5229496
0.0 164.9204 9400 0.4417 5343528
0.0 168.4248 9600 0.4428 5455520
0.0 171.9381 9800 0.4645 5571144
0.0 175.4425 10000 0.4616 5684752
0.0 178.9558 10200 0.4657 5799088
0.0 182.4602 10400 0.4708 5911888
0.0 185.9735 10600 0.4736 6025544
0.0 189.4779 10800 0.4731 6139264
0.0 192.9912 11000 0.4757 6252832
0.0 196.4956 11200 0.4847 6366440
0.0 200.0 11400 0.4812 6478776
0.0 203.5133 11600 0.4872 6592280
0.0 207.0177 11800 0.4893 6704968
0.0 210.5310 12000 0.4962 6819568
0.0 214.0354 12200 0.4929 6933264
0.0 217.5487 12400 0.4958 7045688
0.0 221.0531 12600 0.5161 7159888
0.0 224.5664 12800 0.5078 7274296
0.0 228.0708 13000 0.5155 7387544
0.0 231.5841 13200 0.5224 7500200
0.0 235.0885 13400 0.5161 7614696
0.0 238.6018 13600 0.5258 7727608
0.0 242.1062 13800 0.5333 7840696
0.0 245.6195 14000 0.5420 7954632
0.0 249.1239 14200 0.5439 8068648
0.0 252.6372 14400 0.5527 8181840
0.0 256.1416 14600 0.5454 8294896
0.0 259.6549 14800 0.5406 8408512
0.0 263.1593 15000 0.5593 8522664
0.0 266.6726 15200 0.5593 8636032
0.0 270.1770 15400 0.5581 8748624
0.0 273.6903 15600 0.5558 8863248
0.0 277.1947 15800 0.5764 8976424
0.0 280.7080 16000 0.5791 9088984
0.0 284.2124 16200 0.5732 9204128
0.0 287.7257 16400 0.5830 9317208
0.0 291.2301 16600 0.5897 9431208
0.0 294.7434 16800 0.5792 9544328
0.0 298.2478 17000 0.5745 9657432
0.0 301.7611 17200 0.5907 9770824
0.0 305.2655 17400 0.5804 9884648
0.0 308.7788 17600 0.6051 9997288
0.0 312.2832 17800 0.5829 10111472
0.0 315.7965 18000 0.6008 10223648
0.0 319.3009 18200 0.6038 10336864
0.0 322.8142 18400 0.6034 10450688
0.0 326.3186 18600 0.6052 10563128
0.0 329.8319 18800 0.6043 10677928
0.0 333.3363 19000 0.6169 10790896
0.0 336.8496 19200 0.6139 10904600
0.0 340.3540 19400 0.6213 11018112
0.0 343.8673 19600 0.6173 11131712
0.0 347.3717 19800 0.6094 11245728
0.0 350.8850 20000 0.6229 11358800
0.0 354.3894 20200 0.6238 11471832
0.0 357.9027 20400 0.6088 11586368
0.0 361.4071 20600 0.6196 11700176
0.0 364.9204 20800 0.6083 11814304
0.0 368.4248 21000 0.6165 11927464
0.0 371.9381 21200 0.6383 12041416
0.0 375.4425 21400 0.6364 12153176
0.0 378.9558 21600 0.6361 12267984
0.0 382.4602 21800 0.6420 12381424
0.0 385.9735 22000 0.6455 12494280
0.0 389.4779 22200 0.6498 12608008
0.0 392.9912 22400 0.6348 12721456
0.0 396.4956 22600 0.6526 12835240
0.0 400.0 22800 0.6556 12948416
0.0 403.5133 23000 0.6734 13061472
0.0 407.0177 23200 0.6624 13175888
0.0 410.5310 23400 0.6691 13289752
0.0 414.0354 23600 0.6631 13403848
0.0 417.5487 23800 0.6665 13518496
0.0 421.0531 24000 0.6828 13631704
0.0 424.5664 24200 0.6716 13745200
0.0 428.0708 24400 0.6814 13859752
0.0 431.5841 24600 0.7037 13972648
0.0 435.0885 24800 0.7048 14086360
0.0 438.6018 25000 0.7057 14201656
0.0 442.1062 25200 0.7231 14314736
0.0 445.6195 25400 0.7107 14428104
0.0 449.1239 25600 0.7242 14541136
0.0 452.6372 25800 0.7249 14655696
0.0 456.1416 26000 0.7227 14768168
0.0 459.6549 26200 0.7371 14882048
0.0 463.1593 26400 0.7550 14996008
0.0 466.6726 26600 0.7279 15109352
0.0 470.1770 26800 0.7389 15223592
0.0 473.6903 27000 0.7722 15338072
0.0 477.1947 27200 0.7546 15451312
0.0 480.7080 27400 0.7864 15565784
0.0 484.2124 27600 0.7918 15679720
0.0 487.7257 27800 0.8236 15792680
0.0 491.2301 28000 0.8143 15906624
0.0 494.7434 28200 0.8141 16019936
0.0 498.2478 28400 0.8328 16133784
0.0 501.7611 28600 0.8181 16248200
0.0 505.2655 28800 0.8467 16361560
0.0 508.7788 29000 0.8443 16475624
0.0 512.2832 29200 0.8483 16588984
0.0 515.7965 29400 0.8532 16702496
0.0 519.3009 29600 0.8267 16816272
0.0 522.8142 29800 0.8347 16929072
0.0 526.3186 30000 0.8453 17043120
0.0 529.8319 30200 0.8594 17156344
0.0 533.3363 30400 0.8480 17268656
0.0 536.8496 30600 0.8396 17383696
0.0 540.3540 30800 0.8238 17495648
0.0 543.8673 31000 0.8566 17609616
0.0 547.3717 31200 0.8399 17723600
0.0 550.8850 31400 0.8477 17836576
0.0 554.3894 31600 0.8596 17949928
0.0 557.9027 31800 0.8298 18064576
0.0 561.4071 32000 0.8073 18177096
0.0 564.9204 32200 0.8399 18290608
0.0 568.4248 32400 0.8064 18404648
0.0 571.9381 32600 0.8301 18517216
0.0 575.4425 32800 0.8266 18631296
0.0 578.9558 33000 0.8064 18745416
0.0 582.4602 33200 0.8053 18857896
0.0 585.9735 33400 0.8251 18971344
0.0 589.4779 33600 0.8043 19085248
0.0 592.9912 33800 0.8018 19199136
0.0 596.4956 34000 0.8218 19311344
0.0 600.0 34200 0.8327 19425472
0.0 603.5133 34400 0.7973 19539112
0.0 607.0177 34600 0.7865 19652392
0.0 610.5310 34800 0.8027 19766904
0.0 614.0354 35000 0.8159 19879808
0.0 617.5487 35200 0.8088 19993952
0.0 621.0531 35400 0.8045 20107560
0.0 624.5664 35600 0.8094 20220888
0.0 628.0708 35800 0.8172 20333904
0.0 631.5841 36000 0.7859 20446736
0.0 635.0885 36200 0.8262 20560472
0.0 638.6018 36400 0.7735 20673984
0.0 642.1062 36600 0.8092 20786240
0.0 645.6195 36800 0.8278 20899128
0.0 649.1239 37000 0.8118 21011928
0.0 652.6372 37200 0.7886 21126880
0.0 656.1416 37400 0.7863 21239760
0.0 659.6549 37600 0.7899 21353776
0.0 663.1593 37800 0.7848 21467368
0.0 666.6726 38000 0.7814 21581512
0.0 670.1770 38200 0.7832 21694376
0.0 673.6903 38400 0.7953 21808568
0.0 677.1947 38600 0.8041 21922424
0.0 680.7080 38800 0.8144 22036600
0.0 684.2124 39000 0.7954 22150992
0.0 687.7257 39200 0.7884 22263616
0.0 691.2301 39400 0.7761 22377936
0.0 694.7434 39600 0.7777 22490328
0.0 698.2478 39800 0.7950 22604096
0.0 701.7611 40000 0.8050 22718312

Framework versions

  • PEFT 0.15.2.dev0
  • Transformers 4.51.3
  • Pytorch 2.6.0+cu124
  • Datasets 3.5.0
  • Tokenizers 0.21.1
Downloads last month
1
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_cb_1745950309

Adapter
(81)
this model

Dataset used to train rbelanec/train_cb_1745950309