train_cb_1745950310

This model is a fine-tuned version of google/gemma-3-1b-it on the cb dataset. It achieves the following results on the evaluation set:

  • Loss: 0.2460
  • Num Input Tokens Seen: 22718312

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 2
  • eval_batch_size: 2
  • seed: 123
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 4
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • training_steps: 40000

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.1231 3.5133 200 0.2983 114504
0.0 7.0177 400 0.2772 228504
0.0 10.5310 600 0.2460 341136
0.0 14.0354 800 0.3245 455488
0.0 17.5487 1000 0.3226 569504
0.0 21.0531 1200 0.3285 682024
0.0 24.5664 1400 0.3362 796328
0.0 28.0708 1600 0.3420 909320
0.0 31.5841 1800 0.3482 1023696
0.0 35.0885 2000 0.3368 1137280
0.0 38.6018 2200 0.3498 1251592
0.0 42.1062 2400 0.3494 1364312
0.0 45.6195 2600 0.3590 1478704
0.0 49.1239 2800 0.3601 1591424
0.0 52.6372 3000 0.3583 1705000
0.0 56.1416 3200 0.3588 1818688
0.0 59.6549 3400 0.3548 1932248
0.0 63.1593 3600 0.3601 2045464
0.0 66.6726 3800 0.3658 2159128
0.0 70.1770 4000 0.3729 2272792
0.0 73.6903 4200 0.3782 2387344
0.0 77.1947 4400 0.3814 2500160
0.0 80.7080 4600 0.3626 2614032
0.0 84.2124 4800 0.3679 2728488
0.0 87.7257 5000 0.3792 2842656
0.0 91.2301 5200 0.3791 2956824
0.0 94.7434 5400 0.4004 3069840
0.0 98.2478 5600 0.3897 3183600
0.0 101.7611 5800 0.3824 3297896
0.0 105.2655 6000 0.3835 3411544
0.0 108.7788 6200 0.3907 3525472
0.0 112.2832 6400 0.4030 3638584
0.0 115.7965 6600 0.4009 3752608
0.0 119.3009 6800 0.4006 3865376
0.0 122.8142 7000 0.4033 3979464
0.0 126.3186 7200 0.4094 4093296
0.0 129.8319 7400 0.4080 4207120
0.0 133.3363 7600 0.4074 4320568
0.0 136.8496 7800 0.4120 4434056
0.0 140.3540 8000 0.4256 4547840
0.0 143.8673 8200 0.4117 4662192
0.0 147.3717 8400 0.4215 4774160
0.0 150.8850 8600 0.4241 4887640
0.0 154.3894 8800 0.4225 5002864
0.0 157.9027 9000 0.4309 5116216
0.0 161.4071 9200 0.4269 5229496
0.0 164.9204 9400 0.4272 5343528
0.0 168.4248 9600 0.4281 5455520
0.0 171.9381 9800 0.4237 5571144
0.0 175.4425 10000 0.4401 5684752
0.0 178.9558 10200 0.4291 5799088
0.0 182.4602 10400 0.4354 5911888
0.0 185.9735 10600 0.4433 6025544
0.0 189.4779 10800 0.4493 6139264
0.0 192.9912 11000 0.4488 6252832
0.0 196.4956 11200 0.4484 6366440
0.0 200.0 11400 0.4492 6478776
0.0 203.5133 11600 0.4521 6592280
0.0 207.0177 11800 0.4557 6704968
0.0 210.5310 12000 0.4463 6819568
0.0 214.0354 12200 0.4519 6933264
0.0 217.5487 12400 0.4537 7045688
0.0 221.0531 12600 0.4610 7159888
0.0 224.5664 12800 0.4564 7274296
0.0 228.0708 13000 0.4594 7387544
0.0 231.5841 13200 0.4661 7500200
0.0 235.0885 13400 0.4695 7614696
0.0 238.6018 13600 0.4755 7727608
0.0 242.1062 13800 0.4837 7840696
0.0 245.6195 14000 0.4702 7954632
0.0 249.1239 14200 0.4909 8068648
0.0 252.6372 14400 0.4822 8181840
0.0 256.1416 14600 0.4791 8294896
0.0 259.6549 14800 0.4915 8408512
0.0 263.1593 15000 0.4854 8522664
0.0 266.6726 15200 0.5012 8636032
0.0 270.1770 15400 0.5022 8748624
0.0 273.6903 15600 0.5095 8863248
0.0 277.1947 15800 0.5141 8976424
0.0 280.7080 16000 0.5122 9088984
0.0 284.2124 16200 0.5215 9204128
0.0 287.7257 16400 0.5182 9317208
0.0 291.2301 16600 0.5424 9431208
0.0 294.7434 16800 0.5420 9544328
0.0 298.2478 17000 0.5455 9657432
0.0 301.7611 17200 0.5556 9770824
0.0 305.2655 17400 0.5646 9884648
0.0 308.7788 17600 0.5576 9997288
0.0 312.2832 17800 0.5532 10111472
0.0 315.7965 18000 0.5568 10223648
0.0 319.3009 18200 0.5883 10336864
0.0 322.8142 18400 0.5703 10450688
0.0 326.3186 18600 0.5664 10563128
0.0 329.8319 18800 0.5949 10677928
0.0 333.3363 19000 0.5918 10790896
0.0 336.8496 19200 0.5862 10904600
0.0 340.3540 19400 0.5627 11018112
0.0 343.8673 19600 0.6012 11131712
0.0 347.3717 19800 0.5383 11245728
0.0 350.8850 20000 0.5387 11358800
0.0 354.3894 20200 0.5425 11471832
0.0 357.9027 20400 0.5417 11586368
0.0 361.4071 20600 0.5680 11700176
0.0 364.9204 20800 0.5215 11814304
0.0 368.4248 21000 0.5595 11927464
0.0 371.9381 21200 0.5175 12041416
0.0 375.4425 21400 0.5527 12153176
0.0 378.9558 21600 0.5344 12267984
0.0 382.4602 21800 0.5042 12381424
0.0 385.9735 22000 0.5430 12494280
0.0 389.4779 22200 0.5208 12608008
0.0 392.9912 22400 0.5807 12721456
0.0 396.4956 22600 0.5171 12835240
0.0 400.0 22800 0.5288 12948416
0.0 403.5133 23000 0.5604 13061472
0.0 407.0177 23200 0.5698 13175888
0.0 410.5310 23400 0.5086 13289752
0.0 414.0354 23600 0.4858 13403848
0.0 417.5487 23800 0.5353 13518496
0.0 421.0531 24000 0.4958 13631704
0.0 424.5664 24200 0.4936 13745200
0.0 428.0708 24400 0.5261 13859752
0.0 431.5841 24600 0.5022 13972648
0.0 435.0885 24800 0.5777 14086360
0.0 438.6018 25000 0.5152 14201656
0.0 442.1062 25200 0.5149 14314736
0.0 445.6195 25400 0.5318 14428104
0.0 449.1239 25600 0.4894 14541136
0.0 452.6372 25800 0.5164 14655696
0.0 456.1416 26000 0.5153 14768168
0.0 459.6549 26200 0.5005 14882048
0.0 463.1593 26400 0.5168 14996008
0.139 466.6726 26600 0.8271 15109352
0.0 470.1770 26800 0.9104 15223592
0.0 473.6903 27000 0.9009 15338072
0.0 477.1947 27200 0.9213 15451312
0.0 480.7080 27400 0.9220 15565784
0.0 484.2124 27600 0.9057 15679720
0.0 487.7257 27800 0.9155 15792680
0.0 491.2301 28000 0.9253 15906624
0.0 494.7434 28200 0.9103 16019936
0.0 498.2478 28400 0.9245 16133784
0.0 501.7611 28600 0.8963 16248200
0.0 505.2655 28800 0.9024 16361560
0.0 508.7788 29000 0.9256 16475624
0.0 512.2832 29200 0.9239 16588984
0.0 515.7965 29400 0.9102 16702496
0.0 519.3009 29600 0.9128 16816272
0.0 522.8142 29800 0.9139 16929072
0.0 526.3186 30000 0.9153 17043120
0.0 529.8319 30200 0.9343 17156344
0.0 533.3363 30400 0.9051 17268656
0.0 536.8496 30600 0.9375 17383696
0.0 540.3540 30800 0.9452 17495648
0.0 543.8673 31000 0.9113 17609616
0.0 547.3717 31200 0.9103 17723600
0.0 550.8850 31400 0.8986 17836576
0.0 554.3894 31600 0.8948 17949928
0.0 557.9027 31800 0.9036 18064576
0.0 561.4071 32000 0.9059 18177096
0.0 564.9204 32200 0.9259 18290608
0.0 568.4248 32400 0.9182 18404648
0.0 571.9381 32600 0.9214 18517216
0.0 575.4425 32800 0.9142 18631296
0.0 578.9558 33000 0.9106 18745416
0.0 582.4602 33200 0.9187 18857896
0.0 585.9735 33400 0.9218 18971344
0.0 589.4779 33600 0.9236 19085248
0.0 592.9912 33800 0.9061 19199136
0.0 596.4956 34000 0.8945 19311344
0.0 600.0 34200 0.8979 19425472
0.0 603.5133 34400 0.9250 19539112
0.0 607.0177 34600 0.9027 19652392
0.0 610.5310 34800 0.9087 19766904
0.0 614.0354 35000 0.8934 19879808
0.0 617.5487 35200 0.9040 19993952
0.0 621.0531 35400 0.9110 20107560
0.0 624.5664 35600 0.8989 20220888
0.0 628.0708 35800 0.9270 20333904
0.0 631.5841 36000 0.8909 20446736
0.0 635.0885 36200 0.9129 20560472
0.0 638.6018 36400 0.8988 20673984
0.0 642.1062 36600 0.8977 20786240
0.0 645.6195 36800 0.8956 20899128
0.0 649.1239 37000 0.9297 21011928
0.0 652.6372 37200 0.8970 21126880
0.0 656.1416 37400 0.9159 21239760
0.0 659.6549 37600 0.9120 21353776
0.0 663.1593 37800 0.8969 21467368
0.0 666.6726 38000 0.8925 21581512
0.0 670.1770 38200 0.8996 21694376
0.0 673.6903 38400 0.8811 21808568
0.0 677.1947 38600 0.9198 21922424
0.0 680.7080 38800 0.9037 22036600
0.0 684.2124 39000 0.8997 22150992
0.0 687.7257 39200 0.9019 22263616
0.0 691.2301 39400 0.8945 22377936
0.0 694.7434 39600 0.9180 22490328
0.0 698.2478 39800 0.9090 22604096
0.0 701.7611 40000 0.9120 22718312

Framework versions

  • PEFT 0.15.2.dev0
  • Transformers 4.51.3
  • Pytorch 2.6.0+cu124
  • Datasets 3.5.0
  • Tokenizers 0.21.1
Downloads last month
1
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_cb_1745950310

Adapter
(81)
this model

Dataset used to train rbelanec/train_cb_1745950310