train_copa_1745950324

This model is a fine-tuned version of google/gemma-3-1b-it on the copa dataset. It achieves the following results on the evaluation set:

  • Loss: 6.9884
  • Num Input Tokens Seen: 11200800

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 2
  • eval_batch_size: 2
  • seed: 123
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 4
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • training_steps: 40000

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
8.0443 2.2222 200 7.2653 56176
7.9919 4.4444 400 7.1376 112064
7.2378 6.6667 600 7.1201 168112
7.2996 8.8889 800 7.1118 224048
7.6428 11.1111 1000 7.0621 279904
7.0353 13.3333 1200 7.1229 336032
6.7378 15.5556 1400 7.1355 391904
7.2576 17.7778 1600 7.1211 448112
6.5899 20.0 1800 7.0962 503920
7.5206 22.2222 2000 7.1011 560016
7.8536 24.4444 2200 7.0675 615952
7.0679 26.6667 2400 7.0583 672080
6.7812 28.8889 2600 7.0307 728000
6.7714 31.1111 2800 7.1189 783872
6.769 33.3333 3000 7.1021 839808
7.1798 35.5556 3200 7.0784 896064
7.0755 37.7778 3400 7.0695 951888
7.2945 40.0 3600 7.0575 1007760
7.9462 42.2222 3800 7.0405 1063648
6.9123 44.4444 4000 7.0297 1119744
6.8814 46.6667 4200 7.0303 1175680
7.1626 48.8889 4400 7.0410 1231696
7.2839 51.1111 4600 7.0262 1287744
6.9713 53.3333 4800 7.0311 1343760
7.0528 55.5556 5000 7.0483 1399856
7.1798 57.7778 5200 7.0414 1455808
7.8952 60.0 5400 7.1052 1511856
7.2479 62.2222 5600 7.0172 1567808
7.5761 64.4444 5800 7.0250 1623744
6.743 66.6667 6000 6.9884 1679888
6.5401 68.8889 6200 6.9972 1735952
6.4334 71.1111 6400 7.0941 1791904
7.9049 73.3333 6600 7.0641 1847888
7.0653 75.5556 6800 7.0148 1904000
7.8385 77.7778 7000 7.0460 1959920
7.4646 80.0 7200 7.0653 2015792
7.4211 82.2222 7400 7.0568 2071808
7.1189 84.4444 7600 7.0633 2127808
7.4384 86.6667 7800 7.0779 2183888
7.0618 88.8889 8000 7.0952 2239840
7.2709 91.1111 8200 7.0645 2295888
8.1774 93.3333 8400 7.0724 2351872
6.6601 95.5556 8600 7.1252 2407824
7.2864 97.7778 8800 7.1117 2463744
7.7047 100.0 9000 7.0918 2519680
7.5297 102.2222 9200 7.0915 2575584
7.821 104.4444 9400 7.0944 2631680
7.3182 106.6667 9600 7.1092 2687728
6.858 108.8889 9800 7.0975 2743792
6.9811 111.1111 10000 7.1012 2799840
7.2245 113.3333 10200 7.0897 2855808
7.4336 115.5556 10400 7.0896 2911648
7.1664 117.7778 10600 7.1065 2967856
7.8065 120.0 10800 7.0927 3023792
7.9499 122.2222 11000 7.0762 3079920
7.0617 124.4444 11200 7.0748 3135904
7.6056 126.6667 11400 7.0930 3191808
6.6297 128.8889 11600 7.1012 3247840
7.6139 131.1111 11800 7.1025 3303712
6.7219 133.3333 12000 7.1018 3359680
7.1882 135.5556 12200 7.1178 3415824
7.5924 137.7778 12400 7.1178 3471520
8.07 140.0 12600 7.1220 3527664
7.3366 142.2222 12800 7.1181 3583696
7.0881 144.4444 13000 7.1181 3639680
7.4198 146.6667 13200 7.1095 3695712
7.2589 148.8889 13400 7.1095 3751728
7.1645 151.1111 13600 7.1095 3807744
7.5685 153.3333 13800 7.1090 3863664
7.3565 155.5556 14000 7.1090 3919584
7.405 157.7778 14200 7.1078 3975568
7.2451 160.0 14400 7.1078 4031632
7.2936 162.2222 14600 7.1078 4087632
7.2314 164.4444 14800 7.1078 4143664
6.9781 166.6667 15000 7.1046 4199552
7.1826 168.8889 15200 7.1093 4255584
7.2148 171.1111 15400 7.1093 4311504
6.9793 173.3333 15600 7.1093 4367408
7.4752 175.5556 15800 7.1093 4423376
7.6735 177.7778 16000 7.1093 4479456
7.8021 180.0 16200 7.1093 4535504
7.1948 182.2222 16400 7.1093 4591504
7.8415 184.4444 16600 7.1093 4647424
7.4026 186.6667 16800 7.1093 4703376
7.1809 188.8889 17000 7.1093 4759552
7.2958 191.1111 17200 7.1093 4815552
7.5816 193.3333 17400 7.1093 4871600
7.5369 195.5556 17600 7.1093 4927696
7.3277 197.7778 17800 7.1093 4983424
7.5088 200.0 18000 7.1093 5039536
7.335 202.2222 18200 7.1093 5095376
7.4595 204.4444 18400 7.1093 5151440
7.576 206.6667 18600 7.1093 5207488
7.3778 208.8889 18800 7.1093 5263360
7.2081 211.1111 19000 7.1093 5319344
7.2045 213.3333 19200 7.1093 5375280
7.3599 215.5556 19400 7.1093 5431520
6.3584 217.7778 19600 7.1093 5487472
7.9802 220.0 19800 7.1093 5543504
7.5099 222.2222 20000 7.1093 5599440
7.3845 224.4444 20200 7.1093 5655424
7.8468 226.6667 20400 7.1093 5711344
6.9991 228.8889 20600 7.1093 5767376
7.3156 231.1111 20800 7.1093 5823264
6.9584 233.3333 21000 7.1093 5879248
7.286 235.5556 21200 7.1093 5935168
6.8662 237.7778 21400 7.1093 5991232
8.0219 240.0 21600 7.1093 6047376
6.8685 242.2222 21800 7.1093 6103328
7.4982 244.4444 22000 7.1093 6159376
6.7313 246.6667 22200 7.1093 6215360
7.3844 248.8889 22400 7.1093 6271232
7.4578 251.1111 22600 7.1093 6327136
6.8552 253.3333 22800 7.1093 6383248
7.1551 255.5556 23000 7.1093 6439168
7.4121 257.7778 23200 7.1093 6495280
7.3986 260.0 23400 7.1093 6551264
7.3824 262.2222 23600 7.1093 6607424
7.18 264.4444 23800 7.1093 6663168
7.9576 266.6667 24000 7.1093 6719216
7.3671 268.8889 24200 7.1093 6775344
7.2866 271.1111 24400 7.1093 6831344
7.0735 273.3333 24600 7.1093 6887344
7.4888 275.5556 24800 7.1093 6943632
7.073 277.7778 25000 7.1093 6999632
7.5027 280.0 25200 7.1093 7055664
7.6028 282.2222 25400 7.1093 7111664
7.136 284.4444 25600 7.1093 7167744
6.6772 286.6667 25800 7.1093 7223696
6.5945 288.8889 26000 7.1093 7279760
7.4694 291.1111 26200 7.1093 7335792
7.1574 293.3333 26400 7.1093 7391808
6.9243 295.5556 26600 7.1093 7447808
8.1054 297.7778 26800 7.1093 7503824
7.6388 300.0 27000 7.1093 7559856
7.2333 302.2222 27200 7.1093 7615904
8.2743 304.4444 27400 7.1093 7672000
7.5157 306.6667 27600 7.1093 7727808
7.0823 308.8889 27800 7.1093 7783744
7.3501 311.1111 28000 7.1093 7839808
7.709 313.3333 28200 7.1093 7895872
7.8197 315.5556 28400 7.1093 7951664
6.783 317.7778 28600 7.1093 8007744
7.4538 320.0 28800 7.1093 8063616
7.4345 322.2222 29000 7.1093 8119520
7.3458 324.4444 29200 7.1093 8175584
6.9791 326.6667 29400 7.1093 8231760
7.2433 328.8889 29600 7.1093 8287696
7.3159 331.1111 29800 7.1093 8343760
7.5194 333.3333 30000 7.1093 8399696
7.0194 335.5556 30200 7.1093 8455776
7.2128 337.7778 30400 7.1093 8511760
7.18 340.0 30600 7.1093 8567792
7.3856 342.2222 30800 7.1093 8623728
7.2501 344.4444 31000 7.1093 8679920
7.3079 346.6667 31200 7.1093 8736032
7.0491 348.8889 31400 7.1093 8791888
7.0758 351.1111 31600 7.1093 8847728
7.038 353.3333 31800 7.1093 8903952
6.8764 355.5556 32000 7.1093 8959920
7.5744 357.7778 32200 7.1093 9016096
7.2366 360.0 32400 7.1093 9072192
7.3998 362.2222 32600 7.1093 9128272
7.2568 364.4444 32800 7.1093 9184240
7.4078 366.6667 33000 7.1093 9240064
7.3703 368.8889 33200 7.1093 9295952
7.1993 371.1111 33400 7.1093 9352016
7.1025 373.3333 33600 7.1093 9407968
7.1228 375.5556 33800 7.1093 9463920
6.7341 377.7778 34000 7.1093 9519984
7.6929 380.0 34200 7.1093 9575936
7.3021 382.2222 34400 7.1093 9631952
7.5264 384.4444 34600 7.1093 9687936
7.119 386.6667 34800 7.1093 9743968
7.8449 388.8889 35000 7.1093 9800016
7.3523 391.1111 35200 7.1093 9856016
7.3693 393.3333 35400 7.1093 9912112
7.1478 395.5556 35600 7.1093 9968112
6.9809 397.7778 35800 7.1093 10024160
7.9866 400.0 36000 7.1093 10080240
7.8767 402.2222 36200 7.1093 10136208
6.9538 404.4444 36400 7.1093 10192208
7.0258 406.6667 36600 7.1093 10248192
7.3948 408.8889 36800 7.1093 10304144
7.2103 411.1111 37000 7.1093 10360192
7.448 413.3333 37200 7.1093 10416288
7.3405 415.5556 37400 7.1093 10472368
7.207 417.7778 37600 7.1093 10528352
6.8479 420.0 37800 7.1093 10584384
7.8582 422.2222 38000 7.1093 10640496
7.4546 424.4444 38200 7.1093 10696528
8.3212 426.6667 38400 7.1093 10752640
7.0448 428.8889 38600 7.1093 10808672
8.2754 431.1111 38800 7.1093 10864512
7.447 433.3333 39000 7.1093 10920608
7.231 435.5556 39200 7.1093 10976624
7.0236 437.7778 39400 7.1093 11032608
7.4742 440.0 39600 7.1093 11088720
7.6562 442.2222 39800 7.1093 11144688
6.7836 444.4444 40000 7.1093 11200800

Framework versions

  • PEFT 0.15.2.dev0
  • Transformers 4.51.3
  • Pytorch 2.6.0+cu124
  • Datasets 3.5.0
  • Tokenizers 0.21.1
Downloads last month
1
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_copa_1745950324

Adapter
(81)
this model