smollm2-lora-hrz1ceio-1742335328

This model is a fine-tuned version of HuggingFaceTB/SmolLM2-1.7B-Instruct on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 2.4598
  • Perplexity: 11.7017

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-06
  • train_batch_size: 1
  • eval_batch_size: 1
  • seed: 42
  • gradient_accumulation_steps: 16
  • total_train_batch_size: 16
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_ratio: 0.01
  • num_epochs: 50
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Perplexity
3.8247 1.48 10 3.8893 48.8773
3.7821 2.96 20 3.8282 45.9783
3.7187 4.32 30 3.7582 42.8707
3.6627 5.8 40 3.6850 39.8460
3.5609 7.16 50 3.6080 36.8919
3.4883 8.64 60 3.5351 34.2986
3.3997 10.0 70 3.4620 31.8798
3.3753 11.48 80 3.3897 29.6568
3.249 12.96 90 3.3187 27.6251
3.2619 14.32 100 3.2492 25.7684
3.1184 15.8 110 3.1800 24.0468
3.0852 17.16 120 3.1133 22.4950
2.9999 18.64 130 3.0489 21.0926
2.9315 20.0 140 2.9865 19.8159
2.848 21.48 150 2.9260 18.6519
2.8046 22.96 160 2.8679 17.6005
2.7379 24.32 170 2.8153 16.6979
2.704 25.8 180 2.7637 15.8585
2.6349 27.16 190 2.7168 15.1316
2.5972 28.64 200 2.6749 14.5109
2.5585 30.0 210 2.6327 13.9107
2.5502 31.48 220 2.5979 13.4359
2.5166 32.96 230 2.5670 13.0264
2.4733 34.32 240 2.5395 12.6737
2.4502 35.8 250 2.5152 12.3691
2.4268 37.16 260 2.4965 12.1393
2.365 38.64 270 2.4808 11.9507
2.4208 40.0 280 2.4707 11.8304
2.3818 41.48 290 2.4623 11.7311
2.391 42.96 300 2.4598 11.7017

Framework versions

  • PEFT 0.14.0
  • Transformers 4.48.2
  • Pytorch 2.1.0+cu118
  • Datasets 3.4.1
  • Tokenizers 0.21.1
Downloads last month
0
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for Swephoenix/smollm2-lora-xaji0y6d-1742337923

Adapter
(12)
this model