creation-v1

This model is a fine-tuned version of Qwen/Qwen3-0.6B on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 3.2783

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine_with_restarts
  • lr_scheduler_warmup_ratio: 0.15
  • num_epochs: 30

Training results

Training Loss Epoch Step Validation Loss
No log 1.0 3 4.8549
No log 2.0 6 4.7060
No log 3.0 9 4.4868
4.3847 4.0 12 4.2240
4.3847 5.0 15 3.9478
4.3847 6.0 18 3.6598
3.1657 7.0 21 3.4035
3.1657 8.0 24 3.4113
3.1657 9.0 27 3.2417
1.6627 10.0 30 3.2186
1.6627 11.0 33 3.1794
1.6627 12.0 36 3.1747
1.6627 13.0 39 3.2023
0.9535 14.0 42 3.2163
0.9535 15.0 45 3.2625
0.9535 16.0 48 3.3318
0.7181 17.0 51 3.3887
0.7181 18.0 54 3.3620
0.7181 19.0 57 3.3598
0.498 20.0 60 3.3226
0.498 21.0 63 3.3073
0.498 22.0 66 3.2882
0.498 23.0 69 3.2865
0.4124 24.0 72 3.2841
0.4124 25.0 75 3.2844
0.4124 26.0 78 3.2786
0.4129 27.0 81 3.2787
0.4129 28.0 84 3.2782
0.4129 29.0 87 3.2785
0.4603 30.0 90 3.2783

Framework versions

  • PEFT 0.15.2
  • Transformers 4.51.3
  • Pytorch 2.7.0
  • Datasets 3.5.1
  • Tokenizers 0.21.1
Downloads last month
55
Safetensors
Model size
596M params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for michelangelo-ai/creation-v1

Finetuned
Qwen/Qwen3-0.6B
Adapter
(6)
this model
Quantizations
1 model