roberta-mlm-model-v1

This model is a fine-tuned version of roberta-mlm-model-v1/checkpoint-60000 on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: nan

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0005
  • train_batch_size: 32
  • eval_batch_size: 32
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 1000
  • num_epochs: 4
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
0.0 0.1242 5000 nan
0.0 0.2483 10000 nan
0.0 0.3725 15000 nan
0.0 0.4966 20000 nan
0.0 0.6208 25000 nan
0.0 0.7449 30000 nan
0.0 0.8691 35000 nan
0.0 0.9932 40000 nan
0.0 1.1174 45000 nan
0.0 1.2416 50000 nan
0.0 1.3657 55000 nan
0.0 1.4899 60000 nan
0.0 1.6140 65000 nan
0.0 1.7382 70000 nan
0.0 1.8623 75000 nan
0.0 1.9865 80000 nan
0.0 2.1106 85000 nan
0.0 2.2348 90000 nan
0.0 2.3590 95000 nan
0.0 2.4831 100000 nan
0.0 2.6073 105000 nan
0.0 2.7314 110000 nan
0.0 2.8556 115000 nan
0.0 2.9797 120000 nan
0.0 3.1039 125000 nan
0.0 3.2280 130000 nan
0.0 3.3522 135000 nan
0.0 3.4764 140000 nan
0.0 3.6005 145000 nan
0.0 3.7247 150000 nan
0.0 3.8488 155000 nan
0.0 3.9730 160000 nan

Framework versions

  • Transformers 4.50.0
  • Pytorch 2.6.0+cu124
  • Datasets 3.4.1
  • Tokenizers 0.21.1
Downloads last month
4
Safetensors
Model size
160M params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support