_Nougat_Base_Edv_En_De_01

This model is a fine-tuned version of facebook/nougat-base on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 1700.0442

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 10
  • total_train_batch_size: 80
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss
No log 1.0 42 1.9061
22.7823 2.0 84 7.8791
68.1139 3.0 126 88.5974
829.2286 4.0 168 220.7463
2312.5455 5.0 210 283.3206
3190.685 6.0 252 408.0313
3190.685 7.0 294 772.9317
5969.3912 8.0 336 1039.8457
9675.3725 9.0 378 1248.5635
12223.905 10.0 420 1398.6156
14010.6213 11.0 462 1496.1117
15064.8962 12.0 504 1564.3519
15064.8962 13.0 546 1613.9442
15790.6938 14.0 588 1647.9702
16285.0438 15.0 630 1669.3680
16505.9075 16.0 672 1684.0092
16716.44 17.0 714 1693.2450
16915.6488 18.0 756 1698.5161
16915.6488 19.0 798 1699.9650
16804.3288 19.5251 820 1700.0442

Framework versions

  • Transformers 4.47.1
  • Pytorch 2.5.1+cu121
  • Tokenizers 0.21.0
Downloads last month
2
Safetensors
Model size
349M params
Tensor type
I64
·
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for bustamiyusoef/_Nougat_Base_Edv_En_De_01

Finetuned
(19)
this model