_finetunned_nougat_AHR_jawi

This model is a fine-tuned version of bustamiyusoef/_base_nougat_AHR on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.2215

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 6
  • total_train_batch_size: 48
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 30

Training results

Training Loss Epoch Step Validation Loss
1.4421 0.996 83 1.3987
0.9271 1.992 166 0.7992
0.6591 3.0 250 0.5874
0.5797 3.996 333 0.4766
0.5308 4.992 416 0.4747
0.4441 6.0 500 0.3578
0.3555 6.996 583 0.3452
0.319 7.992 666 0.3231
0.2794 9.0 750 0.3032
0.2817 9.996 833 0.2769
0.2337 10.992 916 0.2796
0.2344 12.0 1000 0.2497
0.1966 12.996 1083 0.2579
0.2119 13.992 1166 0.2385
0.1604 15.0 1250 0.2364
0.1207 15.996 1333 0.2336
0.1452 16.992 1416 0.2250
0.1255 18.0 1500 0.2282
0.1294 18.996 1583 0.2304
0.1016 19.992 1666 0.2202
0.1195 21.0 1750 0.2245
0.1382 21.996 1833 0.2202
0.1201 22.992 1916 0.2178
0.1185 24.0 2000 0.2150
0.1149 24.996 2083 0.2264
0.1011 25.992 2166 0.2190
0.1046 27.0 2250 0.2215

Framework versions

  • Transformers 4.46.3
  • Pytorch 2.5.1+cu121
  • Datasets 3.2.0
  • Tokenizers 0.20.3
Downloads last month
5
Safetensors
Model size
349M params
Tensor type
I64
·
BF16
·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.

Model tree for bustamiyusoef/_finetunned_nougat_AHR_jawi

Finetuned
(1)
this model