arabic-nano-gpt-v2 / README.md
e-hossam96's picture
End of training
82061d3 verified
|
raw
history blame
2.82 kB
metadata
library_name: transformers
license: mit
base_model: openai-community/gpt2
tags:
  - generated_from_trainer
model-index:
  - name: arabic-nano-gpt-v2
    results: []

arabic-nano-gpt-v2

This model is a fine-tuned version of openai-community/gpt2 on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 3.2532

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 32
  • eval_batch_size: 32
  • seed: 42
  • gradient_accumulation_steps: 8
  • total_train_batch_size: 256
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_ratio: 0.01
  • num_epochs: 8

Training results

Training Loss Epoch Step Validation Loss
4.9097 0.2924 5000 4.3161
4.0426 0.5849 10000 3.8633
3.8791 0.8773 15000 3.6969
3.7452 1.1698 20000 3.6052
3.6927 1.4622 25000 3.5420
3.6348 1.7547 30000 3.4976
3.6038 2.0471 35000 3.4622
3.562 2.3396 40000 3.4329
3.5374 2.6320 45000 3.4098
3.5216 2.9245 50000 3.3897
3.4918 3.2169 55000 3.3743
3.4805 3.5094 60000 3.3585
3.4724 3.8018 65000 3.3445
3.4519 4.0943 70000 3.3337
3.4422 4.3867 75000 3.3224
3.4376 4.6791 80000 3.3133
3.4316 4.9716 85000 3.3042
3.4123 5.2640 90000 3.2972
3.4076 5.5565 95000 3.2897
3.4018 5.8489 100000 3.2823
3.3943 6.1414 105000 3.2772
3.3891 6.4338 110000 3.2720
3.3805 6.7263 115000 3.2661
3.3786 7.0187 120000 3.2625
3.3713 7.3112 125000 3.2587
3.3662 7.6036 130000 3.2553
3.365 7.8961 135000 3.2532

Framework versions

  • Transformers 4.45.2
  • Pytorch 2.5.0
  • Datasets 3.0.1
  • Tokenizers 0.20.1