metadata
library_name: transformers
license: mit
base_model: openai-community/gpt2
tags:
- generated_from_trainer
model-index:
- name: arabic-nano-gpt-v2
results: []
arabic-nano-gpt-v2
This model is a fine-tuned version of openai-community/gpt2 on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 3.2532
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0001
- train_batch_size: 32
- eval_batch_size: 32
- seed: 42
- gradient_accumulation_steps: 8
- total_train_batch_size: 256
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_ratio: 0.01
- num_epochs: 8
Training results
Training Loss | Epoch | Step | Validation Loss |
---|---|---|---|
4.9097 | 0.2924 | 5000 | 4.3161 |
4.0426 | 0.5849 | 10000 | 3.8633 |
3.8791 | 0.8773 | 15000 | 3.6969 |
3.7452 | 1.1698 | 20000 | 3.6052 |
3.6927 | 1.4622 | 25000 | 3.5420 |
3.6348 | 1.7547 | 30000 | 3.4976 |
3.6038 | 2.0471 | 35000 | 3.4622 |
3.562 | 2.3396 | 40000 | 3.4329 |
3.5374 | 2.6320 | 45000 | 3.4098 |
3.5216 | 2.9245 | 50000 | 3.3897 |
3.4918 | 3.2169 | 55000 | 3.3743 |
3.4805 | 3.5094 | 60000 | 3.3585 |
3.4724 | 3.8018 | 65000 | 3.3445 |
3.4519 | 4.0943 | 70000 | 3.3337 |
3.4422 | 4.3867 | 75000 | 3.3224 |
3.4376 | 4.6791 | 80000 | 3.3133 |
3.4316 | 4.9716 | 85000 | 3.3042 |
3.4123 | 5.2640 | 90000 | 3.2972 |
3.4076 | 5.5565 | 95000 | 3.2897 |
3.4018 | 5.8489 | 100000 | 3.2823 |
3.3943 | 6.1414 | 105000 | 3.2772 |
3.3891 | 6.4338 | 110000 | 3.2720 |
3.3805 | 6.7263 | 115000 | 3.2661 |
3.3786 | 7.0187 | 120000 | 3.2625 |
3.3713 | 7.3112 | 125000 | 3.2587 |
3.3662 | 7.6036 | 130000 | 3.2553 |
3.365 | 7.8961 | 135000 | 3.2532 |
Framework versions
- Transformers 4.45.2
- Pytorch 2.5.0
- Datasets 3.0.1
- Tokenizers 0.20.1