---
library_name: transformers
license: mit
base_model: openai-community/gpt2
tags:
- generated_from_trainer
model-index:
- name: arabic-nano-gpt
  results: []
datasets:
- wikimedia/wikipedia
language:
- ar
---


# arabic-nano-gpt

This model is a fine-tuned version of [openai-community/gpt2](https://huggingface.co/openai-community/gpt2) on an unknown dataset.
It achieves the following results on the held-out test set:
- Loss: 3.28796


## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 0.001
- train_batch_size: 64
- eval_batch_size: 64
- seed: 42
- gradient_accumulation_steps: 4
- total_train_batch_size: 256
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_ratio: 0.01
- num_epochs: 24

<!-- ### Training results -->

<!-- | Training Loss | Epoch  | Step   | Validation Loss |
|:-------------:|:------:|:------:|:---------------:|
| 5.62          | 0.0585 | 1000   | 5.3754          |
| 4.6527        | 0.1170 | 2000   | 4.4918          |
| 4.2818        | 0.1755 | 3000   | 4.1137          |
| 4.1289        | 0.2340 | 4000   | 3.9388          |
| 4.0021        | 0.2924 | 5000   | 3.8274          |
| 3.9301        | 0.3509 | 6000   | 3.7534          |
| 3.8822        | 0.4094 | 7000   | 3.6986          |
| 3.8375        | 0.4679 | 8000   | 3.6557          |
| 3.7918        | 0.5264 | 9000   | 3.6266          |
| 3.7723        | 0.5849 | 10000  | 3.5994          |
| 3.7549        | 0.6434 | 11000  | 3.5787          |
| 3.7324        | 0.7019 | 12000  | 3.5612          |
| 3.7249        | 0.7604 | 13000  | 3.5436          |
| 3.6989        | 0.8188 | 14000  | 3.5323          |
| 3.7003        | 0.8773 | 15000  | 3.5169          |
| 3.6919        | 0.9358 | 16000  | 3.5055          |
| 3.6717        | 0.9943 | 17000  | 3.4966          |
| 3.6612        | 1.0528 | 18000  | 3.4868          |
| 3.6467        | 1.1113 | 19000  | 3.4787          |
| 3.6497        | 1.1698 | 20000  | 3.4707          |
| 3.6193        | 1.2283 | 21000  | 3.4639          |
| 3.6302        | 1.2868 | 22000  | 3.4572          |
| 3.6225        | 1.3452 | 23000  | 3.4516          |
| 3.635         | 1.4037 | 24000  | 3.4458          |
| 3.6115        | 1.4622 | 25000  | 3.4416          |
| 3.6162        | 1.5207 | 26000  | 3.4348          |
| 3.6142        | 1.5792 | 27000  | 3.4329          |
| 3.5956        | 1.6377 | 28000  | 3.4293          |
| 3.5885        | 1.6962 | 29000  | 3.4226          |
| 3.603         | 1.7547 | 30000  | 3.4195          |
| 3.5947        | 1.8132 | 31000  | 3.4142          |
| 3.588         | 1.8716 | 32000  | 3.4113          |
| 3.5803        | 1.9301 | 33000  | 3.4065          |
| 3.5891        | 1.9886 | 34000  | 3.4044          |
| 3.5801        | 2.0471 | 35000  | 3.4032          |
| 3.5739        | 2.1056 | 36000  | 3.3988          |
| 3.5661        | 2.1641 | 37000  | 3.3981          |
| 3.5657        | 2.2226 | 38000  | 3.3934          |
| 3.5727        | 2.2811 | 39000  | 3.3907          |
| 3.5617        | 2.3396 | 40000  | 3.3885          |
| 3.5579        | 2.3980 | 41000  | 3.3855          |
| 3.5553        | 2.4565 | 42000  | 3.3816          |
| 3.5647        | 2.5150 | 43000  | 3.3803          |
| 3.5531        | 2.5735 | 44000  | 3.3799          |
| 3.5494        | 2.6320 | 45000  | 3.3777          |
| 3.5525        | 2.6905 | 46000  | 3.3759          |
| 3.5487        | 2.7490 | 47000  | 3.3725          |
| 3.5551        | 2.8075 | 48000  | 3.3711          |
| 3.5511        | 2.8660 | 49000  | 3.3681          |
| 3.5463        | 2.9244 | 50000  | 3.3695          |
| 3.5419        | 2.9829 | 51000  | 3.3660          |
| 3.5414        | 3.0414 | 52000  | 3.3648          |
| 3.5388        | 3.0999 | 53000  | 3.3605          |
| 3.5333        | 3.1584 | 54000  | 3.3619          |
| 3.525         | 3.2169 | 55000  | 3.3588          |
| 3.5361        | 3.2754 | 56000  | 3.3572          |
| 3.5302        | 3.3339 | 57000  | 3.3540          |
| 3.5355        | 3.3924 | 58000  | 3.3553          |
| 3.5391        | 3.4508 | 59000  | 3.3504          |
| 3.531         | 3.5093 | 60000  | 3.3495          |
| 3.5293        | 3.5678 | 61000  | 3.3483          |
| 3.5269        | 3.6263 | 62000  | 3.3489          |
| 3.5181        | 3.6848 | 63000  | 3.3494          |
| 3.5205        | 3.7433 | 64000  | 3.3480          |
| 3.5237        | 3.8018 | 65000  | 3.3440          |
| 3.5316        | 3.8603 | 66000  | 3.3417          |
| 3.5222        | 3.9188 | 67000  | 3.3433          |
| 3.5174        | 3.9772 | 68000  | 3.3418          |
| 3.518         | 4.0357 | 69000  | 3.3414          |
| 3.5036        | 4.0942 | 70000  | 3.3365          |
| 3.5101        | 4.1527 | 71000  | 3.3367          |
| 3.5145        | 4.2112 | 72000  | 3.3361          |
| 3.5053        | 4.2697 | 73000  | 3.3355          |
| 3.5153        | 4.3282 | 74000  | 3.3334          |
| 3.5003        | 4.3867 | 75000  | 3.3334          |
| 3.5001        | 4.4452 | 76000  | 3.3326          |
| 3.5114        | 4.5036 | 77000  | 3.3298          |
| 3.5108        | 4.5621 | 78000  | 3.3292          |
| 3.4985        | 4.6206 | 79000  | 3.3288          |
| 3.497         | 4.6791 | 80000  | 3.3303          |
| 3.4982        | 4.7376 | 81000  | 3.3291          |
| 3.5068        | 4.7961 | 82000  | 3.3272          |
| 3.4915        | 4.8546 | 83000  | 3.3244          |
| 3.5036        | 4.9131 | 84000  | 3.3214          |
| 3.5027        | 4.9716 | 85000  | 3.3214          |
| 3.5078        | 5.0300 | 86000  | 3.3225          |
| 3.5112        | 5.0885 | 87000  | 3.3243          |
| 3.5049        | 5.1470 | 88000  | 3.3216          |
| 3.4917        | 5.2055 | 89000  | 3.3192          |
| 3.4802        | 5.2640 | 90000  | 3.3188          |
| 3.4971        | 5.3225 | 91000  | 3.3201          |
| 3.4941        | 5.3810 | 92000  | 3.3175          |
| 3.4998        | 5.4395 | 93000  | 3.3179          |
| 3.5011        | 5.4980 | 94000  | 3.3164          |
| 3.4912        | 5.5564 | 95000  | 3.3180          |
| 3.4961        | 5.6149 | 96000  | 3.3168          |
| 3.4833        | 5.6734 | 97000  | 3.3148          |
| 3.498         | 5.7319 | 98000  | 3.3133          |
| 3.4892        | 5.7904 | 99000  | 3.3142          |
| 3.4967        | 5.8489 | 100000 | 3.3142          |
| 3.4847        | 5.9074 | 101000 | 3.3094          |
| 3.4899        | 5.9659 | 102000 | 3.3102          |
| 3.4774        | 6.0244 | 103000 | 3.3110          |
| 3.4854        | 6.0828 | 104000 | 3.3106          |
| 3.4873        | 6.1413 | 105000 | 3.3087          |
| 3.4869        | 6.1998 | 106000 | 3.3102          |
| 3.4833        | 6.2583 | 107000 | 3.3063          |
| 3.491         | 6.3168 | 108000 | 3.3082          |
| 3.4776        | 6.3753 | 109000 | 3.3075          |
| 3.4924        | 6.4338 | 110000 | 3.3068          |
| 3.4804        | 6.4923 | 111000 | 3.3050          |
| 3.4805        | 6.5508 | 112000 | 3.3041          |
| 3.4892        | 6.6093 | 113000 | 3.3031          |
| 3.4775        | 6.6677 | 114000 | 3.3032          |
| 3.481         | 6.7262 | 115000 | 3.3036          |
| 3.4782        | 6.7847 | 116000 | 3.3025          |
| 3.4804        | 6.8432 | 117000 | 3.3017          |
| 3.4841        | 6.9017 | 118000 | 3.2999          |
| 3.4784        | 6.9602 | 119000 | 3.3008          |
| 3.4821        | 7.0187 | 120000 | 3.3001          |
| 3.4671        | 7.0772 | 121000 | 3.3008          |
| 3.485         | 7.1357 | 122000 | 3.2976          |
| 3.4737        | 7.1941 | 123000 | 3.2985          |
| 3.4793        | 7.2526 | 124000 | 3.2979          |
| 3.4651        | 7.3111 | 125000 | 3.2968          |
| 3.4847        | 7.3696 | 126000 | 3.2974          |
| 3.474         | 7.4281 | 127000 | 3.2973          |
| 3.4769        | 7.4866 | 128000 | 3.2955          |
| 3.486         | 7.5451 | 129000 | 3.2953          |
| 3.4684        | 7.6036 | 130000 | 3.2944          |
| 3.4826        | 7.6621 | 131000 | 3.2949          |
| 3.4685        | 7.7205 | 132000 | 3.2944          |
| 3.4608        | 7.7790 | 133000 | 3.2931          |
| 3.4655        | 7.8375 | 134000 | 3.2953          |
| 3.4648        | 7.8960 | 135000 | 3.2928          |
| 3.4632        | 7.9545 | 136000 | 3.2936          |
| 3.4666        | 8.0130 | 137000 | 3.2902          |
| 3.4663        | 8.0715 | 138000 | 3.2939          |
| 3.4713        | 8.1300 | 139000 | 3.2904          |
| 3.4654        | 8.1885 | 140000 | 3.2917          |
| 3.466         | 8.2469 | 141000 | 3.2913          |
| 3.4724        | 8.3054 | 142000 | 3.2889          |
| 3.4695        | 8.3639 | 143000 | 3.2890          |
| 3.4729        | 8.4224 | 144000 | 3.2876          |
| 3.4551        | 8.4809 | 145000 | 3.2898          |
| 3.4652        | 8.5394 | 146000 | 3.2885          |
| 3.4689        | 8.5979 | 147000 | 3.2854          |
| 3.4647        | 8.6564 | 148000 | 3.2857          |
| 3.4653        | 8.7149 | 149000 | 3.2857          |
| 3.4552        | 8.7733 | 150000 | 3.2861          |
| 3.47          | 8.8318 | 151000 | 3.2868          |
| 3.4627        | 8.8903 | 152000 | 3.2854          | -->

### Training Loss

![image/png](https://cdn-uploads.huggingface.co/production/uploads/63ccee86374057a338e03c1e/970nr9bptjHSMsjLDHfaY.png)

## Validation Loss

![image/png](https://cdn-uploads.huggingface.co/production/uploads/63ccee86374057a338e03c1e/GUbnak7yV02vd0NZhbeEO.png)


### Framework versions

- Transformers 4.45.2
- Pytorch 2.5.0
- Datasets 3.0.1
- Tokenizers 0.20.1