|
--- |
|
library_name: transformers |
|
license: mit |
|
base_model: openai-community/gpt2 |
|
tags: |
|
- generated_from_trainer |
|
model-index: |
|
- name: arabic-nano-gpt |
|
results: [] |
|
datasets: |
|
- wikimedia/wikipedia |
|
language: |
|
- ar |
|
--- |
|
|
|
# arabic-nano-gpt |
|
|
|
This model is a fine-tuned version of [openai-community/gpt2](https://huggingface.co/openai-community/gpt2) on the arabic [wikimedia/wikipedia](https://huggingface.co/datasets/wikimedia/wikipedia) dataset. |
|
|
|
Repository on GitHub: [e-hossam96/arabic-nano-gpt](https://github.com/e-hossam96/arabic-nano-gpt.git) |
|
|
|
The model achieves the following results on the held-out test set: |
|
|
|
- Loss: 3.28796 |
|
|
|
## How to Use |
|
|
|
```python |
|
import torch |
|
from transformers import pipeline |
|
|
|
model_ckpt = "e-hossam96/arabic-nano-gpt-v0" |
|
device = torch.device("cuda" if torch.cuda.is_available() else "cpu") |
|
|
|
|
|
lm = pipeline(task="text-generation", model=model_ckpt, device=device) |
|
|
|
prompt = """المحرك النفاث هو محرك ينفث الموائع (الماء أو الهواء) بسرعة فائقة \ |
|
لينتج قوة دافعة اعتمادا على مبدأ قانون نيوتن الثالث للحركة. \ |
|
هذا التعريف الواسع للمحركات النفاثة يتضمن أيضا""" |
|
|
|
output = lm(prompt, max_new_tokens=128) |
|
|
|
print(output[0]["generated_text"]) |
|
``` |
|
|
|
## Model description |
|
|
|
- Embedding Size: 256 |
|
- Attention Heads: 4 |
|
- Attention Layers: 4 |
|
|
|
## Training and evaluation data |
|
|
|
The entire wikipedia dataset was split into three splits based on the 90-5-5 ratios. |
|
|
|
## Training hyperparameters |
|
|
|
The following hyperparameters were used during training: |
|
|
|
- learning_rate: 0.001 |
|
- train_batch_size: 64 |
|
- eval_batch_size: 64 |
|
- seed: 42 |
|
- gradient_accumulation_steps: 4 |
|
- total_train_batch_size: 256 |
|
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 |
|
- lr_scheduler_type: linear |
|
- lr_scheduler_warmup_ratio: 0.01 |
|
- num_epochs: 24 |
|
|
|
## Training Loss |
|
|
|
![Training Loss](assets/arabic-nano-gpt-v0-train-loss.png) |
|
|
|
## Validation Loss |
|
|
|
![Validation Loss](assets/arabic-nano-gpt-v0-eval-loss.png) |
|
|
|
## Framework versions |
|
|
|
- Transformers 4.45.2 |
|
- Pytorch 2.5.0 |
|
- Datasets 3.0.1 |
|
- Tokenizers 0.20.1 |
|
|