e-hossam96 commited on
Commit
b331cd3
1 Parent(s): 00adddb

added more details to model

Browse files
README.md CHANGED
@@ -3,41 +3,61 @@ library_name: transformers
3
  license: mit
4
  base_model: openai-community/gpt2
5
  tags:
6
- - generated_from_trainer
7
  model-index:
8
- - name: arabic-nano-gpt
9
- results: []
10
  datasets:
11
- - wikimedia/wikipedia
12
  language:
13
- - ar
14
  ---
15
 
16
-
17
  # arabic-nano-gpt
18
 
19
- This model is a fine-tuned version of [openai-community/gpt2](https://huggingface.co/openai-community/gpt2) on an unknown dataset.
20
- It achieves the following results on the held-out test set:
 
 
 
 
21
  - Loss: 3.28796
22
 
 
23
 
24
- ## Model description
 
 
25
 
26
- More information needed
 
27
 
28
- ## Intended uses & limitations
29
 
30
- More information needed
31
 
32
- ## Training and evaluation data
 
 
33
 
34
- More information needed
35
 
36
- ## Training procedure
 
 
 
37
 
38
- ### Training hyperparameters
 
 
 
 
 
 
 
 
39
 
40
  The following hyperparameters were used during training:
 
41
  - learning_rate: 0.001
42
  - train_batch_size: 64
43
  - eval_batch_size: 64
@@ -49,175 +69,17 @@ The following hyperparameters were used during training:
49
  - lr_scheduler_warmup_ratio: 0.01
50
  - num_epochs: 24
51
 
52
- <!-- ### Training results -->
53
-
54
- <!-- | Training Loss | Epoch | Step | Validation Loss |
55
- |:-------------:|:------:|:------:|:---------------:|
56
- | 5.62 | 0.0585 | 1000 | 5.3754 |
57
- | 4.6527 | 0.1170 | 2000 | 4.4918 |
58
- | 4.2818 | 0.1755 | 3000 | 4.1137 |
59
- | 4.1289 | 0.2340 | 4000 | 3.9388 |
60
- | 4.0021 | 0.2924 | 5000 | 3.8274 |
61
- | 3.9301 | 0.3509 | 6000 | 3.7534 |
62
- | 3.8822 | 0.4094 | 7000 | 3.6986 |
63
- | 3.8375 | 0.4679 | 8000 | 3.6557 |
64
- | 3.7918 | 0.5264 | 9000 | 3.6266 |
65
- | 3.7723 | 0.5849 | 10000 | 3.5994 |
66
- | 3.7549 | 0.6434 | 11000 | 3.5787 |
67
- | 3.7324 | 0.7019 | 12000 | 3.5612 |
68
- | 3.7249 | 0.7604 | 13000 | 3.5436 |
69
- | 3.6989 | 0.8188 | 14000 | 3.5323 |
70
- | 3.7003 | 0.8773 | 15000 | 3.5169 |
71
- | 3.6919 | 0.9358 | 16000 | 3.5055 |
72
- | 3.6717 | 0.9943 | 17000 | 3.4966 |
73
- | 3.6612 | 1.0528 | 18000 | 3.4868 |
74
- | 3.6467 | 1.1113 | 19000 | 3.4787 |
75
- | 3.6497 | 1.1698 | 20000 | 3.4707 |
76
- | 3.6193 | 1.2283 | 21000 | 3.4639 |
77
- | 3.6302 | 1.2868 | 22000 | 3.4572 |
78
- | 3.6225 | 1.3452 | 23000 | 3.4516 |
79
- | 3.635 | 1.4037 | 24000 | 3.4458 |
80
- | 3.6115 | 1.4622 | 25000 | 3.4416 |
81
- | 3.6162 | 1.5207 | 26000 | 3.4348 |
82
- | 3.6142 | 1.5792 | 27000 | 3.4329 |
83
- | 3.5956 | 1.6377 | 28000 | 3.4293 |
84
- | 3.5885 | 1.6962 | 29000 | 3.4226 |
85
- | 3.603 | 1.7547 | 30000 | 3.4195 |
86
- | 3.5947 | 1.8132 | 31000 | 3.4142 |
87
- | 3.588 | 1.8716 | 32000 | 3.4113 |
88
- | 3.5803 | 1.9301 | 33000 | 3.4065 |
89
- | 3.5891 | 1.9886 | 34000 | 3.4044 |
90
- | 3.5801 | 2.0471 | 35000 | 3.4032 |
91
- | 3.5739 | 2.1056 | 36000 | 3.3988 |
92
- | 3.5661 | 2.1641 | 37000 | 3.3981 |
93
- | 3.5657 | 2.2226 | 38000 | 3.3934 |
94
- | 3.5727 | 2.2811 | 39000 | 3.3907 |
95
- | 3.5617 | 2.3396 | 40000 | 3.3885 |
96
- | 3.5579 | 2.3980 | 41000 | 3.3855 |
97
- | 3.5553 | 2.4565 | 42000 | 3.3816 |
98
- | 3.5647 | 2.5150 | 43000 | 3.3803 |
99
- | 3.5531 | 2.5735 | 44000 | 3.3799 |
100
- | 3.5494 | 2.6320 | 45000 | 3.3777 |
101
- | 3.5525 | 2.6905 | 46000 | 3.3759 |
102
- | 3.5487 | 2.7490 | 47000 | 3.3725 |
103
- | 3.5551 | 2.8075 | 48000 | 3.3711 |
104
- | 3.5511 | 2.8660 | 49000 | 3.3681 |
105
- | 3.5463 | 2.9244 | 50000 | 3.3695 |
106
- | 3.5419 | 2.9829 | 51000 | 3.3660 |
107
- | 3.5414 | 3.0414 | 52000 | 3.3648 |
108
- | 3.5388 | 3.0999 | 53000 | 3.3605 |
109
- | 3.5333 | 3.1584 | 54000 | 3.3619 |
110
- | 3.525 | 3.2169 | 55000 | 3.3588 |
111
- | 3.5361 | 3.2754 | 56000 | 3.3572 |
112
- | 3.5302 | 3.3339 | 57000 | 3.3540 |
113
- | 3.5355 | 3.3924 | 58000 | 3.3553 |
114
- | 3.5391 | 3.4508 | 59000 | 3.3504 |
115
- | 3.531 | 3.5093 | 60000 | 3.3495 |
116
- | 3.5293 | 3.5678 | 61000 | 3.3483 |
117
- | 3.5269 | 3.6263 | 62000 | 3.3489 |
118
- | 3.5181 | 3.6848 | 63000 | 3.3494 |
119
- | 3.5205 | 3.7433 | 64000 | 3.3480 |
120
- | 3.5237 | 3.8018 | 65000 | 3.3440 |
121
- | 3.5316 | 3.8603 | 66000 | 3.3417 |
122
- | 3.5222 | 3.9188 | 67000 | 3.3433 |
123
- | 3.5174 | 3.9772 | 68000 | 3.3418 |
124
- | 3.518 | 4.0357 | 69000 | 3.3414 |
125
- | 3.5036 | 4.0942 | 70000 | 3.3365 |
126
- | 3.5101 | 4.1527 | 71000 | 3.3367 |
127
- | 3.5145 | 4.2112 | 72000 | 3.3361 |
128
- | 3.5053 | 4.2697 | 73000 | 3.3355 |
129
- | 3.5153 | 4.3282 | 74000 | 3.3334 |
130
- | 3.5003 | 4.3867 | 75000 | 3.3334 |
131
- | 3.5001 | 4.4452 | 76000 | 3.3326 |
132
- | 3.5114 | 4.5036 | 77000 | 3.3298 |
133
- | 3.5108 | 4.5621 | 78000 | 3.3292 |
134
- | 3.4985 | 4.6206 | 79000 | 3.3288 |
135
- | 3.497 | 4.6791 | 80000 | 3.3303 |
136
- | 3.4982 | 4.7376 | 81000 | 3.3291 |
137
- | 3.5068 | 4.7961 | 82000 | 3.3272 |
138
- | 3.4915 | 4.8546 | 83000 | 3.3244 |
139
- | 3.5036 | 4.9131 | 84000 | 3.3214 |
140
- | 3.5027 | 4.9716 | 85000 | 3.3214 |
141
- | 3.5078 | 5.0300 | 86000 | 3.3225 |
142
- | 3.5112 | 5.0885 | 87000 | 3.3243 |
143
- | 3.5049 | 5.1470 | 88000 | 3.3216 |
144
- | 3.4917 | 5.2055 | 89000 | 3.3192 |
145
- | 3.4802 | 5.2640 | 90000 | 3.3188 |
146
- | 3.4971 | 5.3225 | 91000 | 3.3201 |
147
- | 3.4941 | 5.3810 | 92000 | 3.3175 |
148
- | 3.4998 | 5.4395 | 93000 | 3.3179 |
149
- | 3.5011 | 5.4980 | 94000 | 3.3164 |
150
- | 3.4912 | 5.5564 | 95000 | 3.3180 |
151
- | 3.4961 | 5.6149 | 96000 | 3.3168 |
152
- | 3.4833 | 5.6734 | 97000 | 3.3148 |
153
- | 3.498 | 5.7319 | 98000 | 3.3133 |
154
- | 3.4892 | 5.7904 | 99000 | 3.3142 |
155
- | 3.4967 | 5.8489 | 100000 | 3.3142 |
156
- | 3.4847 | 5.9074 | 101000 | 3.3094 |
157
- | 3.4899 | 5.9659 | 102000 | 3.3102 |
158
- | 3.4774 | 6.0244 | 103000 | 3.3110 |
159
- | 3.4854 | 6.0828 | 104000 | 3.3106 |
160
- | 3.4873 | 6.1413 | 105000 | 3.3087 |
161
- | 3.4869 | 6.1998 | 106000 | 3.3102 |
162
- | 3.4833 | 6.2583 | 107000 | 3.3063 |
163
- | 3.491 | 6.3168 | 108000 | 3.3082 |
164
- | 3.4776 | 6.3753 | 109000 | 3.3075 |
165
- | 3.4924 | 6.4338 | 110000 | 3.3068 |
166
- | 3.4804 | 6.4923 | 111000 | 3.3050 |
167
- | 3.4805 | 6.5508 | 112000 | 3.3041 |
168
- | 3.4892 | 6.6093 | 113000 | 3.3031 |
169
- | 3.4775 | 6.6677 | 114000 | 3.3032 |
170
- | 3.481 | 6.7262 | 115000 | 3.3036 |
171
- | 3.4782 | 6.7847 | 116000 | 3.3025 |
172
- | 3.4804 | 6.8432 | 117000 | 3.3017 |
173
- | 3.4841 | 6.9017 | 118000 | 3.2999 |
174
- | 3.4784 | 6.9602 | 119000 | 3.3008 |
175
- | 3.4821 | 7.0187 | 120000 | 3.3001 |
176
- | 3.4671 | 7.0772 | 121000 | 3.3008 |
177
- | 3.485 | 7.1357 | 122000 | 3.2976 |
178
- | 3.4737 | 7.1941 | 123000 | 3.2985 |
179
- | 3.4793 | 7.2526 | 124000 | 3.2979 |
180
- | 3.4651 | 7.3111 | 125000 | 3.2968 |
181
- | 3.4847 | 7.3696 | 126000 | 3.2974 |
182
- | 3.474 | 7.4281 | 127000 | 3.2973 |
183
- | 3.4769 | 7.4866 | 128000 | 3.2955 |
184
- | 3.486 | 7.5451 | 129000 | 3.2953 |
185
- | 3.4684 | 7.6036 | 130000 | 3.2944 |
186
- | 3.4826 | 7.6621 | 131000 | 3.2949 |
187
- | 3.4685 | 7.7205 | 132000 | 3.2944 |
188
- | 3.4608 | 7.7790 | 133000 | 3.2931 |
189
- | 3.4655 | 7.8375 | 134000 | 3.2953 |
190
- | 3.4648 | 7.8960 | 135000 | 3.2928 |
191
- | 3.4632 | 7.9545 | 136000 | 3.2936 |
192
- | 3.4666 | 8.0130 | 137000 | 3.2902 |
193
- | 3.4663 | 8.0715 | 138000 | 3.2939 |
194
- | 3.4713 | 8.1300 | 139000 | 3.2904 |
195
- | 3.4654 | 8.1885 | 140000 | 3.2917 |
196
- | 3.466 | 8.2469 | 141000 | 3.2913 |
197
- | 3.4724 | 8.3054 | 142000 | 3.2889 |
198
- | 3.4695 | 8.3639 | 143000 | 3.2890 |
199
- | 3.4729 | 8.4224 | 144000 | 3.2876 |
200
- | 3.4551 | 8.4809 | 145000 | 3.2898 |
201
- | 3.4652 | 8.5394 | 146000 | 3.2885 |
202
- | 3.4689 | 8.5979 | 147000 | 3.2854 |
203
- | 3.4647 | 8.6564 | 148000 | 3.2857 |
204
- | 3.4653 | 8.7149 | 149000 | 3.2857 |
205
- | 3.4552 | 8.7733 | 150000 | 3.2861 |
206
- | 3.47 | 8.8318 | 151000 | 3.2868 |
207
- | 3.4627 | 8.8903 | 152000 | 3.2854 | -->
208
-
209
- ### Training Loss
210
-
211
- ![image/png](https://cdn-uploads.huggingface.co/production/uploads/63ccee86374057a338e03c1e/970nr9bptjHSMsjLDHfaY.png)
212
 
213
- ## Validation Loss
214
 
215
- ![image/png](https://cdn-uploads.huggingface.co/production/uploads/63ccee86374057a338e03c1e/GUbnak7yV02vd0NZhbeEO.png)
216
 
 
217
 
218
- ### Framework versions
219
 
220
  - Transformers 4.45.2
221
  - Pytorch 2.5.0
222
  - Datasets 3.0.1
223
- - Tokenizers 0.20.1
 
3
  license: mit
4
  base_model: openai-community/gpt2
5
  tags:
6
+ - generated_from_trainer
7
  model-index:
8
+ - name: arabic-nano-gpt
9
+ results: []
10
  datasets:
11
+ - wikimedia/wikipedia
12
  language:
13
+ - ar
14
  ---
15
 
 
16
  # arabic-nano-gpt
17
 
18
+ This model is a fine-tuned version of [openai-community/gpt2](https://huggingface.co/openai-community/gpt2) on the arabic [wikimedia/wikipedia](https://huggingface.co/datasets/wikimedia/wikipedia) dataset.
19
+
20
+ Repository on GitHub: [e-hossam96/arabic-nano-gpt](https://github.com/e-hossam96/arabic-nano-gpt.git)
21
+
22
+ The model achieves the following results on the held-out test set:
23
+
24
  - Loss: 3.28796
25
 
26
+ ## How to Use
27
 
28
+ ```python
29
+ import torch
30
+ from transformers import pipeline
31
 
32
+ model_ckpt = "e-hossam96/arabic-nano-gpt-v0"
33
+ device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
34
 
 
35
 
36
+ lm = pipeline(task="text-generation", model=model_ckpt, device=device)
37
 
38
+ prompt = """المحرك النفاث هو محرك ينفث الموائع (الماء أو الهواء) بسرعة فائقة \
39
+ لينتج قوة دافعة اعتمادا على مبدأ قانون نيوتن الثالث للحركة. \
40
+ هذا التعريف الواسع للمحركات النفاثة يتضمن أيضا"""
41
 
42
+ output = lm(prompt, max_new_tokens=128)
43
 
44
+ print(output[0]["generated_text"])
45
+ ```
46
+
47
+ ## Model description
48
 
49
+ - Embedding Size: 256
50
+ - Attention Heads: 4
51
+ - Attention Layers: 4
52
+
53
+ ## Training and evaluation data
54
+
55
+ The entire wikipedia dataset was split into three splits based on the 90-5-5 ratios.
56
+
57
+ ## Training hyperparameters
58
 
59
  The following hyperparameters were used during training:
60
+
61
  - learning_rate: 0.001
62
  - train_batch_size: 64
63
  - eval_batch_size: 64
 
69
  - lr_scheduler_warmup_ratio: 0.01
70
  - num_epochs: 24
71
 
72
+ ## Training Loss
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
73
 
74
+ ![Training Loss](assets/arabic-nano-gpt-v0-train-loss.png)
75
 
76
+ ## Validation Loss
77
 
78
+ ![Validation Loss](assets/arabic-nano-gpt-v0-eval-loss.png)
79
 
80
+ ## Framework versions
81
 
82
  - Transformers 4.45.2
83
  - Pytorch 2.5.0
84
  - Datasets 3.0.1
85
+ - Tokenizers 0.20.1
assets/arabic-nano-gpt-v0-eval-loss.png ADDED
assets/arabic-nano-gpt-v0-train-loss.png ADDED