e-hossam96
/

arabic-nano-gpt-v0

Text Generation

Generated from Trainer

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

e-hossam96 commited on Oct 23, 2024

Commit

b225392

·

verified ·

1 Parent(s): b19a398

End of training

Files changed (3) hide show

README.md +3 -20
model.safetensors +1 -1
training_args.bin +1 -1

README.md CHANGED Viewed

@@ -15,8 +15,6 @@ should probably proofread and complete it, then remove this comment. -->
 # arabic-nano-gpt
 This model is a fine-tuned version of [openai-community/gpt2](https://huggingface.co/openai-community/gpt2) on an unknown dataset.
-It achieves the following results on the evaluation set:
-- Loss: 4.7743
 ## Model description
@@ -37,32 +35,17 @@ More information needed
 The following hyperparameters were used during training:
 - learning_rate: 0.001
 - train_batch_size: 32
-- eval_batch_size: 32
 - seed: 42
 - gradient_accumulation_steps: 8
 - total_train_batch_size: 256
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
-- num_epochs: 2
 ### Training results
-| Training Loss | Epoch  | Step | Validation Loss |
-|:-------------:|:------:|:----:|:---------------:|
-| 7.3768        | 0.1422 | 100  | 6.7464          |
-| 6.5028        | 0.2844 | 200  | 6.2504          |
-| 6.0865        | 0.4266 | 300  | 5.8534          |
-| 5.7563        | 0.5688 | 400  | 5.5491          |
-| 5.5138        | 0.7110 | 500  | 5.3476          |
-| 5.3615        | 0.8532 | 600  | 5.2035          |
-| 5.2481        | 0.9954 | 700  | 5.0965          |
-| 5.1406        | 1.1376 | 800  | 5.0118          |
-| 5.0665        | 1.2798 | 900  | 4.9467          |
-| 5.015         | 1.4220 | 1000 | 4.8900          |
-| 4.9767        | 1.5642 | 1100 | 4.8476          |
-| 4.9434        | 1.7064 | 1200 | 4.8122          |
-| 4.9137        | 1.8486 | 1300 | 4.7864          |
-| 4.9069        | 1.9908 | 1400 | 4.7743          |
 ### Framework versions

 # arabic-nano-gpt
 This model is a fine-tuned version of [openai-community/gpt2](https://huggingface.co/openai-community/gpt2) on an unknown dataset.
 ## Model description
 The following hyperparameters were used during training:
 - learning_rate: 0.001
 - train_batch_size: 32
+- eval_batch_size: 64
 - seed: 42
 - gradient_accumulation_steps: 8
 - total_train_batch_size: 256
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
+- lr_scheduler_warmup_ratio: 0.01
+- num_epochs: 60
 ### Training results
 ### Framework versions

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:4ef03bee4cdac781668e18b4eb3ac55e84d6086f3b666d99a7211dc1499c90b7
 size 22080496

 version https://git-lfs.github.com/spec/v1
+oid sha256:d7cc702d7abb8ff0819b4a0e3c4f6cee9201e0cbfee91c8c23defce2af264894
 size 22080496

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:12e57a589058865d432b375c24c25ef67764e80778f2e51d8c95b8b5339317dc
 size 5240

 version https://git-lfs.github.com/spec/v1
+oid sha256:b41545602c6b409f6904cbad620cda90ef2f7225d3f2d6604aee1fe37d86e48e
 size 5240