|
--- |
|
license: mit |
|
tags: |
|
- generated_from_keras_callback |
|
model-index: |
|
- name: Ashraf-kasem/custom_gpt2_frames_text_continue |
|
results: [] |
|
--- |
|
|
|
<!-- This model card has been generated automatically according to the information Keras had access to. You should |
|
probably proofread and complete it, then remove this comment. --> |
|
|
|
# Ashraf-kasem/custom_gpt2_frames_text_continue |
|
|
|
This model is a fine-tuned version of [Ashraf-kasem/custom_gpt2_frames_text_continue](https://huggingface.co/Ashraf-kasem/custom_gpt2_frames_text_continue) on an unknown dataset. |
|
It achieves the following results on the evaluation set: |
|
- Train Loss: 0.6337 |
|
- Validation Loss: 2.3028 |
|
- Epoch: 99 |
|
|
|
## Model description |
|
|
|
More information needed |
|
|
|
## Intended uses & limitations |
|
|
|
More information needed |
|
|
|
## Training and evaluation data |
|
|
|
More information needed |
|
|
|
## Training procedure |
|
|
|
### Training hyperparameters |
|
|
|
The following hyperparameters were used during training: |
|
- optimizer: {'name': 'Adam', 'learning_rate': {'class_name': 'LinearWarmup', 'config': {'after_warmup_lr_sched': {'initial_learning_rate': 5e-05, 'decay_steps': 628900, 'end_learning_rate': 0.0, 'power': 1.0, 'cycle': False, 'name': None}, 'warmup_steps': 125780, 'warmup_learning_rate': 0, 'name': None}}, 'decay': 0.0, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-07, 'amsgrad': False} |
|
- training_precision: mixed_float16 |
|
|
|
### Training results |
|
|
|
| Train Loss | Validation Loss | Epoch | |
|
|:----------:|:---------------:|:-----:| |
|
| 1.0060 | 2.0768 | 0 | |
|
| 1.0147 | 2.0771 | 1 | |
|
| 1.0238 | 2.0821 | 2 | |
|
| 1.0331 | 2.0851 | 3 | |
|
| 1.0422 | 2.0870 | 4 | |
|
| 1.0525 | 2.0945 | 5 | |
|
| 1.0618 | 2.1005 | 6 | |
|
| 1.0718 | 2.1014 | 7 | |
|
| 1.0823 | 2.1056 | 8 | |
|
| 1.0921 | 2.1099 | 9 | |
|
| 1.1028 | 2.1106 | 10 | |
|
| 1.1127 | 2.1127 | 11 | |
|
| 1.1230 | 2.1183 | 12 | |
|
| 1.1329 | 2.1207 | 13 | |
|
| 1.1423 | 2.1270 | 14 | |
|
| 1.1521 | 2.1234 | 15 | |
|
| 1.1614 | 2.1283 | 16 | |
|
| 1.1700 | 2.1236 | 17 | |
|
| 1.1784 | 2.1320 | 18 | |
|
| 1.1864 | 2.1359 | 19 | |
|
| 1.1873 | 2.1272 | 20 | |
|
| 1.1766 | 2.1250 | 21 | |
|
| 1.1652 | 2.1260 | 22 | |
|
| 1.1537 | 2.1224 | 23 | |
|
| 1.1415 | 2.1278 | 24 | |
|
| 1.1296 | 2.1254 | 25 | |
|
| 1.1178 | 2.1213 | 26 | |
|
| 1.1059 | 2.1301 | 27 | |
|
| 1.0950 | 2.1253 | 28 | |
|
| 1.0838 | 2.1264 | 29 | |
|
| 1.0729 | 2.1273 | 30 | |
|
| 1.0625 | 2.1355 | 31 | |
|
| 1.0519 | 2.1345 | 32 | |
|
| 1.0414 | 2.1364 | 33 | |
|
| 1.0317 | 2.1324 | 34 | |
|
| 1.0217 | 2.1410 | 35 | |
|
| 1.0126 | 2.1428 | 36 | |
|
| 1.0027 | 2.1427 | 37 | |
|
| 0.9936 | 2.1494 | 38 | |
|
| 0.9846 | 2.1502 | 39 | |
|
| 0.9752 | 2.1490 | 40 | |
|
| 0.9665 | 2.1501 | 41 | |
|
| 0.9582 | 2.1552 | 42 | |
|
| 0.9497 | 2.1533 | 43 | |
|
| 0.9411 | 2.1621 | 44 | |
|
| 0.9331 | 2.1618 | 45 | |
|
| 0.9248 | 2.1655 | 46 | |
|
| 0.9172 | 2.1755 | 47 | |
|
| 0.9093 | 2.1759 | 48 | |
|
| 0.9014 | 2.1751 | 49 | |
|
| 0.8942 | 2.1813 | 50 | |
|
| 0.8867 | 2.1831 | 51 | |
|
| 0.8795 | 2.1856 | 52 | |
|
| 0.8723 | 2.1909 | 53 | |
|
| 0.8651 | 2.1950 | 54 | |
|
| 0.8581 | 2.1955 | 55 | |
|
| 0.8511 | 2.2007 | 56 | |
|
| 0.8444 | 2.2002 | 57 | |
|
| 0.8380 | 2.2078 | 58 | |
|
| 0.8312 | 2.2077 | 59 | |
|
| 0.8246 | 2.2161 | 60 | |
|
| 0.8186 | 2.2103 | 61 | |
|
| 0.8120 | 2.2180 | 62 | |
|
| 0.8053 | 2.2202 | 63 | |
|
| 0.7994 | 2.2232 | 64 | |
|
| 0.7934 | 2.2290 | 65 | |
|
| 0.7872 | 2.2301 | 66 | |
|
| 0.7816 | 2.2327 | 67 | |
|
| 0.7757 | 2.2369 | 68 | |
|
| 0.7698 | 2.2408 | 69 | |
|
| 0.7640 | 2.2439 | 70 | |
|
| 0.7582 | 2.2451 | 71 | |
|
| 0.7528 | 2.2505 | 72 | |
|
| 0.7475 | 2.2524 | 73 | |
|
| 0.7420 | 2.2520 | 74 | |
|
| 0.7366 | 2.2561 | 75 | |
|
| 0.7313 | 2.2616 | 76 | |
|
| 0.7260 | 2.2628 | 77 | |
|
| 0.7211 | 2.2654 | 78 | |
|
| 0.7158 | 2.2701 | 79 | |
|
| 0.7107 | 2.2704 | 80 | |
|
| 0.7061 | 2.2743 | 81 | |
|
| 0.7008 | 2.2749 | 82 | |
|
| 0.6962 | 2.2769 | 83 | |
|
| 0.6916 | 2.2813 | 84 | |
|
| 0.6869 | 2.2838 | 85 | |
|
| 0.6823 | 2.2853 | 86 | |
|
| 0.6780 | 2.2867 | 87 | |
|
| 0.6737 | 2.2883 | 88 | |
|
| 0.6691 | 2.2921 | 89 | |
|
| 0.6651 | 2.2931 | 90 | |
|
| 0.6608 | 2.2946 | 91 | |
|
| 0.6568 | 2.2957 | 92 | |
|
| 0.6533 | 2.2984 | 93 | |
|
| 0.6494 | 2.2981 | 94 | |
|
| 0.6459 | 2.2994 | 95 | |
|
| 0.6425 | 2.3006 | 96 | |
|
| 0.6395 | 2.3019 | 97 | |
|
| 0.6363 | 2.3026 | 98 | |
|
| 0.6337 | 2.3028 | 99 | |
|
|
|
|
|
### Framework versions |
|
|
|
- Transformers 4.25.1 |
|
- TensorFlow 2.9.0 |
|
- Datasets 2.8.0 |
|
- Tokenizers 0.13.2 |
|
|