Training in progress epoch 0

Browse files

Files changed (10) hide show

README.md +6 -35
checkpoint/extra_data.pickle +1 -1
checkpoint/weights.h5 +1 -1
config.json +1 -1
logs/train/events.out.tfevents.1674661877.ip-10-39-226-229.afrd.expertcity.com.6893.0.v2 +2 -2
logs/train/events.out.tfevents.1674736212.ip-10-39-226-229.afrd.expertcity.com.7323.0.v2 +3 -0
logs/validation/events.out.tfevents.1674663631.ip-10-39-226-229.afrd.expertcity.com.6893.1.v2 +2 -2
logs/validation/events.out.tfevents.1674737933.ip-10-39-226-229.afrd.expertcity.com.7323.1.v2 +3 -0
tf_model.h5 +1 -1
tokenizer_config.json +1 -1

README.md CHANGED Viewed

@@ -12,11 +12,11 @@ probably proofread and complete it, then remove this comment. -->
 # Ashraf-kasem/custom_gpt2_frames_text_continue
-This model is a fine-tuned version of [Ashraf-kasem/custom_gpt2_frames_text](https://huggingface.co/Ashraf-kasem/custom_gpt2_frames_text) on an unknown dataset.
 It achieves the following results on the evaluation set:
-- Train Loss: 1.0092
-- Validation Loss: 2.0758
-- Epoch: 29
 ## Model description
@@ -35,43 +35,14 @@ More information needed
 ### Training hyperparameters
 The following hyperparameters were used during training:
-- optimizer: {'name': 'Adam', 'learning_rate': {'class_name': 'LinearWarmup', 'config': {'after_warmup_lr_sched': {'initial_learning_rate': 5e-05, 'decay_steps': 188670, 'end_learning_rate': 0.0, 'power': 1.0, 'cycle': False, 'name': None}, 'warmup_steps': 18867, 'warmup_learning_rate': 0, 'name': None}}, 'decay': 0.0, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-07, 'amsgrad': False}
 - training_precision: mixed_float16
 ### Training results
 | Train Loss | Validation Loss | Epoch |
 |:----------:|:---------------:|:-----:|
-| 1.4198     | 2.1045          | 0     |
-| 1.5117     | 2.1357          | 1     |
-| 1.6065     | 2.1713          | 2     |
-| 1.6491     | 2.1594          | 3     |
-| 1.6151     | 2.1483          | 4     |
-| 1.5789     | 2.1319          | 5     |
-| 1.5417     | 2.1198          | 6     |
-| 1.5061     | 2.1073          | 7     |
-| 1.4719     | 2.1122          | 8     |
-| 1.4392     | 2.1024          | 9     |
-| 1.4078     | 2.0968          | 10    |
-| 1.3780     | 2.0914          | 11    |
-| 1.3493     | 2.0822          | 12    |
-| 1.3218     | 2.0823          | 13    |
-| 1.2953     | 2.0823          | 14    |
-| 1.2703     | 2.0777          | 15    |
-| 1.2454     | 2.0783          | 16    |
-| 1.2220     | 2.0789          | 17    |
-| 1.1994     | 2.0747          | 18    |
-| 1.1775     | 2.0737          | 19    |
-| 1.1565     | 2.0732          | 20    |
-| 1.1367     | 2.0730          | 21    |
-| 1.1170     | 2.0765          | 22    |
-| 1.0987     | 2.0750          | 23    |
-| 1.0813     | 2.0774          | 24    |
-| 1.0646     | 2.0732          | 25    |
-| 1.0483     | 2.0753          | 26    |
-| 1.0339     | 2.0769          | 27    |
-| 1.0207     | 2.0754          | 28    |
-| 1.0092     | 2.0758          | 29    |
 ### Framework versions

 # Ashraf-kasem/custom_gpt2_frames_text_continue
+This model is a fine-tuned version of [Ashraf-kasem/custom_gpt2_frames_text_continue](https://huggingface.co/Ashraf-kasem/custom_gpt2_frames_text_continue) on an unknown dataset.
 It achieves the following results on the evaluation set:
+- Train Loss: 1.0060
+- Validation Loss: 2.0768
+- Epoch: 0
 ## Model description
 ### Training hyperparameters
 The following hyperparameters were used during training:
+- optimizer: {'name': 'Adam', 'learning_rate': {'class_name': 'LinearWarmup', 'config': {'after_warmup_lr_sched': {'initial_learning_rate': 5e-05, 'decay_steps': 628900, 'end_learning_rate': 0.0, 'power': 1.0, 'cycle': False, 'name': None}, 'warmup_steps': 125780, 'warmup_learning_rate': 0, 'name': None}}, 'decay': 0.0, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-07, 'amsgrad': False}
 - training_precision: mixed_float16
 ### Training results
 | Train Loss | Validation Loss | Epoch |
 |:----------:|:---------------:|:-----:|
+| 1.0060     | 2.0768          | 0     |
 ### Framework versions

checkpoint/extra_data.pickle CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:92be69d263591a1180f89b16a946e2b41150fb8dcc896d123cd90071ea181639
 size 748191129

 version https://git-lfs.github.com/spec/v1
+oid sha256:7b4b7b8f3a14f569b80d50d6c156fcf32032e43828d63b1f4f1461981b69c085
 size 748191129

checkpoint/weights.h5 CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:223df2fd46d0cb573ebcfd781c618c615e86dd51bca8ec63b96e96b8ae707557
 size 374265936

 version https://git-lfs.github.com/spec/v1
+oid sha256:ef8b15e4e620394ad4165ef7dedff287675f33cf8a2a717e01c00f82e6e604cc
 size 374265936

config.json CHANGED Viewed

@@ -1,5 +1,5 @@
 {
-  "_name_or_path": "Ashraf-kasem/custom_gpt2_frames_text",
   "activation_function": "gelu_new",
   "architectures": [
     "GPT2LMHeadModel"

 {
+  "_name_or_path": "Ashraf-kasem/custom_gpt2_frames_text_continue",
   "activation_function": "gelu_new",
   "architectures": [
     "GPT2LMHeadModel"

logs/train/events.out.tfevents.1674661877.ip-10-39-226-229.afrd.expertcity.com.6893.0.v2 CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:f867aeb52a7dc5ad922077a0b1974c6a5f7d77e9f962e184be018f0816c13b35
-size 4870291

 version https://git-lfs.github.com/spec/v1
+oid sha256:a81cf23c00c318ca1004032f0716d1e1d55e6229365e8bc77ceb0ac10092f403
+size 4995163

logs/train/events.out.tfevents.1674736212.ip-10-39-226-229.afrd.expertcity.com.7323.0.v2 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:36414f485dffac7f316ee6a2b2308292cdd2e903f9de680d37cec1964cbd913e
+size 1249303

logs/validation/events.out.tfevents.1674663631.ip-10-39-226-229.afrd.expertcity.com.6893.1.v2 CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:a8c22270c65283edcc50ab21e090ec10c2bbb0a3aa06c5f887104de688919f14
-size 4678

 version https://git-lfs.github.com/spec/v1
+oid sha256:ef55d729af5843305dd77b375eafb470d1a72942ab3c3a8d9341434a27fff684
+size 4746

logs/validation/events.out.tfevents.1674737933.ip-10-39-226-229.afrd.expertcity.com.7323.1.v2 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:f92a7cada68d0d4c634e94e2b1ead7938cf2a677574dba72046f8e904d878246
+size 128

tf_model.h5 CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:223df2fd46d0cb573ebcfd781c618c615e86dd51bca8ec63b96e96b8ae707557
 size 374265936

 version https://git-lfs.github.com/spec/v1
+oid sha256:ef8b15e4e620394ad4165ef7dedff287675f33cf8a2a717e01c00f82e6e604cc
 size 374265936

tokenizer_config.json CHANGED Viewed

@@ -2,7 +2,7 @@
   "bos_token": "<EOS>",
   "eos_token": "<EOS>",
   "model_max_length": 1000000000000000019884624838656,
-  "name_or_path": "Ashraf-kasem/custom_gpt2_frames_text",
   "special_tokens_map_file": "frames_text_cleaned_dataset_tokenizer/special_tokens_map.json",
   "tokenizer_class": "PreTrainedTokenizerFast"
 }

   "bos_token": "<EOS>",
   "eos_token": "<EOS>",
   "model_max_length": 1000000000000000019884624838656,
+  "name_or_path": "Ashraf-kasem/custom_gpt2_frames_text_continue",
   "special_tokens_map_file": "frames_text_cleaned_dataset_tokenizer/special_tokens_map.json",
   "tokenizer_class": "PreTrainedTokenizerFast"
 }