osei1819 commited on
Commit
7e37eac
·
verified ·
1 Parent(s): a79ec7a

End of training

Browse files
Files changed (1) hide show
  1. README.md +13 -8
README.md CHANGED
@@ -16,7 +16,7 @@ should probably proofread and complete it, then remove this comment. -->
16
 
17
  This model is a fine-tuned version of [microsoft/phi-1_5](https://huggingface.co/microsoft/phi-1_5) on the None dataset.
18
  It achieves the following results on the evaluation set:
19
- - Loss: 3.8786
20
 
21
  ## Model description
22
 
@@ -41,21 +41,26 @@ The following hyperparameters were used during training:
41
  - seed: 42
42
  - gradient_accumulation_steps: 16
43
  - total_train_batch_size: 64
44
- - optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
45
  - lr_scheduler_type: linear
46
  - lr_scheduler_warmup_steps: 500
47
- - num_epochs: 5
48
  - mixed_precision_training: Native AMP
49
 
50
  ### Training results
51
 
52
  | Training Loss | Epoch | Step | Validation Loss |
53
  |:-------------:|:------:|:----:|:---------------:|
54
- | No log | 0.9275 | 4 | 4.5963 |
55
- | No log | 1.9275 | 8 | 4.5550 |
56
- | 5.207 | 2.9275 | 12 | 4.4240 |
57
- | 5.207 | 3.9275 | 16 | 4.1611 |
58
- | 4.7183 | 4.9275 | 20 | 3.8786 |
 
 
 
 
 
59
 
60
 
61
  ### Framework versions
 
16
 
17
  This model is a fine-tuned version of [microsoft/phi-1_5](https://huggingface.co/microsoft/phi-1_5) on the None dataset.
18
  It achieves the following results on the evaluation set:
19
+ - Loss: 0.3498
20
 
21
  ## Model description
22
 
 
41
  - seed: 42
42
  - gradient_accumulation_steps: 16
43
  - total_train_batch_size: 64
44
+ - optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
45
  - lr_scheduler_type: linear
46
  - lr_scheduler_warmup_steps: 500
47
+ - num_epochs: 10
48
  - mixed_precision_training: Native AMP
49
 
50
  ### Training results
51
 
52
  | Training Loss | Epoch | Step | Validation Loss |
53
  |:-------------:|:------:|:----:|:---------------:|
54
+ | No log | 0.9275 | 4 | 5.3873 |
55
+ | No log | 1.9275 | 8 | 5.1770 |
56
+ | 6.4196 | 2.9275 | 12 | 4.5702 |
57
+ | 6.4196 | 3.9275 | 16 | 3.1211 |
58
+ | 4.6448 | 4.9275 | 20 | 2.0151 |
59
+ | 4.6448 | 5.9275 | 24 | 0.5937 |
60
+ | 4.6448 | 6.9275 | 28 | 0.4527 |
61
+ | 0.9737 | 7.9275 | 32 | 0.4155 |
62
+ | 0.9737 | 8.9275 | 36 | 0.3759 |
63
+ | 0.4308 | 9.9275 | 40 | 0.3498 |
64
 
65
 
66
  ### Framework versions