temporary0-0name
/

run_4

Text Generation

Transformers

PyTorch

bert

Generated from Trainer

Inference Endpoints

Model card Files Files and versions Community

temporary0-0name commited on Nov 14, 2023

Commit

96fa43c

1 Parent(s): fb499d0

End of training

Browse files

Files changed (1) hide show

README.md +23 -23

README.md CHANGED Viewed

@@ -17,7 +17,7 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [bert-base-uncased](https://huggingface.co/bert-base-uncased) on the wikitext dataset.
 It achieves the following results on the evaluation set:
-- Loss: 0.1400
 ## Model description
@@ -37,38 +37,38 @@ More information needed
 The following hyperparameters were used during training:
 - learning_rate: 0.0001
-- train_batch_size: 64
-- eval_batch_size: 64
 - seed: 42
 - gradient_accumulation_steps: 8
-- total_train_batch_size: 512
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: cosine
 - lr_scheduler_warmup_steps: 100
-- num_epochs: 10
 ### Training results
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
-| 9.1576        | 0.55  | 50   | 7.9518          |
-| 7.1585        | 1.1   | 100  | 6.6554          |
-| 6.4795        | 1.65  | 150  | 6.2877          |
-| 6.1004        | 2.19  | 200  | 5.8841          |
-| 5.3975        | 2.74  | 250  | 4.3378          |
-| 3.2884        | 3.29  | 300  | 2.1826          |
-| 1.7833        | 3.84  | 350  | 1.1134          |
-| 1.0053        | 4.39  | 400  | 0.6347          |
-| 0.6362        | 4.94  | 450  | 0.4108          |
-| 0.4388        | 5.49  | 500  | 0.2961          |
-| 0.3388        | 6.04  | 550  | 0.2316          |
-| 0.2713        | 6.58  | 600  | 0.1930          |
-| 0.235         | 7.13  | 650  | 0.1695          |
-| 0.2103        | 7.68  | 700  | 0.1550          |
-| 0.1953        | 8.23  | 750  | 0.1466          |
-| 0.1876        | 8.78  | 800  | 0.1422          |
-| 0.1834        | 9.33  | 850  | 0.1403          |
-| 0.1812        | 9.88  | 900  | 0.1400          |
 ### Framework versions

 This model is a fine-tuned version of [bert-base-uncased](https://huggingface.co/bert-base-uncased) on the wikitext dataset.
 It achieves the following results on the evaluation set:
+- Loss: 0.2449
 ## Model description
 The following hyperparameters were used during training:
 - learning_rate: 0.0001
+- train_batch_size: 32
+- eval_batch_size: 32
 - seed: 42
 - gradient_accumulation_steps: 8
+- total_train_batch_size: 256
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: cosine
 - lr_scheduler_warmup_steps: 100
+- num_epochs: 5
 ### Training results
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
+| 9.1704        | 0.27  | 50   | 8.0057          |
+| 7.2118        | 0.55  | 100  | 6.6834          |
+| 6.5244        | 0.82  | 150  | 6.3491          |
+| 6.2201        | 1.1   | 200  | 6.0229          |
+| 5.7189        | 1.37  | 250  | 5.1311          |
+| 4.1268        | 1.65  | 300  | 2.9582          |
+| 2.4963        | 1.92  | 350  | 1.7429          |
+| 1.5611        | 2.2   | 400  | 1.0743          |
+| 1.0537        | 2.47  | 450  | 0.7155          |
+| 0.7665        | 2.75  | 500  | 0.5189          |
+| 0.5947        | 3.02  | 550  | 0.4061          |
+| 0.4782        | 3.29  | 600  | 0.3396          |
+| 0.4161        | 3.57  | 650  | 0.2976          |
+| 0.3785        | 3.84  | 700  | 0.2718          |
+| 0.3491        | 4.12  | 750  | 0.2567          |
+| 0.3319        | 4.39  | 800  | 0.2488          |
+| 0.3286        | 4.67  | 850  | 0.2455          |
+| 0.326         | 4.94  | 900  | 0.2449          |
 ### Framework versions