End of training

Browse files

Files changed (4) hide show

README.md +21 -27
model-00001-of-00003.safetensors +1 -1
model-00002-of-00003.safetensors +1 -1
model-00003-of-00003.safetensors +1 -1

README.md CHANGED Viewed

@@ -21,8 +21,8 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [mistralai/Mistral-7B-Instruct-v0.1](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.1) on the super_glue dataset.
 It achieves the following results on the evaluation set:
-- Loss: 0.3115
-- Accuracy: 0.8735
 ## Model description
@@ -44,7 +44,7 @@ The following hyperparameters were used during training:
 - learning_rate: 1e-05
 - train_batch_size: 2
 - eval_batch_size: 4
-- seed: 1
 - distributed_type: multi-GPU
 - num_devices: 2
 - gradient_accumulation_steps: 2
@@ -58,30 +58,24 @@ The following hyperparameters were used during training:
 | Training Loss | Epoch | Step | Validation Loss | Accuracy |
 |:-------------:|:-----:|:----:|:---------------:|:--------:|
-| 0.369         | 0.05  | 50   | 0.4328          | 0.8120   |
-| 0.2708        | 0.1   | 100  | 0.4051          | 0.8283   |
-| 0.6276        | 0.15  | 150  | 0.4020          | 0.8452   |
-| 0.4395        | 0.2   | 200  | 0.3671          | 0.8452   |
-| 0.3282        | 0.25  | 250  | 0.3746          | 0.8445   |
-| 0.2967        | 0.3   | 300  | 0.3557          | 0.8523   |
-| 0.2483        | 0.35  | 350  | 0.3862          | 0.8622   |
-| 0.384         | 0.4   | 400  | 0.3765          | 0.8565   |
-| 0.334         | 0.45  | 450  | 0.3628          | 0.8601   |
-| 0.2671        | 0.5   | 500  | 0.3290          | 0.8664   |
-| 0.2478        | 0.55  | 550  | 0.3421          | 0.8650   |
-| 0.1814        | 0.6   | 600  | 0.3233          | 0.8693   |
-| 0.3332        | 0.65  | 650  | 0.3451          | 0.8728   |
-| 0.2063        | 0.7   | 700  | 0.3709          | 0.8678   |
-| 0.2614        | 0.75  | 750  | 0.3530          | 0.8763   |
-| 0.4273        | 0.8   | 800  | 0.3383          | 0.8721   |
-| 0.1319        | 0.85  | 850  | 0.3360          | 0.8735   |
-| 0.196         | 0.9   | 900  | 0.3096          | 0.8735   |
-| 0.3564        | 0.95  | 950  | 0.3354          | 0.8770   |
-| 0.3145        | 1.0   | 1000 | 0.3421          | 0.8784   |
-| 0.1344        | 1.05  | 1050 | 0.4273          | 0.8735   |
-| 0.4227        | 1.1   | 1100 | 0.3555          | 0.8707   |
-| 0.1696        | 1.15  | 1150 | 0.3399          | 0.8742   |
-| 0.5423        | 1.2   | 1200 | 0.3405          | 0.8813   |
 ### Framework versions

 This model is a fine-tuned version of [mistralai/Mistral-7B-Instruct-v0.1](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.1) on the super_glue dataset.
 It achieves the following results on the evaluation set:
+- Loss: 0.3297
+- Accuracy: 0.8700
 ## Model description
 - learning_rate: 1e-05
 - train_batch_size: 2
 - eval_batch_size: 4
+- seed: 2
 - distributed_type: multi-GPU
 - num_devices: 2
 - gradient_accumulation_steps: 2
 | Training Loss | Epoch | Step | Validation Loss | Accuracy |
 |:-------------:|:-----:|:----:|:---------------:|:--------:|
+| 0.4632        | 0.05  | 50   | 0.4840          | 0.7958   |
+| 0.3453        | 0.1   | 100  | 0.3888          | 0.8226   |
+| 0.2722        | 0.15  | 150  | 0.3590          | 0.8396   |
+| 0.3266        | 0.2   | 200  | 0.3811          | 0.8459   |
+| 0.3699        | 0.25  | 250  | 0.3534          | 0.8438   |
+| 0.3554        | 0.3   | 300  | 0.3378          | 0.8565   |
+| 0.1229        | 0.35  | 350  | 0.3368          | 0.8643   |
+| 0.3522        | 0.4   | 400  | 0.3424          | 0.8643   |
+| 0.2548        | 0.45  | 450  | 0.3467          | 0.8664   |
+| 0.2119        | 0.5   | 500  | 0.3439          | 0.8714   |
+| 0.2113        | 0.55  | 550  | 0.3518          | 0.8657   |
+| 0.2122        | 0.6   | 600  | 0.3110          | 0.8770   |
+| 0.3251        | 0.65  | 650  | 0.3323          | 0.8728   |
+| 0.2904        | 0.7   | 700  | 0.3152          | 0.8792   |
+| 0.6366        | 0.75  | 750  | 0.3502          | 0.8763   |
+| 0.4161        | 0.8   | 800  | 0.3250          | 0.8806   |
+| 0.1605        | 0.85  | 850  | 0.3258          | 0.8834   |
+| 0.271         | 0.9   | 900  | 0.3330          | 0.8848   |
 ### Framework versions

model-00001-of-00003.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:e019a5c835a08dc82fc4526058a1a1f832185a6aa75814f44319b088de4dcdab
 size 4943163992

 version https://git-lfs.github.com/spec/v1
+oid sha256:090ecb4822b1506c0dcb96ea17376b9a148335897194ddc19e9c7c9f732d73c1
 size 4943163992

model-00002-of-00003.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:2ec161cb1d01a097e6c7c6655e1323d9fe2ac8f828151c9fce8cec20189dd520
 size 4999821144

 version https://git-lfs.github.com/spec/v1
+oid sha256:3a8470935908d2fcec9ea4a4424652af2665b2a2aed595715fa77c625bfd628d
 size 4999821144

model-00003-of-00003.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:6e06a98b0bb7e099fc353489ada758a39a18535ec8f1278a2cddbc763d77e039
 size 4540517840

 version https://git-lfs.github.com/spec/v1
+oid sha256:752e20ee5f66e06138cd9dd583d0cc7f565868f05e8ded361ecfb5ce861e5add
 size 4540517840