End of training

Browse files

Files changed (4) hide show

README.md +27 -28
model-00001-of-00003.safetensors +1 -1
model-00002-of-00003.safetensors +1 -1
model-00003-of-00003.safetensors +1 -1

README.md CHANGED Viewed

@@ -21,8 +21,8 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [mistralai/Mistral-7B-Instruct-v0.1](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.1) on the super_glue dataset.
 It achieves the following results on the evaluation set:
-- Loss: 0.3093
-- Accuracy: 0.8707
 ## Model description
@@ -44,7 +44,7 @@ The following hyperparameters were used during training:
 - learning_rate: 1e-05
 - train_batch_size: 2
 - eval_batch_size: 4
-- seed: 0
 - distributed_type: multi-GPU
 - num_devices: 2
 - gradient_accumulation_steps: 2
@@ -58,31 +58,30 @@ The following hyperparameters were used during training:
 | Training Loss | Epoch | Step | Validation Loss | Accuracy |
 |:-------------:|:-----:|:----:|:---------------:|:--------:|
-| 0.5295        | 0.05  | 50   | 0.4429          | 0.8134   |
-| 0.3797        | 0.1   | 100  | 0.4171          | 0.8269   |
-| 0.1387        | 0.15  | 150  | 0.4193          | 0.8325   |
-| 0.3203        | 0.2   | 200  | 0.3964          | 0.8417   |
-| 0.2488        | 0.25  | 250  | 0.3915          | 0.8389   |
-| 0.2149        | 0.3   | 300  | 0.3729          | 0.8438   |
-| 0.2169        | 0.35  | 350  | 0.3606          | 0.8601   |
-| 0.4027        | 0.4   | 400  | 0.3680          | 0.8657   |
-| 0.3494        | 0.45  | 450  | 0.4100          | 0.8601   |
-| 0.1724        | 0.5   | 500  | 0.3634          | 0.8643   |
-| 0.346         | 0.55  | 550  | 0.4234          | 0.8671   |
-| 0.2071        | 0.6   | 600  | 0.3717          | 0.8629   |
-| 0.2983        | 0.65  | 650  | 0.3482          | 0.8622   |
-| 0.1517        | 0.7   | 700  | 0.3576          | 0.8728   |
-| 0.4414        | 0.75  | 750  | 0.3222          | 0.8650   |
-| 0.4264        | 0.8   | 800  | 0.3564          | 0.8820   |
-| 0.1045        | 0.85  | 850  | 0.3393          | 0.8721   |
-| 0.4414        | 0.9   | 900  | 0.3743          | 0.8671   |
-| 0.4848        | 0.95  | 950  | 0.3192          | 0.8657   |
-| 0.3788        | 1.0   | 1000 | 0.3339          | 0.8756   |
-| 0.6698        | 1.05  | 1050 | 0.3449          | 0.8728   |
-| 0.1779        | 1.1   | 1100 | 0.3368          | 0.8792   |
-| 0.397         | 1.15  | 1150 | 0.3678          | 0.8792   |
-| 0.0648        | 1.2   | 1200 | 0.3712          | 0.8749   |
-| 0.2106        | 1.25  | 1250 | 0.3511          | 0.8721   |
 ### Framework versions

 This model is a fine-tuned version of [mistralai/Mistral-7B-Instruct-v0.1](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.1) on the super_glue dataset.
 It achieves the following results on the evaluation set:
+- Loss: 0.3115
+- Accuracy: 0.8735
 ## Model description
 - learning_rate: 1e-05
 - train_batch_size: 2
 - eval_batch_size: 4
+- seed: 1
 - distributed_type: multi-GPU
 - num_devices: 2
 - gradient_accumulation_steps: 2
 | Training Loss | Epoch | Step | Validation Loss | Accuracy |
 |:-------------:|:-----:|:----:|:---------------:|:--------:|
+| 0.369         | 0.05  | 50   | 0.4328          | 0.8120   |
+| 0.2708        | 0.1   | 100  | 0.4051          | 0.8283   |
+| 0.6276        | 0.15  | 150  | 0.4020          | 0.8452   |
+| 0.4395        | 0.2   | 200  | 0.3671          | 0.8452   |
+| 0.3282        | 0.25  | 250  | 0.3746          | 0.8445   |
+| 0.2967        | 0.3   | 300  | 0.3557          | 0.8523   |
+| 0.2483        | 0.35  | 350  | 0.3862          | 0.8622   |
+| 0.384         | 0.4   | 400  | 0.3765          | 0.8565   |
+| 0.334         | 0.45  | 450  | 0.3628          | 0.8601   |
+| 0.2671        | 0.5   | 500  | 0.3290          | 0.8664   |
+| 0.2478        | 0.55  | 550  | 0.3421          | 0.8650   |
+| 0.1814        | 0.6   | 600  | 0.3233          | 0.8693   |
+| 0.3332        | 0.65  | 650  | 0.3451          | 0.8728   |
+| 0.2063        | 0.7   | 700  | 0.3709          | 0.8678   |
+| 0.2614        | 0.75  | 750  | 0.3530          | 0.8763   |
+| 0.4273        | 0.8   | 800  | 0.3383          | 0.8721   |
+| 0.1319        | 0.85  | 850  | 0.3360          | 0.8735   |
+| 0.196         | 0.9   | 900  | 0.3096          | 0.8735   |
+| 0.3564        | 0.95  | 950  | 0.3354          | 0.8770   |
+| 0.3145        | 1.0   | 1000 | 0.3421          | 0.8784   |
+| 0.1344        | 1.05  | 1050 | 0.4273          | 0.8735   |
+| 0.4227        | 1.1   | 1100 | 0.3555          | 0.8707   |
+| 0.1696        | 1.15  | 1150 | 0.3399          | 0.8742   |
+| 0.5423        | 1.2   | 1200 | 0.3405          | 0.8813   |
 ### Framework versions

model-00001-of-00003.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:77bd6c46d43a529cb0b02f18114a54f915db6dc45007707cd70118305bf38f3c
 size 4943163992

 version https://git-lfs.github.com/spec/v1
+oid sha256:e019a5c835a08dc82fc4526058a1a1f832185a6aa75814f44319b088de4dcdab
 size 4943163992

model-00002-of-00003.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:f96b5186ae311378d272f07dc7cd6306cc4b5d971dcad6f8d29df5a9c5502816
 size 4999821144

 version https://git-lfs.github.com/spec/v1
+oid sha256:2ec161cb1d01a097e6c7c6655e1323d9fe2ac8f828151c9fce8cec20189dd520
 size 4999821144

model-00003-of-00003.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:54ae1cabb63f4fc65fb566649fda211fe62c64bdfe410c7ece8035dfc4cd0d04
 size 4540517840

 version https://git-lfs.github.com/spec/v1
+oid sha256:6e06a98b0bb7e099fc353489ada758a39a18535ec8f1278a2cddbc763d77e039
 size 4540517840