shawgpt-ft-lr0.0002-wd0.05

Files changed (3) hide show

README.md CHANGED Viewed

@@ -16,7 +16,7 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [TheBloke/Mistral-7B-Instruct-v0.2-GPTQ](https://huggingface.co/TheBloke/Mistral-7B-Instruct-v0.2-GPTQ) on an unknown dataset.
 It achieves the following results on the evaluation set:
-- Loss: 3.2816
 ## Model description
@@ -51,22 +51,22 @@ The following hyperparameters were used during training:
 | Training Loss | Epoch  | Step | Validation Loss |
 |:-------------:|:------:|:----:|:---------------:|
-| 8.5018        | 0.5714 | 1    | 4.2401          |
-| 8.538         | 1.5714 | 2    | 4.1612          |
-| 8.0957        | 2.5714 | 3    | 3.9772          |
-| 7.788         | 3.5714 | 4    | 3.8079          |
-| 7.4088        | 4.5714 | 5    | 3.6647          |
-| 7.1719        | 5.5714 | 6    | 3.5412          |
-| 6.9592        | 6.5714 | 7    | 3.4397          |
-| 6.7737        | 7.5714 | 8    | 3.3620          |
-| 6.6492        | 8.5714 | 9    | 3.3086          |
-| 3.2214        | 9.5714 | 10   | 3.2816          |
 ### Framework versions
 - PEFT 0.14.0
-- Transformers 4.48.3
-- Pytorch 2.5.1+cu124
 - Datasets 3.3.1
 - Tokenizers 0.21.0

 This model is a fine-tuned version of [TheBloke/Mistral-7B-Instruct-v0.2-GPTQ](https://huggingface.co/TheBloke/Mistral-7B-Instruct-v0.2-GPTQ) on an unknown dataset.
 It achieves the following results on the evaluation set:
+- Loss: 3.2789
 ## Model description
 | Training Loss | Epoch  | Step | Validation Loss |
 |:-------------:|:------:|:----:|:---------------:|
+| 25.5434       | 0.5714 | 1    | 4.2401          |
+| 25.7032       | 1.5714 | 2    | 4.1611          |
+| 24.6801       | 2.5714 | 3    | 3.9766          |
+| 23.5885       | 3.5714 | 4    | 3.8072          |
+| 22.457        | 4.5714 | 5    | 3.6633          |
+| 21.6522       | 5.5714 | 6    | 3.5395          |
+| 20.926        | 6.5714 | 7    | 3.4376          |
+| 20.4754       | 7.5714 | 8    | 3.3595          |
+| 20.0631       | 8.5714 | 9    | 3.3062          |
+| 12.9153       | 9.5714 | 10   | 3.2789          |
 ### Framework versions
 - PEFT 0.14.0
+- Transformers 4.47.1
+- Pytorch 2.5.1+cu121
 - Datasets 3.3.1
 - Tokenizers 0.21.0

runs/Feb18_19-27-42_9a3887f9873e/events.out.tfevents.1739906863.9a3887f9873e.3474.12 ADDED Viewed

+version https://git-lfs.github.com/spec/v1
+oid sha256:e9b6a43bccc82987d3d3da3820b380dd6d557966975eee511f6bc23f819fda57
+size 10847

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:4689191036ea0e86a2c3a495d625281bf1f49b0bfaddd157b6d58c0eb9186f76
 size 5368

 version https://git-lfs.github.com/spec/v1
+oid sha256:084ef5691a22a82309fb0d7f6919c0117eac5243b654fcf197bc8f1d23246e2b
 size 5368