codegood/Mistral_instruct_QA/

Files changed (4) hide show

README.md CHANGED Viewed

@@ -15,7 +15,12 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [bn22/Mistral-7B-Instruct-v0.1-sharded](https://huggingface.co/bn22/Mistral-7B-Instruct-v0.1-sharded) on the None dataset.
 It achieves the following results on the evaluation set:
-- Loss: 1.9447
 ## Model description
@@ -41,19 +46,7 @@ The following hyperparameters were used during training:
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: cosine
 - lr_scheduler_warmup_ratio: 0.05
-- training_steps: 600
-### Training results
-| Training Loss | Epoch | Step | Validation Loss |
-|:-------------:|:-----:|:----:|:---------------:|
-| 2.4418        | 0.07  | 100  | 2.2922          |
-| 2.1884        | 0.15  | 200  | 2.1105          |
-| 2.0871        | 0.22  | 300  | 2.0371          |
-| 2.011         | 0.29  | 400  | 1.9835          |
-| 1.9346        | 0.37  | 500  | 1.9517          |
-| 1.9721        | 0.44  | 600  | 1.9447          |
 ### Framework versions

 This model is a fine-tuned version of [bn22/Mistral-7B-Instruct-v0.1-sharded](https://huggingface.co/bn22/Mistral-7B-Instruct-v0.1-sharded) on the None dataset.
 It achieves the following results on the evaluation set:
+- eval_loss: 0.5257
+- eval_runtime: 448.2658
+- eval_samples_per_second: 2.231
+- eval_steps_per_second: 0.558
+- epoch: 2.32
+- step: 3000
 ## Model description
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: cosine
 - lr_scheduler_warmup_ratio: 0.05
+- num_epochs: 3.0
 ### Framework versions

adapter_config.json CHANGED Viewed

@@ -19,11 +19,11 @@
   "rank_pattern": {},
   "revision": null,
   "target_modules": [
-    "k_proj",
     "q_proj",
-    "o_proj",
     "v_proj",
-    "gate_proj"
   ],
   "task_type": "CAUSAL_LM"
 }

   "rank_pattern": {},
   "revision": null,
   "target_modules": [
+    "gate_proj",
     "q_proj",
+    "k_proj",
     "v_proj",
+    "o_proj"
   ],
   "task_type": "CAUSAL_LM"
 }

adapter_model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:61965e4bd4a997696781f46af81b9860bd3b6f9c9dbc2d790a3a26efdf4cc4ba
 size 369142184

 version https://git-lfs.github.com/spec/v1
+oid sha256:d2e427316d0fff899f219be68978450b7eb2aaddbea4e631acf123d4c0efb22e
 size 369142184

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:1c78c6d371e675608a6248617f3f15cef6616718c149eb7f6252e5188c29580a
 size 4091

 version https://git-lfs.github.com/spec/v1
+oid sha256:34bfb87bcd63286e412b8c4a791e932f8a87efc6d0850f0af666080c87ba8468
 size 4091