santhoshmlops/Skai_Mistral-7B-Instruct-v0.2-SFT

Files changed (6) hide show

README.md CHANGED Viewed

@@ -16,7 +16,7 @@ should probably proofread and complete it, then remove this comment. -->
 # Mistral-7B-Instruct-v0.2-SFT
-This model is a fine-tuned version of [mistralai/Mistral-7B-Instruct-v0.2](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2) on the None dataset.
 ## Model description
@@ -36,13 +36,15 @@ More information needed
 The following hyperparameters were used during training:
 - learning_rate: 0.0002
-- train_batch_size: 1
 - eval_batch_size: 8
 - seed: 42
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: cosine
 - lr_scheduler_warmup_ratio: 0.03
-- training_steps: 100
 - mixed_precision_training: Native AMP
 ### Training results

 # Mistral-7B-Instruct-v0.2-SFT
+This model is a fine-tuned version of [mistralai/Mistral-7B-Instruct-v0.2](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2) on an unknown dataset.
 ## Model description
 The following hyperparameters were used during training:
 - learning_rate: 0.0002
+- train_batch_size: 3
 - eval_batch_size: 8
 - seed: 42
+- gradient_accumulation_steps: 3
+- total_train_batch_size: 9
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: cosine
 - lr_scheduler_warmup_ratio: 0.03
+- training_steps: 500
 - mixed_precision_training: Native AMP
 ### Training results

adapter_config.json CHANGED Viewed

@@ -19,14 +19,14 @@
   "rank_pattern": {},
   "revision": null,
   "target_modules": [
-    "down_proj",
-    "up_proj",
     "lm_head",
-    "gate_proj",
     "v_proj",
     "o_proj",
-    "q_proj",
-    "k_proj"
   ],
   "task_type": "CAUSAL_LM",
   "use_dora": false,

   "rank_pattern": {},
   "revision": null,
   "target_modules": [
+    "q_proj",
     "lm_head",
+    "up_proj",
+    "k_proj",
     "v_proj",
+    "down_proj",
     "o_proj",
+    "gate_proj"
   ],
   "task_type": "CAUSAL_LM",
   "use_dora": false,

adapter_model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:81cc19df5eff86da7878a3aa1c39c4fcbed98834f416c8d1501721383508e97e
 size 694431312

 version https://git-lfs.github.com/spec/v1
+oid sha256:9113246da93294d32732f22a0ccf25a598f31edcae1973ce78fc70a8549e40ba
 size 694431312

runs/Mar17_17-19-42_a9b8762ddd23/events.out.tfevents.1710696178.a9b8762ddd23.1374.0 ADDED Viewed

+version https://git-lfs.github.com/spec/v1
+oid sha256:5fc9304deea83e102719476c936f4231574f35df8a3a2fed8642cd855888c92a
+size 7526

tokenizer.json CHANGED Viewed

@@ -2,7 +2,7 @@
   "version": "1.0",
   "truncation": {
     "direction": "Right",
-    "max_length": 512,
     "strategy": "LongestFirst",
     "stride": 0
   },

   "version": "1.0",
   "truncation": {
     "direction": "Right",
+    "max_length": 1024,
     "strategy": "LongestFirst",
     "stride": 0
   },

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:fd6a4f7db934f2ca9c20d7c584e721b4ecefd9e7150e73f2c3edac8c27447893
 size 4920

 version https://git-lfs.github.com/spec/v1
+oid sha256:c83325fed57d2316c032abc70558b09bab1ae790377ebd3b4fa1934acccd2ef3
 size 4920