fine-tuning-Phi2-with-webglm-qa-with-lora

Browse files

Files changed (4) hide show

README.md +23 -8
adapter_config.json +4 -4
adapter_model.safetensors +1 -1
training_args.bin +1 -1

README.md CHANGED Viewed

@@ -16,7 +16,7 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [microsoft/phi-2](https://huggingface.co/microsoft/phi-2) on an unknown dataset.
 It achieves the following results on the evaluation set:
-- Loss: 0.5112
 ## Model description
@@ -43,19 +43,34 @@ The following hyperparameters were used during training:
 - total_train_batch_size: 10
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
-- lr_scheduler_warmup_steps: 10
-- training_steps: 50
 - mixed_precision_training: Native AMP
 ### Training results
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
-| 8.0656        | 0.2   | 10   | 6.1207          |
-| 3.4473        | 0.4   | 20   | 0.9409          |
-| 0.6293        | 0.6   | 30   | 0.5648          |
-| 0.455         | 0.8   | 40   | 0.5257          |
-| 0.4325        | 1.0   | 50   | 0.5112          |
 ### Framework versions

 This model is a fine-tuned version of [microsoft/phi-2](https://huggingface.co/microsoft/phi-2) on an unknown dataset.
 It achieves the following results on the evaluation set:
+- Loss: 0.2084
 ## Model description
 - total_train_batch_size: 10
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
+- lr_scheduler_warmup_steps: 20
+- training_steps: 200
 - mixed_precision_training: Native AMP
 ### Training results
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
+| No log        | 0.2   | 10   | 7.3955          |
+| 7.0321        | 0.4   | 20   | 2.6772          |
+| 7.0321        | 0.6   | 30   | 0.5779          |
+| 0.751         | 0.8   | 40   | 0.5021          |
+| 0.751         | 1.0   | 50   | 0.4501          |
+| 0.3719        | 1.2   | 60   | 0.4067          |
+| 0.3719        | 1.39  | 70   | 0.3655          |
+| 0.3398        | 1.59  | 80   | 0.3302          |
+| 0.3398        | 1.79  | 90   | 0.3029          |
+| 0.2285        | 1.99  | 100  | 0.2831          |
+| 0.2285        | 2.19  | 110  | 0.2666          |
+| 0.2156        | 2.39  | 120  | 0.2549          |
+| 0.2156        | 2.59  | 130  | 0.2435          |
+| 0.2049        | 2.79  | 140  | 0.2324          |
+| 0.2049        | 2.99  | 150  | 0.2246          |
+| 0.177         | 3.19  | 160  | 0.2197          |
+| 0.177         | 3.39  | 170  | 0.2149          |
+| 0.1745        | 3.59  | 180  | 0.2112          |
+| 0.1745        | 3.78  | 190  | 0.2091          |
+| 0.1742        | 3.98  | 200  | 0.2084          |
 ### Framework versions

adapter_config.json CHANGED Viewed

@@ -19,12 +19,12 @@
   "rank_pattern": {},
   "revision": null,
   "target_modules": [
     "k_proj",
     "v_proj",
-    "fc1",
-    "fc2",
-    "q_proj",
-    "dense"
   ],
   "task_type": "CAUSAL_LM"
 }

   "rank_pattern": {},
   "revision": null,
   "target_modules": [
+    "q_proj",
+    "fc1",
     "k_proj",
     "v_proj",
+    "dense",
+    "fc2"
   ],
   "task_type": "CAUSAL_LM"
 }

adapter_model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:20b26d06164cbc59c6eb5c608e01279ec6aafc6e98021551dc840350210cb582
 size 94422368

 version https://git-lfs.github.com/spec/v1
+oid sha256:642bc97daf234a6c3585b392cde64c46485745914a0c28276e04be19cc11c18c
 size 94422368

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:9e5612a777b0fa79303c9a2432f3a86539e7e19bb8723dde492cfaf93aff3767
 size 4283

 version https://git-lfs.github.com/spec/v1
+oid sha256:7a033dae6c1f16aa354b5a33bbad2a7b0332a8a83d882ff79922e8195f69e1e9
 size 4283