fine-tuning-Phi2-with-webglm-qa-with-lora

Browse files

Files changed (4) hide show

README.md +33 -23
adapter_config.json +3 -3
adapter_model.safetensors +1 -1
training_args.bin +1 -1

README.md CHANGED Viewed

@@ -16,7 +16,7 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [microsoft/phi-2](https://huggingface.co/microsoft/phi-2) on an unknown dataset.
 It achieves the following results on the evaluation set:
-- Loss: 0.2084
 ## Model description
@@ -43,34 +43,44 @@ The following hyperparameters were used during training:
 - total_train_batch_size: 10
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
-- lr_scheduler_warmup_steps: 20
-- training_steps: 200
 - mixed_precision_training: Native AMP
 ### Training results
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
-| No log        | 0.2   | 10   | 7.3955          |
-| 7.0321        | 0.4   | 20   | 2.6772          |
-| 7.0321        | 0.6   | 30   | 0.5779          |
-| 0.751         | 0.8   | 40   | 0.5021          |
-| 0.751         | 1.0   | 50   | 0.4501          |
-| 0.3719        | 1.2   | 60   | 0.4067          |
-| 0.3719        | 1.39  | 70   | 0.3655          |
-| 0.3398        | 1.59  | 80   | 0.3302          |
-| 0.3398        | 1.79  | 90   | 0.3029          |
-| 0.2285        | 1.99  | 100  | 0.2831          |
-| 0.2285        | 2.19  | 110  | 0.2666          |
-| 0.2156        | 2.39  | 120  | 0.2549          |
-| 0.2156        | 2.59  | 130  | 0.2435          |
-| 0.2049        | 2.79  | 140  | 0.2324          |
-| 0.2049        | 2.99  | 150  | 0.2246          |
-| 0.177         | 3.19  | 160  | 0.2197          |
-| 0.177         | 3.39  | 170  | 0.2149          |
-| 0.1745        | 3.59  | 180  | 0.2112          |
-| 0.1745        | 3.78  | 190  | 0.2091          |
-| 0.1742        | 3.98  | 200  | 0.2084          |
 ### Framework versions

 This model is a fine-tuned version of [microsoft/phi-2](https://huggingface.co/microsoft/phi-2) on an unknown dataset.
 It achieves the following results on the evaluation set:
+- Loss: 0.1475
 ## Model description
 - total_train_batch_size: 10
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
+- lr_scheduler_warmup_steps: 30
+- training_steps: 300
 - mixed_precision_training: Native AMP
 ### Training results
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
+| No log        | 0.2   | 10   | 7.7121          |
+| 7.4808        | 0.4   | 20   | 4.3398          |
+| 7.4808        | 0.6   | 30   | 0.6362          |
+| 1.5296        | 0.8   | 40   | 0.5285          |
+| 1.5296        | 1.0   | 50   | 0.4668          |
+| 0.3883        | 1.2   | 60   | 0.4194          |
+| 0.3883        | 1.39  | 70   | 0.3737          |
+| 0.3482        | 1.59  | 80   | 0.3338          |
+| 0.3482        | 1.79  | 90   | 0.3036          |
+| 0.2296        | 1.99  | 100  | 0.2802          |
+| 0.2296        | 2.19  | 110  | 0.2595          |
+| 0.212         | 2.39  | 120  | 0.2452          |
+| 0.212         | 2.59  | 130  | 0.2307          |
+| 0.1943        | 2.79  | 140  | 0.2145          |
+| 0.1943        | 2.99  | 150  | 0.2031          |
+| 0.1635        | 3.19  | 160  | 0.1957          |
+| 0.1635        | 3.39  | 170  | 0.1857          |
+| 0.1543        | 3.59  | 180  | 0.1788          |
+| 0.1543        | 3.78  | 190  | 0.1732          |
+| 0.1492        | 3.98  | 200  | 0.1687          |
+| 0.1492        | 4.18  | 210  | 0.1650          |
+| 0.1327        | 4.38  | 220  | 0.1632          |
+| 0.1327        | 4.58  | 230  | 0.1597          |
+| 0.1359        | 4.78  | 240  | 0.1552          |
+| 0.1359        | 4.98  | 250  | 0.1522          |
+| 0.1367        | 5.18  | 260  | 0.1506          |
+| 0.1367        | 5.38  | 270  | 0.1495          |
+| 0.1204        | 5.58  | 280  | 0.1484          |
+| 0.1204        | 5.78  | 290  | 0.1477          |
+| 0.125         | 5.98  | 300  | 0.1475          |
 ### Framework versions

adapter_config.json CHANGED Viewed

@@ -19,12 +19,12 @@
   "rank_pattern": {},
   "revision": null,
   "target_modules": [
-    "q_proj",
-    "fc1",
     "k_proj",
     "v_proj",
     "dense",
-    "fc2"
   ],
   "task_type": "CAUSAL_LM"
 }

   "rank_pattern": {},
   "revision": null,
   "target_modules": [
     "k_proj",
+    "q_proj",
     "v_proj",
+    "fc2",
     "dense",
+    "fc1"
   ],
   "task_type": "CAUSAL_LM"
 }

adapter_model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:642bc97daf234a6c3585b392cde64c46485745914a0c28276e04be19cc11c18c
 size 94422368

 version https://git-lfs.github.com/spec/v1
+oid sha256:68d52b9ee5f1f1b896b6562f771ede7ebf1806ad405654c4f70b3cd896ce5c4f
 size 94422368

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:7a033dae6c1f16aa354b5a33bbad2a7b0332a8a83d882ff79922e8195f69e1e9
 size 4283

 version https://git-lfs.github.com/spec/v1
+oid sha256:d4e6f08bbc3be4f8ef90b14fec37406472b33edd35f297e638593e1cbf0bbb83
 size 4283