fine-tuning-Phi2-with-webglm-qa-with-lora

Browse files

Files changed (4) hide show

README.md +53 -33
adapter_config.json +3 -3
adapter_model.safetensors +1 -1
training_args.bin +1 -1

README.md CHANGED Viewed

@@ -16,7 +16,7 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [microsoft/phi-2](https://huggingface.co/microsoft/phi-2) on an unknown dataset.
 It achieves the following results on the evaluation set:
-- Loss: 0.1475
 ## Model description
@@ -43,44 +43,64 @@ The following hyperparameters were used during training:
 - total_train_batch_size: 10
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
-- lr_scheduler_warmup_steps: 30
-- training_steps: 300
 - mixed_precision_training: Native AMP
 ### Training results
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
-| No log        | 0.2   | 10   | 7.7121          |
-| 7.4808        | 0.4   | 20   | 4.3398          |
-| 7.4808        | 0.6   | 30   | 0.6362          |
-| 1.5296        | 0.8   | 40   | 0.5285          |
-| 1.5296        | 1.0   | 50   | 0.4668          |
-| 0.3883        | 1.2   | 60   | 0.4194          |
-| 0.3883        | 1.39  | 70   | 0.3737          |
-| 0.3482        | 1.59  | 80   | 0.3338          |
-| 0.3482        | 1.79  | 90   | 0.3036          |
-| 0.2296        | 1.99  | 100  | 0.2802          |
-| 0.2296        | 2.19  | 110  | 0.2595          |
-| 0.212         | 2.39  | 120  | 0.2452          |
-| 0.212         | 2.59  | 130  | 0.2307          |
-| 0.1943        | 2.79  | 140  | 0.2145          |
-| 0.1943        | 2.99  | 150  | 0.2031          |
-| 0.1635        | 3.19  | 160  | 0.1957          |
-| 0.1635        | 3.39  | 170  | 0.1857          |
-| 0.1543        | 3.59  | 180  | 0.1788          |
-| 0.1543        | 3.78  | 190  | 0.1732          |
-| 0.1492        | 3.98  | 200  | 0.1687          |
-| 0.1492        | 4.18  | 210  | 0.1650          |
-| 0.1327        | 4.38  | 220  | 0.1632          |
-| 0.1327        | 4.58  | 230  | 0.1597          |
-| 0.1359        | 4.78  | 240  | 0.1552          |
-| 0.1359        | 4.98  | 250  | 0.1522          |
-| 0.1367        | 5.18  | 260  | 0.1506          |
-| 0.1367        | 5.38  | 270  | 0.1495          |
-| 0.1204        | 5.58  | 280  | 0.1484          |
-| 0.1204        | 5.78  | 290  | 0.1477          |
-| 0.125         | 5.98  | 300  | 0.1475          |
 ### Framework versions

 This model is a fine-tuned version of [microsoft/phi-2](https://huggingface.co/microsoft/phi-2) on an unknown dataset.
 It achieves the following results on the evaluation set:
+- Loss: 0.1008
 ## Model description
 - total_train_batch_size: 10
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
+- lr_scheduler_warmup_steps: 50
+- training_steps: 500
 - mixed_precision_training: Native AMP
 ### Training results
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
+| No log        | 0.2   | 10   | 7.9734          |
+| No log        | 0.4   | 20   | 6.2577          |
+| No log        | 0.6   | 30   | 2.8966          |
+| No log        | 0.8   | 40   | 0.5650          |
+| 4.4766        | 1.0   | 50   | 0.5106          |
+| 4.4766        | 1.2   | 60   | 0.4513          |
+| 4.4766        | 1.39  | 70   | 0.4008          |
+| 4.4766        | 1.59  | 80   | 0.3539          |
+| 4.4766        | 1.79  | 90   | 0.3156          |
+| 0.3251        | 1.99  | 100  | 0.2877          |
+| 0.3251        | 2.19  | 110  | 0.2631          |
+| 0.3251        | 2.39  | 120  | 0.2464          |
+| 0.3251        | 2.59  | 130  | 0.2303          |
+| 0.3251        | 2.79  | 140  | 0.2117          |
+| 0.1953        | 2.99  | 150  | 0.1982          |
+| 0.1953        | 3.19  | 160  | 0.1892          |
+| 0.1953        | 3.39  | 170  | 0.1767          |
+| 0.1953        | 3.59  | 180  | 0.1687          |
+| 0.1953        | 3.78  | 190  | 0.1616          |
+| 0.1469        | 3.98  | 200  | 0.1559          |
+| 0.1469        | 4.18  | 210  | 0.1507          |
+| 0.1469        | 4.38  | 220  | 0.1484          |
+| 0.1469        | 4.58  | 230  | 0.1421          |
+| 0.1469        | 4.78  | 240  | 0.1353          |
+| 0.1212        | 4.98  | 250  | 0.1309          |
+| 0.1212        | 5.18  | 260  | 0.1292          |
+| 0.1212        | 5.38  | 270  | 0.1267          |
+| 0.1212        | 5.58  | 280  | 0.1231          |
+| 0.1212        | 5.78  | 290  | 0.1218          |
+| 0.1059        | 5.98  | 300  | 0.1177          |
+| 0.1059        | 6.18  | 310  | 0.1154          |
+| 0.1059        | 6.37  | 320  | 0.1151          |
+| 0.1059        | 6.57  | 330  | 0.1144          |
+| 0.1059        | 6.77  | 340  | 0.1114          |
+| 0.0936        | 6.97  | 350  | 0.1098          |
+| 0.0936        | 7.17  | 360  | 0.1093          |
+| 0.0936        | 7.37  | 370  | 0.1071          |
+| 0.0936        | 7.57  | 380  | 0.1063          |
+| 0.0936        | 7.77  | 390  | 0.1060          |
+| 0.0881        | 7.97  | 400  | 0.1049          |
+| 0.0881        | 8.17  | 410  | 0.1042          |
+| 0.0881        | 8.37  | 420  | 0.1035          |
+| 0.0881        | 8.57  | 430  | 0.1032          |
+| 0.0881        | 8.76  | 440  | 0.1028          |
+| 0.0819        | 8.96  | 450  | 0.1019          |
+| 0.0819        | 9.16  | 460  | 0.1014          |
+| 0.0819        | 9.36  | 470  | 0.1012          |
+| 0.0819        | 9.56  | 480  | 0.1010          |
+| 0.0819        | 9.76  | 490  | 0.1008          |
+| 0.079         | 9.96  | 500  | 0.1008          |
 ### Framework versions

adapter_config.json CHANGED Viewed

@@ -21,10 +21,10 @@
   "target_modules": [
     "k_proj",
     "q_proj",
-    "v_proj",
     "fc2",
-    "dense",
-    "fc1"
   ],
   "task_type": "CAUSAL_LM"
 }

   "target_modules": [
     "k_proj",
     "q_proj",
     "fc2",
+    "v_proj",
+    "fc1",
+    "dense"
   ],
   "task_type": "CAUSAL_LM"
 }

adapter_model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:68d52b9ee5f1f1b896b6562f771ede7ebf1806ad405654c4f70b3cd896ce5c4f
 size 94422368

 version https://git-lfs.github.com/spec/v1
+oid sha256:0893ceadbbb8b438bd07e6464913f985f2fa6107ff3d249ce2e1a812c4cd4e1c
 size 94422368

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:d4e6f08bbc3be4f8ef90b14fec37406472b33edd35f297e638593e1cbf0bbb83
 size 4283

 version https://git-lfs.github.com/spec/v1
+oid sha256:30bdac43ad19d93ba5e682a664188d3572cb9c34d72d5639b1b67e5719607246
 size 4283