End of training

Files changed (4) hide show

README.md CHANGED Viewed

@@ -18,10 +18,10 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [meta-llama/Llama-3.2-3B](https://huggingface.co/meta-llama/Llama-3.2-3B) on the None dataset.
 It achieves the following results on the evaluation set:
-- Loss: 0.9166
-- Model Preparation Time: 0.0071
-- Accuracy: 0.7913
-- F1 Macro: 0.8020
 ## Model description
@@ -47,17 +47,17 @@ The following hyperparameters were used during training:
 - optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
 - lr_scheduler_type: cosine
 - lr_scheduler_warmup_ratio: 0.0960656012161834
-- num_epochs: 6
 - mixed_precision_training: Native AMP
 ### Training results
 | Training Loss | Epoch | Step | Validation Loss | Model Preparation Time | Accuracy | F1 Macro |
 |:-------------:|:-----:|:----:|:---------------:|:----------------------:|:--------:|:--------:|
-| 0.4157        | 1.0   | 1035 | 0.5256          | 0.0071                 | 0.7787   | 0.7822   |
-| 0.3687        | 2.0   | 2070 | 0.4594          | 0.0071                 | 0.8135   | 0.8205   |
-| 0.3644        | 3.0   | 3105 | 0.5025          | 0.0071                 | 0.8043   | 0.8130   |
-| 0.1143        | 4.0   | 4140 | 0.7857          | 0.0071                 | 0.8116   | 0.8213   |
 ### Framework versions

 This model is a fine-tuned version of [meta-llama/Llama-3.2-3B](https://huggingface.co/meta-llama/Llama-3.2-3B) on the None dataset.
 It achieves the following results on the evaluation set:
+- Loss: 0.4762
+- Model Preparation Time: 0.0067
+- Accuracy: 0.8388
+- F1 Macro: 0.8459
 ## Model description
 - optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
 - lr_scheduler_type: cosine
 - lr_scheduler_warmup_ratio: 0.0960656012161834
+- num_epochs: 7
 - mixed_precision_training: Native AMP
 ### Training results
 | Training Loss | Epoch | Step | Validation Loss | Model Preparation Time | Accuracy | F1 Macro |
 |:-------------:|:-----:|:----:|:---------------:|:----------------------:|:--------:|:--------:|
+| 0.4458        | 1.0   | 735  | 0.4802          | 0.0067                 | 0.8034   | 0.8130   |
+| 0.3723        | 2.0   | 1470 | 0.4281          | 0.0067                 | 0.8184   | 0.8270   |
+| 0.2744        | 3.0   | 2205 | 0.4820          | 0.0067                 | 0.8218   | 0.8305   |
+| 0.1358        | 4.0   | 2940 | 0.6884          | 0.0067                 | 0.8156   | 0.8254   |
 ### Framework versions

adapter_config.json CHANGED Viewed

@@ -26,13 +26,13 @@
   "rank_pattern": {},
   "revision": null,
   "target_modules": [
     "k_proj",
     "gate_proj",
-    "v_proj",
-    "down_proj",
-    "q_proj",
-    "o_proj",
-    "up_proj"
   ],
   "task_type": "SEQ_CLS",
   "use_dora": false,

   "rank_pattern": {},
   "revision": null,
   "target_modules": [
+    "o_proj",
+    "q_proj",
+    "v_proj",
     "k_proj",
+    "up_proj",
     "gate_proj",
+    "down_proj"
   ],
   "task_type": "SEQ_CLS",
   "use_dora": false,

adapter_model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:6c07fffc4faa44ec8d689f29ef7ac944872e9b968cd7c4fb7924cd3911a5b201
 size 97356792

 version https://git-lfs.github.com/spec/v1
+oid sha256:2352d8ac9d5c1991e9424f539c8ebf3e21cf59ca537d7affdf7212ed82e2d42e
 size 97356792

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:2092158b350f240b143e5b0dc6b9f31bab436f9f51cafd8826cc29e751663f89
 size 5304

 version https://git-lfs.github.com/spec/v1
+oid sha256:bbe76161fbe25f804e9304c0c2942ba05b7a4e0feaf3a51c72f5698a592fc6c0
 size 5304