DLingo
/

qwen2-2b-instruct-trl-sft-mrg

@@ -18,7 +18,7 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [Qwen/Qwen2-VL-2B-Instruct](https://huggingface.co/Qwen/Qwen2-VL-2B-Instruct) on an unknown dataset.
 It achieves the following results on the evaluation set:
-- Loss: 0.6032
 ## Model description
@@ -38,10 +38,10 @@ More information needed
 The following hyperparameters were used during training:
 - learning_rate: 0.0001
-- train_batch_size: 2
-- eval_batch_size: 2
 - seed: 42
-- gradient_accumulation_steps: 16
 - total_train_batch_size: 32
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: constant
@@ -52,25 +52,25 @@ The following hyperparameters were used during training:
 | Training Loss | Epoch   | Step | Validation Loss |
 |:-------------:|:-------:|:----:|:---------------:|
-| 2.5999        | 0.7729  | 50   | 1.5728          |
-| 1.2619        | 1.5459  | 100  | 1.0742          |
-| 0.9938        | 2.3188  | 150  | 0.8737          |
-| 0.8441        | 3.0918  | 200  | 0.7896          |
-| 0.7839        | 3.8647  | 250  | 0.7490          |
-| 0.7418        | 4.6377  | 300  | 0.7199          |
-| 0.7067        | 5.4106  | 350  | 0.6964          |
-| 0.6922        | 6.1836  | 400  | 0.6798          |
-| 0.6541        | 6.9565  | 450  | 0.6670          |
-| 0.6602        | 7.7295  | 500  | 0.6553          |
-| 0.6152        | 8.5024  | 550  | 0.6435          |
-| 0.6274        | 9.2754  | 600  | 0.6379          |
-| 0.6154        | 10.0483 | 650  | 0.6297          |
-| 0.6104        | 10.8213 | 700  | 0.6244          |
-| 0.5662        | 11.5942 | 750  | 0.6147          |
-| 0.5905        | 12.3671 | 800  | 0.6137          |
-| 0.5673        | 13.1401 | 850  | 0.6109          |
-| 0.5703        | 13.9130 | 900  | 0.6005          |
-| 0.557         | 14.6860 | 950  | 0.6032          |
 ### Framework versions

 This model is a fine-tuned version of [Qwen/Qwen2-VL-2B-Instruct](https://huggingface.co/Qwen/Qwen2-VL-2B-Instruct) on an unknown dataset.
 It achieves the following results on the evaluation set:
+- Loss: 1.2108
 ## Model description
 The following hyperparameters were used during training:
 - learning_rate: 0.0001
+- train_batch_size: 4
+- eval_batch_size: 4
 - seed: 42
+- gradient_accumulation_steps: 8
 - total_train_batch_size: 32
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: constant
 | Training Loss | Epoch   | Step | Validation Loss |
 |:-------------:|:-------:|:----:|:---------------:|
+| 2.1076        | 0.7722  | 50   | 2.0130          |
+| 1.7006        | 1.5444  | 100  | 1.6928          |
+| 1.5932        | 2.3166  | 150  | 1.5687          |
+| 1.5092        | 3.0888  | 200  | 1.4995          |
+| 1.4633        | 3.8610  | 250  | 1.4468          |
+| 1.3849        | 4.6332  | 300  | 1.4023          |
+| 1.3616        | 5.4054  | 350  | 1.3673          |
+| 1.361         | 6.1776  | 400  | 1.3386          |
+| 1.3253        | 6.9498  | 450  | 1.3159          |
+| 1.3204        | 7.7220  | 500  | 1.2976          |
+| 1.1944        | 8.4942  | 550  | 1.2814          |
+| 1.2286        | 9.2664  | 600  | 1.2703          |
+| 1.3097        | 10.0386 | 650  | 1.2532          |
+| 1.263         | 10.8108 | 700  | 1.2466          |
+| 1.1474        | 11.5830 | 750  | 1.2374          |
+| 1.191         | 12.3552 | 800  | 1.2298          |
+| 1.09          | 13.1274 | 850  | 1.2246          |
+| 1.1622        | 13.8996 | 900  | 1.2130          |
+| 1.1883        | 14.6718 | 950  | 1.2108          |
 ### Framework versions

adapter_model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:460df92e38b2b6cd3f0df685b95dbeba171a6fca99c36b3acaa46fdd386e0255
 size 4373064

 version https://git-lfs.github.com/spec/v1
+oid sha256:0d8782e5d43ef2cf59856effcb5935892ccf2a1f568969c95663b76633a42d05
 size 4373064