Model save

Browse files

Files changed (3) hide show

README.md +23 -15
model-00001-of-00002.safetensors +1 -1
runs/May26_06-08-21_ae63705f58eb/events.out.tfevents.1716703706.ae63705f58eb.59214.0 +2 -2

README.md CHANGED Viewed

@@ -13,16 +13,12 @@ model-index:
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 should probably proofread and complete it, then remove this comment. -->
-[<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="200" height="32"/>](https://wandb.ai/statking/huggingface/runs/vajvfidu)
-[<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="200" height="32"/>](https://wandb.ai/statking/huggingface/runs/vajvfidu)
-[<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="200" height="32"/>](https://wandb.ai/statking/huggingface/runs/vajvfidu)
-[<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="200" height="32"/>](https://wandb.ai/statking/huggingface/runs/vajvfidu)
-[<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="200" height="32"/>](https://wandb.ai/statking/huggingface/runs/vajvfidu)
 # paligemma-vqa
 This model is a fine-tuned version of [google/paligemma-3b-pt-224](https://huggingface.co/google/paligemma-3b-pt-224) on the vq_av2 dataset.
 It achieves the following results on the evaluation set:
-- Loss: 0.0003
 ## Model description
@@ -41,22 +37,34 @@ More information needed
 ### Training hyperparameters
 The following hyperparameters were used during training:
-- learning_rate: 0.02
-- train_batch_size: 32
-- eval_batch_size: 8
 - seed: 42
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
-- lr_scheduler_warmup_steps: 2500
 - num_epochs: 1
 ### Training results
-| Training Loss | Epoch  | Step  | Validation Loss |
-|:-------------:|:------:|:-----:|:---------------:|
-| 0.0003        | 0.3205 | 4000  | 0.0007          |
-| 0.0003        | 0.6410 | 8000  | 0.0004          |
-| 0.0003        | 0.9615 | 12000 | 0.0003          |
 ### Framework versions

 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 should probably proofread and complete it, then remove this comment. -->
+[<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="200" height="32"/>](https://wandb.ai/statking/huggingface/runs/5b5skvtb)
 # paligemma-vqa
 This model is a fine-tuned version of [google/paligemma-3b-pt-224](https://huggingface.co/google/paligemma-3b-pt-224) on the vq_av2 dataset.
 It achieves the following results on the evaluation set:
+- Loss: 0.9226
 ## Model description
 ### Training hyperparameters
 The following hyperparameters were used during training:
+- learning_rate: 8e-06
+- train_batch_size: 16
+- eval_batch_size: 16
 - seed: 42
+- gradient_accumulation_steps: 4
+- total_train_batch_size: 64
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
+- lr_scheduler_warmup_steps: 1200
 - num_epochs: 1
 ### Training results
+| Training Loss | Epoch  | Step | Validation Loss |
+|:-------------:|:------:|:----:|:---------------:|
+| 20.2791       | 0.0736 | 500  | 19.3371         |
+| 5.4004        | 0.1472 | 1000 | 4.8792          |
+| 1.5853        | 0.2207 | 1500 | 1.4809          |
+| 1.091         | 0.2943 | 2000 | 1.0661          |
+| 0.9667        | 0.3679 | 2500 | 0.9655          |
+| 0.9449        | 0.4415 | 3000 | 0.9356          |
+| 0.9241        | 0.5151 | 3500 | 0.9270          |
+| 0.9295        | 0.5886 | 4000 | 0.9238          |
+| 0.922         | 0.6622 | 4500 | 0.9228          |
+| 0.9103        | 0.7358 | 5000 | 0.9229          |
+| 0.9225        | 0.8094 | 5500 | 0.9225          |
+| 0.9159        | 0.8830 | 6000 | 0.9223          |
+| 0.934         | 0.9566 | 6500 | 0.9226          |
 ### Framework versions

model-00001-of-00002.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:d389e25d4e3bba678ae408a124be3f9025927b49a7bdbcb73bee8a433dcb86bf
 size 4985044392

 version https://git-lfs.github.com/spec/v1
+oid sha256:f37fc24dcff6177b9fe739f3b9f243cec8fa0270f95b9f1a680dd840f99ef44a
 size 4985044392

runs/May26_06-08-21_ae63705f58eb/events.out.tfevents.1716703706.ae63705f58eb.59214.0 CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:823048fba9991b5566ec585457aa2fdc2145c458691b0d3af5070f4aedf9dbf4
-size 21043

 version https://git-lfs.github.com/spec/v1
+oid sha256:b5de8fa4549a6c93ab148958a1faa4a99a4388741768bba90874f2daab6ae5ba
+size 23145