statking commited on
Commit
ef93af4
·
verified ·
1 Parent(s): fb8db17

Model save

Browse files
README.md CHANGED
@@ -13,16 +13,12 @@ model-index:
13
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
14
  should probably proofread and complete it, then remove this comment. -->
15
 
16
- [<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="200" height="32"/>](https://wandb.ai/statking/huggingface/runs/vajvfidu)
17
- [<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="200" height="32"/>](https://wandb.ai/statking/huggingface/runs/vajvfidu)
18
- [<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="200" height="32"/>](https://wandb.ai/statking/huggingface/runs/vajvfidu)
19
- [<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="200" height="32"/>](https://wandb.ai/statking/huggingface/runs/vajvfidu)
20
- [<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="200" height="32"/>](https://wandb.ai/statking/huggingface/runs/vajvfidu)
21
  # paligemma-vqa
22
 
23
  This model is a fine-tuned version of [google/paligemma-3b-pt-224](https://huggingface.co/google/paligemma-3b-pt-224) on the vq_av2 dataset.
24
  It achieves the following results on the evaluation set:
25
- - Loss: 0.0003
26
 
27
  ## Model description
28
 
@@ -41,22 +37,34 @@ More information needed
41
  ### Training hyperparameters
42
 
43
  The following hyperparameters were used during training:
44
- - learning_rate: 0.02
45
- - train_batch_size: 32
46
- - eval_batch_size: 8
47
  - seed: 42
 
 
48
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
49
  - lr_scheduler_type: linear
50
- - lr_scheduler_warmup_steps: 2500
51
  - num_epochs: 1
52
 
53
  ### Training results
54
 
55
- | Training Loss | Epoch | Step | Validation Loss |
56
- |:-------------:|:------:|:-----:|:---------------:|
57
- | 0.0003 | 0.3205 | 4000 | 0.0007 |
58
- | 0.0003 | 0.6410 | 8000 | 0.0004 |
59
- | 0.0003 | 0.9615 | 12000 | 0.0003 |
 
 
 
 
 
 
 
 
 
 
60
 
61
 
62
  ### Framework versions
 
13
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
14
  should probably proofread and complete it, then remove this comment. -->
15
 
16
+ [<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="200" height="32"/>](https://wandb.ai/statking/huggingface/runs/5b5skvtb)
 
 
 
 
17
  # paligemma-vqa
18
 
19
  This model is a fine-tuned version of [google/paligemma-3b-pt-224](https://huggingface.co/google/paligemma-3b-pt-224) on the vq_av2 dataset.
20
  It achieves the following results on the evaluation set:
21
+ - Loss: 0.9226
22
 
23
  ## Model description
24
 
 
37
  ### Training hyperparameters
38
 
39
  The following hyperparameters were used during training:
40
+ - learning_rate: 8e-06
41
+ - train_batch_size: 16
42
+ - eval_batch_size: 16
43
  - seed: 42
44
+ - gradient_accumulation_steps: 4
45
+ - total_train_batch_size: 64
46
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
47
  - lr_scheduler_type: linear
48
+ - lr_scheduler_warmup_steps: 1200
49
  - num_epochs: 1
50
 
51
  ### Training results
52
 
53
+ | Training Loss | Epoch | Step | Validation Loss |
54
+ |:-------------:|:------:|:----:|:---------------:|
55
+ | 20.2791 | 0.0736 | 500 | 19.3371 |
56
+ | 5.4004 | 0.1472 | 1000 | 4.8792 |
57
+ | 1.5853 | 0.2207 | 1500 | 1.4809 |
58
+ | 1.091 | 0.2943 | 2000 | 1.0661 |
59
+ | 0.9667 | 0.3679 | 2500 | 0.9655 |
60
+ | 0.9449 | 0.4415 | 3000 | 0.9356 |
61
+ | 0.9241 | 0.5151 | 3500 | 0.9270 |
62
+ | 0.9295 | 0.5886 | 4000 | 0.9238 |
63
+ | 0.922 | 0.6622 | 4500 | 0.9228 |
64
+ | 0.9103 | 0.7358 | 5000 | 0.9229 |
65
+ | 0.9225 | 0.8094 | 5500 | 0.9225 |
66
+ | 0.9159 | 0.8830 | 6000 | 0.9223 |
67
+ | 0.934 | 0.9566 | 6500 | 0.9226 |
68
 
69
 
70
  ### Framework versions
model-00001-of-00002.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:d389e25d4e3bba678ae408a124be3f9025927b49a7bdbcb73bee8a433dcb86bf
3
  size 4985044392
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f37fc24dcff6177b9fe739f3b9f243cec8fa0270f95b9f1a680dd840f99ef44a
3
  size 4985044392
runs/May26_06-08-21_ae63705f58eb/events.out.tfevents.1716703706.ae63705f58eb.59214.0 CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:823048fba9991b5566ec585457aa2fdc2145c458691b0d3af5070f4aedf9dbf4
3
- size 21043
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:b5de8fa4549a6c93ab148958a1faa4a99a4388741768bba90874f2daab6ae5ba
3
+ size 23145