statking commited on
Commit
ae590b7
·
verified ·
1 Parent(s): 78ca13e

Model save

Browse files
README.md CHANGED
@@ -13,12 +13,16 @@ model-index:
13
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
14
  should probably proofread and complete it, then remove this comment. -->
15
 
 
 
 
 
16
  [<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="200" height="32"/>](https://wandb.ai/statking/huggingface/runs/vajvfidu)
17
  # paligemma-vqa
18
 
19
  This model is a fine-tuned version of [google/paligemma-3b-pt-224](https://huggingface.co/google/paligemma-3b-pt-224) on the vq_av2 dataset.
20
  It achieves the following results on the evaluation set:
21
- - Loss: 0.0000
22
 
23
  ## Model description
24
 
@@ -37,24 +41,22 @@ More information needed
37
  ### Training hyperparameters
38
 
39
  The following hyperparameters were used during training:
40
- - learning_rate: 2e-05
41
- - train_batch_size: 16
42
  - eval_batch_size: 8
43
  - seed: 42
44
- - gradient_accumulation_steps: 4
45
- - total_train_batch_size: 64
46
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
47
  - lr_scheduler_type: linear
48
- - lr_scheduler_warmup_steps: 2
49
- - num_epochs: 2
50
 
51
  ### Training results
52
 
53
  | Training Loss | Epoch | Step | Validation Loss |
54
  |:-------------:|:------:|:-----:|:---------------:|
55
- | 0.0002 | 0.6410 | 4000 | 0.0000 |
56
- | 0.0 | 1.2819 | 8000 | 0.0000 |
57
- | 0.0 | 1.9229 | 12000 | 0.0000 |
58
 
59
 
60
  ### Framework versions
 
13
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
14
  should probably proofread and complete it, then remove this comment. -->
15
 
16
+ [<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="200" height="32"/>](https://wandb.ai/statking/huggingface/runs/vajvfidu)
17
+ [<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="200" height="32"/>](https://wandb.ai/statking/huggingface/runs/vajvfidu)
18
+ [<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="200" height="32"/>](https://wandb.ai/statking/huggingface/runs/vajvfidu)
19
+ [<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="200" height="32"/>](https://wandb.ai/statking/huggingface/runs/vajvfidu)
20
  [<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="200" height="32"/>](https://wandb.ai/statking/huggingface/runs/vajvfidu)
21
  # paligemma-vqa
22
 
23
  This model is a fine-tuned version of [google/paligemma-3b-pt-224](https://huggingface.co/google/paligemma-3b-pt-224) on the vq_av2 dataset.
24
  It achieves the following results on the evaluation set:
25
+ - Loss: 0.0003
26
 
27
  ## Model description
28
 
 
41
  ### Training hyperparameters
42
 
43
  The following hyperparameters were used during training:
44
+ - learning_rate: 0.02
45
+ - train_batch_size: 32
46
  - eval_batch_size: 8
47
  - seed: 42
 
 
48
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
49
  - lr_scheduler_type: linear
50
+ - lr_scheduler_warmup_steps: 2500
51
+ - num_epochs: 1
52
 
53
  ### Training results
54
 
55
  | Training Loss | Epoch | Step | Validation Loss |
56
  |:-------------:|:------:|:-----:|:---------------:|
57
+ | 0.0003 | 0.3205 | 4000 | 0.0007 |
58
+ | 0.0003 | 0.6410 | 8000 | 0.0004 |
59
+ | 0.0003 | 0.9615 | 12000 | 0.0003 |
60
 
61
 
62
  ### Framework versions
model-00001-of-00002.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:1de2cb4929fca6b3864aa16e1801b11d77ee4f6f61eaa7d60195259bf106b5f5
3
  size 4985044392
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d389e25d4e3bba678ae408a124be3f9025927b49a7bdbcb73bee8a433dcb86bf
3
  size 4985044392
model-00002-of-00002.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:f70825e8b0d43eb4c81ff90f02c610813904ddf6d00e76bbc0c15f36d55a088b
3
  size 861970608
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:9091eb9dc19fdf0d547322d917b589e1359b7b7b6605f1623f91fec791a32d00
3
  size 861970608
runs/May25_12-56-47_ae63705f58eb/events.out.tfevents.1716641808.ae63705f58eb.46359.5 CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:2198d81a5ff3fe72d81d0b574e630577914e4a7734c4b1a1f8d5ee45321aafb8
3
- size 31263
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:a4a6992dff36e19e1617ea28a39cc61c2982755497e61e084f08439935873378
3
+ size 32461