FatCat87
/

test-task-2025-01-06

@@ -1,10 +1,11 @@
 ---
 library_name: peft
 tags:
 - generated_from_trainer
 base_model: mhenrichsen/gemma-7b
 model-index:
-- name: outputs/out
   results: []
 ---
@@ -36,6 +37,7 @@ fsdp_config: null
 gradient_accumulation_steps: 3
 gradient_checkpointing: true
 group_by_length: false
 learning_rate: 0.0002
 load_in_4bit: true
 load_in_8bit: false
@@ -62,11 +64,12 @@ tf32: false
 tokenizer_type: AutoTokenizer
 train_on_inputs: false
 val_set_size: 0.1
-wandb_entity: null
 wandb_log_model: null
-wandb_name: test-task
-wandb_project: null
-wandb_runid: test-task
 wandb_watch: null
 warmup_ratio: 0.1
 weight_decay: 0.0
@@ -76,11 +79,12 @@ xformers_attention: null
 </details><br>
-# outputs/out
 This model is a fine-tuned version of [mhenrichsen/gemma-7b](https://huggingface.co/mhenrichsen/gemma-7b) on the None dataset.
 It achieves the following results on the evaluation set:
-- Loss: 1.0412
 ## Model description
@@ -103,33 +107,31 @@ The following hyperparameters were used during training:
 - train_batch_size: 2
 - eval_batch_size: 2
 - seed: 42
-- distributed_type: multi-GPU
-- num_devices: 2
 - gradient_accumulation_steps: 3
-- total_train_batch_size: 12
-- total_eval_batch_size: 4
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: cosine
-- lr_scheduler_warmup_steps: 2
 - num_epochs: 4
 ### Training results
-| Training Loss | Epoch  | Step | Validation Loss |
-|:-------------:|:------:|:----:|:---------------:|
-| 1.0695        | 0.1579 | 1    | 1.1912          |
-| 1.1142        | 0.3158 | 2    | 1.1076          |
-| 1.0936        | 0.6316 | 4    | 1.0834          |
-| 1.058         | 0.9474 | 6    | 1.0378          |
-| 0.9794        | 1.1579 | 8    | 1.0479          |
-| 0.9632        | 1.4737 | 10   | 1.0377          |
-| 0.951         | 1.7895 | 12   | 1.0467          |
-| 1.0219        | 2.1053 | 14   | 1.0463          |
-| 0.9345        | 2.3158 | 16   | 1.0417          |
-| 0.9314        | 2.6316 | 18   | 1.0434          |
-| 0.9108        | 2.9474 | 20   | 1.0363          |
-| 0.894         | 3.1579 | 22   | 1.0348          |
-| 0.8835        | 3.4737 | 24   | 1.0412          |
 ### Framework versions

 ---
 library_name: peft
 tags:
+- axolotl
 - generated_from_trainer
 base_model: mhenrichsen/gemma-7b
 model-index:
+- name: test-task-2025-01-06
   results: []
 ---
 gradient_accumulation_steps: 3
 gradient_checkpointing: true
 group_by_length: false
+hub_model_id: FatCat87/test-task-2025-01-06
 learning_rate: 0.0002
 load_in_4bit: true
 load_in_8bit: false
 tokenizer_type: AutoTokenizer
 train_on_inputs: false
 val_set_size: 0.1
+wandb_entity: fatcat87-taopanda
 wandb_log_model: null
+wandb_mode: online
+wandb_name: test-task-2025-01-06
+wandb_project: subnet56
+wandb_runid: test-task-2025-01-06
 wandb_watch: null
 warmup_ratio: 0.1
 weight_decay: 0.0
 </details><br>
+[<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="200" height="32"/>](https://wandb.ai/fatcat87-taopanda/subnet56/runs/p0rc3cvq)
+# test-task-2025-01-06
 This model is a fine-tuned version of [mhenrichsen/gemma-7b](https://huggingface.co/mhenrichsen/gemma-7b) on the None dataset.
 It achieves the following results on the evaluation set:
+- Loss: 1.0913
 ## Model description
 - train_batch_size: 2
 - eval_batch_size: 2
 - seed: 42
 - gradient_accumulation_steps: 3
+- total_train_batch_size: 6
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: cosine
+- lr_scheduler_warmup_steps: 5
 - num_epochs: 4
 ### Training results
+| Training Loss | Epoch | Step | Validation Loss |
+|:-------------:|:-----:|:----:|:---------------:|
+| 1.046         | 0.075 | 1    | 1.1912          |
+| 1.1095        | 0.3   | 4    | 1.1067          |
+| 1.0619        | 0.6   | 8    | 1.0441          |
+| 1.0547        | 0.9   | 12   | 1.0446          |
+| 0.931         | 1.15  | 16   | 1.0528          |
+| 0.8836        | 1.45  | 20   | 1.0399          |
+| 0.8958        | 1.75  | 24   | 1.0419          |
+| 0.9922        | 2.05  | 28   | 1.0361          |
+| 0.7736        | 2.3   | 32   | 1.0851          |
+| 0.7437        | 2.6   | 36   | 1.0840          |
+| 0.7552        | 2.9   | 40   | 1.0769          |
+| 0.6623        | 3.15  | 44   | 1.0870          |
+| 0.7173        | 3.45  | 48   | 1.0946          |
+| 0.7122        | 3.75  | 52   | 1.0913          |
 ### Framework versions

adapter_model.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:ab93f1af6b30528b09ddf4f5826c4a502dd62cb47134a77f138d4a97e8d27c9b
 size 200157610

 version https://git-lfs.github.com/spec/v1
+oid sha256:c05c1b8dba8f8c380d8615f6e36c46a7b81f5153ea5ac58052c3feac0713a74e
 size 200157610