Tippawan
/

proof-reading-SeaLLM3-7B-Chat-3090-v3

@@ -6,7 +6,7 @@ tags:
 - axolotl
 - generated_from_trainer
 model-index:
-- name: proof-reading-SeaLLM3-7B-Chat-3090-v2
   results: []
 ---
@@ -26,7 +26,7 @@ load_in_4bit: true
 strict: false
 datasets:
-  - path: Tippawan/proof-reading-SeaLLM3-7B-Chat-3090-v2
     type: sharegpt
     conversation: chatml
     field_messages: messages
@@ -41,7 +41,7 @@ eval_sample_packing: false
 pad_to_sequence_len: false
 push_to_hub: true
-hub_model_id: Tippawan/proof-reading-SeaLLM3-7B-Chat-3090-v2  # Replace with your Hugging Face repo ID
 use_auth_token: true  # Ensure you have set your Hugging Face API token in the environment
 hub_private_repo: true  # Set to true if you want the repository to be private
 hub_strategy: all_checkpoints
@@ -56,14 +56,14 @@ lora_dropout: 0.05
 lora_target_linear: true
 lora_fan_in_fan_out:
-wandb_project: proof-reading-SeaLLM3-7B-Chat-3090-v2
 wandb_entity:
 wandb_watch:
 wandb_name:
 wandb_log_model:
 gradient_accumulation_steps: 4
-micro_batch_size: 4
 num_epochs: 1 #editted 3
 optimizer: adamw_torch
 lr_scheduler: cosine
@@ -96,7 +96,7 @@ special_tokens:
 </details><br>
-# proof-reading-SeaLLM3-7B-Chat-3090-v2
 This model is a fine-tuned version of [SeaLLMs/SeaLLM3-7B-Chat](https://huggingface.co/SeaLLMs/SeaLLM3-7B-Chat) on the None dataset.
@@ -118,11 +118,11 @@ More information needed
 The following hyperparameters were used during training:
 - learning_rate: 0.0002
-- train_batch_size: 4
-- eval_batch_size: 4
 - seed: 42
 - gradient_accumulation_steps: 4
-- total_train_batch_size: 16
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: cosine
 - lr_scheduler_warmup_steps: 10

 - axolotl
 - generated_from_trainer
 model-index:
+- name: proof-reading-SeaLLM3-7B-Chat-3090-v3
   results: []
 ---
 strict: false
 datasets:
+  - path: Tippawan/pr-v3
     type: sharegpt
     conversation: chatml
     field_messages: messages
 pad_to_sequence_len: false
 push_to_hub: true
+hub_model_id: Tippawan/proof-reading-SeaLLM3-7B-Chat-3090-v3  # Replace with your Hugging Face repo ID
 use_auth_token: true  # Ensure you have set your Hugging Face API token in the environment
 hub_private_repo: true  # Set to true if you want the repository to be private
 hub_strategy: all_checkpoints
 lora_target_linear: true
 lora_fan_in_fan_out:
+wandb_project: proof-reading-SeaLLM3-7B-Chat-3090-v3
 wandb_entity:
 wandb_watch:
 wandb_name:
 wandb_log_model:
 gradient_accumulation_steps: 4
+micro_batch_size: 2
 num_epochs: 1 #editted 3
 optimizer: adamw_torch
 lr_scheduler: cosine
 </details><br>
+# proof-reading-SeaLLM3-7B-Chat-3090-v3
 This model is a fine-tuned version of [SeaLLMs/SeaLLM3-7B-Chat](https://huggingface.co/SeaLLMs/SeaLLM3-7B-Chat) on the None dataset.
 The following hyperparameters were used during training:
 - learning_rate: 0.0002
+- train_batch_size: 2
+- eval_batch_size: 2
 - seed: 42
 - gradient_accumulation_steps: 4
+- total_train_batch_size: 8
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: cosine
 - lr_scheduler_warmup_steps: 10

adapter_model.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:7d3e8ac4f51f2835203d41cba71d7bed76a4561b123311a9761500dc0507ea23
 size 161621802

 version https://git-lfs.github.com/spec/v1
+oid sha256:f765a256c9414c1bf99761e3dd85351874da0d6d3d502c6dd867c7e6242eb189
 size 161621802