Model save

Files changed (4) hide show

README.md CHANGED Viewed

@@ -6,23 +6,16 @@ tags:
 - sft
 - generated_from_trainer
 model-index:
-- name: Cold-Again-LLama-2-7B
   results: []
 ---
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 should probably proofread and complete it, then remove this comment. -->
-# Cold-Again-LLama-2-7B
 This model is a fine-tuned version of [NousResearch/Llama-2-7b-hf](https://huggingface.co/NousResearch/Llama-2-7b-hf) on the None dataset.
-It achieves the following results on the evaluation set:
-- eval_loss: 1.3661
-- eval_runtime: 90.0594
-- eval_samples_per_second: 1.11
-- eval_steps_per_second: 0.044
-- epoch: 5.76
-- step: 36
 ## Model description
@@ -41,16 +34,16 @@ More information needed
 ### Training hyperparameters
 The following hyperparameters were used during training:
-- learning_rate: 0.0001
 - train_batch_size: 16
-- eval_batch_size: 32
 - seed: 42
 - gradient_accumulation_steps: 4
 - total_train_batch_size: 64
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
-- lr_scheduler_type: linear
 - lr_scheduler_warmup_ratio: 0.03
-- num_epochs: 10
 ### Framework versions

 - sft
 - generated_from_trainer
 model-index:
+- name: Cold-Data-LLama-2-7B
   results: []
 ---
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 should probably proofread and complete it, then remove this comment. -->
+# Cold-Data-LLama-2-7B
 This model is a fine-tuned version of [NousResearch/Llama-2-7b-hf](https://huggingface.co/NousResearch/Llama-2-7b-hf) on the None dataset.
 ## Model description
 ### Training hyperparameters
 The following hyperparameters were used during training:
+- learning_rate: 2e-05
 - train_batch_size: 16
+- eval_batch_size: 16
 - seed: 42
 - gradient_accumulation_steps: 4
 - total_train_batch_size: 64
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
+- lr_scheduler_type: cosine
 - lr_scheduler_warmup_ratio: 0.03
+- num_epochs: 2
 ### Framework versions

adapter_model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:8fde4d9e393419028aa8084bf4b5eb45ecf114573958200d298e7b40a165502d
 size 134267920

 version https://git-lfs.github.com/spec/v1
+oid sha256:2f4e69029b729a2ecafb8eed45797844a0b795161a8d3946272bdc9607a7b17d
 size 134267920

tokenizer.json CHANGED Viewed

@@ -7,9 +7,7 @@
     "stride": 0
   },
   "padding": {
-    "strategy": {
-      "Fixed": 512
-    },
     "direction": "Left",
     "pad_to_multiple_of": null,
     "pad_id": 0,

     "stride": 0
   },
   "padding": {
+    "strategy": "BatchLongest",
     "direction": "Left",
     "pad_to_multiple_of": null,
     "pad_id": 0,

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:c9200ba3bd35436e1584cf5038c9356f8e6bb4f7b2a52e5f0c9d22364961999b
 size 5432

 version https://git-lfs.github.com/spec/v1
+oid sha256:01e5f113bdcf55e4178ab97c72be49d141dd46dac20be7319363dbbf09d8d697
 size 5432