Kareem Amr commited on
Commit
f9ee3b2
·
verified ·
1 Parent(s): 55e1f6e

End of training

Browse files
Files changed (2) hide show
  1. README.md +27 -25
  2. adapter_model.bin +2 -2
README.md CHANGED
@@ -2,10 +2,11 @@
2
  license: apache-2.0
3
  library_name: peft
4
  tags:
 
5
  - generated_from_trainer
6
  base_model: TinyLlama/TinyLlama-1.1B-intermediate-step-1431k-3T
7
  model-index:
8
- - name: outputs/lora-out
9
  results: []
10
  ---
11
 
@@ -39,21 +40,23 @@ pad_to_sequence_len: true
39
 
40
  adapter: lora
41
  lora_model_dir:
42
- lora_r: 4
43
  lora_alpha: 16
44
- lora_dropout: 0.8
45
  lora_target_linear: true
46
  lora_fan_in_fan_out:
47
 
48
- wandb_project: tinyllama-dolly-axolotl
49
- wandb_entity: kamr54
 
 
50
 
51
  gradient_accumulation_steps: 4
52
  micro_batch_size: 2
53
  num_epochs: 4
54
  optimizer: adamw_bnb_8bit
55
  lr_scheduler:
56
- learning_rate: 0.0008
57
 
58
  train_on_inputs: false
59
  group_by_length: false
@@ -78,16 +81,15 @@ weight_decay: 0.0
78
  fsdp:
79
  fsdp_config:
80
  special_tokens:
81
-
82
  ```
83
 
84
  </details><br>
85
 
86
- # outputs/lora-out
87
 
88
  This model is a fine-tuned version of [TinyLlama/TinyLlama-1.1B-intermediate-step-1431k-3T](https://huggingface.co/TinyLlama/TinyLlama-1.1B-intermediate-step-1431k-3T) on the None dataset.
89
  It achieves the following results on the evaluation set:
90
- - Loss: 1.7742
91
 
92
  ## Model description
93
 
@@ -106,7 +108,7 @@ More information needed
106
  ### Training hyperparameters
107
 
108
  The following hyperparameters were used during training:
109
- - learning_rate: 0.0008
110
  - train_batch_size: 2
111
  - eval_batch_size: 2
112
  - seed: 42
@@ -122,21 +124,21 @@ The following hyperparameters were used during training:
122
  | Training Loss | Epoch | Step | Validation Loss |
123
  |:-------------:|:------:|:----:|:---------------:|
124
  | 1.8146 | 0.0317 | 1 | 2.1074 |
125
- | 1.767 | 0.2540 | 8 | 1.8264 |
126
- | 2.0003 | 0.5079 | 16 | 1.7782 |
127
- | 1.7691 | 0.7619 | 24 | 1.7640 |
128
- | 1.8407 | 1.0159 | 32 | 1.7693 |
129
- | 1.7637 | 1.2460 | 40 | 1.7650 |
130
- | 1.7748 | 1.5 | 48 | 1.7724 |
131
- | 1.773 | 1.7540 | 56 | 1.7669 |
132
- | 1.7533 | 2.0079 | 64 | 1.7599 |
133
- | 1.5889 | 2.2381 | 72 | 1.7678 |
134
- | 1.591 | 2.4921 | 80 | 1.7741 |
135
- | 1.7303 | 2.7460 | 88 | 1.7669 |
136
- | 1.5035 | 3.0 | 96 | 1.7666 |
137
- | 1.4954 | 3.2222 | 104 | 1.7715 |
138
- | 1.6623 | 3.4762 | 112 | 1.7708 |
139
- | 1.7277 | 3.7302 | 120 | 1.7742 |
140
 
141
 
142
  ### Framework versions
 
2
  license: apache-2.0
3
  library_name: peft
4
  tags:
5
+ - axolotl
6
  - generated_from_trainer
7
  base_model: TinyLlama/TinyLlama-1.1B-intermediate-step-1431k-3T
8
  model-index:
9
+ - name: tinyllama-1.1B_dolly-4.5k_lora
10
  results: []
11
  ---
12
 
 
40
 
41
  adapter: lora
42
  lora_model_dir:
43
+ lora_r: 16
44
  lora_alpha: 16
45
+ lora_dropout: 0.5
46
  lora_target_linear: true
47
  lora_fan_in_fan_out:
48
 
49
+ # wandb_project: tinyllama-dolly-axolotl
50
+ # wandb_entity: kamr54
51
+
52
+ hub_model_id: kareemamrr/tinyllama-1.1B_dolly-4.5k_lora
53
 
54
  gradient_accumulation_steps: 4
55
  micro_batch_size: 2
56
  num_epochs: 4
57
  optimizer: adamw_bnb_8bit
58
  lr_scheduler:
59
+ learning_rate: 0.0004
60
 
61
  train_on_inputs: false
62
  group_by_length: false
 
81
  fsdp:
82
  fsdp_config:
83
  special_tokens:
 
84
  ```
85
 
86
  </details><br>
87
 
88
+ # tinyllama-1.1B_dolly-4.5k_lora
89
 
90
  This model is a fine-tuned version of [TinyLlama/TinyLlama-1.1B-intermediate-step-1431k-3T](https://huggingface.co/TinyLlama/TinyLlama-1.1B-intermediate-step-1431k-3T) on the None dataset.
91
  It achieves the following results on the evaluation set:
92
+ - Loss: 1.7650
93
 
94
  ## Model description
95
 
 
108
  ### Training hyperparameters
109
 
110
  The following hyperparameters were used during training:
111
+ - learning_rate: 0.0004
112
  - train_batch_size: 2
113
  - eval_batch_size: 2
114
  - seed: 42
 
124
  | Training Loss | Epoch | Step | Validation Loss |
125
  |:-------------:|:------:|:----:|:---------------:|
126
  | 1.8146 | 0.0317 | 1 | 2.1074 |
127
+ | 1.7728 | 0.2540 | 8 | 1.8290 |
128
+ | 1.9975 | 0.5079 | 16 | 1.7875 |
129
+ | 1.7685 | 0.7619 | 24 | 1.7717 |
130
+ | 1.8368 | 1.0159 | 32 | 1.7684 |
131
+ | 1.768 | 1.2460 | 40 | 1.7622 |
132
+ | 1.7774 | 1.5 | 48 | 1.7655 |
133
+ | 1.7727 | 1.7540 | 56 | 1.7565 |
134
+ | 1.7453 | 2.0079 | 64 | 1.7502 |
135
+ | 1.5904 | 2.2381 | 72 | 1.7644 |
136
+ | 1.5978 | 2.4921 | 80 | 1.7628 |
137
+ | 1.7305 | 2.7460 | 88 | 1.7600 |
138
+ | 1.4956 | 3.0 | 96 | 1.7582 |
139
+ | 1.503 | 3.2222 | 104 | 1.7603 |
140
+ | 1.6659 | 3.4762 | 112 | 1.7634 |
141
+ | 1.734 | 3.7302 | 120 | 1.7650 |
142
 
143
 
144
  ### Framework versions
adapter_model.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:33dc6dba5051843c44397d04eda3bb5eaae0acfc6c7af6a8100ef6c0141a5637
3
- size 12726362
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:be606f7dabf57a40b0d23e7b10b9ef2863b08dd5b69d56cb78ae4637056551c2
3
+ size 50573530