jtatman commited on
Commit
1643cc0
1 Parent(s): b0e6c9c

End of training

Browse files
Files changed (4) hide show
  1. README.md +6 -3
  2. adapter_model.safetensors +2 -2
  3. config.json +1 -1
  4. pytorch_model.bin +1 -1
README.md CHANGED
@@ -61,7 +61,7 @@ output_dir: ./outputs/lora-alpaca-pythia-160m-storytelling
61
  gradient_accumulation_steps: 16
62
  micro_batch_size: 1
63
  num_epochs: 3
64
- learning_rate: 0.001
65
  lr_scheduler: cosine_with_restarts
66
  #cosine_min_lr_ratio: 0.1
67
  train_on_inputs: false
@@ -98,7 +98,7 @@ tokens:
98
 
99
  This model is a fine-tuned version of [EleutherAI/pythia-160m-deduped](https://huggingface.co/EleutherAI/pythia-160m-deduped) on the None dataset.
100
  It achieves the following results on the evaluation set:
101
- - Loss: 7.3539
102
 
103
  ## Model description
104
 
@@ -117,7 +117,7 @@ More information needed
117
  ### Training hyperparameters
118
 
119
  The following hyperparameters were used during training:
120
- - learning_rate: 0.001
121
  - train_batch_size: 1
122
  - eval_batch_size: 1
123
  - seed: 42
@@ -139,6 +139,9 @@ The following hyperparameters were used during training:
139
  | 8.1159 | 0.9391 | 800 | 8.4966 |
140
  | 6.7656 | 1.1739 | 1000 | 7.1575 |
141
  | 7.0548 | 1.4087 | 1200 | 7.3539 |
 
 
 
142
 
143
 
144
  ### Framework versions
 
61
  gradient_accumulation_steps: 16
62
  micro_batch_size: 1
63
  num_epochs: 3
64
+ learning_rate: 0.004
65
  lr_scheduler: cosine_with_restarts
66
  #cosine_min_lr_ratio: 0.1
67
  train_on_inputs: false
 
98
 
99
  This model is a fine-tuned version of [EleutherAI/pythia-160m-deduped](https://huggingface.co/EleutherAI/pythia-160m-deduped) on the None dataset.
100
  It achieves the following results on the evaluation set:
101
+ - Loss: 5.0097
102
 
103
  ## Model description
104
 
 
117
  ### Training hyperparameters
118
 
119
  The following hyperparameters were used during training:
120
+ - learning_rate: 0.004
121
  - train_batch_size: 1
122
  - eval_batch_size: 1
123
  - seed: 42
 
139
  | 8.1159 | 0.9391 | 800 | 8.4966 |
140
  | 6.7656 | 1.1739 | 1000 | 7.1575 |
141
  | 7.0548 | 1.4087 | 1200 | 7.3539 |
142
+ | 5.9982 | 1.6445 | 1400 | 5.9954 |
143
+ | 5.7662 | 1.8792 | 1600 | 6.0222 |
144
+ | 4.8094 | 2.1140 | 1800 | 5.0097 |
145
 
146
 
147
  ### Framework versions
adapter_model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:beed07264889070614923e60987914dfc89e9442eeeac7fc7e983c8338de1f46
3
- size 159266376
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:e44ce263e6fd885f50d82ca515b9325375b43ee36ededb75acf161ce88bc2e41
3
+ size 48
config.json CHANGED
@@ -22,7 +22,7 @@
22
  "rotary_emb_base": 10000,
23
  "rotary_pct": 0.25,
24
  "tie_word_embeddings": false,
25
- "torch_dtype": "float16",
26
  "transformers_version": "4.41.2",
27
  "use_cache": false,
28
  "use_parallel_residual": true,
 
22
  "rotary_emb_base": 10000,
23
  "rotary_pct": 0.25,
24
  "tie_word_embeddings": false,
25
+ "torch_dtype": "bfloat16",
26
  "transformers_version": "4.41.2",
27
  "use_cache": false,
28
  "use_parallel_residual": true,
pytorch_model.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:9dec837ce83e8a9edef8bb8c740c7c5826b1c3b830849b2c1c30a0e610b54bc6
3
  size 324696090
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:0144df65701734c02c9a9a78f6dcaf1b3b5f4ec06aaa3866efa3dc31c90aafc6
3
  size 324696090