bilkultheek commited on
Commit
cf6813b
1 Parent(s): 797c227

End of training

Browse files
Files changed (1) hide show
  1. README.md +10 -10
README.md CHANGED
@@ -1,6 +1,7 @@
1
  ---
2
- base_model: NousResearch/Llama-2-7b-hf
3
  library_name: peft
 
4
  tags:
5
  - trl
6
  - sft
@@ -15,7 +16,7 @@ should probably proofread and complete it, then remove this comment. -->
15
 
16
  # YaHaHamaraLlama
17
 
18
- This model is a fine-tuned version of [NousResearch/Llama-2-7b-hf](https://huggingface.co/NousResearch/Llama-2-7b-hf) on the None dataset.
19
 
20
  ## Model description
21
 
@@ -36,15 +37,14 @@ More information needed
36
  The following hyperparameters were used during training:
37
  - learning_rate: 0.0002
38
  - train_batch_size: 8
39
- - eval_batch_size: 6
40
  - seed: 42
41
- - gradient_accumulation_steps: 8
42
- - total_train_batch_size: 64
43
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
44
- - lr_scheduler_type: cosine
45
  - lr_scheduler_warmup_ratio: 0.03
46
- - num_epochs: 3
47
- - mixed_precision_training: Native AMP
48
 
49
  ### Training results
50
 
@@ -53,7 +53,7 @@ The following hyperparameters were used during training:
53
  ### Framework versions
54
 
55
  - PEFT 0.12.0
56
- - Transformers 4.42.4
57
  - Pytorch 2.3.1+cu121
58
- - Datasets 2.20.0
59
  - Tokenizers 0.19.1
 
1
  ---
2
+ base_model: ahxt/LiteLlama-460M-1T
3
  library_name: peft
4
+ license: mit
5
  tags:
6
  - trl
7
  - sft
 
16
 
17
  # YaHaHamaraLlama
18
 
19
+ This model is a fine-tuned version of [ahxt/LiteLlama-460M-1T](https://huggingface.co/ahxt/LiteLlama-460M-1T) on the None dataset.
20
 
21
  ## Model description
22
 
 
37
  The following hyperparameters were used during training:
38
  - learning_rate: 0.0002
39
  - train_batch_size: 8
40
+ - eval_batch_size: 8
41
  - seed: 42
42
+ - gradient_accumulation_steps: 4
43
+ - total_train_batch_size: 32
44
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
45
+ - lr_scheduler_type: linear
46
  - lr_scheduler_warmup_ratio: 0.03
47
+ - num_epochs: 5
 
48
 
49
  ### Training results
50
 
 
53
  ### Framework versions
54
 
55
  - PEFT 0.12.0
56
+ - Transformers 4.43.3
57
  - Pytorch 2.3.1+cu121
58
+ - Datasets 2.17.0
59
  - Tokenizers 0.19.1