bilkultheek commited on
Commit
46587a0
1 Parent(s): 7c0bce9

Model save

Browse files
Files changed (1) hide show
  1. README.md +5 -10
README.md CHANGED
@@ -17,8 +17,6 @@ should probably proofread and complete it, then remove this comment. -->
17
  # Cold-Data-LLama-2-7B
18
 
19
  This model is a fine-tuned version of [meta-llama/Llama-2-7b-hf](https://huggingface.co/meta-llama/Llama-2-7b-hf) on the None dataset.
20
- It achieves the following results on the evaluation set:
21
- - Loss: 0.4892
22
 
23
  ## Model description
24
 
@@ -37,12 +35,12 @@ More information needed
37
  ### Training hyperparameters
38
 
39
  The following hyperparameters were used during training:
40
- - learning_rate: 2e-05
41
- - train_batch_size: 8
42
- - eval_batch_size: 8
43
  - seed: 42
44
- - gradient_accumulation_steps: 8
45
- - total_train_batch_size: 64
46
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
47
  - lr_scheduler_type: cosine
48
  - lr_scheduler_warmup_ratio: 0.03
@@ -51,9 +49,6 @@ The following hyperparameters were used during training:
51
 
52
  ### Training results
53
 
54
- | Training Loss | Epoch | Step | Validation Loss |
55
- |:-------------:|:------:|:----:|:---------------:|
56
- | 0.4494 | 4.0404 | 100 | 0.4892 |
57
 
58
 
59
  ### Framework versions
 
17
  # Cold-Data-LLama-2-7B
18
 
19
  This model is a fine-tuned version of [meta-llama/Llama-2-7b-hf](https://huggingface.co/meta-llama/Llama-2-7b-hf) on the None dataset.
 
 
20
 
21
  ## Model description
22
 
 
35
  ### Training hyperparameters
36
 
37
  The following hyperparameters were used during training:
38
+ - learning_rate: 0.01
39
+ - train_batch_size: 24
40
+ - eval_batch_size: 24
41
  - seed: 42
42
+ - gradient_accumulation_steps: 24
43
+ - total_train_batch_size: 576
44
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
45
  - lr_scheduler_type: cosine
46
  - lr_scheduler_warmup_ratio: 0.03
 
49
 
50
  ### Training results
51
 
 
 
 
52
 
53
 
54
  ### Framework versions