bilkultheek commited on
Commit
056c888
·
verified ·
1 Parent(s): acda56f

End of training

Browse files
README.md CHANGED
@@ -6,18 +6,23 @@ tags:
6
  - sft
7
  - generated_from_trainer
8
  model-index:
9
- - name: Cold-Data-LLama-2-7B
10
  results: []
11
  ---
12
 
13
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
14
  should probably proofread and complete it, then remove this comment. -->
15
 
16
- # Cold-Data-LLama-2-7B
17
 
18
  This model is a fine-tuned version of [NousResearch/Llama-2-7b-hf](https://huggingface.co/NousResearch/Llama-2-7b-hf) on the None dataset.
19
  It achieves the following results on the evaluation set:
20
- - Loss: 0.0526
 
 
 
 
 
21
 
22
  ## Model description
23
 
@@ -36,7 +41,7 @@ More information needed
36
  ### Training hyperparameters
37
 
38
  The following hyperparameters were used during training:
39
- - learning_rate: 0.002
40
  - train_batch_size: 16
41
  - eval_batch_size: 32
42
  - seed: 42
@@ -47,17 +52,6 @@ The following hyperparameters were used during training:
47
  - lr_scheduler_warmup_ratio: 0.03
48
  - num_epochs: 10
49
 
50
- ### Training results
51
-
52
- | Training Loss | Epoch | Step | Validation Loss |
53
- |:-------------:|:-----:|:----:|:---------------:|
54
- | 0.1019 | 1.992 | 249 | 0.1022 |
55
- | 0.0542 | 3.984 | 498 | 0.0540 |
56
- | 0.0508 | 5.976 | 747 | 0.0513 |
57
- | 0.0479 | 7.968 | 996 | 0.0515 |
58
- | 0.0472 | 9.96 | 1245 | 0.0537 |
59
-
60
-
61
  ### Framework versions
62
 
63
  - PEFT 0.12.0
 
6
  - sft
7
  - generated_from_trainer
8
  model-index:
9
+ - name: Cold-Again-LLama-2-7B
10
  results: []
11
  ---
12
 
13
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
14
  should probably proofread and complete it, then remove this comment. -->
15
 
16
+ # Cold-Again-LLama-2-7B
17
 
18
  This model is a fine-tuned version of [NousResearch/Llama-2-7b-hf](https://huggingface.co/NousResearch/Llama-2-7b-hf) on the None dataset.
19
  It achieves the following results on the evaluation set:
20
+ - eval_loss: 1.3661
21
+ - eval_runtime: 90.0594
22
+ - eval_samples_per_second: 1.11
23
+ - eval_steps_per_second: 0.044
24
+ - epoch: 5.76
25
+ - step: 36
26
 
27
  ## Model description
28
 
 
41
  ### Training hyperparameters
42
 
43
  The following hyperparameters were used during training:
44
+ - learning_rate: 0.0001
45
  - train_batch_size: 16
46
  - eval_batch_size: 32
47
  - seed: 42
 
52
  - lr_scheduler_warmup_ratio: 0.03
53
  - num_epochs: 10
54
 
 
 
 
 
 
 
 
 
 
 
 
55
  ### Framework versions
56
 
57
  - PEFT 0.12.0
adapter_model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:38cd1ccd5681e8b6442697ddc6b05aa049d662f3fe973d76419114b54a30c5a7
3
  size 134235048
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:eaae29d5c08c301696d3b97088a18cf91eeb0e129284ac2ffc2e336e18e0807c
3
  size 134235048
runs/Aug20_23-57-27_fastgpuserv/events.out.tfevents.1724193656.fastgpuserv.1412094.1 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:0dd21beb2705c281008c444c908b3bbc99f8b40a7608ab8cc9afd0c2307affc3
3
+ size 7406