nayohan commited on
Commit
b685114
·
1 Parent(s): ae87ae7

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +5 -5
README.md CHANGED
@@ -12,22 +12,22 @@ tags:
12
  base_model: EleutherAI/polyglot-ko-12.8b
13
  ---
14
 
15
- This model is a instruct-tuned poylglot-ko-12.8b model, using 10% [Kullm, OIG, KoAlpaca] Instruction dataset.
16
 
17
  ## Training hyperparameters
18
  - learning_rate: 5e-5
19
  - seed: 42
20
- - distributed_type: multi-GPU (A100 80G)
 
21
  - train_batch_size: 4
22
- - num_devices: 4
23
- - gradient_accumulation_steps: 4
24
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
25
  - lr_scheduler_type: linear
26
  - num_epochs: 2.0
27
 
28
  ## Framework versions
29
  - Transformers 4.35.0
30
- - Pytorch 2.0.1+cu118
31
  - Datasets 2.14.6
32
  - deepspeed 0.11.1
33
  - accelerate 0.24.1
 
12
  base_model: EleutherAI/polyglot-ko-12.8b
13
  ---
14
 
15
+ This model is a instruct-tuned poylglot-ko-12.8b model, using 10% [Kullm, OIG, KoAlpaca] Instruction dataset. -> 29step
16
 
17
  ## Training hyperparameters
18
  - learning_rate: 5e-5
19
  - seed: 42
20
+ - distributed_type: multi-GPU (A100 40G) + CPU offloading (512GB)
21
+ - num_devices: 1
22
  - train_batch_size: 4
23
+ - gradient_accumulation_steps: 16
 
24
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
25
  - lr_scheduler_type: linear
26
  - num_epochs: 2.0
27
 
28
  ## Framework versions
29
  - Transformers 4.35.0
30
+ - Pytorch 2.0.1+cu117
31
  - Datasets 2.14.6
32
  - deepspeed 0.11.1
33
  - accelerate 0.24.1