jekunz commited on
Commit
28f5516
·
verified ·
1 Parent(s): 31dd026

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -18,7 +18,7 @@ Training:
18
  - LR scheduler: Cosine
19
  - Warmup ratio: 0.05
20
  - Batch size: 1
21
- - 4 A100 (80GB) GPUs
22
  - Gradient accumulation steps: 32
23
- - Effective batch size: 128
24
  - Max. context length: 8192 tokens
 
18
  - LR scheduler: Cosine
19
  - Warmup ratio: 0.05
20
  - Batch size: 1
21
+ - 8 A100 (80GB) GPUs
22
  - Gradient accumulation steps: 32
23
+ - Effective batch size: 256
24
  - Max. context length: 8192 tokens