Filip commited on
Commit
655bf72
·
1 Parent(s): afc14cf
Files changed (1) hide show
  1. README.md +5 -5
README.md CHANGED
@@ -30,11 +30,11 @@ Quantization method: float16
30
  ### Hyperparameters
31
 
32
  Both models used the same hyperparameters during training.
33
- `per_device_train_batch_size = 2`
34
- `gradient_accumulation_steps=4`
35
- `learning_rate=2e-4`
36
- `optim="adamw_8bit"`
37
- `weight_decay=0.01`
38
  `lr_scheduler_type="linear"`
39
 
40
  We chose float16 as the quantization method as it has the fastest conversion and retains 100% accuracy. However, it is slow and memory hungry which is a disadvantage.
 
30
  ### Hyperparameters
31
 
32
  Both models used the same hyperparameters during training.
33
+ `per_device_train_batch_size = 2`\
34
+ `gradient_accumulation_steps=4`\
35
+ `learning_rate=2e-4`\
36
+ `optim="adamw_8bit"`\
37
+ `weight_decay=0.01`\
38
  `lr_scheduler_type="linear"`
39
 
40
  We chose float16 as the quantization method as it has the fastest conversion and retains 100% accuracy. However, it is slow and memory hungry which is a disadvantage.