Spaces:
Sleeping
Sleeping
Filip
commited on
Commit
·
ac5a1bf
1
Parent(s):
61046e0
update
Browse files
README.md
CHANGED
@@ -44,7 +44,7 @@ These hyperparameters are [suggested as default](https://docs.unsloth.ai/tutoria
|
|
44 |
`dropout=0.3`\
|
45 |
`per_device_train_batch_size=20`\
|
46 |
`gradient_accumulation_steps=40`\
|
47 |
-
`learning_rate=2e-2
|
48 |
|
49 |
The effects of this were evident. One step took around 10 minutes due to the increased `gradient_accumulation_steps`, and it required significant amount of memory from the GPU due to `per_device_train_batch_size=20`. It also overfitted just in 15 steps, achieving `loss=0`, due to the high learning rate. We wanted to try if the dropout could prevent overfitting while at the same time having a high learning rate, but it could not.
|
50 |
|
|
|
44 |
`dropout=0.3`\
|
45 |
`per_device_train_batch_size=20`\
|
46 |
`gradient_accumulation_steps=40`\
|
47 |
+
`learning_rate=2e-2`
|
48 |
|
49 |
The effects of this were evident. One step took around 10 minutes due to the increased `gradient_accumulation_steps`, and it required significant amount of memory from the GPU due to `per_device_train_batch_size=20`. It also overfitted just in 15 steps, achieving `loss=0`, due to the high learning rate. We wanted to try if the dropout could prevent overfitting while at the same time having a high learning rate, but it could not.
|
50 |
|