Update README.md
Browse files
README.md
CHANGED
@@ -56,7 +56,7 @@ Here is the table summarizing the architecture used for training, along with the
|
|
56 |
| optimize | AdamW |
|
57 |
| betas | 0.9, 0.999 |
|
58 |
| AMSGrad | True |
|
59 |
-
| learning rate | 5e-
|
60 |
| anneal strategy | cos |
|
61 |
| div factor | 100 |
|
62 |
| final div factor | 0.1 |
|
|
|
56 |
| optimize | AdamW |
|
57 |
| betas | 0.9, 0.999 |
|
58 |
| AMSGrad | True |
|
59 |
+
| learning rate | 5e-4 |
|
60 |
| anneal strategy | cos |
|
61 |
| div factor | 100 |
|
62 |
| final div factor | 0.1 |
|