umarbutler
commited on
Commit
•
01dc13a
1
Parent(s):
b19ae6f
Update README.md
Browse files
README.md
CHANGED
@@ -82,7 +82,7 @@ The model was trained with the following hyperparameters for the first 100,290 s
|
|
82 |
| Weight decay | 0.01 |
|
83 |
| Warmup ratio | 0.06 |
|
84 |
|
85 |
-
After training on two RTX A6000s for \~120,050 steps over a period of 91 hours, the [vast.ai](https://vast.ai) instance hosting the model crashed. Fortunately, a checkpoint had been saved at step 100,290 (
|
86 |
| Hyperparameter | Value |
|
87 |
| --- | --- |
|
88 |
| Sequence length | 512 |
|
|
|
82 |
| Weight decay | 0.01 |
|
83 |
| Warmup ratio | 0.06 |
|
84 |
|
85 |
+
After training on two RTX A6000s for \~120,050 steps over a period of 91 hours, the [vast.ai](https://vast.ai) instance hosting the model crashed. Fortunately, a checkpoint had been saved at step 100,290 (\~60% of an epoch), although the optimiser's state was mistakenly not downloaded. The model was subsequently moved to a new instance where it was trained on an L40 for a further 133,711 steps (\~40% of an epoch) with the following hyperparameters (changes emphasised):
|
86 |
| Hyperparameter | Value |
|
87 |
| --- | --- |
|
88 |
| Sequence length | 512 |
|