Update README.md
Browse files
README.md
CHANGED
@@ -120,7 +120,7 @@ for split in ("random", "stepwise", "gaussian"):
|
|
120 |
|
121 |
We then used the same setup as Liu et al. (2019) but trained only for half the steps (250k) on a sequence length of 128. Then, we continued training the most promising model for 25k more on sequence length 512.
|
122 |
|
123 |
-
**MENTION TWO WAYS TO CONTINUE TRAINING ON 512 AND SHOW DIFFERENCE IN PERFORMANCE, DO WE HAVE A GRAPH FOR THIS?**
|
124 |
|
125 |
## Results
|
126 |
|
|
|
120 |
|
121 |
We then used the same setup as Liu et al. (2019) but trained only for half the steps (250k) on a sequence length of 128. Then, we continued training the most promising model for 25k more on sequence length 512.
|
122 |
|
123 |
+
**MENTION TWO WAYS TO CONTINUE TRAINING ON 512 AND SHOW DIFFERENCE IN PERFORMANCE, DO WE HAVE A GRAPH FOR THIS?** .
|
124 |
|
125 |
## Results
|
126 |
|