Update README.md
Browse files
README.md
CHANGED
@@ -37,7 +37,7 @@ The model was trained on 3 corpora, which were hot-swapped during the training.
|
|
37 |
<img src="figures/tloss_full.png" width="900"/>
|
38 |
Figure 1: Training loss.
|
39 |
<img src="figures/tloss_closeup.png" width="900"/>
|
40 |
-
Figure 2: Training loss closeup. We mark two hotswap places, where the training corpus #1 was switched for internal-corpus #2 and internal-corpus #2.1 respectively.
|
41 |
|
42 |
Additionaly, we perform two ablations:
|
43 |
|
|
|
37 |
<img src="figures/tloss_full.png" width="900"/>
|
38 |
Figure 1: Training loss.
|
39 |
<img src="figures/tloss_closeup.png" width="900"/>
|
40 |
+
Figure 2: Training loss closeup. We mark two hotswap places, where the training corpus #1 was switched for internal-corpus #2 and internal-corpus #2.1 respectively. The flat region between 112k steps and 119.5k steps is caused by missing data---due to an accident, we lost these logs.
|
41 |
|
42 |
Additionaly, we perform two ablations:
|
43 |
|