Update README.md
Browse files
README.md
CHANGED
@@ -148,7 +148,7 @@ The adaptation procedure is explained in [this blog post](https://medium.com/@mp
|
|
148 |
|
149 |
The training corpus consists of 26B tokens of several corpora gathered from web crawlings and public domain data.
|
150 |
|
151 |
-
| Dataset | Language |
|
152 |
|---------------------|----------|--------------------|--------------|
|
153 |
| Wikipedia | en | 2169.97M | 1.428144485 |
|
154 |
| C4_es | es | 53709.80M | 0.1049686196 |
|
|
|
148 |
|
149 |
The training corpus consists of 26B tokens of several corpora gathered from web crawlings and public domain data.
|
150 |
|
151 |
+
| Dataset | Language | Words (per-epoch) | Epochs |
|
152 |
|---------------------|----------|--------------------|--------------|
|
153 |
| Wikipedia | en | 2169.97M | 1.428144485 |
|
154 |
| C4_es | es | 53709.80M | 0.1049686196 |
|