Update README.md
Browse files
README.md
CHANGED
@@ -169,7 +169,7 @@ The training corpus consists 26B tokens of several corpora gathered from web cra
|
|
169 |
|
170 |
The dataset has the following language distribution:
|
171 |
|
172 |
-
|Language
|
173 |
|---|---|
|
174 |
|En|16.84%|
|
175 |
|Es|41.38%|
|
@@ -195,8 +195,8 @@ The training lasted a total of 320 hours on 8 NVIDIA H100 GPUs with 80GB RAM.
|
|
195 |
- total_train_batch_size: 8
|
196 |
- total_eval_batch_size: 8
|
197 |
- optimizer: Adam
|
198 |
-
- betas
|
199 |
-
- epsilon
|
200 |
- learning_rate: 5e-05
|
201 |
- lr_scheduler_type: linear
|
202 |
- num_epochs: 1.0
|
@@ -226,7 +226,7 @@ Copyright(c) 2023 by Language Technologies Unit, Barcelona Supercomputing Center
|
|
226 |
### Funding
|
227 |
This work was partially funded by:
|
228 |
- The [Departament de la Vicepresidència i de Polítiques Digitals i Territori de la Generalitat de Catalunya](https://politiquesdigitals.gencat.cat/ca/inici/index.html#googtrans(ca|en) within the framework of [Projecte AINA](https://politiquesdigitals.gencat.cat/ca/economia/catalonia-ai/aina).
|
229 |
-
- The [Spanish State Secretariat for Digitalization and Artificial Intelligence
|
230 |
|
231 |
### Disclaimer
|
232 |
|
|
|
169 |
|
170 |
The dataset has the following language distribution:
|
171 |
|
172 |
+
|Language|Percentage|
|
173 |
|---|---|
|
174 |
|En|16.84%|
|
175 |
|Es|41.38%|
|
|
|
195 |
- total_train_batch_size: 8
|
196 |
- total_eval_batch_size: 8
|
197 |
- optimizer: Adam
|
198 |
+
- betas: (0.9,0.999)
|
199 |
+
- epsilon: 1e-08
|
200 |
- learning_rate: 5e-05
|
201 |
- lr_scheduler_type: linear
|
202 |
- num_epochs: 1.0
|
|
|
226 |
### Funding
|
227 |
This work was partially funded by:
|
228 |
- The [Departament de la Vicepresidència i de Polítiques Digitals i Territori de la Generalitat de Catalunya](https://politiquesdigitals.gencat.cat/ca/inici/index.html#googtrans(ca|en) within the framework of [Projecte AINA](https://politiquesdigitals.gencat.cat/ca/economia/catalonia-ai/aina).
|
229 |
+
- The [Spanish State Secretariat for Digitalization and Artificial Intelligence](https://portal.mineco.gob.es/en-us/digitalizacionIA/Pages/sedia.aspx) within the framework of the [Plan de Impulso de las Tecnologías del Lenguaje](https://plantl.mineco.gob.es/Paginas/index.aspx).
|
230 |
|
231 |
### Disclaimer
|
232 |
|