BSC-LT
/

roberta-base-biomedical-es

Inference Endpoints

Model card Files Files and versions Community

ccasimiro commited on Sep 20, 2021

Commit

ae3116f

•

1 Parent(s): bf095a4

Update README.md

Files changed (1) hide show

README.md +31 -13

README.md CHANGED Viewed

@@ -16,19 +16,6 @@ widget:
 # Biomedical language model for Spanish
 Biomedical pretrained language model for Spanish. For more details about the corpus, the pretraining and the evaluation, read the paper read the paper "_Carrino, C. P., Armengol-Estapé, J., Gutiérrez-Fandiño, A., Llop-Palao, J., Pàmies, M., Gonzalez-Agirre, A., & Villegas, M. (2021). Biomedical and Clinical Language Models for Spanish: On the Benefits of Domain-Specific Pretraining in a Mid-Resource Scenario._"
-## BibTeX  citation
-If you use any of these resources (datasets or models) in your work, please cite our latest paper:
-```bibtex
-@misc{carrino2021biomedical,
-      title={Biomedical and Clinical Language Models for Spanish: On the Benefits of Domain-Specific Pretraining in a Mid-Resource Scenario},
-      author={Casimiro Pio Carrino and Jordi Armengol-Estapé and Asier Gutiérrez-Fandiño and Joan Llop-Palao and Marc Pàmies and Aitor Gonzalez-Agirre and Marta Villegas},
-      year={2021},
-      eprint={2109.03570},
-      archivePrefix={arXiv},
-      primaryClass={cs.CL}
-}
-```
 ## Tokenization and model pretraining
@@ -92,6 +79,37 @@ The model is ready-to-use only for masked language modelling to perform the Fill
 However, the is intended to be fine-tuned on downstream tasks such as Named Entity Recognition or Text Classification.
 ---
 ## How to use

 # Biomedical language model for Spanish
 Biomedical pretrained language model for Spanish. For more details about the corpus, the pretraining and the evaluation, read the paper read the paper "_Carrino, C. P., Armengol-Estapé, J., Gutiérrez-Fandiño, A., Llop-Palao, J., Pàmies, M., Gonzalez-Agirre, A., & Villegas, M. (2021). Biomedical and Clinical Language Models for Spanish: On the Benefits of Domain-Specific Pretraining in a Mid-Resource Scenario._"
 ## Tokenization and model pretraining
 However, the is intended to be fine-tuned on downstream tasks such as Named Entity Recognition or Text Classification.
+## Cite
+If you use our models, please cite our latest preprint:
+```bibtex
+@misc{carrino2021biomedical,
+      title={Biomedical and Clinical Language Models for Spanish: On the Benefits of Domain-Specific Pretraining in a Mid-Resource Scenario},
+      author={Casimiro Pio Carrino and Jordi Armengol-Estapé and Asier Gutiérrez-Fandiño and Joan Llop-Palao and Marc Pàmies and Aitor Gonzalez-Agirre and Marta Villegas},
+      year={2021},
+      eprint={2109.03570},
+      archivePrefix={arXiv},
+      primaryClass={cs.CL}
+}
+```
+If you use our Medical Crawler corpus, please cite the preprint:
+```bibtex
+@misc{carrino2021spanish,
+      title={Spanish Biomedical Crawled Corpus: A Large, Diverse Dataset for Spanish Biomedical Language Models},
+      author={Casimiro Pio Carrino and Jordi Armengol-Estapé and Ona de Gibert Bonet and Asier Gutiérrez-Fandiño and Aitor Gonzalez-Agirre and Martin Krallinger and Marta Villegas},
+      year={2021},
+      eprint={2109.07765},
+      archivePrefix={arXiv},
+      primaryClass={cs.CL}
+}
+```
 ---
 ## How to use