--- language: "pt" widget: - text: "O principal [MASK] da COVID-19 é tosse seca." - text: "O vírus da gripe apresenta um [MASK] constituído por segmentos de ácido ribonucleico." datasets: - biomedical literature from Scielo and Pubmed thumbnail: "https://raw.githubusercontent.com/HAILab-PUCPR/BioBERTpt/master/logo-biobertpr1.png" --- Logo BioBERTpt # BioBERTpt - Portuguese Clinical and Biomedical BERT The [BioBERTpt - A Portuguese Neural Language Model for Clinical Named Entity Recognition](https://www.aclweb.org/anthology/2020.clinicalnlp-1.7/) paper contains clinical and biomedical BERT-based models for Portuguese Language, initialized with BERT-Multilingual-Cased & trained on clinical notes and biomedical literature. This model card describes the BioBERTpt(bio) model, a biomedical version of BioBERTpt, trained on Portuguese biomedical literature from scientific papers from Pubmed and Scielo. ## How to use the model Load the model via the transformers library: ``` from transformers import AutoTokenizer, AutoModel tokenizer = AutoTokenizer.from_pretrained("pucpr/biobertpt-bio") model = AutoModel.from_pretrained("pucpr/biobertpt-bio") ``` ## More Information Refer to the original paper, [BioBERTpt - A Portuguese Neural Language Model for Clinical Named Entity Recognition](https://www.aclweb.org/anthology/2020.clinicalnlp-1.7/) for additional details and performance on Portuguese NER tasks. ## Questions? Post a Github issue on the [BioBERTpt repo](https://github.com/HAILab-PUCPR/BioBERTpt).