Danish medical BERT

MeDa-BERT was initialized with weights from a pretrained Danish BERT model and pretrained for 48 epochs using the MLM objective on a Danish medical corpus of 123M tokens.

The development of the corpus and model is described further in this paper.

Here is an example on how to load the model in PyTorch using the 🤗Transformers library:

from transformers import AutoTokenizer, AutoModelForMaskedLM
tokenizer = AutoTokenizer.from_pretrained("jannikskytt/MeDa-Bert")
model = AutoModelForMaskedLM.from_pretrained("jannikskytt/MeDa-Bert")

Citing

@inproceedings{pedersen-etal-2023-meda,
    title = "{M}e{D}a-{BERT}: A medical {D}anish pretrained transformer model",
    author = "Pedersen, Jannik  and
      Laursen, Martin  and
      Vinholt, Pernille  and
      Savarimuthu, Thiusius Rajeeth",
    booktitle = "Proceedings of the 24th Nordic Conference on Computational Linguistics (NoDaLiDa)",
    month = may,
    year = "2023",
    address = "T{\'o}rshavn, Faroe Islands",
    publisher = "University of Tartu Library",
    url = "https://aclanthology.org/2023.nodalida-1.31",
    pages = "301--307",
}
Downloads last month
126
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and the model is not deployed on the HF Inference API.