factual-med-bert-de / README.md
binsumait's picture
Update README.md
f2d9f40
metadata
language: de
tags:
  - negation
  - speculation
  - cross-lingual
  - bert
  - clinical/medical
  - text-classification
extra_gated_prompt: >-
  You agree to not use the model to conduct experiments that cause harm to human
  subjects, i.e. attempting to misuse clinical data or re-identify any sensible
  data.
extra_gated_fields:
  Company: text
  Country: text
  Name: text
  Email: text
  I agree to use this model for non-commercial use ONLY: checkbox
pipeline_tag: text-classification

FactualMedBERT-DE: Clinical Factuality Detection BERT model for German language

Model description

FactualMedBERT-DE is the first pre-trained language model to address factuality/assertion detection problem in German clinical texts (primarily discharge summaries). It is introduced in the paper Factuality Detection using Machine Translation - a Use Case for German Clinical Text. The model classifies tagged medical conditions based on their factuality value. It can support label classification of Affirmed, Negated and Possible.

It was intialized from smanjil/German-MedBERT German language model and was trained on a translated subset data of the 2010 i2b2/VA assertion challenege.

How to use the model

  • You might need to authenticate and login before being able to download the model (see more here)
  • Get the model using the transformers library
from transformers import AutoTokenizer, AutoModelForSequenceClassification
tokenizer = AutoTokenizer.from_pretrained("binsumait/factual-med-bert-de")
model = AutoModelForSequenceClassification.from_pretrained("binsumait/factual-med-bert-de")
  • Predict an instance by pre-tagging the factuality target (ideally a medical condition) with [unused1] special token:
from transformers import TextClassificationPipeline
instance = "Der Patient hat vielleicht [unused1] Fieber [unused1]"

factuality_pipeline = TextClassificationPipeline(model=model, tokenizer=tokenizer)

print(factuality_pipeline(instance))

which should output: [{'label': 'possible', 'score': 0.9744388461112976}]

Cite

If you use our model, please cite your paper as follows:

@inproceedings{bin_sumait_2023,
  title={Factuality Detection using Machine Translation - a Use Case for German Clinical Text},
  author={Bin Sumait, Mohammed and Gabryszak, Aleksandra and Hennig, Leonhard and Roller, Roland},
  booktitle={Proceedings of the 19th Conference on Natural Language Processing (KONVENS 2023)},
  year={2023}
}