factual-med-bert-de / README.md
binsumait's picture
Update README.md
f2d9f40
---
language: de
tags:
- negation
- speculation
- cross-lingual
- bert
- clinical/medical
- text-classification
extra_gated_prompt: >-
You agree to not use the model to conduct experiments that cause harm to human
subjects, i.e. attempting to misuse clinical data or re-identify any sensible
data.
extra_gated_fields:
Company: text
Country: text
Name: text
Email: text
I agree to use this model for non-commercial use ONLY: checkbox
pipeline_tag: text-classification
---
# FactualMedBERT-DE: Clinical Factuality Detection BERT model for German language
## Model description
FactualMedBERT-DE is the first pre-trained language model to address factuality/assertion detection problem in German clinical texts (primarily discharge summaries).
It is introduced in the paper [Factuality Detection using Machine Translation - a Use Case for German Clinical Text](https://arxiv.org/abs/2308.08827). The model classifies tagged medical conditions based
on their factuality value. It can support label classification of `Affirmed`, `Negated` and `Possible`.
It was intialized from [smanjil/German-MedBERT](https://huggingface.co/smanjil/German-MedBERT) German language model and
was trained on a translated subset data of [the 2010 i2b2/VA assertion challenege](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3168320/).
## How to use the model
- You might need to authenticate and login before being able to download the model (see more [here](https://huggingface.co/docs/huggingface_hub/quick-start))
- Get the model using the transformers library
```python
from transformers import AutoTokenizer, AutoModelForSequenceClassification
tokenizer = AutoTokenizer.from_pretrained("binsumait/factual-med-bert-de")
model = AutoModelForSequenceClassification.from_pretrained("binsumait/factual-med-bert-de")
```
- Predict an instance by pre-tagging the factuality target (ideally a medical condition) with `[unused1]` special token:
```python
from transformers import TextClassificationPipeline
instance = "Der Patient hat vielleicht [unused1] Fieber [unused1]"
factuality_pipeline = TextClassificationPipeline(model=model, tokenizer=tokenizer)
print(factuality_pipeline(instance))
```
which should output:
`[{'label': 'possible', 'score': 0.9744388461112976}]`
## Cite
If you use our model, please cite your paper as follows:
```bibtex
@inproceedings{bin_sumait_2023,
title={Factuality Detection using Machine Translation - a Use Case for German Clinical Text},
author={Bin Sumait, Mohammed and Gabryszak, Aleksandra and Hennig, Leonhard and Roller, Roland},
booktitle={Proceedings of the 19th Conference on Natural Language Processing (KONVENS 2023)},
year={2023}
}
```