|
--- |
|
language: de |
|
tags: |
|
- negation |
|
- speculation |
|
- cross-lingual |
|
- bert |
|
- clinical/medical |
|
- text-classification |
|
extra_gated_prompt: >- |
|
You agree to not use the model to conduct experiments that cause harm to human |
|
subjects, i.e. attempting to misuse clinical data or re-identify any sensible |
|
data. |
|
extra_gated_fields: |
|
Company: text |
|
Country: text |
|
Name: text |
|
Email: text |
|
I agree to use this model for non-commercial use ONLY: checkbox |
|
pipeline_tag: text-classification |
|
--- |
|
|
|
# FactualMedBERT-DE: Clinical Factuality Detection BERT model for German language |
|
|
|
## Model description |
|
|
|
FactualMedBERT-DE is the first pre-trained language model to address factuality/assertion detection problem in German clinical texts (primarily discharge summaries). |
|
It is introduced in the paper [Factuality Detection using Machine Translation - a Use Case for German Clinical Text](https://arxiv.org/abs/2308.08827). The model classifies tagged medical conditions based |
|
on their factuality value. It can support label classification of `Affirmed`, `Negated` and `Possible`. |
|
|
|
It was intialized from [smanjil/German-MedBERT](https://huggingface.co/smanjil/German-MedBERT) German language model and |
|
was trained on a translated subset data of [the 2010 i2b2/VA assertion challenege](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3168320/). |
|
|
|
## How to use the model |
|
|
|
- You might need to authenticate and login before being able to download the model (see more [here](https://huggingface.co/docs/huggingface_hub/quick-start)) |
|
- Get the model using the transformers library |
|
|
|
```python |
|
from transformers import AutoTokenizer, AutoModelForSequenceClassification |
|
tokenizer = AutoTokenizer.from_pretrained("binsumait/factual-med-bert-de") |
|
model = AutoModelForSequenceClassification.from_pretrained("binsumait/factual-med-bert-de") |
|
``` |
|
|
|
- Predict an instance by pre-tagging the factuality target (ideally a medical condition) with `[unused1]` special token: |
|
|
|
```python |
|
from transformers import TextClassificationPipeline |
|
instance = "Der Patient hat vielleicht [unused1] Fieber [unused1]" |
|
|
|
factuality_pipeline = TextClassificationPipeline(model=model, tokenizer=tokenizer) |
|
|
|
print(factuality_pipeline(instance)) |
|
``` |
|
|
|
which should output: |
|
`[{'label': 'possible', 'score': 0.9744388461112976}]` |
|
|
|
## Cite |
|
|
|
If you use our model, please cite your paper as follows: |
|
|
|
```bibtex |
|
@inproceedings{bin_sumait_2023, |
|
title={Factuality Detection using Machine Translation - a Use Case for German Clinical Text}, |
|
author={Bin Sumait, Mohammed and Gabryszak, Aleksandra and Hennig, Leonhard and Roller, Roland}, |
|
booktitle={Proceedings of the 19th Conference on Natural Language Processing (KONVENS 2023)}, |
|
year={2023} |
|
} |
|
``` |