bvanaken commited on
Commit
5c5b942
·
1 Parent(s): 74dfeb6

Create README

Browse files
Files changed (1) hide show
  1. README.md +70 -0
README.md ADDED
@@ -0,0 +1,70 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language: "en"
3
+ tags:
4
+ - bert
5
+ - medical
6
+ - clinical
7
+ - diagnosis
8
+ thumbnail: "https://core.app.datexis.com/static/paper.png"
9
+ ---
10
+
11
+ # CORe Model - Clinical Diagnosis Prediction
12
+
13
+ ## Model description
14
+
15
+ The CORe (_Clinical Outcome Representations_) model is introduced in the paper [Clinical Outcome Predictions from Admission Notes using Self-Supervised Knowledge Integration](https://www.aclweb.org/anthology/2021.eacl-main.75.pdf).
16
+ It is based on BioBERT and further pre-trained on clinical notes, disease descriptions and medical articles with a specialised _Clinical Outcome Pre-Training_ objective.
17
+
18
+ This model checkpoint is **fine-tuned on the task of diagnosis prediction**.
19
+ The model expects patient admission notes as input and outputs multi-label ICD9-code predictions.
20
+
21
+ #### Model Predictions
22
+ The model makes predictions on a total of 9237 labels. These contain 3- and 4-digit ICD9 codes and textual descriptions of these codes. The 4-digit codes and textual descriptions help to incorporate further topical and hierarchical information into the model during training (see Section 4.2 _ICD+: Incorporation of ICD Hierarchy_ in our paper). We recommend to only use the **3-digit code predictions at inference time**, because only those have been evaluated in our work.
23
+
24
+ #### How to use CORe Diagnosis Prediction
25
+
26
+ You can load the model via the transformers library:
27
+ ```
28
+ from transformers import AutoTokenizer, AutoModelForSequenceClassification
29
+ tokenizer = AutoTokenizer.from_pretrained("bvanaken/CORe-clinical-diagnosis-prediction")
30
+ model = AutoModelForSequenceClassification.from_pretrained("bvanaken/CORe-clinical-diagnosis-prediction")
31
+ ```
32
+
33
+ The following code shows an inference example:
34
+
35
+ ```
36
+ input = "CHIEF COMPLAINT: Headaches\n\nPRESENT ILLNESS: 58yo man w/ hx of hypertension, AFib on coumadin presented to ED with the worst headache of his life."
37
+
38
+ tokenized_input = tokenizer(input, return_tensors="pt")
39
+ output = model(**tokenized_input)
40
+
41
+ import torch
42
+ predictions = torch.sigmoid(output.logits)
43
+ predicted_labels = [model.config.id2label[_id] for _id in (predictions > 0.3).nonzero()[:, 1].tolist()]
44
+ ```
45
+ Note: For the best performance, we recommend to determine the thresholds (0.3 in this example) individually per label.
46
+
47
+
48
+ ### More Information
49
+
50
+ For all the details about CORe and contact info, please visit [CORe.app.datexis.com](http://core.app.datexis.com/).
51
+
52
+ ### Cite
53
+
54
+ ```bibtex
55
+ @inproceedings{vanaken21,
56
+ author = {Betty van Aken and
57
+ Jens-Michalis Papaioannou and
58
+ Manuel Mayrdorfer and
59
+ Klemens Budde and
60
+ Felix A. Gers and
61
+ Alexander Löser},
62
+ title = {Clinical Outcome Prediction from Admission Notes using Self-Supervised
63
+ Knowledge Integration},
64
+ booktitle = {Proceedings of the 16th Conference of the European Chapter of the
65
+ Association for Computational Linguistics: Main Volume, {EACL} 2021,
66
+ Online, April 19 - 23, 2021},
67
+ publisher = {Association for Computational Linguistics},
68
+ year = {2021},
69
+ }
70
+ ```