UMCU commited on
Commit
cb98be4
·
verified ·
1 Parent(s): 8350d1c

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +113 -0
README.md ADDED
@@ -0,0 +1,113 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ tags:
3
+ - echocardiogram
4
+ - arxiv:2408.06930
5
+ - medical
6
+ language:
7
+ - nl
8
+ license: gpl-3.0
9
+ model-index:
10
+ - name: Echocardiogram_TricuspidRegurgitation_reduced
11
+ results:
12
+ - task:
13
+ type: text-classification
14
+ dataset:
15
+ type: test
16
+ name: internal test set
17
+ metrics:
18
+ - name: Macro f1
19
+ type: f1
20
+ value: 0.967
21
+ verified: false
22
+ - name: Macro precision
23
+ type: precision
24
+ value: 0.972
25
+ verified: false
26
+ - name: Macro recall
27
+ type: recall
28
+ value: 0.962
29
+ verified: false
30
+ pipeline_tag: text-classification
31
+ metrics:
32
+ - f1
33
+ - precision
34
+ - recall
35
+ ---
36
+
37
+ # Description
38
+ This model is a [MedRoBERTa.nl](https://huggingface.co/CLTL/MedRoBERTa.nl) model finetuned on Dutch echocardiogram reports sourced from Electronic Health Records.
39
+ The publication associated with the span classification task can be found at https://arxiv.org/abs/2408.06930.
40
+ The config file for training the model can be found at https://github.com/umcu/echolabeler.
41
+
42
+ # Minimum working example
43
+ ```python
44
+ from transformer import pipeline
45
+ ```
46
+ ```python
47
+ le_pipe = pipeline(model="UMCU/Echocardiogram_TricuspidRegurgitation_reduced")
48
+ document = "Lorem ipsum"
49
+ results = le_pipe(document)
50
+ ```
51
+
52
+ # Label Scheme
53
+
54
+ <details>
55
+
56
+ <summary>View label scheme</summary>
57
+
58
+ | Component | Labels |
59
+ | --- | --- |
60
+ | **`reduced`** | `No label`, `Normal`, `Not Normal` |
61
+ </details>
62
+
63
+ Here, for the reduced labels `Present` means that for *any one or multiple* of the pathologies we have a positive result.
64
+
65
+ Here, for the pathologies we have
66
+
67
+ <details>
68
+
69
+ <summary>View pathologies</summary>
70
+
71
+ | Annotation | Pathology |
72
+ | --- | --- |
73
+ | pe | Pericardial Effusion |
74
+ | wma | Wall Motion Abnormality |
75
+ | lv_dil | Left Ventricle Dilation |
76
+ | rv_dil | Right Ventricle Dilation |
77
+ | lv_syst_func | Left Ventricle Systolic Dysfunction |
78
+ | rv_syst_func | Right Ventricle Systolic Dysfunction |
79
+ | lv_dias_func | Diastolic Dysfunction |
80
+ | aortic_valve_native_stenosis | Aortic Stenosis |
81
+ | mitral_valve_native_regurgitation | Mitral valve regurgitation |
82
+ | tricuspid_valve_native_regurgitation | Tricuspid regurgitation |
83
+ | aortic_valve_native_regurgitation | Aortic Regurgitation |
84
+ </details>
85
+
86
+ Note: `lv_dias_func` should have been `dias_func`..
87
+
88
+ # Intended use
89
+ The model is developed for *document* classification of Dutch clinical echocardiogram reports.
90
+ Since it is a domain-specific model trained on medical data, it is **only** meant to be used on medical NLP tasks for *Dutch echocardiogram reports*.
91
+
92
+ # Data
93
+ The model was trained on approximately 4,000 manually annotated echocardiogram reports from the University Medical Centre Utrecht.
94
+ The training data was anonymized before starting the training procedure.
95
+
96
+ | Feature | Description |
97
+ | --- | --- |
98
+ | **Name** | `Echocardiogram_TricuspidRegurgitation_reduced` |
99
+ | **Version** | `1.0.0` |
100
+ | **transformers** | `>=4.40.0` |
101
+ | **Default Pipeline** | `pipeline`, `text-classification` |
102
+ | **Components** | `RobertaForSequenceClassification` |
103
+ | **License** | `cc-by-sa-4.0` |
104
+ | **Author** | [Bram van Es]() |
105
+
106
+ # Contact
107
+ If you are having problems with this model please add an issue on our git: https://github.com/umcu/echolabeler/issues
108
+
109
+ # Usage
110
+ If you use the model in your work please use the following referral; https://doi.org/10.48550/arXiv.2408.06930
111
+
112
+ # References
113
+ Paper: Bauke Arends, Melle Vessies, Dirk van Osch, Arco Teske, Pim van der Harst, René van Es, Bram van Es (2024): Diagnosis extraction from unstructured Dutch echocardiogram reports using span- and document-level characteristic classification, Arxiv https://arxiv.org/abs/2408.06930