ML Wong commited on
Commit
cb9f0a7
·
1 Parent(s): 1ee1ecc

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +58 -2
README.md CHANGED
@@ -13,6 +13,62 @@ widget:
13
  example_title: "Example 3"
14
  ---
15
  # Intro
16
- This model was built on Microsoft's BERT trained on PubMed uncased database (`microsoft/BiomedNLP-PubMedBERT-base-uncased-abstract-fulltext`). I have extracted > 400 radiology reports for staging nasopharyngeal carcinoma (NPC). To focus on NPC, incidental findings and unrelated observations are removed.
17
 
18
- A tokenizer was trained based on the original PubMed version, and the radiology reports were used to fine tune the PubMedBert. This fine tuned model has the weakness of unable to identify phrase or multi-word nouns, e.g. "nodal metastatases" is considered two separate words such that the BERT module tends to fill "nodes" when these two words are masked.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
13
  example_title: "Example 3"
14
  ---
15
  # Intro
16
+ This model was built on Microsoft's BERT trained on PubMed uncased database (`microsoft/BiomedNLP-PubMedBERT-base-uncased-abstract-fulltext`). I have extracted > 400 radiology reports for staging nasopharyngeal carcinoma (NPC). To focus on NPC, incidental findings and unrelated observations are removed prior to training. In addition, the abbreviations for structures were replaced by the original words to facilitate the model of learning suffixes and prefixes that might indicate geographical locations (e.g. L neck -> left neck, IJC -> internal jugular chain).
17
 
18
+ A tokenizer was trained based on the original PubMed version, and the radiology reports were used to fine tune the PubMedBert. This fine tuned model has the weakness of unable to identify phrase or multi-word nouns, e.g. "nodal metastatases" is considered two separate words such that the BERT module tends to fill "nodes" when these two words are masked.
19
+
20
+ This model serve as a pilot analysis of whether it is possible to adopt a transformer based deep learning for radiology report corpus of NPC.
21
+
22
+ # Training Losses
23
+ | Epoch | Training Loss | Validation Loss |
24
+ |-------|---------------|-----------------|
25
+ | 1 | No log | 3.474347 |
26
+ | 2 | No log | 3.174083 |
27
+ | 3 | No log | 2.944307 |
28
+ | 4 | No log | 2.674384 |
29
+ | 5 | No log | 2.574261 |
30
+ | 6 | No log | 2.390012 |
31
+ | 7 | No log | 2.209419 |
32
+ | 8 | 2.464700 | 2.107448 |
33
+ | 9 | 2.464700 | 1.974744 |
34
+ | 10 | 2.464700 | 1.841606 |
35
+ | 11 | 2.464700 | 1.783265 |
36
+ | 12 | 2.464700 | 1.674914 |
37
+ | 13 | 2.464700 | 1.572721 |
38
+ | 14 | 2.464700 | 1.546106 |
39
+ | 15 | 2.464700 | 1.507173 |
40
+ | 16 | 1.153500 | 1.445264 |
41
+ | 17 | 1.153500 | 1.394671 |
42
+ | 18 | 1.153500 | 1.345976 |
43
+ | 19 | 1.153500 | 1.312650 |
44
+ | 20 | 1.153500 | 1.256743 |
45
+ | 21 | 1.153500 | 1.233211 |
46
+ | 22 | 1.153500 | 1.213525 |
47
+ | 23 | 1.153500 | 1.182824 |
48
+ | 24 | 0.681100 | 1.164411 |
49
+ | 25 | 0.681100 | 1.128899 |
50
+ | 26 | 0.681100 | 1.145166 |
51
+ | 27 | 0.681100 | 1.079617 |
52
+ | 28 | 0.681100 | 1.087909 |
53
+ | 29 | 0.681100 | 1.102839 |
54
+ | 30 | 0.681100 | 1.066386 |
55
+ | 31 | 0.681100 | 1.094807 |
56
+ | 32 | 0.478400 | 1.060072 |
57
+ | 33 | 0.478400 | 1.016879 |
58
+ | 34 | 0.478400 | 0.999808 |
59
+ | 35 | 0.478400 | 0.987576 |
60
+ | 36 | 0.478400 | 1.011713 |
61
+ | 37 | 0.478400 | 0.996884 |
62
+ | 38 | 0.478400 | 1.018533 |
63
+ | 39 | 0.478400 | 1.015250 |
64
+ | 40 | 0.378400 | 0.945075 |
65
+ | 41 | 0.378400 | 0.950782 |
66
+ | 42 | 0.378400 | 1.004242 |
67
+ | 43 | 0.378400 | 0.984930 |
68
+ | 44 | 0.378400 | 0.966999 |
69
+ | 45 | 0.378400 | 0.988593 |
70
+ | 46 | 0.378400 | 0.970504 |
71
+ | 47 | 0.378400 | 0.976804 |
72
+ | 48 | 0.339400 | 1.001518 |
73
+ | 49 | 0.339400 | 0.986024 |
74
+ | 50 | 0.339400 | 0.987911 |