Text Classification
English
medical
yip4002 commited on
Commit
d813a66
·
verified ·
1 Parent(s): 4041413

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +7 -3
README.md CHANGED
@@ -11,11 +11,12 @@ tags:
11
 
12
 
13
  # RadReportX
 
14
  ### Model description
15
- Llama3.1-8B-instruct model fine tuned on synthetic data. There are two tasks that this model can achieve. The first task is an open-ended question, which is to detect phrases in a radiology report that represents an ICD-10 code. There is no restriction about the underlying disease. The second task is to detect disease out of 13 candidates from a radiology report. The candidate diseases are [*Atelectasis, Cardiomegaly, Consolidation, Edema, Enlarged Cardiomediastinum, Fracture, Lung Lesion, Lung Opacity, Pleural Effusion, Pleural Other, Pneumonia, Pneumothorax, Support Devices*]. When there are no diseases out of the candidates, the model will output 'Normal'.
16
 
17
  ### Training set and training process
18
- There are two sources of training data. The first set is generated by GPT4o. The second source comes from MIMIC-CXR dataset (https://arxiv.org/pdf/1901.07042), with labels being extracted by Negbio algorithm. The training is conducted using torchtune framework (https://github.com/pytorch/torchtune). For details, please refer to our paper listed below.
19
 
20
  ### How to use
21
  Please refer to https://github.com/bionlplab/RadReportX
@@ -32,4 +33,7 @@ https://arxiv.org/pdf/2409.16563
32
  }
33
 
34
  ### Disclaimer
35
- This tool shows the results of research conducted in the Computational Biology Branch, NCBI. The information produced on this website is not intended for direct diagnostic use or medical decision-making without review and oversight by a clinical professional. Individuals should not change their health behavior solely on the basis of information produced on this website. NIH does not independently verify the validity or utility of the information produced by this tool. If you have questions about the information produced on this website, please see a health care professional. More information about NCBI's disclaimer policy is available.
 
 
 
 
11
 
12
 
13
  # RadReportX
14
+
15
  ### Model description
16
+ Llama3.1-8B-instruct model fine-tuned on the synthetic data. This model can achieve two tasks. The first task is an open-ended question, which is to detect phrases in a radiology report that represents an ICD-10 code. There is no restriction about the underlying disease. The second task is to detect disease out of 13 candidates from a radiology report. The candidate diseases are *Atelectasis, Cardiomegaly, Consolidation, Edema, Enlarged Cardiomediastinum, Fracture, Lung Lesion, Lung Opacity, Pleural Effusion, Pleural Other, Pneumonia, Pneumothorax, Support Devices*. When there are no diseases out of the candidates, the model will output '*Normal*'.
17
 
18
  ### Training set and training process
19
+ There are two sources of training data. The first set is generated by GPT-4o. The second source comes from the MIMIC-CXR dataset (https://arxiv.org/pdf/1901.07042), with labels being extracted by Negbio. The training is conducted using torchtune framework (https://github.com/pytorch/torchtune). For details, please refer to our paper listed below.
20
 
21
  ### How to use
22
  Please refer to https://github.com/bionlplab/RadReportX
 
33
  }
34
 
35
  ### Disclaimer
36
+ The information produced on this website is not intended for direct diagnostic use or medical decision-making without review and oversight by a clinical professional. Individuals should not change their health behavior solely on the basis of information produced on this website. We do not independently verify the validity or utility of the information produced by this tool. If you have questions about the information produced on this website, please see a health care professional.
37
+
38
+ ### Acknowledgment
39
+ This work was supported by the National Science Foundation Faculty Early Career Development (CAREER) award number 2145640, the Intramural Research Program of the National Institutes of Health, and the Amazon Research Award. The Medical Imaging and Data Resource Center (MIDRC) is funded by the National Institute of Biomedical Imaging and Bioengineering (NIBIB) of the National Institutes of Health under contract 75N92020D00021 and through The Advanced Research Projects Agency for Health (ARPA-H).