priyanka17 commited on
Commit
03b551c
1 Parent(s): 7e82cd1

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +14 -2
README.md CHANGED
@@ -118,12 +118,24 @@ The dataset was compiled from https://huggingface.co/datasets/duxprajapati/sympt
118
  which was then processed in terms of data-labeling using Smabbler's QueryLab platform ensuring a accurate representation of data-labels for common and rare diseases.
119
 
120
 
121
- Pre-processing:
122
 
123
- Data was pre-processed to ensure consistency and accuracy. This involved cleaning the data, handling missing values, and normalizing the binary encoding.
 
 
 
 
124
  Each symptom was converted into a binary feature (0 or 1), indicating its absence or presence respectively.
125
  The labels were mapped to specific diseases using a detailed mapping file to ensure accurate representation.
126
 
 
 
 
 
 
 
 
 
127
  Label Mapping:
128
 
129
  The labels in the dataset correspond to various diseases. A mapping file (mapping.json) was used to translate encoded labels to human-readable disease names.
 
118
  which was then processed in terms of data-labeling using Smabbler's QueryLab platform ensuring a accurate representation of data-labels for common and rare diseases.
119
 
120
 
121
+ ### Pre-processing:
122
 
123
+ The pre-processing stage is very crucial to the building of an accurate machine learning model and in terms of ensuring its reliability to be used in medical domain.
124
+ It involves data cleaning process which is a bit labor-intensive involving extensive manual checks for consistency and iterative validation for retaining high quality of final dataset.
125
+ These processes are particularly complex while dealing with medical data.
126
+
127
+ Here the data was pre-processed to ensure consistency and accuracy. This involved cleaning the data, handling missing values, and normalizing the binary encoding.
128
  Each symptom was converted into a binary feature (0 or 1), indicating its absence or presence respectively.
129
  The labels were mapped to specific diseases using a detailed mapping file to ensure accurate representation.
130
 
131
+ Smabbler made the pre-processing method easy by providing automated labeling,reducing the manual effort, ensuring consistency,
132
+ and maintained high accuracy in the pre-processed dataset,
133
+ making it a crucial asset in building a reliable disease diagnostic model.
134
+
135
+ The data cleaning process, which would have been labor-intensive and time-consuming, was significantly expedited by Smabbler's tools and features.The platform's automation,
136
+ standardization, and validation capabilities ensured that the pre-processing was not only quicker but also more reliable and accurate.
137
+
138
+
139
  Label Mapping:
140
 
141
  The labels in the dataset correspond to various diseases. A mapping file (mapping.json) was used to translate encoded labels to human-readable disease names.