Spaces:

Ci-Dave
/

DR_Classification

Sleeping

App Files Files Community

3v324v23 commited on May 6

Commit

198f3b5

1 Parent(s): 1b4d479

added the dataset table

Browse files

Files changed (2) hide show

pages/Dataset.py +32 -40
training/training.ipynb +1 -1

pages/Dataset.py CHANGED Viewed

@@ -4,6 +4,7 @@ import os
 from PIL import Image
 import matplotlib.pyplot as plt
 import seaborn as sns
 st.set_page_config(layout="wide")
 st.title("🩺 Diabetic Retinopathy Project")
@@ -21,16 +22,15 @@ with tab1:
     **Dataset Description:**
     The DDR dataset contains **13,673 fundus images** from **147 hospitals** across **23 provinces in China**. The images are labeled into 5 classes based on DR severity:
     - **No_DR**
     - **Mild**
     - **Moderate**
     - **Severe**
     - **Proliferative_DR**
-    Poor-quality images were removed, and black backgrounds were deleted.
     [📎 Dataset source](https://www.kaggle.com/datasets/mariaherrerot/ddrdataset)
     ### 🧪 Data Preparation & Splitting
     - All images resized to **224x224**
@@ -60,7 +60,11 @@ with tab2:
     }
     df['label'] = df['diagnosis'].map(label_map)
-    # --- Metric 1: Class Distribution ---
     st.subheader("1️⃣ Class Distribution")
     class_counts = df['label'].value_counts().reset_index()
     class_counts.columns = ['Class', 'Count']
@@ -70,7 +74,7 @@ with tab2:
     ax1.set_title("Class Distribution")
     st.pyplot(fig1)
-    # --- Metric 2: Sample Images Per Class ---
     st.subheader("2️⃣ Sample Images Per Class")
     cols = st.columns(len(class_counts))
@@ -82,36 +86,6 @@ with tab2:
             cols[i].image(image, caption=label, use_container_width=True)
         else:
             cols[i].write(f"Image not found: {sample_row['id_code']}")
-    # --- Metric 3: Image Size Distribution ---
-    st.subheader("3️⃣ Image Size Distribution")
-    image_sizes = []
-    # Check a few images per class for speed
-    for label in class_counts['Class']:
-        sample_paths = df[df['label'] == label]['id_code'][:5]  # 5 images per class
-        for img_code in sample_paths:
-            img_path = os.path.join(IMG_FOLDER, str(img_code))  # Assuming image filenames are id_code.png
-            if os.path.exists(img_path):
-                try:
-                    with Image.open(img_path) as img:
-                        image_sizes.append(img.size)
-                except Exception as e:
-                    st.warning(f"Error loading image {img_code}: {e}")
-                    pass
-    if image_sizes:
-        widths, heights = zip(*image_sizes)
-        fig2, ax2 = plt.subplots()
-        sns.histplot(widths, kde=True, label="Width", color="blue")
-        sns.histplot(heights, kde=True, label="Height", color="green")
-        ax2.legend()
-        ax2.set_title("Image Size Distribution")
-        st.pyplot(fig2)
-    else:
-        st.info("No image size data available. Check your paths.")
 # =============================
 # Tab 3: Algorithm Used
 # =============================
@@ -119,14 +93,32 @@ with tab3:
     st.markdown("""
     ### 🤖 Model and Algorithm
-    We used **Transfer Learning** with **ResNet50** for DR classification.
     #### 🏗️ Model Details:
     - Input Image Size: **224x224**
-    - Pretrained on **ImageNet**
-    - Optimizer: **Adam**
     - Loss Function: **Categorical Crossentropy**
     - Evaluation Metrics: **Accuracy**, **Precision**, **Recall**
-    This architecture is ideal for medical image analysis due to its deep layers and robustness to overfitting.
-    """)

 from PIL import Image
 import matplotlib.pyplot as plt
 import seaborn as sns
+import numpy as np
 st.set_page_config(layout="wide")
 st.title("🩺 Diabetic Retinopathy Project")
     **Dataset Description:**
     The DDR dataset contains **13,673 fundus images** from **147 hospitals** across **23 provinces in China**. The images are labeled into 5 classes based on DR severity:
     - **No_DR**
     - **Mild**
     - **Moderate**
     - **Severe**
     - **Proliferative_DR**
+    Poor-quality images were removed, and black backgrounds were deleted. **12,521 images left**
     [📎 Dataset source](https://www.kaggle.com/datasets/mariaherrerot/ddrdataset)
     ### 🧪 Data Preparation & Splitting
     - All images resized to **224x224**
     }
     df['label'] = df['diagnosis'].map(label_map)
+    # --- Metric 1: Full Dataset Table ---
+    st.subheader("3️⃣ Full Dataset Table")
+    st.dataframe(df, use_container_width=True)
+    # --- Metric 2: Class Distribution ---
     st.subheader("1️⃣ Class Distribution")
     class_counts = df['label'].value_counts().reset_index()
     class_counts.columns = ['Class', 'Count']
     ax1.set_title("Class Distribution")
     st.pyplot(fig1)
+    # --- Metric 3: Sample Images Per Class ---
     st.subheader("2️⃣ Sample Images Per Class")
     cols = st.columns(len(class_counts))
             cols[i].image(image, caption=label, use_container_width=True)
         else:
             cols[i].write(f"Image not found: {sample_row['id_code']}")
 # =============================
 # Tab 3: Algorithm Used
 # =============================
     st.markdown("""
     ### 🤖 Model and Algorithm
+    We used **Transfer Learning** with **DenseNet121** for DR classification.
     #### 🏗️ Model Details:
+    - Model: **DenseNet121** (pretrained on **ImageNet**)
     - Input Image Size: **224x224**
+    - Batch Size: **32**
+    - Optimizer: **AdamW** (learning rate = **1e-3**)
     - Loss Function: **Categorical Crossentropy**
     - Evaluation Metrics: **Accuracy**, **Precision**, **Recall**
+    #### 📊 Evaluation Results:
+    - **Top-1 Accuracy:** 85.0%
+    - **Top-2 Accuracy:** 84.9%
+    - **Top-3 Accuracy:** 84.6%
+    #### 🖥️ Training Environment:
+    - **Operating System:** Windows
+    - **Hardware:** CPU only (no GPU)
+    - **Epochs:** 15
+    - **Training Time:** ~41 minutes per epoch
+    Since the training was done on a CPU, it was slower compared to using a GPU.
+    Because of this, we only trained for 15 epochs to save time.
+    DenseNet121 was selected because it passes features directly to deeper layers,
+    which helps improve learning and reduces overfitting — especially useful in medical images like eye scans.
+    https://www.researchgate.net/publication/373171778_Deep_learning-enhanced_diabetic_retinopathy_image_classification
+    """)

training/training.ipynb CHANGED Viewed

@@ -271,7 +271,7 @@
    "id": "1e34f571",
    "metadata": {},
    "source": [
-    "#### For the ESRGAN if applicable"
    ]
   },
   {

    "id": "1e34f571",
    "metadata": {},
    "source": [
+    "#### For the ESRGAN if applicable (Future)"
    ]
   },
   {