new inference utilities

Files changed (6) hide show

README.md +110 -2
notebooks/0_preprocess.ipynb +22 -8
setup.py +33 -0
vascx_models/cli.py +198 -0
vascx_models/inference.py +269 -0
vascx_models/utils.py +160 -0

README.md CHANGED Viewed

@@ -6,7 +6,7 @@ tags:
 - biology
 ---
-## VascX models
 This repository contains the instructions for using the VascX models from the paper [VascX Models: Model Ensembles for Retinal Vascular Analysis from Color Fundus Images](https://arxiv.org/abs/2409.16016).
@@ -18,7 +18,7 @@ The model weights are in [huggingface](https://huggingface.co/Eyened/vascx).
 <img src="imgs/HRF_04_g_rgb.png" width="240" height="240" style="display:inline"><img src="imgs/HRF_04_g.png" width="240" height="240" style="display:inline">
-### Installation
 To install the entire fundus analysis pipeline including fundus preprocessing, model inference code and vascular biomarker extraction:
@@ -26,8 +26,116 @@ To install the entire fundus analysis pipeline including fundus preprocessing, m
 2. Install the [rtnls_inference package](https://github.com/Eyened/retinalysis-inference).
 ### Usage
 To speed up re-execution of vascx we recommend to run the preprocessing and segmentation steps separately:
 1. Preprocessing. See [this notebook](./notebooks/0_preprocess.ipynb). This step is CPU-heavy and benefits from parallelization (see notebook).

 - biology
 ---
+# VascX models
 This repository contains the instructions for using the VascX models from the paper [VascX Models: Model Ensembles for Retinal Vascular Analysis from Color Fundus Images](https://arxiv.org/abs/2409.16016).
 <img src="imgs/HRF_04_g_rgb.png" width="240" height="240" style="display:inline"><img src="imgs/HRF_04_g.png" width="240" height="240" style="display:inline">
+## Installation
 To install the entire fundus analysis pipeline including fundus preprocessing, model inference code and vascular biomarker extraction:
 2. Install the [rtnls_inference package](https://github.com/Eyened/retinalysis-inference).
+## `vascx run` Command
+The `run` command provides a comprehensive pipeline for processing fundus images, performing various analyses, and creating visualizations.
 ### Usage
+```bash
+vascx run DATA_PATH OUTPUT_PATH [OPTIONS]
+```
+### Arguments
+- `DATA_PATH`: Path to input data. Can be either:
+  - A directory containing fundus images
+  - A CSV file with a 'path' column containing paths to images
+- `OUTPUT_PATH`: Directory where processed results will be stored
+### Options
+| Option | Default | Description |
+|--------|---------|-------------|
+| `--preprocess/--no-preprocess` | `--preprocess` | Run preprocessing to standardize images for model input |
+| `--vessels/--no-vessels` | `--vessels` | Run vessel segmentation and artery-vein classification |
+| `--disc/--no-disc` | `--disc` | Run optic disc segmentation |
+| `--quality/--no-quality` | `--quality` | Run image quality assessment |
+| `--fovea/--no-fovea` | `--fovea` | Run fovea detection |
+| `--overlay/--no-overlay` | `--overlay` | Create visualization overlays combining all results |
+| `--n_jobs` | `4` | Number of preprocessing workers for parallel processing |
+### Output Structure
+When run with default options, the command creates the following structure in `OUTPUT_PATH`:
+```
+OUTPUT_PATH/
+├── preprocessed_rgb/     # Standardized fundus images
+├── vessels/              # Vessel segmentation results
+├── artery_vein/          # Artery-vein classification
+├── disc/                 # Optic disc segmentation
+├── overlays/             # Visualization images
+├── bounds.csv            # Image boundary information
+├── quality.csv           # Image quality scores
+└── fovea.csv             # Fovea coordinates
+```
+### Processing Stages
+1. **Preprocessing**:
+   - Standardizes input images for consistent analysis
+   - Outputs preprocessed images and boundary information
+2. **Quality Assessment**:
+   - Evaluates image quality with three quality metrics (q1, q2, q3)
+   - Higher scores indicate better image quality
+3. **Vessel Segmentation and Artery-Vein Classification**:
+   - Identifies blood vessels in the retina
+   - Classifies vessels as arteries (1) or veins (2) with intersections (3)
+4. **Optic Disc Segmentation**:
+   - Identifies the optic disc location and boundaries
+5. **Fovea Detection**:
+   - Determines the coordinates of the fovea (center of vision)
+6. **Visualization Overlays**:
+   - Creates color-coded images showing:
+     - Arteries in red
+     - Veins in blue
+     - Optic disc in white
+     - Fovea marked with yellow X
+### Examples
+**Process a directory of images with all analyses:**
+```bash
+vascx run /path/to/images /path/to/output
+```
+**Process specific images listed in a CSV:**
+```bash
+vascx run /path/to/image_list.csv /path/to/output
+```
+**Only run preprocessing and vessel segmentation:**
+```bash
+vascx run /path/to/images /path/to/output --no-disc --no-quality --no-fovea --no-overlay
+```
+**Skip preprocessing on already preprocessed images:**
+```bash
+vascx run /path/to/preprocessed/images /path/to/output --no-preprocess
+```
+**Increase parallel processing workers:**
+```bash
+vascx run /path/to/images /path/to/output --n_jobs 8
+```
+### Notes
+- The CSV input must contain a 'path' column with image file paths
+- If the CSV includes an 'id' column, these IDs will be used instead of filenames
+- When `--no-preprocess` is used, input images must already be in the proper format
+- The overlay visualization requires at least one analysis component to be enabled
+###
 To speed up re-execution of vascx we recommend to run the preprocessing and segmentation steps separately:
 1. Preprocessing. See [this notebook](./notebooks/0_preprocess.ipynb). This step is CPU-heavy and benefits from parallelization (see notebook).

notebooks/0_preprocess.ipynb CHANGED Viewed

@@ -10,7 +10,7 @@
     "\n",
     "import pandas as pd\n",
     "\n",
-    "from rtnls_fundusprep.utils import preprocess_for_inference"
    ]
   },
   {
@@ -58,16 +58,30 @@
      "output_type": "stream",
      "text": [
       "0it [00:00, ?it/s][Parallel(n_jobs=4)]: Using backend LokyBackend with 4 concurrent workers.\n",
-      "6it [00:00, 143.58it/s]\n",
-      "[Parallel(n_jobs=4)]: Done   2 out of   6 | elapsed:    2.1s remaining:    4.2s\n",
-      "[Parallel(n_jobs=4)]: Done   3 out of   6 | elapsed:    2.1s remaining:    2.1s\n",
-      "[Parallel(n_jobs=4)]: Done   4 out of   6 | elapsed:    2.9s remaining:    1.4s\n",
-      "[Parallel(n_jobs=4)]: Done   6 out of   6 | elapsed:    4.3s finished\n"
      ]
     }
    ],
    "source": [
-    "bounds = preprocess_for_inference(\n",
     "    files,  # List of image files\n",
     "    rgb_path=ds_path / \"rgb\",  # Output path for RGB images\n",
     "    ce_path=ds_path / \"ce\",  # Output path for Contrast Enhanced images\n",
@@ -102,7 +116,7 @@
  ],
  "metadata": {
   "kernelspec": {
-   "display_name": "base",
    "language": "python",
    "name": "python3"
   },

     "\n",
     "import pandas as pd\n",
     "\n",
+    "from rtnls_fundusprep.preprocessor import parallel_preprocess"
    ]
   },
   {
      "output_type": "stream",
      "text": [
       "0it [00:00, ?it/s][Parallel(n_jobs=4)]: Using backend LokyBackend with 4 concurrent workers.\n",
+      "6it [00:00, 154.80it/s]\n"
+     ]
+    },
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Error with image ../samples/fundus/original/HRF_07_dr.jpg\n",
+      "Error with image ../samples/fundus/original/HRF_04_g.jpg\n"
+     ]
+    },
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "[Parallel(n_jobs=4)]: Done   2 out of   6 | elapsed:    0.9s remaining:    1.8s\n",
+      "[Parallel(n_jobs=4)]: Done   3 out of   6 | elapsed:    1.5s remaining:    1.5s\n",
+      "[Parallel(n_jobs=4)]: Done   4 out of   6 | elapsed:    1.5s remaining:    0.8s\n",
+      "[Parallel(n_jobs=4)]: Done   6 out of   6 | elapsed:    1.6s finished\n"
      ]
     }
    ],
    "source": [
+    "bounds = parallel_preprocess(\n",
     "    files,  # List of image files\n",
     "    rgb_path=ds_path / \"rgb\",  # Output path for RGB images\n",
     "    ce_path=ds_path / \"ce\",  # Output path for Contrast Enhanced images\n",
  ],
  "metadata": {
   "kernelspec": {
+   "display_name": "retinalysis",
    "language": "python",
    "name": "python3"
   },

setup.py ADDED Viewed

	@@ -0,0 +1,33 @@

+from setuptools import find_packages, setup
+with open("README.md", "r") as fh:
+    long_description = fh.read()
+setup(
+    name="vascx_models",
+    # using versioneer for versioning using git tags
+    # https://github.com/python-versioneer/python-versioneer/blob/master/INSTALL.md
+    # version=versioneer.get_version(),
+    # cmdclass=versioneer.get_cmdclass(),
+    author="Jose Vargas",
+    author_email="[email protected]",
+    description="Retinal analysis toolbox for Python",
+    long_description=long_description,
+    long_description_content_type="text/markdown",
+    packages=find_packages(),
+    include_package_data=True,
+    zip_safe=False,
+    entry_points={
+        "console_scripts": [
+            "vascx = vascx_models.cli:cli",
+        ]
+    },
+    install_requires=[
+        "numpy == 1.*",
+        "pandas == 2.*",
+        "tqdm == 4.*",
+        "Pillow == 9.*",
+        "click==8.*",
+    ],
+    python_requires=">=3.10, <3.11",
+)

vascx_models/cli.py ADDED Viewed

	@@ -0,0 +1,198 @@

+from pathlib import Path
+import click
+import pandas as pd
+import torch
+from rtnls_fundusprep.cli import _run_preprocessing
+from .inference import (
+    run_fovea_detection,
+    run_quality_estimation,
+    run_segmentation_disc,
+    run_segmentation_vessels_and_av,
+)
+from .utils import batch_create_overlays
+@click.group(name="vascx")
+def cli():
+    pass
+@cli.command()
+@click.argument("data_path", type=click.Path(exists=True))
+@click.argument("output_path", type=click.Path())
+@click.option(
+    "--preprocess/--no-preprocess",
+    default=True,
+    help="Run preprocessing or use preprocessed images",
+)
+@click.option(
+    "--vessels/--no-vessels", default=True, help="Run vessels and AV segmentation"
+)
+@click.option("--disc/--no-disc", default=True, help="Run optic disc segmentation")
+@click.option(
+    "--quality/--no-quality", default=True, help="Run image quality estimation"
+)
+@click.option("--fovea/--no-fovea", default=True, help="Run fovea detection")
+@click.option(
+    "--overlay/--no-overlay", default=True, help="Create visualization overlays"
+)
+@click.option("--n_jobs", type=int, default=4, help="Number of preprocessing workers")
+def run(
+    data_path, output_path, preprocess, vessels, disc, quality, fovea, overlay, n_jobs
+):
+    """Run the complete inference pipeline on fundus images.
+    DATA_PATH is either a directory containing images or a CSV file with 'path' column.
+    OUTPUT_PATH is the directory where results will be stored.
+    """
+    output_path = Path(output_path)
+    output_path.mkdir(exist_ok=True, parents=True)
+    # Setup output directories
+    preprocess_rgb_path = output_path / "preprocessed_rgb"
+    vessels_path = output_path / "vessels"
+    av_path = output_path / "artery_vein"
+    disc_path = output_path / "disc"
+    overlay_path = output_path / "overlays"
+    # Create required directories
+    if preprocess:
+        preprocess_rgb_path.mkdir(exist_ok=True, parents=True)
+    if vessels:
+        av_path.mkdir(exist_ok=True, parents=True)
+        vessels_path.mkdir(exist_ok=True, parents=True)
+    if disc:
+        disc_path.mkdir(exist_ok=True, parents=True)
+    if overlay:
+        overlay_path.mkdir(exist_ok=True, parents=True)
+    bounds_path = output_path / "bounds.csv" if preprocess else None
+    quality_path = output_path / "quality.csv" if quality else None
+    fovea_path = output_path / "fovea.csv" if fovea else None
+    # Determine if input is a folder or CSV file
+    data_path = Path(data_path)
+    is_csv = data_path.suffix.lower() == ".csv"
+    # Get files to process
+    files = []
+    ids = None
+    if is_csv:
+        click.echo(f"Reading file paths from CSV: {data_path}")
+        try:
+            df = pd.read_csv(data_path)
+            if "path" not in df.columns:
+                click.echo("Error: CSV must contain a 'path' column")
+                return
+            # Get file paths and convert to Path objects
+            files = [Path(p) for p in df["path"]]
+            if "id" in df.columns:
+                ids = df["id"].tolist()
+                click.echo("Using IDs from CSV 'id' column")
+        except Exception as e:
+            click.echo(f"Error reading CSV file: {e}")
+            return
+    else:
+        click.echo(f"Finding files in directory: {data_path}")
+        files = list(data_path.glob("*"))
+        ids = [f.stem for f in files]
+    if not files:
+        click.echo("No files found to process")
+        return
+    click.echo(f"Found {len(files)} files to process")
+    # Step 1: Preprocess images if requested
+    if preprocess:
+        click.echo("Running preprocessing...")
+        _run_preprocessing(
+            files=files,
+            ids=ids,
+            rgb_path=preprocess_rgb_path,
+            bounds_path=bounds_path,
+            n_jobs=n_jobs,
+        )
+        # Use the preprocessed images for subsequent steps
+        preprocessed_files = list(preprocess_rgb_path.glob("*.png"))
+    else:
+        # Use the input files directly
+        preprocessed_files = files
+    ids = [f.stem for f in preprocessed_files]
+    # Set up GPU device
+    device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
+    click.echo(f"Using device: {device}")
+    # Step 2: Run quality estimation if requested
+    if quality:
+        click.echo("Running quality estimation...")
+        df_quality = run_quality_estimation(
+            fpaths=preprocessed_files, ids=ids, device=device
+        )
+        df_quality.to_csv(quality_path)
+        click.echo(f"Quality results saved to {quality_path}")
+    # Step 3: Run vessels and AV segmentation if requested
+    if vessels:
+        click.echo("Running vessels and AV segmentation...")
+        run_segmentation_vessels_and_av(
+            rgb_paths=preprocessed_files,
+            ids=ids,
+            av_path=av_path,
+            vessels_path=vessels_path,
+            device=device,
+        )
+        click.echo(f"Vessel segmentation saved to {vessels_path}")
+        click.echo(f"AV segmentation saved to {av_path}")
+    # Step 4: Run optic disc segmentation if requested
+    if disc:
+        click.echo("Running optic disc segmentation...")
+        run_segmentation_disc(
+            rgb_paths=preprocessed_files, ids=ids, output_path=disc_path, device=device
+        )
+        click.echo(f"Disc segmentation saved to {disc_path}")
+    # Step 5: Run fovea detection if requested
+    df_fovea = None
+    if fovea:
+        click.echo("Running fovea detection...")
+        df_fovea = run_fovea_detection(
+            rgb_paths=preprocessed_files, ids=ids, device=device
+        )
+        df_fovea.to_csv(fovea_path)
+        click.echo(f"Fovea detection results saved to {fovea_path}")
+    # Step 6: Create overlays if requested
+    if overlay:
+        click.echo("Creating visualization overlays...")
+        # Prepare fovea data if available
+        fovea_data = None
+        if df_fovea is not None:
+            fovea_data = {
+                idx: (row["x_fovea"], row["y_fovea"])
+                for idx, row in df_fovea.iterrows()
+            }
+        # Create visualization overlays
+        batch_create_overlays(
+            rgb_dir=preprocess_rgb_path if preprocess else data_path,
+            output_dir=overlay_path,
+            av_dir=av_path,
+            disc_dir=disc_path,
+            fovea_data=fovea_data,
+        )
+        click.echo(f"Visualization overlays saved to {overlay_path}")
+    click.echo(f"All requested processing complete. Results saved to {output_path}")

vascx_models/inference.py ADDED Viewed

	@@ -0,0 +1,269 @@

+import os
+from pathlib import Path
+from typing import List, Optional
+import numpy as np
+import pandas as pd
+import torch
+from PIL import Image
+from tqdm import tqdm
+from rtnls_inference.ensembles.ensemble_classification import ClassificationEnsemble
+from rtnls_inference.ensembles.ensemble_heatmap_regression import (
+    HeatmapRegressionEnsemble,
+)
+from rtnls_inference.ensembles.ensemble_segmentation import SegmentationEnsemble
+from rtnls_inference.utils import decollate_batch, extract_keypoints_from_heatmaps
+def run_quality_estimation(fpaths, ids, device: torch.device):
+    ensemble_quality = ClassificationEnsemble.from_release("quality.pt").to(device)
+    dataloader = ensemble_quality._make_inference_dataloader(
+        fpaths,
+        ids=ids,
+        num_workers=8,
+        preprocess=False,
+        batch_size=16,
+    )
+    output_ids, outputs = [], []
+    with torch.no_grad():
+        for batch in tqdm(dataloader):
+            if len(batch) == 0:
+                continue
+            im = batch["image"].to(device)
+            # QUALITY
+            quality = ensemble_quality.predict_step(im)
+            quality = torch.mean(quality, dim=0)
+            items = {"id": batch["id"], "quality": quality}
+            items = decollate_batch(items)
+            for item in items:
+                output_ids.append(item["id"])
+                outputs.append(item["quality"].tolist())
+    return pd.DataFrame(
+        outputs,
+        index=output_ids,
+        columns=["q1", "q2", "q3"],
+    )
+def run_segmentation_vessels_and_av(
+    rgb_paths: List[Path],
+    ce_paths: Optional[List[Path]] = None,
+    ids: Optional[List[str]] = None,
+    av_path: Optional[Path] = None,
+    vessels_path: Optional[Path] = None,
+    device: torch.device = torch.device(
+        "cuda:0" if torch.cuda.is_available() else "cpu"
+    ),
+) -> None:
+    """
+    Run AV and vessel segmentation on the provided images.
+    Args:
+        rgb_paths: List of paths to RGB fundus images
+        ce_paths: Optional list of paths to contrast enhanced images
+        ids: Optional list of ids to pass to _make_inference_dataloader
+        av_path: Folder where to store output AV segmentations
+        vessels_path: Folder where to store output vessel segmentations
+        device: Device to run inference on
+    """
+    # Create output directories if they don't exist
+    if av_path is not None:
+        av_path.mkdir(exist_ok=True, parents=True)
+    if vessels_path is not None:
+        vessels_path.mkdir(exist_ok=True, parents=True)
+    # Load models
+    ensemble_av = SegmentationEnsemble.from_release("av_july24.pt").to(device).eval()
+    ensemble_vessels = (
+        SegmentationEnsemble.from_release("vessels_july24.pt").to(device).eval()
+    )
+    # Prepare input paths
+    if ce_paths is None:
+        # If CE paths are not provided, use RGB paths for both inputs
+        fpaths = rgb_paths
+    else:
+        # If CE paths are provided, pair them with RGB paths
+        if len(rgb_paths) != len(ce_paths):
+            raise ValueError("rgb_paths and ce_paths must have the same length")
+        fpaths = list(zip(rgb_paths, ce_paths))
+    # Create dataloader
+    dataloader = ensemble_av._make_inference_dataloader(
+        fpaths,
+        ids=ids,
+        num_workers=8,
+        preprocess=False,
+        batch_size=8,
+    )
+    # Run inference
+    with torch.no_grad():
+        for batch in tqdm(dataloader):
+            # AV segmentation
+            if av_path is not None:
+                with torch.autocast(device_type=device.type):
+                    proba = ensemble_av.forward(batch["image"].to(device))
+                proba = torch.mean(proba, dim=0)  # average over models
+                proba = torch.permute(proba, (0, 2, 3, 1))  # NCHW -> NHWC
+                proba = torch.nn.functional.softmax(proba, dim=-1)
+                items = {
+                    "id": batch["id"],
+                    "image": proba,
+                }
+                items = decollate_batch(items)
+                for i, item in enumerate(items):
+                    fpath = os.path.join(av_path, f"{item['id']}.png")
+                    mask = np.argmax(item["image"], -1)
+                    Image.fromarray(mask.squeeze().astype(np.uint8)).save(fpath)
+            # Vessel segmentation
+            if vessels_path is not None:
+                with torch.autocast(device_type=device.type):
+                    proba = ensemble_vessels.forward(batch["image"].to(device))
+                proba = torch.mean(proba, dim=0)  # average over models
+                proba = torch.permute(proba, (0, 2, 3, 1))  # NCHW -> NHWC
+                proba = torch.nn.functional.softmax(proba, dim=-1)
+                items = {
+                    "id": batch["id"],
+                    "image": proba,
+                }
+                items = decollate_batch(items)
+                for i, item in enumerate(items):
+                    fpath = os.path.join(vessels_path, f"{item['id']}.png")
+                    mask = np.argmax(item["image"], -1)
+                    Image.fromarray(mask.squeeze().astype(np.uint8)).save(fpath)
+def run_segmentation_disc(
+    rgb_paths: List[Path],
+    ce_paths: Optional[List[Path]] = None,
+    ids: Optional[List[str]] = None,
+    output_path: Optional[Path] = None,
+    device: torch.device = torch.device(
+        "cuda:0" if torch.cuda.is_available() else "cpu"
+    ),
+) -> None:
+    ensemble_disc = (
+        SegmentationEnsemble.from_release("disc_july24.pt").to(device).eval()
+    )
+    # Prepare input paths
+    if ce_paths is None:
+        # If CE paths are not provided, use RGB paths for both inputs
+        fpaths = rgb_paths
+    else:
+        # If CE paths are provided, pair them with RGB paths
+        if len(rgb_paths) != len(ce_paths):
+            raise ValueError("rgb_paths and ce_paths must have the same length")
+        fpaths = list(zip(rgb_paths, ce_paths))
+    dataloader = ensemble_disc._make_inference_dataloader(
+        fpaths,
+        ids=ids,
+        num_workers=8,
+        preprocess=False,
+        batch_size=8,
+    )
+    with torch.no_grad():
+        for batch in tqdm(dataloader):
+            # AV
+            with torch.autocast(device_type=device.type):
+                proba = ensemble_disc.forward(batch["image"].to(device))
+            proba = torch.mean(proba, dim=0)  # average over models
+            proba = torch.permute(proba, (0, 2, 3, 1))  # NCHW -> NHWC
+            proba = torch.nn.functional.softmax(proba, dim=-1)
+            items = {
+                "id": batch["id"],
+                "image": proba,
+            }
+            items = decollate_batch(items)
+            items = [dataloader.dataset.transform.undo_item(item) for item in items]
+            for i, item in enumerate(items):
+                fpath = os.path.join(output_path, f"{item['id']}.png")
+                mask = np.argmax(item["image"], -1)
+                Image.fromarray(mask.squeeze().astype(np.uint8)).save(fpath)
+def run_fovea_detection(
+    rgb_paths: List[Path],
+    ce_paths: Optional[List[Path]] = None,
+    ids: Optional[List[str]] = None,
+    device: torch.device = torch.device(
+        "cuda:0" if torch.cuda.is_available() else "cpu"
+    ),
+) -> None:
+    # def run_fovea_detection(fpaths, ids, device: torch.device):
+    ensemble_fovea = HeatmapRegressionEnsemble.from_release("fovea_july24.pt").to(
+        device
+    )
+    # Prepare input paths
+    if ce_paths is None:
+        # If CE paths are not provided, use RGB paths for both inputs
+        fpaths = rgb_paths
+    else:
+        # If CE paths are provided, pair them with RGB paths
+        if len(rgb_paths) != len(ce_paths):
+            raise ValueError("rgb_paths and ce_paths must have the same length")
+        fpaths = list(zip(rgb_paths, ce_paths))
+    dataloader = ensemble_fovea._make_inference_dataloader(
+        fpaths,
+        ids=ids,
+        num_workers=8,
+        preprocess=False,
+        batch_size=8,
+    )
+    output_ids, outputs = [], []
+    with torch.no_grad():
+        for batch in tqdm(dataloader):
+            if len(batch) == 0:
+                continue
+            im = batch["image"].to(device)
+            # FOVEA DETECTION
+            with torch.autocast(device_type=device.type):
+                heatmap = ensemble_fovea.forward(im)
+            keypoints = extract_keypoints_from_heatmaps(heatmap)
+            kp_fovea = torch.mean(keypoints, dim=0)  # average over models
+            items = {
+                "id": batch["id"],
+                "keypoints": kp_fovea,
+                "metadata": batch["metadata"],
+            }
+            items = decollate_batch(items)
+            items = [dataloader.dataset.transform.undo_item(item) for item in items]
+            for item in items:
+                output_ids.append(item["id"])
+                outputs.append(
+                    [
+                        *item["keypoints"][0].tolist(),
+                    ]
+                )
+    return pd.DataFrame(
+        outputs,
+        index=output_ids,
+        columns=["x_fovea", "y_fovea"],
+    )

vascx_models/utils.py ADDED Viewed

	@@ -0,0 +1,160 @@

+from pathlib import Path
+from typing import Dict, Optional, Tuple
+import numpy as np
+from PIL import Image, ImageDraw
+def create_fundus_overlay(
+    rgb_path: str,
+    av_path: Optional[str] = None,
+    disc_path: Optional[str] = None,
+    fovea_location: Optional[Tuple[int, int]] = None,
+    output_path: Optional[str] = None,
+) -> np.ndarray:
+    """
+    Create a visualization of a fundus image with overlaid segmentations and markers.
+    Args:
+        rgb_path: Path to the RGB fundus image
+        av_path: Optional path to artery-vein segmentation (1=artery, 2=vein, 3=intersection)
+        disc_path: Optional path to binary disc segmentation
+        fovea_location: Optional (x,y) tuple indicating the location of the fovea
+        output_path: Optional path to save the visualization image
+    Returns:
+        Numpy array containing the visualization image
+    """
+    print(rgb_path, av_path, disc_path, fovea_location, output_path)
+    # Load RGB image
+    rgb_img = np.array(Image.open(rgb_path))
+    # Create output image starting with the RGB image
+    output_img = rgb_img.copy()
+    # Load and overlay AV segmentation if provided
+    if av_path:
+        av_mask = np.array(Image.open(av_path))
+        # Create masks for arteries (1), veins (2) and intersections (3)
+        artery_mask = av_mask == 1
+        vein_mask = av_mask == 2
+        intersection_mask = av_mask == 3
+        # Combine artery and intersection for visualization
+        artery_combined = np.logical_or(artery_mask, intersection_mask)
+        vein_combined = np.logical_or(vein_mask, intersection_mask)
+        # Apply colors: red for arteries, blue for veins
+        # Red channel - increase for arteries
+        output_img[artery_combined, 0] = 255
+        output_img[artery_combined, 1] = 0
+        output_img[artery_combined, 2] = 0
+        # Blue channel - increase for veins
+        output_img[vein_combined, 0] = 0
+        output_img[vein_combined, 1] = 0
+        output_img[vein_combined, 2] = 255
+    # Load and overlay optic disc segmentation if provided
+    if disc_path:
+        disc_mask = np.array(Image.open(disc_path)) > 0
+        # Apply white color for disc
+        output_img[disc_mask, :] = [255, 255, 255]  # White
+    # Convert to PIL image for drawing the fovea marker
+    pil_img = Image.fromarray(output_img)
+    # Add fovea marker if provided
+    if fovea_location:
+        draw = ImageDraw.Draw(pil_img)
+        x, y = fovea_location
+        marker_size = (
+            min(pil_img.width, pil_img.height) // 50
+        )  # Scale marker with image
+        # Draw yellow X at fovea location
+        draw.line(
+            [(x - marker_size, y - marker_size), (x + marker_size, y + marker_size)],
+            fill=(255, 255, 0),
+            width=2,
+        )
+        draw.line(
+            [(x - marker_size, y + marker_size), (x + marker_size, y - marker_size)],
+            fill=(255, 255, 0),
+            width=2,
+        )
+    # Convert back to numpy array
+    output_img = np.array(pil_img)
+    # Save output if path provided
+    if output_path:
+        Image.fromarray(output_img).save(output_path)
+    return output_img
+def batch_create_overlays(
+    rgb_dir: Path,
+    output_dir: Path,
+    av_dir: Optional[Path] = None,
+    disc_dir: Optional[Path] = None,
+    fovea_data: Optional[Dict[str, Tuple[int, int]]] = None,
+) -> None:
+    """
+    Create visualization overlays for a batch of images.
+    Args:
+        rgb_dir: Directory containing RGB fundus images
+        output_dir: Directory to save visualization images
+        av_dir: Optional directory containing AV segmentations
+        disc_dir: Optional directory containing disc segmentations
+        fovea_data: Optional dictionary mapping image IDs to fovea coordinates
+    Returns:
+        List of paths to created visualization images
+    """
+    # Create output directory if it doesn't exist
+    output_dir.mkdir(exist_ok=True, parents=True)
+    # Get all RGB images
+    rgb_files = list(rgb_dir.glob("*.png"))
+    if not rgb_files:
+        return []
+    # Process each image
+    for rgb_file in rgb_files:
+        image_id = rgb_file.stem
+        # Check for corresponding AV segmentation
+        av_file = None
+        if av_dir:
+            av_file_path = av_dir / f"{image_id}.png"
+            if av_file_path.exists():
+                av_file = str(av_file_path)
+        # Check for corresponding disc segmentation
+        disc_file = None
+        if disc_dir:
+            disc_file_path = disc_dir / f"{image_id}.png"
+            if disc_file_path.exists():
+                disc_file = str(disc_file_path)
+        # Get fovea location if available
+        fovea_location = None
+        if fovea_data and image_id in fovea_data:
+            fovea_location = fovea_data[image_id]
+        # Create output path
+        output_file = output_dir / f"{image_id}.png"
+        # Create and save overlay
+        create_fundus_overlay(
+            rgb_path=str(rgb_file),
+            av_path=av_file,
+            disc_path=disc_file,
+            fovea_location=fovea_location,
+            output_path=str(output_file),
+        )