Spaces:

SEA-AI
/

det-metrics

Running

App Files Files Community

franzi2505 commited on Feb 16, 2024

Commit

3359d6e

1 Parent(s): e599283

adapt detection metrics to fit standard payload

Browse files

Files changed (2) hide show

README.md +112 -77
det-metrics.py +100 -97

README.md CHANGED Viewed

@@ -20,28 +20,17 @@ This metric can be used to calculate object detection metrics. It has an option
 ## How to Use
 ```
->>> module = evaluate.load("SEA-AI/det-metrics")
-# shape: (n_images, m_predicted_bboxes, xywh)
->>> predictions = [
-        [
-            [10, 15, 5, 9],
-            [45, 30, 10, 10]
-        ],[
-            [14, 25, 6, 6],
-            [10, 16, 6, 10]
-        ],
-    ]
-# shape: (n_images, m_gt_bboxes, xywh)
->>> references = [
-        [[10, 16, 6, 10]],
-        [[30, 30, 5, 6]]
-    ]
->>> module.add_batch(
-        predictions=predictions,
-        references=references,
-        predictions_scores=[[0.5,0.1], [0.8, 0.2]]
     )
->>> module.compute()
 ```
 ### Metric Settings
@@ -53,10 +42,97 @@ When loading module: `module = evaluate.load("SEA-AI/det-metrics", **params)`, m
 ### Input Values
-Add predictions to the metric with the function `module.add_batches(predictions, references)` with the following parameters:
-- **predictions** *List[List[List[int]]]*: predicted bounding boxes in shape `n x m x 4` with `n` being the number of images that are evaluated, `m` the number of predicted bounding boxes for the n-th image and the four co-ordinates specifying the bounding box (by default: x y width height).
-- **references** *List[List[List[int]]]*: ground truth bounding boxes in shape `n x l x 4` with `l` being the number of ground truth bounding boxes for the n-th image.
 ### Output Values
 The metric outputs a dictionary that contains sub-dictionaries for each name of the specified area ranges.
@@ -77,66 +153,23 @@ Each sub-dictionary holds performance metrics at the specific area range level:
 ### Examples
-#### Example 1
-Basic usage example. Add predictions and references via `module.add_batch(predictions, references)` function. Finally, compute the metrics accross predictions and ground truths over different images via `module.compute()`.
-```
->>> module = evaluate.load("SEA-AI/det-metrics", iou_thresholds=0.9)
->>> predictions = [
-        [
-            [10, 15, 20, 25],
-            [45, 30, 10, 10]
-        ],[
-            [14, 25, 6, 6],
-            [10, 16, 6, 10]
-        ]
-    ]
->>> references = [
-        [[10, 15, 20, 20]],
-        [[30, 30, 5, 6]]
-    ]
->>> module.add_batch(predictions=predictions, references=references, predictions_scores=[[0.5,0.3],[0.8, 0.1]])
->>> result = module.compute()
->>> print(result)
-{'all': {
-    'range': [0, 10000000000.0],
-    'iouThr': '0.00',
-    'maxDets': 100,
-    'tp': 1,
-    'fp': 3,
-    'fn': 1,
-    'duplicates': 0,
-    'precision': 0.25,
-    'recall': 0.5,
-    'f1': 0.3333333333333333,
-    'support': 2,
-    'fpi': 0,
-    'nImgs': 2
-    }
-}
-```
-#### Example 2
-We can specify different area range levels, at which we would like to compute the metrics. Further note that in the references, there is an empty list for the first image, because it does not include any ground truth bounding boxes. We still need to include it, so that we can map the false positive prediction to the references boxes correctly.
 ```
 >>> area_ranges_tuples = [
         ("all", [0, 1e5 ** 2]),
         ("small", [0 ** 2, 6 ** 2]),
         ("medium", [6 ** 2, 12 ** 2]),
         ("large", [12 ** 2, 1e5 ** 2])
     ]
->>> module = evaluate.load("SEA-AI/det-metrics", area_ranges_tuples=area_ranges_tuples)
->>> predictions = [
-        [
-            [10, 15, 5, 5],
-            [45, 30, 10, 10]
-        ],[
-            [50, 50, 6, 10]
-        ],
-    ]
->>> references = [
-        [],
-        [[10, 15, 5, 5]]
-    ]
->>> module.add_batch(predictions=predictions, references=references)
 >>> result = module.compute()
 >>> print(result)
 {'all':
@@ -202,6 +235,8 @@ We can specify different area range levels, at which we would like to compute th
 ```
 ## Further References
 Calculating metrics is based on pycoco tools: https://github.com/cocodataset/cocoapi/tree/master/PythonAPI/pycocotools
 Further info about metrics: https://www.analyticsvidhya.com/blog/2020/09/precision-recall-machine-learning/

 ## How to Use
 ```
+>>> import evaluate
+>>> from seametrics.fo_to_payload.utils import fo_to_payload
+>>> payload = fo_to_payload(
+        dataset=dataset,
+        gt_field=gt_field,
+        models=model_list
     )
+>>> for model in payload["models"]:
+>>>     module = evaluate.load("./detection_metric.py", iou_thresholds=0.9)
+>>>     module.add_batch(payload, model=model)
+>>>     result = module.compute()
 ```
 ### Metric Settings
 ### Input Values
+Add predictions and ground truths to the metric with the function `module.add_batches(payload)`.
+The format of payload should be as returned by function `fo_to_payload()` defined in seametrics library.
+An example of how a payload might look like is:
+```
+test_payload = {
+    'dataset': 'SAILING_DATASET_QA',
+    'models': ['yolov5n6_RGB_D2304-v1_9C'],
+    'gt_field_name': 'ground_truth_det',
+    'sequences': {
+        # sequence 1, 1 frame with 1 pred and 1 gt
+        'Trip_14_Seq_1': {
+            'resolution': (720, 1280),
+            'yolov5n6_RGB_D2304-v1_9C': [[fo.Detection(
+                    label='FAR_AWAY_OBJECT',
+                    bounding_box=[0.35107421875, 0.274658203125, 0.0048828125, 0.009765625], # tp nr1
+                    confidence=0.153076171875
+                )]],
+                'ground_truth_det': [[fo.Detection(
+                    label='FAR_AWAY_OBJECT',
+                    bounding_box=[0.35107421875, 0.274658203125, 0.0048828125, 0.009765625]
+                )]]
+            },
+        # sequence 2, 2 frames with frame 1: 2 pred, 1 gt; frame 2: 1 pred 1 gt
+        'Trip_14_Seq_2': {
+            'resolution': (720, 1280),
+            'yolov5n6_RGB_D2304-v1_9C': [
+                [
+                    fo.Detection(
+                        label='FAR_AWAY_OBJECT',
+                        bounding_box=[0.389404296875,0.306640625,0.005126953125,0.0146484375], # tp nr 2
+                        confidence=0.153076171875
+                        ),
+                    fo.Detection(
+                        label='FAR_AWAY_OBJECT',
+                        bounding_box=[0.50390625, 0.357666015625, 0.0048828125, 0.00976562], # fp nr 1
+                        confidence=0.153076171875
+                        ),
+                    fo.Detection(
+                        label='FAR_AWAY_OBJECT',
+                        bounding_box=[0.455078125, 0.31494140625, 0.00390625, 0.0087890625], # fp nr 2
+                        confidence=0.153076171875
+                    )
+                ],
+                [
+                    fo.Detection(
+                        label='FAR_AWAY_OBJECT',
+                        bounding_box=[0.455078125, 0.31494140625, 0.00390625, 0.0087890625], # tp nr 3
+                        confidence=0.153076171875
+                        )
+                ],
+                [
+                    fo.Detection(
+                        label='FAR_AWAY_OBJECT',
+                        bounding_box=[0.455078125, 0.31494140625, 0.00390625, 0.0087890625], # fp nr 3
+                        confidence=0.153076171875
+                        )
+                ]
+            ],
+            'ground_truth_det': [
+                # frame nr 1
+                [
+                    fo.Detection(
+                        label='FAR_AWAY_OBJECT',
+                        bounding_box=[0.389404296875,0.306640625,0.005126953125,0.0146484375],
+                        )
+                ],
+                # frame nr 2
+                [
+                    fo.Detection(
+                        label='FAR_AWAY_OBJECT',
+                        bounding_box=[0.455078125, 0.31494140625, 0.00390625, 0.0087890625],
+                        confidence=0.153076171875
+                    ),
+                    fo.Detection(
+                        label='FAR_AWAY_OBJECT',
+                        bounding_box=[0.35107421875, 0.274658203125, 0.0048828125, 0.009765625], # missed nr 1
+                        confidence=0.153076171875
+                    )
+                ],
+                # frame nr3
+                [
+                ],
+            ]
+            }
+        },
+    "sequence_list": ["Trip_14_Seq_1", 'Trip_14_Seq_2']
+}
+```
+Optionally, you can pass the model as string that should be evaluated, via `model=model_str`. By default, it will evaluate the first model, i.e. `model = payload["models"][0]`.
 ### Output Values
 The metric outputs a dictionary that contains sub-dictionaries for each name of the specified area ranges.
 ### Examples
+We can specify different area range levels, at which we would like to compute the metrics.
 ```
+>>> import evaluate
+>>> from seametrics.fo_to_payload.utils import fo_to_payload
 >>> area_ranges_tuples = [
         ("all", [0, 1e5 ** 2]),
         ("small", [0 ** 2, 6 ** 2]),
         ("medium", [6 ** 2, 12 ** 2]),
         ("large", [12 ** 2, 1e5 ** 2])
     ]
+>>> payload = fo_to_payload(
+        dataset=dataset,
+        gt_field=gt_field,
+        models=model_list
+    )
+>>> module = evaluate.load("./detection_metric.py", iou_thresholds=0.9)
+>>> module.add_batch(payload)
 >>> result = module.compute()
 >>> print(result)
 {'all':
 ```
 ## Further References
+*seametrics* library: https://github.com/SEA-AI/seametrics/tree/main
 Calculating metrics is based on pycoco tools: https://github.com/cocodataset/cocoapi/tree/master/PythonAPI/pycocotools
 Further info about metrics: https://www.analyticsvidhya.com/blog/2020/09/precision-recall-machine-learning/

det-metrics.py CHANGED Viewed

@@ -13,7 +13,7 @@
 # limitations under the License.
 """TODO: Add a description here."""
-from typing import List, Tuple, Optional, Literal
 import evaluate
 import datasets
@@ -21,7 +21,6 @@ import numpy as np
 from seametrics.detection import PrecisionRecallF1Support
 _CITATION = """\
 @InProceedings{coco:2020,
 title = {Microsoft {COCO:} Common Objects in Context},
@@ -82,39 +81,30 @@ Returns:
         'fpi': number of images with no ground truth but false positive predictions,
         'nImgs': number of images considered in evaluation
 Examples:
-    >>> module = evaluate.load("./detection_metric.py", iou_thresholds=0.9)
-    >>> predictions = [
-            [
-                [10, 15, 20, 25],
-                [45, 30, 10, 10]
-            ],[
-                [14, 25, 6, 6],
-                [10, 16, 6, 10]
-            ]
-        ]
-    >>> references = [
-            [[10, 15, 20, 20]],
-            [[30, 30, 5, 6]]
-        ]
-    >>> module.add_batch(predictions=predictions, references=references, predictions_scores=[[0.5,0.3],[0.8, 0.1]])
-    >>> result = module.compute()
-    >>> print(result)
-    {'all': {
-        'range': [0, 10000000000.0],
-        'iouThr': '0.00',
-        'maxDets': 100,
-        'tp': 1,
-        'fp': 3,
-        'fn': 1,
-        'duplicates': 0,
-        'precision': 0.25,
-        'recall': 0.5,
-        'f1': 0.3333333333333333,
-        'support': 2,
-        'fpi': 0,
-        'nImgs': 2
-        }
-    }
 """
@@ -164,38 +154,43 @@ class DetectionMetric(evaluate.Metric):
     def add_batch(
             self,
-            predictions,
-            references,
-            predictions_labels: Optional[np.ndarray] = None,
-            predictions_scores: Optional[np.ndarray] = None,
-            references_labels: Optional[np.ndarray] = None
         ):
         """Add predictions and ground truths of a single image to update the metric.
         Args:
-            predictions (List[List[List[int]]]): predicted bounding boxes, shape: (n_images, m_pred_boxes, 4)
-            references (List[List[List[int]]]): ground truth bounding boxes, shape: (n_images, l_gt_boxes, 4)
-            predictions_labels (Optional[np.ndarray], optional): Labels of predicted bounding boxes, shape: (n_images, m_pred_boxes).
-                Defaults to None.
-            predictions_scores (Optional[np.ndarray], optional): Scores of predicted bounding boxes, shape: (n_images, m_pred_boxes).
-                Defaults to None.
-            references_labels (Optional[np.ndarray], optional): Labels of predicted bounding boxes, shape: (n_images, l_pred_boxes).
-                Defaults to None.
         """
-        if predictions_labels is None:
-            predictions_labels = [None]*len(predictions)
-        if predictions_scores is None:
-            predictions_scores = [None]*len(predictions)
-        if references_labels is None:
-            references_labels = [None]*len(references)
-        for pred, ref, pred_score, pred_l, ref_l in zip(predictions,
-                                                        references,
-                                                        predictions_scores,
-                                                        predictions_labels,
-                                                        references_labels):
-            preds, targets = self.process_preds_references(pred, ref, pred_l, pred_score, ref_l)
-            self.coco_metric.update(preds, targets)
-        super(evaluate.Metric, self).add_batch(predictions=predictions, references=references)
     def _compute(
             self,
@@ -205,42 +200,50 @@ class DetectionMetric(evaluate.Metric):
         """Returns the scores"""
         result = self.coco_metric.compute()["metrics"]
         return result
     @staticmethod
-    def process_preds_references(
-            predictions,
-            references,
-            predictions_labels: Optional[np.ndarray] = None,
-            predictions_scores: Optional[np.ndarray] = None,
-            references_labels: Optional[np.ndarray] = None
-        ):
-        if predictions_scores is None:
-            predictions_scores = np.ones(shape=len(predictions), dtype=np.float32)
-        else:
-            predictions_scores = np.array(predictions_scores, dtype=np.float32)
-        if predictions_labels is None:
-            if references_labels is not None:
-                print("Warning: Providing no prediction labels, but ground truth labels!")
-            predictions_labels = np.zeros(shape=len(predictions), dtype=np.int16)
-        else:
-            predictions_labels = np.array(predictions_labels)
-        if references_labels is None:
-            references_labels = np.zeros(shape=len(references), dtype=np.int16)
-        else:
-            references_labels = np.array(references_labels)
-        preds = [
-            dict(
-                boxes=np.array(predictions),
-                scores=predictions_scores,
-                labels=predictions_labels
             )
-        ]
-        target = [
             dict(
-                boxes=np.array(references),
-                labels=references_labels
             )
-        ]
-        return preds, target

 # limitations under the License.
 """TODO: Add a description here."""
+from typing import List, Tuple, Dict, Literal
 import evaluate
 import datasets
 from seametrics.detection import PrecisionRecallF1Support
 _CITATION = """\
 @InProceedings{coco:2020,
 title = {Microsoft {COCO:} Common Objects in Context},
         'fpi': number of images with no ground truth but false positive predictions,
         'nImgs': number of images considered in evaluation
 Examples:
+    >>> import evaluate
+    >>> from seametrics.fo_to_payload.utils import fo_to_payload
+    >>> payload = fo_to_payload(..., models=model_list)
+    >>> for model in payload["models"]:
+        >>> module = evaluate.load("./detection_metric.py", iou_thresholds=0.9)
+        >>> module.add_batch(payload)
+        >>> result = module.compute()
+        >>> print(result)
+            {'all': {
+                'range': [0, 10000000000.0],
+                'iouThr': '0.00',
+                'maxDets': 100,
+                'tp': 1,
+                'fp': 3,
+                'fn': 1,
+                'duplicates': 0,
+                'precision': 0.25,
+                'recall': 0.5,
+                'f1': 0.3333333333333333,
+                'support': 2,
+                'fpi': 0,
+                'nImgs': 2
+                }
+            }
 """
     def add_batch(
             self,
+            data: dict,
+            model: str = None
         ):
         """Add predictions and ground truths of a single image to update the metric.
         Args:
+            data (dict): containing standard payload of data that should be evaluated
+                format should be as returned by function `fo_to_payload()` in seametrics library
+            model (str): should be one out of values given in data["models"]
+                if not defined, defaults to data["models"][0], as only one model can be evaluated a time.
         """
+        # populate two empty lists in format suitable for hugging face metric
+        # nothing is computed based on them but prevents huggingface error
+        predictions, references = [], []
+        if model is None:
+            model = data["models"][0]
+        for sequence in data["sequence_list"]:
+            seq_data = data["sequences"][sequence]
+            gt_normalized = seq_data[data["gt_field_name"]] # shape: (n_frames, m_gts)
+            pred_normalized = seq_data[model] # shape: (n_frames, l_preds)
+            img_res = seq_data["resolution"] # (h, w)
+            for gt_frame, pred_frame in zip(gt_normalized, pred_normalized): # iterate over all frames
+                processed_pred = self._fo_dets_to_metrics_dict(pred_frame, w=img_res[1], h=img_res[0])
+                processed_gt = self._fo_dets_to_metrics_dict(gt_frame, w=img_res[1], h=img_res[0])
+                predictions.append(processed_pred[0]["boxes"].tolist())
+                references.append(processed_gt[0]["boxes"].tolist())
+                 # where the magic happens: update metric with data from current frame
+                self.coco_metric.update(processed_pred, processed_gt)
+        # prevents hugging face error, doesn't do a lot
+        super(evaluate.Metric, self).add_batch(
+            predictions=predictions,
+            references=references
+        )
     def _compute(
             self,
         """Returns the scores"""
         result = self.coco_metric.compute()["metrics"]
         return result
     @staticmethod
+    def _fo_dets_to_metrics_dict(fo_dets: list,
+                                 w: int,
+                                 h: int) -> List[Dict[str, np.ndarray]]:
+        """Convert list of fiftyone detections to format that is
+            required by PrecisionRecallF1Support() function of seametrics library
+        Args:
+            fo_dets (list): list containing fiftyone detections (or empty if frame without any detections)
+                note: bounding boxes in fo-detections are in format xywhn
+            w (int): width in pixel of image
+            h (int): height in pixel of image
+        Returns:
+            List[Dict[str, np.ndarray]]: list holding single dict with items:
+                "boxes": denormalized bounding boxes of whole frame in numpy array (shape: n_bboxes, 4)
+                "scores": confidence scores in numpy array (shape: n_bboxes)
+                "labels": labels in numpy array (shape: n_bboxes)
+        """
+        detections = []
+        scores = []
+        labels = [] #TODO: map to numbers
+        if len(fo_dets) == 0:
+            return [
+                dict(
+                    boxes=np.array([]),
+                    scores=np.array([]),
+                    labels=np.array([])
+                )
+            ]
+        for det in fo_dets:
+            bbox = det["bounding_box"]
+            detections.append(
+                [bbox[0]*w, bbox[1]*h, bbox[2]*w, bbox[3]*h]
             )
+            scores.append(det["confidence"]) # None for gt
+            #labels.append(bbox["label"])
+            labels.append(1)
+        return [
             dict(
+                boxes=np.array(detections),
+                scores=np.array(scores),
+                labels=np.array(labels)
             )
+        ]