metadata

title: Detection Metric
tags:
  - evaluate
  - metric
description: >-
  Compute multiple object detection metrics at different bounding box area
  levels.
sdk: gradio
sdk_version: 3.19.1
app_file: app.py
pinned: false

Metric Card for Detection Metric

Metric Description

This metric can be used to calculate object detection metrics. It has an option to calculate the metrics at different levels of bounding box sizes, so that more insight is provided into the performance for different objects. It is adapted from the base of pycocotools metrics.

How to Use

>>> module = evaluate.load("./detection_metric.py")
# shape: (n_images, m_predicted_bboxes, xywh)
>>> predictions = [
        [
            [10, 15, 5, 9],
            [45, 30, 10, 10]
        ],[
            [14, 25, 6, 6],
            [10, 16, 6, 10]
        ],
    ]
# shape: (n_images, m_gt_bboxes, xywh)
>>> references = [
        [[10, 16, 6, 10]],
        [[30, 30, 5, 6]]
    ]
>>> module.add_batch(
        predictions=predictions, 
        references=references, 
        predictions_scores=[[0.5,0.1], [0.8, 0.2]]
    )
>>> module.compute()

Metric Settings

When loading module: module = evaluate.load("./detection_metric.py", **params), multiple parameters can be specified.

area_ranges_tuples List[Tuple[str, List[int]]]: Different levels of area ranges at which metrics should be calculated. It is a list that contains tuples, where the first element of each tuple should specify the name of the area range and the second element is list specifying the lower and upper limit of the area range. Defaults to [("all", [0, 1e5.pow(2)])].
bbox_format Literal["xyxy", "xywh", "cxcywh"]: Bounding box format of predictions and ground truth. Defaults to "xywh".
iou_threshold Optional[float]: at which IOU-treshold the metrics should be calculated. IOU-threshold defines the minimal overlap between a ground truth and predicted bounding box so that it is considered a correct prediction. Defaults to 1e-10.
class_agnostic bool. Defaults to True. Non-class-agnostic metrics are currently not supported.

Input Values

Add predictions to the metric with the function module.add_batches(predictions, references) with the following parameters:

predictions List[List[List[int]]]: predicted bounding boxes in shape n x m x 4 with n being the number of images that are evaluated, m the number of predicted bounding boxes for the n-th image and the four co-ordinates specifying the bounding box (by default: x y width height).
references List[List[List[int]]]: ground truth bounding boxes in shape n x l x 4 with l being the number of ground truth bounding boxes for the n-th image.

Output Values

The metric outputs a dictionary that contains sub-dictionaries for each name of the specified area ranges. Each sub-dictionary holds performance metrics at the specific area range level:

range: corresponding area range
iouThr: IOU-threshold used in calculating the metric
maxDets: maximum number of detections in calculating the metrics
tp: number of true positive predictions
fp: number of false positive predictions
fn: number of false negative predictions
duplicates: number of duplicated bounding box predictions
precision: ratio between true positive predictions and positive predictions (tp/(tp+fp))
recall: ratio between true positive predictions and actual ground truths (tp/(tp+fn))
f1: trades-off precision and recall (2*(precision*recall)/(precision+recall))
support: number of ground truth bounding boxes that are considered in the metric
fpi: number of images with predictions but no ground truths
nImgs: number of total images considered in calculating the metric

Examples

Example 1

Basic usage example. Add predictions and references via module.add_batch(predictions, references) function. Finally, compute the metrics accross predictions and ground truths over different images via module.compute().

>>> module = evaluate.load("./detection_metric.py", iou_thresholds=0.9)

>>> predictions = [
        [
            [10, 15, 20, 25],
            [45, 30, 10, 10]
        ],[
            [14, 25, 6, 6],
            [10, 16, 6, 10]
        ]
    ]

>>> references = [
        [[10, 15, 20, 20]],
        [[30, 30, 5, 6]]
    ]

>>> module.add_batch(predictions=predictions, references=references, predictions_scores=[[0.5,0.3],[0.8, 0.1]])
>>> result = module.compute()
>>> print(result)
{'all': {
    'range': [0, 10000000000.0],
    'iouThr': '0.00',
    'maxDets': 100,
    'tp': 1,
    'fp': 3,
    'fn': 1,
    'duplicates': 0,
    'precision': 0.25,
    'recall': 0.5,
    'f1': 0.3333333333333333,
    'support': 2,
    'fpi': 0,
    'nImgs': 2
    }
}

Example 2

We can specify different area range levels, at which we would like to compute the metrics. Further note that in the references, there is an empty list for the first image, because it does not include any ground truth bounding boxes. We still need to include it, so that we can map the false positive prediction to the references boxes correctly.

>>> area_ranges_tuples = [
        ("all", [0, 1e5 ** 2]),
        ("small", [0 ** 2, 6 ** 2]),
        ("medium", [6 ** 2, 12 ** 2]),  
        ("large", [12 ** 2, 1e5 ** 2])
    ]

>>> module = evaluate.load("./detection_metric.py", area_ranges_tuples=area_ranges_tuples)

>>> predictions = [
        [
            [10, 15, 5, 5],
            [45, 30, 10, 10]
        ],[
            [50, 50, 6, 10]
        ],
    ]

>>> references = [
        [],
        [[10, 15, 5, 5]]
    ]

>>> module.add_batch(predictions=predictions, references=references)
>>> result = module.compute()
>>> print(result)
{'all': 
    {'range': [0, 10000000000.0],
    'iouThr': '0.00',
    'maxDets': 100,
    'tp': 0, 
    'fp': 3,
    'fn': 1,
    'duplicates': 0,
    'precision': 0.0,
    'recall': 0.0,
    'f1': 0,
    'support': 1,
    'fpi': 1,
    'nImgs': 2
}, 
'small': {
    'range': [0, 36],
    'iouThr': '0.00',
    'maxDets': 100,
    'tp': 0,
    'fp': 1,
    'fn': 1,
    'duplicates': 0,
    'precision': 0.0,
    'recall': 0.0,
    'f1': 0,
    'support': 1,
    'fpi': 1, 
    'nImgs': 2
}, 
'medium': {
    'range': [36, 144],
    'iouThr': '0.00',
    'maxDets': 100,
    'tp': 0,
    'fp': 2,
    'fn': 0,
    'duplicates': 0,
    'precision': 0.0,
    'recall': 0,
    'f1': 0,
    'support': 0,
    'fpi': 2,
    'nImgs': 2
}, 'large': {
    'range': [144, 10000000000.0],
    'iouThr': '0.00',
    'maxDets': 100,
    'tp': -1,
    'fp': -1,
    'fn': -1,
    'duplicates': -1,
    'precision': -1,
    'recall': -1,
    'f1': -1,
    'support': 0,
    'fpi': 0,
    'nImgs': 2
    }
}

Further References

Calculating metrics is based on pycoco tools: https://github.com/cocodataset/cocoapi/tree/master/PythonAPI/pycocotools

Further info about metrics: https://www.analyticsvidhya.com/blog/2020/09/precision-recall-machine-learning/