franzi2505 commited on
Commit
f965db0
·
1 Parent(s): 3ae0b30
README.md CHANGED
@@ -1,13 +1,211 @@
1
  ---
2
- title: Detection Metrics
3
- emoji: 👀
4
- colorFrom: red
5
- colorTo: red
 
6
  sdk: gradio
7
- sdk_version: 4.17.0
8
  app_file: app.py
9
  pinned: false
10
- license: agpl-3.0
11
  ---
12
 
13
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ title: Detection Metric
3
+ tags:
4
+ - evaluate
5
+ - metric
6
+ description: "Compute multiple object detection metrics at different bounding box area levels."
7
  sdk: gradio
8
+ sdk_version: 3.19.1
9
  app_file: app.py
10
  pinned: false
 
11
  ---
12
 
13
+ # Metric Card for Detection Metric
14
+
15
+ ## Metric Description
16
+ This metric can be used to calculate object detection metrics. It has an option to calculate the metrics at different levels of bounding box sizes, so that more insight is provided into the performance for different objects. It is adapted from the base of pycocotools metrics.
17
+
18
+ ## How to Use
19
+ ```
20
+ >>> module = evaluate.load("./detection_metric.py")
21
+ # shape: (n_images, m_predicted_bboxes, xywh)
22
+ >>> predictions = [
23
+ [
24
+ [10, 15, 5, 9],
25
+ [45, 30, 10, 10]
26
+ ],[
27
+ [14, 25, 6, 6],
28
+ [10, 16, 6, 10]
29
+ ],
30
+ ]
31
+ # shape: (n_images, m_gt_bboxes, xywh)
32
+ >>> references = [
33
+ [[10, 16, 6, 10]],
34
+ [[30, 30, 5, 6]]
35
+ ]
36
+ >>> module.add_batch(
37
+ predictions=predictions,
38
+ references=references,
39
+ predictions_scores=[[0.5,0.1], [0.8, 0.2]]
40
+ )
41
+ >>> module.compute()
42
+ ```
43
+
44
+ ### Metric Settings
45
+ When loading module: `module = evaluate.load("./detection_metric.py", **params)`, multiple parameters can be specified.
46
+ - **area_ranges_tuples** *List[Tuple[str, List[int]]]*: Different levels of area ranges at which metrics should be calculated. It is a list that contains tuples, where the first element of each tuple should specify the name of the area range and the second element is list specifying the lower and upper limit of the area range. Defaults to `[("all", [0, 1e5.pow(2)])]`.
47
+ - **bbox_format** *Literal["xyxy", "xywh", "cxcywh"]*: Bounding box format of predictions and ground truth. Defaults to `"xywh"`.
48
+ - **iou_threshold** *Optional[float]*: at which IOU-treshold the metrics should be calculated. IOU-threshold defines the minimal overlap between a ground truth and predicted bounding box so that it is considered a correct prediction. Defaults to `1e-10`.
49
+ - **class_agnostic** *bool*. Defaults to `True`. Non-class-agnostic metrics are currently not supported.
50
+
51
+
52
+ ### Input Values
53
+ Add predictions to the metric with the function `module.add_batches(predictions, references)` with the following parameters:
54
+ - **predictions** *List[List[List[int]]]*: predicted bounding boxes in shape `n x m x 4` with `n` being the number of images that are evaluated, `m` the number of predicted bounding boxes for the n-th image and the four co-ordinates specifying the bounding box (by default: x y width height).
55
+ - **references** *List[List[List[int]]]*: ground truth bounding boxes in shape `n x l x 4` with `l` being the number of ground truth bounding boxes for the n-th image.
56
+
57
+
58
+ ### Output Values
59
+ The metric outputs a dictionary that contains sub-dictionaries for each name of the specified area ranges.
60
+ Each sub-dictionary holds performance metrics at the specific area range level:
61
+ - **range**: corresponding area range
62
+ - **iouThr**: IOU-threshold used in calculating the metric
63
+ - **maxDets**: maximum number of detections in calculating the metrics
64
+ - **tp**: number of true positive predictions
65
+ - **fp**: number of false positive predictions
66
+ - **fn**: number of false negative predictions
67
+ - **duplicates**: number of duplicated bounding box predictions
68
+ - **precision**: ratio between true positive predictions and positive predictions (tp/(tp+fp))
69
+ - **recall**: ratio between true positive predictions and actual ground truths (tp/(tp+fn))
70
+ - **f1**: trades-off precision and recall (2*(precision*recall)/(precision+recall))
71
+ - **support**: number of ground truth bounding boxes that are considered in the metric
72
+ - **fpi**: number of images with predictions but no ground truths
73
+ - **nImgs**: number of total images considered in calculating the metric
74
+
75
+
76
+ ### Examples
77
+ #### Example 1
78
+ Basic usage example. Add predictions and references via `module.add_batch(predictions, references)` function. Finally, compute the metrics accross predictions and ground truths over different images via `module.compute()`.
79
+ ```
80
+ >>> module = evaluate.load("./detection_metric.py", iou_thresholds=0.9)
81
+
82
+ >>> predictions = [
83
+ [
84
+ [10, 15, 20, 25],
85
+ [45, 30, 10, 10]
86
+ ],[
87
+ [14, 25, 6, 6],
88
+ [10, 16, 6, 10]
89
+ ]
90
+ ]
91
+
92
+ >>> references = [
93
+ [[10, 15, 20, 20]],
94
+ [[30, 30, 5, 6]]
95
+ ]
96
+
97
+ >>> module.add_batch(predictions=predictions, references=references, predictions_scores=[[0.5,0.3],[0.8, 0.1]])
98
+ >>> result = module.compute()
99
+ >>> print(result)
100
+ {'all': {
101
+ 'range': [0, 10000000000.0],
102
+ 'iouThr': '0.00',
103
+ 'maxDets': 100,
104
+ 'tp': 1,
105
+ 'fp': 3,
106
+ 'fn': 1,
107
+ 'duplicates': 0,
108
+ 'precision': 0.25,
109
+ 'recall': 0.5,
110
+ 'f1': 0.3333333333333333,
111
+ 'support': 2,
112
+ 'fpi': 0,
113
+ 'nImgs': 2
114
+ }
115
+ }
116
+ ```
117
+ #### Example 2
118
+ We can specify different area range levels, at which we would like to compute the metrics. Further note that in the references, there is an empty list for the first image, because it does not include any ground truth bounding boxes. We still need to include it, so that we can map the false positive prediction to the references boxes correctly.
119
+ ```
120
+ >>> area_ranges_tuples = [
121
+ ("all", [0, 1e5 ** 2]),
122
+ ("small", [0 ** 2, 6 ** 2]),
123
+ ("medium", [6 ** 2, 12 ** 2]),
124
+ ("large", [12 ** 2, 1e5 ** 2])
125
+ ]
126
+
127
+ >>> module = evaluate.load("./detection_metric.py", area_ranges_tuples=area_ranges_tuples)
128
+
129
+ >>> predictions = [
130
+ [
131
+ [10, 15, 5, 5],
132
+ [45, 30, 10, 10]
133
+ ],[
134
+ [50, 50, 6, 10]
135
+ ],
136
+ ]
137
+
138
+ >>> references = [
139
+ [],
140
+ [[10, 15, 5, 5]]
141
+ ]
142
+
143
+ >>> module.add_batch(predictions=predictions, references=references)
144
+ >>> result = module.compute()
145
+ >>> print(result)
146
+ {'all':
147
+ {'range': [0, 10000000000.0],
148
+ 'iouThr': '0.00',
149
+ 'maxDets': 100,
150
+ 'tp': 0,
151
+ 'fp': 3,
152
+ 'fn': 1,
153
+ 'duplicates': 0,
154
+ 'precision': 0.0,
155
+ 'recall': 0.0,
156
+ 'f1': 0,
157
+ 'support': 1,
158
+ 'fpi': 1,
159
+ 'nImgs': 2
160
+ },
161
+ 'small': {
162
+ 'range': [0, 36],
163
+ 'iouThr': '0.00',
164
+ 'maxDets': 100,
165
+ 'tp': 0,
166
+ 'fp': 1,
167
+ 'fn': 1,
168
+ 'duplicates': 0,
169
+ 'precision': 0.0,
170
+ 'recall': 0.0,
171
+ 'f1': 0,
172
+ 'support': 1,
173
+ 'fpi': 1,
174
+ 'nImgs': 2
175
+ },
176
+ 'medium': {
177
+ 'range': [36, 144],
178
+ 'iouThr': '0.00',
179
+ 'maxDets': 100,
180
+ 'tp': 0,
181
+ 'fp': 2,
182
+ 'fn': 0,
183
+ 'duplicates': 0,
184
+ 'precision': 0.0,
185
+ 'recall': 0,
186
+ 'f1': 0,
187
+ 'support': 0,
188
+ 'fpi': 2,
189
+ 'nImgs': 2
190
+ }, 'large': {
191
+ 'range': [144, 10000000000.0],
192
+ 'iouThr': '0.00',
193
+ 'maxDets': 100,
194
+ 'tp': -1,
195
+ 'fp': -1,
196
+ 'fn': -1,
197
+ 'duplicates': -1,
198
+ 'precision': -1,
199
+ 'recall': -1,
200
+ 'f1': -1,
201
+ 'support': 0,
202
+ 'fpi': 0,
203
+ 'nImgs': 2
204
+ }
205
+ }
206
+ ```
207
+
208
+ ## Further References
209
+ Calculating metrics is based on pycoco tools: https://github.com/cocodataset/cocoapi/tree/master/PythonAPI/pycocotools
210
+
211
+ Further info about metrics: https://www.analyticsvidhya.com/blog/2020/09/precision-recall-machine-learning/
app.py ADDED
@@ -0,0 +1,6 @@
 
 
 
 
 
 
 
1
+ import evaluate
2
+ from evaluate.utils import launch_gradio_widget
3
+
4
+
5
+ module = evaluate.load("./detection_metric.py",)
6
+ launch_gradio_widget(module)
detection_metric.py ADDED
@@ -0,0 +1,246 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Copyright 2020 The HuggingFace Datasets Authors and the current dataset script contributor.
2
+ #
3
+ # Licensed under the Apache License, Version 2.0 (the "License");
4
+ # you may not use this file except in compliance with the License.
5
+ # You may obtain a copy of the License at
6
+ #
7
+ # http://www.apache.org/licenses/LICENSE-2.0
8
+ #
9
+ # Unless required by applicable law or agreed to in writing, software
10
+ # distributed under the License is distributed on an "AS IS" BASIS,
11
+ # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12
+ # See the License for the specific language governing permissions and
13
+ # limitations under the License.
14
+ """TODO: Add a description here."""
15
+
16
+ from typing import List, Tuple, Optional, Literal
17
+
18
+ import evaluate
19
+ import datasets
20
+ import numpy as np
21
+
22
+ from modified_coco.pr_rec_f1 import PrecisionRecallF1Support
23
+
24
+
25
+ _CITATION = """\
26
+ @InProceedings{coco:2020,
27
+ title = {Microsoft {COCO:} Common Objects in Context},
28
+ authors={Tsung{-}Yi Lin and
29
+ Michael Maire and
30
+ Serge J. Belongie and
31
+ James Hays and
32
+ Pietro Perona and
33
+ Deva Ramanan and
34
+ Piotr Dollar and
35
+ C. Lawrence Zitnick},
36
+ booktitle = {Computer Vision - {ECCV} 2014 - 13th European Conference, Zurich,
37
+ Switzerland, September 6-12, 2014, Proceedings, Part {V}},
38
+ series = {Lecture Notes in Computer Science},
39
+ volume = {8693},
40
+ pages = {740--755},
41
+ publisher = {Springer},
42
+ year={2014}
43
+ }
44
+ """
45
+
46
+ _DESCRIPTION = """\
47
+ This evaluation metric is designed to give provide object detection metrics at different object size levels.
48
+ It is based on a modified version of the commonly used COCO-evaluation metrics.
49
+ """
50
+
51
+
52
+ _KWARGS_DESCRIPTION = """
53
+ Calculates object detection metrics given predicted and ground truth bounding boxes for a single image.
54
+ Args:
55
+ predictions: list of predictions to score. Each prediction should
56
+ be a list containing the four co-ordinates that specify the bounding box.
57
+ Co-ordinate format is as defined when instantiating the metric
58
+ (parameter: bbox_type, defaults to xywh).
59
+ references: list of reference for each prediction. Each prediction should
60
+ be a list containing the four co-ordinates that specify the bounding box.
61
+ Bounding box format should be the same as for the predictions.
62
+ Returns:
63
+ dict containing dicts for each specified area range with following items:
64
+ 'range': specified area with [max_px_area, max_px_area]
65
+ 'iouThr': min. IOU-threshold of a prediction with a ground truth box
66
+ to be considered a correct prediction
67
+ 'maxDets': maximum number of detections
68
+ 'tp': number of true positive (correct) predictions
69
+ 'fp': number of false positive (incorrect) predictions
70
+ 'fn': number of false negative (missed) predictions
71
+ 'duplicates': number of duplicate predictions
72
+ 'precision': best possible score = 1, worst possible score = 0
73
+ large if few false positive predictions
74
+ formula: tp/(fp+tp)
75
+ 'recall' best possible score = 1, worst possible score = 0
76
+ large if few missed predictions
77
+ formula: tp/(tp+fn)
78
+ 'f1': best possible score = 1, worst possible score = 0
79
+ trades off precision and recall
80
+ formula: 2*(precision*recall)/(precision+recall)
81
+ 'support': number of ground truth bounding boxes considered in the evaluation,
82
+ 'fpi': number of images with no ground truth but false positive predictions,
83
+ 'nImgs': number of images considered in evaluation
84
+ Examples:
85
+ >>> module = evaluate.load("./detection_metric.py", iou_thresholds=0.9)
86
+ >>> predictions = [
87
+ [
88
+ [10, 15, 20, 25],
89
+ [45, 30, 10, 10]
90
+ ],[
91
+ [14, 25, 6, 6],
92
+ [10, 16, 6, 10]
93
+ ]
94
+ ]
95
+ >>> references = [
96
+ [[10, 15, 20, 20]],
97
+ [[30, 30, 5, 6]]
98
+ ]
99
+ >>> module.add_batch(predictions=predictions, references=references, predictions_scores=[[0.5,0.3],[0.8, 0.1]])
100
+ >>> result = module.compute()
101
+ >>> print(result)
102
+ {'all': {
103
+ 'range': [0, 10000000000.0],
104
+ 'iouThr': '0.00',
105
+ 'maxDets': 100,
106
+ 'tp': 1,
107
+ 'fp': 3,
108
+ 'fn': 1,
109
+ 'duplicates': 0,
110
+ 'precision': 0.25,
111
+ 'recall': 0.5,
112
+ 'f1': 0.3333333333333333,
113
+ 'support': 2,
114
+ 'fpi': 0,
115
+ 'nImgs': 2
116
+ }
117
+ }
118
+ """
119
+
120
+
121
+ @evaluate.utils.file_utils.add_start_docstrings(_DESCRIPTION, _KWARGS_DESCRIPTION)
122
+ class DetectionMetric(evaluate.Metric):
123
+ def __init__(
124
+ self,
125
+ area_ranges_tuples: List[Tuple[str, List[int]]] = [("all", [0, 1e5 ** 2])],
126
+ iou_threshold: float = 1e-10,
127
+ class_agnostic: bool = True,
128
+ bbox_format: str = "xywh",
129
+ iou_type: Literal["bbox", "segm"] = "bbox",
130
+ **kwargs
131
+ ):
132
+ super().__init__(**kwargs)
133
+ area_ranges = [v for _, v in area_ranges_tuples]
134
+ area_ranges_labels = [k for k, _ in area_ranges_tuples]
135
+
136
+ metric_params = dict(
137
+ iou_thresholds=[iou_threshold],
138
+ area_ranges=area_ranges,
139
+ area_ranges_labels=area_ranges_labels,
140
+ class_agnostic=class_agnostic,
141
+ iou_type=iou_type,
142
+ box_format=bbox_format
143
+ )
144
+ self.coco_metric = PrecisionRecallF1Support(**metric_params)
145
+
146
+ def _info(self):
147
+ return evaluate.MetricInfo(
148
+ # This is the description that will appear on the modules page.
149
+ module_type="metric",
150
+ description=_DESCRIPTION,
151
+ citation=_CITATION,
152
+ inputs_description=_KWARGS_DESCRIPTION,
153
+ # This defines the format of each prediction and reference
154
+ features=datasets.Features(
155
+ {
156
+ 'predictions': datasets.Sequence(feature=datasets.Sequence(datasets.Value("float"))),
157
+ 'references': datasets.Sequence(feature=datasets.Sequence(datasets.Value("float"))),
158
+ }
159
+ ),
160
+ # Additional links to the codebase or references
161
+ codebase_urls=["https://github.com/SEA-AI/metrics/tree/main",
162
+ "https://github.com/cocodataset/cocoapi/tree/master"]
163
+ )
164
+
165
+ def add_batch(
166
+ self,
167
+ predictions,
168
+ references,
169
+ predictions_labels: Optional[np.ndarray] = None,
170
+ predictions_scores: Optional[np.ndarray] = None,
171
+ references_labels: Optional[np.ndarray] = None
172
+ ):
173
+ """Add predictions and ground truths of a single image to update the metric.
174
+
175
+ Args:
176
+ predictions (List[List[List[int]]]): predicted bounding boxes, shape: (n_images, m_pred_boxes, 4)
177
+ references (List[List[List[int]]]): ground truth bounding boxes, shape: (n_images, l_gt_boxes, 4)
178
+ predictions_labels (Optional[np.ndarray], optional): Labels of predicted bounding boxes, shape: (n_images, m_pred_boxes).
179
+ Defaults to None.
180
+ predictions_scores (Optional[np.ndarray], optional): Scores of predicted bounding boxes, shape: (n_images, m_pred_boxes).
181
+ Defaults to None.
182
+ references_labels (Optional[np.ndarray], optional): Labels of predicted bounding boxes, shape: (n_images, l_pred_boxes).
183
+ Defaults to None.
184
+ """
185
+ if predictions_labels is None:
186
+ predictions_labels = [None]*len(predictions)
187
+ if predictions_scores is None:
188
+ predictions_scores = [None]*len(predictions)
189
+ if references_labels is None:
190
+ references_labels = [None]*len(references)
191
+ for pred, ref, pred_score, pred_l, ref_l in zip(predictions,
192
+ references,
193
+ predictions_scores,
194
+ predictions_labels,
195
+ references_labels):
196
+ preds, targets = self.process_preds_references(pred, ref, pred_l, pred_score, ref_l)
197
+ self.coco_metric.update(preds, targets)
198
+ super(evaluate.Metric, self).add_batch(predictions=predictions, references=references)
199
+
200
+ def _compute(
201
+ self,
202
+ predictions,
203
+ references
204
+ ):
205
+ """Returns the scores"""
206
+ result = self.coco_metric.compute()["metrics"]
207
+ return result
208
+
209
+ @staticmethod
210
+ def process_preds_references(
211
+ predictions,
212
+ references,
213
+ predictions_labels: Optional[np.ndarray] = None,
214
+ predictions_scores: Optional[np.ndarray] = None,
215
+ references_labels: Optional[np.ndarray] = None
216
+ ):
217
+ if predictions_scores is None:
218
+ predictions_scores = np.ones(shape=len(predictions), dtype=np.float32)
219
+ else:
220
+ predictions_scores = np.array(predictions_scores, dtype=np.float32)
221
+ if predictions_labels is None:
222
+ if references_labels is not None:
223
+ print("Warning: Providing no prediction labels, but ground truth labels!")
224
+ predictions_labels = np.zeros(shape=len(predictions), dtype=np.int16)
225
+ else:
226
+ predictions_labels = np.array(predictions_labels)
227
+ if references_labels is None:
228
+ references_labels = np.zeros(shape=len(references), dtype=np.int16)
229
+ else:
230
+ references_labels = np.array(references_labels)
231
+
232
+ preds = [
233
+ dict(
234
+ boxes=np.array(predictions),
235
+ scores=predictions_scores,
236
+ labels=predictions_labels
237
+ )
238
+ ]
239
+ target = [
240
+ dict(
241
+ boxes=np.array(references),
242
+ labels=references_labels
243
+ )
244
+ ]
245
+
246
+ return preds, target
modified_coco/cocoeval.py ADDED
@@ -0,0 +1,693 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ __author__ = 'tsungyi, [email protected]'
2
+
3
+ # This is a modified version of the original cocoeval.py
4
+ # In this version we are able to return the TP, FP, and FN values
5
+ # along with the other default metrics.
6
+
7
+ import numpy as np
8
+ import datetime
9
+ import time
10
+ from collections import defaultdict
11
+ from pycocotools import mask as maskUtils
12
+ import copy
13
+
14
+ class COCOeval:
15
+ # Interface for evaluating detection on the Microsoft COCO dataset.
16
+ #
17
+ # The usage for CocoEval is as follows:
18
+ # cocoGt=..., cocoDt=... # load dataset and results
19
+ # E = CocoEval(cocoGt,cocoDt); # initialize CocoEval object
20
+ # E.params.recThrs = ...; # set parameters as desired
21
+ # E.evaluate(); # run per image evaluation
22
+ # E.accumulate(); # accumulate per image results
23
+ # E.summarize(); # display summary metrics of results
24
+ # For example usage see evalDemo.m and http://mscoco.org/.
25
+ #
26
+ # The evaluation parameters are as follows (defaults in brackets):
27
+ # imgIds - [all] N img ids to use for evaluation
28
+ # catIds - [all] K cat ids to use for evaluation
29
+ # iouThrs - [.5:.05:.95] T=10 IoU thresholds for evaluation
30
+ # recThrs - [0:.01:1] R=101 recall thresholds for evaluation
31
+ # areaRng - [...] A=4 object area ranges for evaluation
32
+ # maxDets - [1 10 100] M=3 thresholds on max detections per image
33
+ # iouType - ['segm'] set iouType to 'segm', 'bbox' or 'keypoints'
34
+ # iouType replaced the now DEPRECATED useSegm parameter.
35
+ # useCats - [1] if true use category labels for evaluation
36
+ # Note: if useCats=0 category labels are ignored as in proposal scoring.
37
+ # Note: multiple areaRngs [Ax2] and maxDets [Mx1] can be specified.
38
+ #
39
+ # evaluate(): evaluates detections on every image and every category and
40
+ # concats the results into the "evalImgs" with fields:
41
+ # dtIds - [1xD] id for each of the D detections (dt)
42
+ # gtIds - [1xG] id for each of the G ground truths (gt)
43
+ # dtMatches - [TxD] matching gt id at each IoU or 0
44
+ # gtMatches - [TxG] matching dt id at each IoU or 0
45
+ # dtScores - [1xD] confidence of each dt
46
+ # gtIgnore - [1xG] ignore flag for each gt
47
+ # dtIgnore - [TxD] ignore flag for each dt at each IoU
48
+ #
49
+ # accumulate(): accumulates the per-image, per-category evaluation
50
+ # results in "evalImgs" into the dictionary "eval" with fields:
51
+ # params - parameters used for evaluation
52
+ # date - date evaluation was performed
53
+ # counts - [T,R,K,A,M] parameter dimensions (see above)
54
+ # precision - [TxRxKxAxM] precision for every evaluation setting
55
+ # recall - [TxKxAxM] max recall for every evaluation setting
56
+ # TP - [TxKxAxM] number of true positives for every eval setting [NEW]
57
+ # FP - [TxKxAxM] number of false positives for every eval setting [NEW]
58
+ # FN - [TxKxAxM] number of false negatives for every eval setting [NEW]
59
+ # Note: precision and recall==-1 for settings with no gt objects.
60
+ #
61
+ # See also coco, mask, pycocoDemo, pycocoEvalDemo
62
+ #
63
+ # Microsoft COCO Toolbox. version 2.0
64
+ # Data, paper, and tutorials available at: http://mscoco.org/
65
+ # Code written by Piotr Dollar and Tsung-Yi Lin, 2015.
66
+ # Licensed under the Simplified BSD License [see coco/license.txt]
67
+ def __init__(self, cocoGt=None, cocoDt=None, iouType='segm'):
68
+ '''
69
+ Initialize CocoEval using coco APIs for gt and dt
70
+ :param cocoGt: coco object with ground truth annotations
71
+ :param cocoDt: coco object with detection results
72
+ :return: None
73
+ '''
74
+ if not iouType:
75
+ print('iouType not specified. use default iouType segm')
76
+ self.cocoGt = cocoGt # ground truth COCO API
77
+ self.cocoDt = cocoDt # detections COCO API
78
+ self.evalImgs = defaultdict(list) # per-image per-category evaluation results [KxAxI] elements
79
+ self.eval = {} # accumulated evaluation results
80
+ self._gts = defaultdict(list) # gt for evaluation
81
+ self._dts = defaultdict(list) # dt for evaluation
82
+ self.params = Params(iouType=iouType) # parameters
83
+ self._paramsEval = {} # parameters for evaluation
84
+ self.stats = [] # result summarization
85
+ self.ious = {} # ious between all gts and dts
86
+ if not cocoGt is None:
87
+ self.params.imgIds = sorted(cocoGt.getImgIds())
88
+ self.params.catIds = sorted(cocoGt.getCatIds())
89
+
90
+
91
+ def _prepare(self):
92
+ '''
93
+ Prepare ._gts and ._dts for evaluation based on params
94
+ :return: None
95
+ '''
96
+ def _toMask(anns, coco):
97
+ # modify ann['segmentation'] by reference
98
+ for ann in anns:
99
+ rle = coco.annToRLE(ann)
100
+ ann['segmentation'] = rle
101
+ p = self.params
102
+ if p.useCats:
103
+ gts=self.cocoGt.loadAnns(self.cocoGt.getAnnIds(imgIds=p.imgIds, catIds=p.catIds))
104
+ dts=self.cocoDt.loadAnns(self.cocoDt.getAnnIds(imgIds=p.imgIds, catIds=p.catIds))
105
+ else:
106
+ gts=self.cocoGt.loadAnns(self.cocoGt.getAnnIds(imgIds=p.imgIds))
107
+ dts=self.cocoDt.loadAnns(self.cocoDt.getAnnIds(imgIds=p.imgIds))
108
+
109
+ # convert ground truth to mask if iouType == 'segm'
110
+ if p.iouType == 'segm':
111
+ _toMask(gts, self.cocoGt)
112
+ _toMask(dts, self.cocoDt)
113
+ # set ignore flag
114
+ for gt in gts:
115
+ gt['ignore'] = gt['ignore'] if 'ignore' in gt else 0
116
+ gt['ignore'] = 'iscrowd' in gt and gt['iscrowd']
117
+ if p.iouType == 'keypoints':
118
+ gt['ignore'] = (gt['num_keypoints'] == 0) or gt['ignore']
119
+ self._gts = defaultdict(list) # gt for evaluation
120
+ self._dts = defaultdict(list) # dt for evaluation
121
+ for gt in gts:
122
+ self._gts[gt['image_id'], gt['category_id']].append(gt)
123
+ for dt in dts:
124
+ self._dts[dt['image_id'], dt['category_id']].append(dt)
125
+ self.evalImgs = defaultdict(list) # per-image per-category evaluation results
126
+ self.eval = {} # accumulated evaluation results
127
+
128
+ def evaluate(self):
129
+ '''
130
+ Run per image evaluation on given images and store results (a list of dict) in self.evalImgs
131
+ :return: None
132
+ '''
133
+ tic = time.time()
134
+ print('Running per image evaluation...')
135
+ p = self.params
136
+ # add backward compatibility if useSegm is specified in params
137
+ if not p.useSegm is None:
138
+ p.iouType = 'segm' if p.useSegm == 1 else 'bbox'
139
+ print('useSegm (deprecated) is not None. Running {} evaluation'.format(p.iouType))
140
+ print('Evaluate annotation type *{}*'.format(p.iouType))
141
+ p.imgIds = list(np.unique(p.imgIds))
142
+ if p.useCats:
143
+ p.catIds = list(np.unique(p.catIds))
144
+ p.maxDets = sorted(p.maxDets)
145
+ self.params=p
146
+
147
+ self._prepare()
148
+ # loop through images, area range, max detection number
149
+ catIds = p.catIds if p.useCats else [-1]
150
+
151
+ if p.iouType == 'segm' or p.iouType == 'bbox':
152
+ computeIoU = self.computeIoU
153
+ elif p.iouType == 'keypoints':
154
+ computeIoU = self.computeOks
155
+ self.ious = {(imgId, catId): computeIoU(imgId, catId) \
156
+ for imgId in p.imgIds
157
+ for catId in catIds}
158
+
159
+ evaluateImg = self.evaluateImg
160
+ maxDet = p.maxDets[-1]
161
+ self.evalImgs = [evaluateImg(imgId, catId, areaRng, maxDet)
162
+ for catId in catIds
163
+ for areaRng in p.areaRng
164
+ for imgId in p.imgIds
165
+ ]
166
+ self._paramsEval = copy.deepcopy(self.params)
167
+ toc = time.time()
168
+ print('DONE (t={:0.2f}s).'.format(toc-tic))
169
+
170
+ def computeIoU(self, imgId, catId):
171
+ p = self.params
172
+ if p.useCats:
173
+ gt = self._gts[imgId,catId]
174
+ dt = self._dts[imgId,catId]
175
+ else:
176
+ gt = [_ for cId in p.catIds for _ in self._gts[imgId,cId]]
177
+ dt = [_ for cId in p.catIds for _ in self._dts[imgId,cId]]
178
+ if len(gt) == 0 and len(dt) ==0:
179
+ return []
180
+ inds = np.argsort([-d['score'] for d in dt], kind='mergesort')
181
+ dt = [dt[i] for i in inds]
182
+ if len(dt) > p.maxDets[-1]:
183
+ dt=dt[0:p.maxDets[-1]]
184
+
185
+ if p.iouType == 'segm':
186
+ g = [g['segmentation'] for g in gt]
187
+ d = [d['segmentation'] for d in dt]
188
+ elif p.iouType == 'bbox':
189
+ g = [g['bbox'] for g in gt]
190
+ d = [d['bbox'] for d in dt]
191
+ else:
192
+ raise Exception('unknown iouType for iou computation')
193
+
194
+ # compute iou between each dt and gt region
195
+ iscrowd = [int(o['iscrowd']) for o in gt]
196
+ ious = maskUtils.iou(d,g,iscrowd)
197
+ return ious
198
+
199
+ def computeOks(self, imgId, catId):
200
+ p = self.params
201
+ # dimention here should be Nxm
202
+ gts = self._gts[imgId, catId]
203
+ dts = self._dts[imgId, catId]
204
+ inds = np.argsort([-d['score'] for d in dts], kind='mergesort')
205
+ dts = [dts[i] for i in inds]
206
+ if len(dts) > p.maxDets[-1]:
207
+ dts = dts[0:p.maxDets[-1]]
208
+ # if len(gts) == 0 and len(dts) == 0:
209
+ if len(gts) == 0 or len(dts) == 0:
210
+ return []
211
+ ious = np.zeros((len(dts), len(gts)))
212
+ sigmas = p.kpt_oks_sigmas
213
+ vars = (sigmas * 2)**2
214
+ k = len(sigmas)
215
+ # compute oks between each detection and ground truth object
216
+ for j, gt in enumerate(gts):
217
+ # create bounds for ignore regions(double the gt bbox)
218
+ g = np.array(gt['keypoints'])
219
+ xg = g[0::3]; yg = g[1::3]; vg = g[2::3]
220
+ k1 = np.count_nonzero(vg > 0)
221
+ bb = gt['bbox']
222
+ x0 = bb[0] - bb[2]; x1 = bb[0] + bb[2] * 2
223
+ y0 = bb[1] - bb[3]; y1 = bb[1] + bb[3] * 2
224
+ for i, dt in enumerate(dts):
225
+ d = np.array(dt['keypoints'])
226
+ xd = d[0::3]; yd = d[1::3]
227
+ if k1>0:
228
+ # measure the per-keypoint distance if keypoints visible
229
+ dx = xd - xg
230
+ dy = yd - yg
231
+ else:
232
+ # measure minimum distance to keypoints in (x0,y0) & (x1,y1)
233
+ z = np.zeros((k))
234
+ dx = np.max((z, x0-xd),axis=0)+np.max((z, xd-x1),axis=0)
235
+ dy = np.max((z, y0-yd),axis=0)+np.max((z, yd-y1),axis=0)
236
+ e = (dx**2 + dy**2) / vars / (gt['area']+np.spacing(1)) / 2
237
+ if k1 > 0:
238
+ e=e[vg > 0]
239
+ ious[i, j] = np.sum(np.exp(-e)) / e.shape[0]
240
+ return ious
241
+
242
+ def is_bbox1_inside_bbox2(self, bbox1, bbox2):
243
+ '''
244
+ Check if bbox1 is inside bbox2. Bbox is in the format [x, y, w, h]
245
+ Returns:
246
+ - True if bbox1 is inside bbox2, False otherwise
247
+ - How much bbox1 is inside bbox2 (number between 0 and 1)
248
+ '''
249
+ x1_1, y1_1, w1_1, h1_1 = bbox1
250
+ x1_2, y1_2, w1_2, h1_2 = bbox2
251
+
252
+ # Convert xywh to (x, y, x2, y2) format
253
+ x2_1, y2_1 = x1_1 + w1_1, y1_1 + h1_1
254
+ x2_2, y2_2 = x1_2 + w1_2, y1_2 + h1_2
255
+
256
+ # Calculate the coordinates of the intersection rectangle
257
+ x_left, y_top = max(x1_1, x1_2), max(y1_1, y1_2)
258
+ x_right, y_bottom = min(x2_1, x2_2), min(y2_1, y2_2)
259
+ print(f"{x_left=}, {x_right=}, {y_top=}, {y_bottom=}")
260
+ if x_right < x_left or y_bottom < y_top:
261
+ return False, 0
262
+
263
+ intersection_area = (x_right - x_left) * (y_bottom - y_top)
264
+ print(f"{intersection_area=}")
265
+ return True, intersection_area / (w1_1 * h1_1)
266
+
267
+ def evaluateImg(self, imgId, catId, aRng, maxDet):
268
+ '''
269
+ perform evaluation for single category and image
270
+ :return: dict (single image results)
271
+ '''
272
+ p = self.params
273
+ if p.useCats:
274
+ gt = self._gts[imgId,catId]
275
+ dt = self._dts[imgId,catId]
276
+ else:
277
+ gt = [_ for cId in p.catIds for _ in self._gts[imgId,cId]]
278
+ dt = [_ for cId in p.catIds for _ in self._dts[imgId,cId]]
279
+ if len(gt) == 0 and len(dt) ==0:
280
+ return None
281
+
282
+ for g in gt:
283
+ if g['ignore'] or (g['area']<aRng[0] or g['area']>aRng[1]):
284
+ g['_ignore'] = 1
285
+ else:
286
+ g['_ignore'] = 0
287
+
288
+ # sort dt highest score first, sort gt ignore last
289
+ gtind = np.argsort([g['_ignore'] for g in gt], kind='mergesort')
290
+ gt = [gt[i] for i in gtind]
291
+ dtind = np.argsort([-d['score'] for d in dt], kind='mergesort')
292
+ dt = [dt[i] for i in dtind[0:maxDet]]
293
+ iscrowd = [int(o['iscrowd']) for o in gt]
294
+ # load computed ious
295
+ ious = self.ious[imgId, catId][:, gtind] if len(self.ious[imgId, catId]) > 0 else self.ious[imgId, catId]
296
+
297
+ T = len(p.iouThrs)
298
+ G = len(gt)
299
+ D = len(dt)
300
+ gtm = np.zeros((T,G))
301
+ dtm = np.zeros((T,D))
302
+ gtIg = np.array([g['_ignore'] for g in gt])
303
+ dtIg = np.zeros((T,D))
304
+ dtDup = np.zeros((T,D))
305
+
306
+ if not len(ious)==0:
307
+ for tind, t in enumerate(p.iouThrs):
308
+ for dind, d in enumerate(dt):
309
+ # information about best match so far (m=-1 -> unmatched)
310
+ iou = min([t,1-1e-10])
311
+ m = -1
312
+ for gind, g in enumerate(gt):
313
+ # if this gt already matched, iou>iouThr, and not a crowd
314
+ # store detection as duplicate
315
+ if gtm[tind,gind]>0 and ious[dind,gind]>t and not iscrowd[gind]:
316
+ dtDup[tind, dind] = d['id']
317
+ # if this gt already matched, and not a crowd, continue
318
+ if gtm[tind,gind]>0 and not iscrowd[gind]:
319
+ continue
320
+ # if dt matched to reg gt, and on ignore gt, stop
321
+ if m > -1 and gtIg[m]==0 and gtIg[gind]==1:
322
+ break
323
+ # continue to next gt unless better match made
324
+ if ious[dind,gind] < iou:
325
+ continue
326
+ # if match successful and best so far, store appropriately
327
+ iou=ious[dind,gind]
328
+ m=gind
329
+ # if match made store id of match for both dt and gt
330
+ if m ==-1:
331
+ continue
332
+ dtIg[tind,dind] = gtIg[m]
333
+ dtm[tind,dind] = gt[m]['id']
334
+ gtm[tind,m] = d['id']
335
+ # set unmatched detections outside of area range to ignore
336
+ a = np.array([d['area']<aRng[0] or d['area']>aRng[1] for d in dt]).reshape((1, len(dt)))
337
+ dtIg = np.logical_or(dtIg, np.logical_and(dtm==0, np.repeat(a,T,0)))
338
+ # only consider duplicates if dets are inside the area range
339
+ dtDup = np.logical_and(dtDup, np.logical_and(dtm==0, np.logical_not(np.repeat(a,T,0))))
340
+ # false positive img (fpi) when all gt are ignored and there remain detections
341
+ fpi = (gtIg.sum() == G) and np.any(dtIg == 0)
342
+
343
+ # store results for given image and category
344
+ return {
345
+ 'image_id': imgId,
346
+ 'category_id': catId,
347
+ 'aRng': aRng,
348
+ 'maxDet': maxDet,
349
+ 'dtIds': [d['id'] for d in dt],
350
+ 'gtIds': [g['id'] for g in gt],
351
+ 'dtMatches': dtm,
352
+ 'gtMatches': gtm,
353
+ 'dtScores': [d['score'] for d in dt],
354
+ 'gtIgnore': gtIg,
355
+ 'dtIgnore': dtIg,
356
+ 'dtDuplicates': dtDup,
357
+ 'fpi': fpi,
358
+ }
359
+
360
+ def accumulate(self, p = None):
361
+ '''
362
+ Accumulate per image evaluation results and store the result in self.eval
363
+ :param p: input params for evaluation
364
+ :return: None
365
+ '''
366
+ print('Accumulating evaluation results...')
367
+ tic = time.time()
368
+ if not self.evalImgs:
369
+ print('Please run evaluate() first')
370
+ # allows input customized parameters
371
+ if p is None:
372
+ p = self.params
373
+ p.catIds = p.catIds if p.useCats == 1 else [-1]
374
+ T = len(p.iouThrs)
375
+ R = len(p.recThrs)
376
+ K = len(p.catIds) if p.useCats else 1
377
+ A = len(p.areaRng)
378
+ M = len(p.maxDets)
379
+ precision = -np.ones((T,R,K,A,M)) # -1 for the precision of absent categories
380
+ recall = -np.ones((T,K,A,M))
381
+ scores = -np.ones((T,R,K,A,M))
382
+ TP = -np.ones((T,K,A,M))
383
+ FP = -np.ones((T,K,A,M))
384
+ FN = -np.ones((T,K,A,M))
385
+ duplicates = -np.ones((T,K,A,M))
386
+ FPI = -np.ones((T,K,A,M))
387
+
388
+ # matrix of arrays
389
+ TPC = np.empty((T,K,A,M), dtype=object)
390
+ FPC = np.empty((T,K,A,M), dtype=object)
391
+ sorted_conf = np.empty((K,A,M), dtype=object)
392
+
393
+ # create dictionary for future indexing
394
+ _pe = self._paramsEval
395
+ catIds = _pe.catIds if _pe.useCats else [-1]
396
+ setK = set(catIds)
397
+ setA = set(map(tuple, _pe.areaRng))
398
+ setM = set(_pe.maxDets)
399
+ setI = set(_pe.imgIds)
400
+ # get inds to evaluate
401
+ k_list = [n for n, k in enumerate(p.catIds) if k in setK]
402
+ m_list = [m for n, m in enumerate(p.maxDets) if m in setM]
403
+ a_list = [n for n, a in enumerate(map(lambda x: tuple(x), p.areaRng)) if a in setA]
404
+ i_list = [n for n, i in enumerate(p.imgIds) if i in setI]
405
+ I0 = len(_pe.imgIds)
406
+ A0 = len(_pe.areaRng)
407
+ # retrieve E at each category, area range, and max number of detections
408
+ for k, k0 in enumerate(k_list):
409
+ Nk = k0*A0*I0
410
+ for a, a0 in enumerate(a_list):
411
+ Na = a0*I0
412
+ for m, maxDet in enumerate(m_list):
413
+ E = [self.evalImgs[Nk + Na + i] for i in i_list]
414
+ E = [e for e in E if not e is None]
415
+ if len(E) == 0:
416
+ continue
417
+ dtScores = np.concatenate([e['dtScores'][0:maxDet] for e in E])
418
+
419
+ # different sorting method generates slightly different results.
420
+ # mergesort is used to be consistent as Matlab implementation.
421
+ inds = np.argsort(-dtScores, kind='mergesort')
422
+ dtScoresSorted = dtScores[inds]
423
+ sorted_conf[k,a,m] = dtScoresSorted.copy()
424
+
425
+ dtm = np.concatenate([e['dtMatches'][:,0:maxDet] for e in E], axis=1)[:,inds]
426
+ dtIg = np.concatenate([e['dtIgnore'][:,0:maxDet] for e in E], axis=1)[:,inds]
427
+ dtDups = np.concatenate([e['dtDuplicates'][:,0:maxDet] for e in E], axis=1)[:,inds]
428
+ gtIg = np.concatenate([e['gtIgnore'] for e in E])
429
+ npig = np.count_nonzero(gtIg==0) # number of not ignored gt objects
430
+ fpi = np.array([e['fpi'] for e in E]) # false positive image (no gt objects)
431
+ # if npig == 0:
432
+ # print("No ground truth objects, continuing...")
433
+ # continue
434
+ tps = np.logical_and( dtm, np.logical_not(dtIg) )
435
+ fps = np.logical_and(np.logical_not(dtm), np.logical_not(dtIg) )
436
+
437
+ tp_sum = np.cumsum(tps, axis=1).astype(dtype=float)
438
+ fp_sum = np.cumsum(fps, axis=1).astype(dtype=float)
439
+ fpi_sum = np.cumsum(fpi).astype(dtype=int)
440
+ for t, (tp, fp) in enumerate(zip(tp_sum, fp_sum)):
441
+ tp = np.array(tp)
442
+ fp = np.array(fp)
443
+ fn = npig - tp # difference between gt and tp
444
+ nd = len(tp)
445
+ rc = tp / npig if npig else [0]
446
+ pr = tp / (fp+tp+np.spacing(1))
447
+ q = np.zeros((R,))
448
+ ss = np.zeros((R,)) #
449
+
450
+ if nd:
451
+ recall[t,k,a,m] = rc[-1]
452
+ else:
453
+ recall[t,k,a,m] = 0
454
+
455
+ TP[t,k,a,m] = tp[-1] if nd else 0
456
+ FP[t,k,a,m] = fp[-1] if nd else 0
457
+ FN[t,k,a,m] = fn[-1] if nd else npig
458
+ duplicates[t,k,a,m] = np.sum(dtDups[t, :])
459
+ FPI[t,k,a,m] = fpi_sum[-1]
460
+ TPC[t,k,a,m] = tp.copy()
461
+ FPC[t,k,a,m] = fp.copy()
462
+
463
+ # numpy is slow without cython optimization for accessing elements
464
+ # use python array gets significant speed improvement
465
+ pr = pr.tolist(); q = q.tolist()
466
+
467
+ for i in range(nd-1, 0, -1):
468
+ if pr[i] > pr[i-1]:
469
+ pr[i-1] = pr[i]
470
+
471
+ inds = np.searchsorted(rc, p.recThrs, side='left')
472
+ try:
473
+ for ri, pi in enumerate(inds):
474
+ q[ri] = pr[pi]
475
+ ss[ri] = dtScoresSorted[pi]
476
+ except:
477
+ pass
478
+ precision[t,:,k,a,m] = np.array(q)
479
+ scores[t,:,k,a,m] = np.array(ss)
480
+ self.eval = {
481
+ 'params': p,
482
+ 'counts': [T, R, K, A, M],
483
+ 'date': datetime.datetime.now().strftime('%Y-%m-%d %H:%M:%S'),
484
+ 'precision': precision,
485
+ 'recall': recall,
486
+ 'scores': scores,
487
+ 'TP': TP,
488
+ 'FP': FP,
489
+ 'FN': FN,
490
+ 'duplicates': duplicates,
491
+ 'support': TP + FN,
492
+ 'FPI': FPI,
493
+ 'TPC': TPC,
494
+ 'FPC': FPC,
495
+ 'sorted_conf': sorted_conf,
496
+ }
497
+ toc = time.time()
498
+ print('DONE (t={:0.2f}s).'.format( toc-tic))
499
+
500
+ def summarize(self):
501
+ results = {}
502
+ max_dets = self.params.maxDets[-1]
503
+ min_iou = self.params.iouThrs[0]
504
+
505
+ results['params'] = self.params
506
+ results['eval'] = self.eval
507
+ results['metrics'] = {}
508
+
509
+ # for area_lbl in self.params.areaRngLbl:
510
+ # results.append(self._summarize('ap', iouThr=min_iou,
511
+ # areaRng=area_lbl, maxDets=max_dets))
512
+
513
+ # for area_lbl in self.params.areaRngLbl:
514
+ # results.append(self._summarize('ar', iouThr=min_iou,
515
+ # areaRng=area_lbl, maxDets=max_dets))
516
+
517
+ metrics_str = f"{'tp':>6}, {'fp':>6}, {'fn':>6}, {'dup':>6}, "
518
+ metrics_str += f"{'pr':>5.2}, {'rec':>5.2}, {'f1':>5.2}, {'supp':>6}"
519
+ metrics_str += f", {'fpi':>6}, {'nImgs':>6}"
520
+ print('{:>51} {}'.format('METRIC', metrics_str))
521
+ for area_lbl in self.params.areaRngLbl:
522
+ results['metrics'][area_lbl] = self._summarize(
523
+ 'pr_rec_f1',
524
+ iouThr=min_iou,
525
+ areaRng=area_lbl,
526
+ maxDets=max_dets
527
+ )
528
+
529
+ return results
530
+
531
+ def _summarize(self, metric_type='ap', iouThr=None, areaRng='all', maxDets=100):
532
+ """
533
+ Helper function to print and obtain metrics of types:
534
+ - ap: average precision
535
+ - ar: average recall
536
+ - cf: tp, fp, fn, precision, recall, f1
537
+ values from COCOeval object
538
+ """
539
+ def _summarize_ap_ar(ap=1, iouThr=None, areaRng='all', maxDets=100):
540
+ iStr = ' {:<18} {} @[ IoU={:<9} | area={:>6s} | maxDets={:>3d} ] = {:0.3f}'
541
+ titleStr = 'Average Precision' if ap == 1 else 'Average Recall'
542
+ typeStr = '(AP)' if ap == 1 else '(AR)'
543
+ iouStr = '{:0.2f}:{:0.2f}'.format(p.iouThrs[0], p.iouThrs[-1]) \
544
+ if iouThr is None else '{:0.2f}'.format(iouThr)
545
+
546
+ aind = [i for i, aRng in enumerate(
547
+ p.areaRngLbl) if aRng == areaRng]
548
+ mind = [i for i, mDet in enumerate(p.maxDets) if mDet == maxDets]
549
+
550
+ if ap == 1:
551
+ # dimension of precision: [TxRxKxAxM]
552
+ s = self.eval['precision']
553
+ # IoU
554
+ if iouThr is not None:
555
+ t = np.where(iouThr == p.iouThrs)[0]
556
+ s = s[t]
557
+ s = s[:, :, :, aind, mind]
558
+ else:
559
+ # dimension of recall: [TxKxAxM]
560
+ s = self.eval['recall']
561
+ if iouThr is not None:
562
+ t = np.where(iouThr == p.iouThrs)[0]
563
+ s = s[t]
564
+ s = s[:, :, aind, mind]
565
+ if len(s[s > -1]) == 0:
566
+ mean_s = -1
567
+ else:
568
+ mean_s = np.mean(s[s > -1])
569
+ print(iStr.format(titleStr, typeStr, iouStr, areaRng, maxDets, mean_s))
570
+ return mean_s
571
+
572
+ def _summarize_pr_rec_f1(iouThr=None, areaRng='all', maxDets=100):
573
+ aind = [i for i, aRng in enumerate(p.areaRngLbl) if aRng == areaRng]
574
+ mind = [i for i, mDet in enumerate(p.maxDets) if mDet == maxDets]
575
+
576
+ # dimension of TP, FP, FN [TxKxAxM]
577
+ tp = self.eval['TP']
578
+ fp = self.eval['FP']
579
+ fn = self.eval['FN']
580
+ dup = self.eval['duplicates']
581
+ fpi = self.eval['FPI']
582
+ nImgs = len(p.imgIds)
583
+
584
+ # filter by IoU
585
+ if iouThr is not None:
586
+ t = np.where(iouThr == p.iouThrs)[0]
587
+ tp, fp, fn = tp[t], fp[t], fn[t]
588
+ dup = dup[t]
589
+ fpi = fpi[t]
590
+
591
+ # filter by area and maxDets
592
+ tp = tp[:, :, aind, mind].squeeze()
593
+ fp = fp[:, :, aind, mind].squeeze()
594
+ fn = fn[:, :, aind, mind].squeeze()
595
+ dup = dup[:, :, aind, mind].squeeze()
596
+ fpi = fpi[:, :, aind, mind].squeeze()
597
+
598
+ # handle case where tp, fp, fn and dup are empty (no gt and no dt)
599
+ if all([not np.any(m) for m in [tp, fp, fn, dup, fpi]]):
600
+ tp, fp, fn, dup, fpi =[-1] * 5
601
+ else:
602
+ tp, fp, fn, dup, fpi = [e.item() for e in [tp, fp, fn, dup, fpi]]
603
+
604
+ # compute precision, recall, f1
605
+ if tp == -1 and fp == -1 and fn == -1:
606
+ pr, rec, f1 = -1, -1, -1
607
+ support, fpi = 0, 0
608
+ else:
609
+ pr = 0 if tp + fp == 0 else tp / (tp + fp)
610
+ rec = 0 if tp + fn == 0 else tp / (tp + fn)
611
+ f1 = 0 if pr + rec == 0 else 2 * pr * rec / (pr + rec)
612
+ support = tp + fn
613
+ # print(f"{tp=}, {fp=}, {fn=}, {dup=}, {pr=}, {rec=}, {f1=}, {support=}, {fpi=}")
614
+
615
+ iStr = '@[ IoU={:<9} | area={:>9s} | maxDets={:>3d} ] = {}'
616
+ iouStr = '{:0.2f}:{:0.2f}'.format(p.iouThrs[0], p.iouThrs[-1]) \
617
+ if iouThr is None else '{:0.2f}'.format(iouThr)
618
+ metrics_str = f"{tp:>6.0f}, {fp:>6.0f}, {fn:>6.0f}, {dup:>6.0f}, "
619
+ metrics_str += f"{pr:>5.2f}, {rec:>5.2f}, {f1:>5.2f}, {support:>6.0f}, "
620
+ metrics_str += f"{fpi:>6.0f}, {nImgs:>6.0f}"
621
+ print(iStr.format(iouStr, areaRng, maxDets, metrics_str))
622
+
623
+ return {
624
+ 'range': p.areaRng[aind[0]],
625
+ 'iouThr': iouStr,
626
+ 'maxDets': maxDets,
627
+ 'tp': int(tp),
628
+ 'fp': int(fp),
629
+ 'fn': int(fn),
630
+ 'duplicates': int(dup),
631
+ 'precision': pr,
632
+ 'recall': rec,
633
+ 'f1': f1,
634
+ 'support': int(support),
635
+ 'fpi': int(fpi),
636
+ 'nImgs': nImgs,
637
+ }
638
+
639
+ p = self.params
640
+ if metric_type in ['ap', 'ar']:
641
+ ap = 1 if metric_type == 'ap' else 0
642
+ return _summarize_ap_ar(ap, iouThr=iouThr, areaRng=areaRng, maxDets=maxDets)
643
+
644
+ # return tp, fp, fn, pr, rec, f1, support, fpi, nImgs
645
+ return _summarize_pr_rec_f1(iouThr=iouThr, areaRng=areaRng, maxDets=maxDets)
646
+
647
+ def __str__(self):
648
+ self.summarize()
649
+
650
+ class Params:
651
+ '''
652
+ Params for coco evaluation api
653
+ '''
654
+ def setDetParams(self):
655
+ self.imgIds = []
656
+ self.catIds = []
657
+ # np.arange causes trouble. the data point on arange is slightly larger than the true value
658
+ self.iouThrs = np.linspace(.5, 0.95, int(np.round((0.95 - .5) / .05)) + 1, endpoint=True)
659
+ self.recThrs = np.linspace(.0, 1.00, int(np.round((1.00 - .0) / .01)) + 1, endpoint=True)
660
+ self.maxDets = [1, 10, 100]
661
+ self.areaRng = [[0 ** 2, 1e5 ** 2], [0 ** 2, 32 ** 2], [32 ** 2, 96 ** 2], [96 ** 2, 1e5 ** 2]]
662
+ self.areaRngLbl = ['all', 'small', 'medium', 'large']
663
+ self.useCats = 1
664
+
665
+ def setKpParams(self):
666
+ self.imgIds = []
667
+ self.catIds = []
668
+ # np.arange causes trouble. the data point on arange is slightly larger than the true value
669
+ self.iouThrs = np.linspace(.5, 0.95, int(np.round((0.95 - .5) / .05)) + 1, endpoint=True)
670
+ self.recThrs = np.linspace(.0, 1.00, int(np.round((1.00 - .0) / .01)) + 1, endpoint=True)
671
+ self.maxDets = [20]
672
+ self.areaRng = [[0 ** 2, 1e5 ** 2], [32 ** 2, 96 ** 2], [96 ** 2, 1e5 ** 2]]
673
+ self.areaRngLbl = ['all', 'medium', 'large']
674
+ self.useCats = 1
675
+ self.kpt_oks_sigmas = np.array([.26, .25, .25, .35, .35, .79, .79, .72, .72, .62,.62, 1.07, 1.07, .87, .87, .89, .89])/10.0
676
+
677
+ def __init__(self, iouType='segm'):
678
+ if iouType == 'segm' or iouType == 'bbox':
679
+ self.setDetParams()
680
+ elif iouType == 'keypoints':
681
+ self.setKpParams()
682
+ else:
683
+ raise Exception('iouType not supported')
684
+ self.iouType = iouType
685
+ # useSegm is deprecated
686
+ self.useSegm = None
687
+
688
+ def __repr__(self) -> str:
689
+ return str(self.__dict__)
690
+
691
+ def __iter__(self):
692
+ return iter(self.__dict__.items())
693
+
modified_coco/pr_rec_f1.py ADDED
@@ -0,0 +1,620 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Copyright The PyTorch Lightning team.
2
+ #
3
+ # Licensed under the Apache License, Version 2.0 (the "License");
4
+ # you may not use this file except in compliance with the License.
5
+ # You may obtain a copy of the License at
6
+ #
7
+ # http://www.apache.org/licenses/LICENSE-2.0
8
+ #
9
+ # Unless required by applicable law or agreed to in writing, software
10
+ # distributed under the License is distributed on an "AS IS" BASIS,
11
+ # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12
+ # See the License for the specific language governing permissions and
13
+ # limitations under the License.
14
+
15
+ # NOTE: This metric is based on torchmetrics.detection.mean_ap and
16
+ # then modified to support the evaluation of precision, recall, f1 and support
17
+ # for object detection. It can also be used to evaluate the mean average precision
18
+ # but some modifications are needed. Additionally, numpy is used instead of torch
19
+
20
+ import contextlib
21
+ import io
22
+ import json
23
+ from typing import Any, Callable, Dict, List, Optional, Tuple, Union
24
+ from typing_extensions import Literal
25
+ import numpy as np
26
+ from modified_coco.utils import _fix_empty_arrays, _input_validator, box_convert
27
+
28
+ try:
29
+ import pycocotools.mask as mask_utils
30
+ from pycocotools.coco import COCO
31
+ # from pycocotools.cocoeval import COCOeval
32
+ from modified_coco.cocoeval import COCOeval # use our own version of COCOeval
33
+ except ImportError:
34
+ raise ModuleNotFoundError(
35
+ "`MAP` metric requires that `pycocotools` installed."
36
+ " Please install with `pip install pycocotools`"
37
+ )
38
+
39
+ class PrecisionRecallF1Support:
40
+ r"""Compute the Precision, Recall, F1 and Support scores for object detection.
41
+
42
+ - Precision = :math:`\frac{TP}{TP + FP}`
43
+ - Recall = :math:`\frac{TP}{TP + FN}`
44
+ - F1 = :math:`\frac{2 * Precision * Recall}{Precision + Recall}`
45
+ - Support = :math:`TP + FN`
46
+
47
+ As input to ``forward`` and ``update`` the metric accepts the following input:
48
+
49
+ - ``preds`` (:class:`~List`): A list consisting of dictionaries each containing the key-values
50
+ (each dictionary corresponds to a single image). Parameters that should be provided per dict:
51
+ - boxes: (:class:`~np.ndarray`) of shape ``(num_boxes, 4)`` containing ``num_boxes``
52
+ detection boxes of the format specified in the constructor. By default, this method expects
53
+ ``(xmin, ymin, xmax, ymax)`` in absolute image coordinates.
54
+ - scores: :class:`~np.ndarray` of shape ``(num_boxes)`` containing detection scores
55
+ for the boxes.
56
+ - labels: :class:`~np.ndarray` of shape ``(num_boxes)`` containing 0-indexed detection
57
+ classes for the boxes.
58
+ - masks: :class:`~torch.bool` of shape ``(num_boxes, image_height, image_width)`` containing
59
+ boolean masks. Only required when `iou_type="segm"`.
60
+
61
+ - ``target`` (:class:`~List`) A list consisting of dictionaries each containing the key-values
62
+ (each dictionary corresponds to a single image). Parameters that should be provided per dict:
63
+ - boxes: :class:`~np.ndarray` of shape ``(num_boxes, 4)`` containing ``num_boxes``
64
+ ground truth boxes of the format specified in the constructor. By default, this method
65
+ expects ``(xmin, ymin, xmax, ymax)`` in absolute image coordinates.
66
+ - labels: :class:`~np.ndarray` of shape ``(num_boxes)`` containing 0-indexed ground
67
+ truth classes for the boxes.
68
+ - masks: :class:`~torch.bool` of shape ``(num_boxes, image_height, image_width)``
69
+ containing boolean masks. Only required when `iou_type="segm"`.
70
+ - iscrowd: :class:`~np.ndarray` of shape ``(num_boxes)`` containing 0/1 values
71
+ indicating whether the bounding box/masks indicate a crowd of objects. Value is optional,
72
+ and if not provided it will automatically be set to 0.
73
+ - area: :class:`~np.ndarray` of shape ``(num_boxes)`` containing the area of the
74
+ object. Value if optional, and if not provided will be automatically calculated based
75
+ on the bounding box/masks provided. Only affects when 'area_ranges' is provided.
76
+
77
+ As output of ``forward`` and ``compute`` the metric returns the following output:
78
+
79
+ - ``results``: A dictionary containing the following key-values:
80
+
81
+ - ``params``: COCOeval parameters object
82
+ - ``eval``: output of COCOeval.accumuate()
83
+ - ``metrics``: A dictionary containing the following key-values for each area range:
84
+ - ``area_range``: str containing the area range
85
+ - ``iouThr``: str containing the IoU threshold
86
+ - ``maxDets``: int containing the maximum number of detections
87
+ - ``tp``: int containing the number of true positives
88
+ - ``fp``: int containing the number of false positives
89
+ - ``fn``: int containing the number of false negatives
90
+ - ``precision``: float containing the precision
91
+ - ``recall``: float containing the recall
92
+ - ``f1``: float containing the f1 score
93
+ - ``support``: int containing the support (tp + fn)
94
+
95
+ .. note::
96
+ This metric utilizes the official `pycocotools` implementation as its backend. This means that the metric
97
+ requires you to have `pycocotools` installed. In addition we require `torchvision` version 0.8.0 or newer.
98
+ Please install with ``pip install torchmetrics[detection]``.
99
+
100
+ Args:
101
+ box_format:
102
+ Input format of given boxes. Supported formats are ``[xyxy, xywh, cxcywh]``.
103
+ iou_type:
104
+ Type of input (either masks or bounding-boxes) used for computing IOU.
105
+ Supported IOU types are ``["bbox", "segm"]``. If using ``"segm"``, masks should be provided in input.
106
+ iou_thresholds:
107
+ IoU thresholds for evaluation. If set to ``None`` it corresponds to the stepped range ``[0.5,...,0.95]``
108
+ with step ``0.05``. Else provide a list of floats.
109
+ rec_thresholds:
110
+ Recall thresholds for evaluation. If set to ``None`` it corresponds to the stepped range ``[0,...,1]``
111
+ with step ``0.01``. Else provide a list of floats.
112
+ max_detection_thresholds:
113
+ Thresholds on max detections per image. If set to `None` will use thresholds ``[100]``.
114
+ Else, please provide a list of ints.
115
+ area_ranges:
116
+ Area ranges for evaluation. If set to ``None`` it corresponds to the ranges ``[[0^2, 1e5^2]]``.
117
+ Else, please provide a list of lists of length 2.
118
+ area_ranges_labels:
119
+ Labels for the area ranges. If set to ``None`` it corresponds to the labels ``["all"]``.
120
+ Else, please provide a list of strings of the same length as ``area_ranges``.
121
+ class_agnostic:
122
+ If ``True`` will compute metrics globally. If ``False`` will compute metrics per class.
123
+ Default: ``True`` (per class metrics are not supported yet)
124
+ debug:
125
+ If ``True`` will print the COCOEval summary to stdout.
126
+ kwargs: Additional keyword arguments, see :ref:`Metric kwargs` for more info.
127
+
128
+ Raises:
129
+ ValueError:
130
+ If ``box_format`` is not one of ``"xyxy"``, ``"xywh"`` or ``"cxcywh"``
131
+ ValueError:
132
+ If ``iou_type`` is not one of ``"bbox"`` or ``"segm"``
133
+ ValueError:
134
+ If ``iou_thresholds`` is not None or a list of floats
135
+ ValueError:
136
+ If ``rec_thresholds`` is not None or a list of floats
137
+ ValueError:
138
+ If ``max_detection_thresholds`` is not None or a list of ints
139
+ ValueError:
140
+ If ``area_ranges`` is not None or a list of lists of length 2
141
+ ValueError:
142
+ If ``area_ranges_labels`` is not None or a list of strings
143
+
144
+ Example:
145
+ >>> import numpy as np
146
+ >>> from metrics.detection import MeanAveragePrecision
147
+ >>> preds = [
148
+ ... dict(
149
+ ... boxes=np.array([[258.0, 41.0, 606.0, 285.0]]),
150
+ ... scores=np.array([0.536]),
151
+ ... labels=np.array([0]),
152
+ ... )
153
+ ... ]
154
+ >>> target = [
155
+ ... dict(
156
+ ... boxes=np.array([[214.0, 41.0, 562.0, 285.0]]),
157
+ ... labels=np.array([0]),
158
+ ... )
159
+ ... ]
160
+ >>> metric = PrecisionRecallF1Support()
161
+ >>> metric.update(preds, target)
162
+ >>> print(metric.compute())
163
+ {'params': <metrics.detection.cocoeval.Params at 0x16dc99150>,
164
+ 'eval': ... output of COCOeval.accumuate(),
165
+ 'metrics': {'all': {'range': [0, 10000000000.0],
166
+ 'iouThr': '0.50',
167
+ 'maxDets': 100,
168
+ 'tp': 1,
169
+ 'fp': 0,
170
+ 'fn': 0,
171
+ 'precision': 1.0,
172
+ 'recall': 1.0,
173
+ 'f1': 1.0,
174
+ 'support': 1}}}
175
+ """
176
+ is_differentiable: bool = False
177
+ higher_is_better: Optional[bool] = True
178
+ full_state_update: bool = True
179
+ plot_lower_bound: float = 0.0
180
+ plot_upper_bound: float = 1.0
181
+
182
+ detections: List[np.ndarray]
183
+ detection_scores: List[np.ndarray]
184
+ detection_labels: List[np.ndarray]
185
+ groundtruths: List[np.ndarray]
186
+ groundtruth_labels: List[np.ndarray]
187
+ groundtruth_crowds: List[np.ndarray]
188
+ groundtruth_area: List[np.ndarray]
189
+
190
+ def __init__(
191
+ self,
192
+ box_format: str = "xyxy",
193
+ iou_type: Literal["bbox", "segm"] = "bbox",
194
+ iou_thresholds: Optional[List[float]] = None,
195
+ rec_thresholds: Optional[List[float]] = None,
196
+ max_detection_thresholds: Optional[List[int]] = None,
197
+ area_ranges: Optional[List[List[int]]] = None,
198
+ area_ranges_labels: Optional[List[str]] = None,
199
+ class_agnostic: bool = True,
200
+ debug: bool = False,
201
+ **kwargs: Any,
202
+ ) -> None:
203
+
204
+ allowed_box_formats = ("xyxy", "xywh", "cxcywh")
205
+ if box_format not in allowed_box_formats:
206
+ raise ValueError(
207
+ f"Expected argument `box_format` to be one of {allowed_box_formats} but got {box_format}")
208
+ self.box_format = box_format
209
+
210
+ allowed_iou_types = ("segm", "bbox")
211
+ if iou_type not in allowed_iou_types:
212
+ raise ValueError(
213
+ f"Expected argument `iou_type` to be one of {allowed_iou_types} but got {iou_type}")
214
+ self.iou_type = iou_type
215
+
216
+ if iou_thresholds is not None and not isinstance(iou_thresholds, list):
217
+ raise ValueError(
218
+ f"Expected argument `iou_thresholds` to either be `None` or a list of floats but got {iou_thresholds}"
219
+ )
220
+ self.iou_thresholds = iou_thresholds or np.linspace(
221
+ 0.5, 0.95, round((0.95 - 0.5) / 0.05) + 1).tolist()
222
+
223
+ if rec_thresholds is not None and not isinstance(rec_thresholds, list):
224
+ raise ValueError(
225
+ f"Expected argument `rec_thresholds` to either be `None` or a list of floats but got {rec_thresholds}"
226
+ )
227
+ self.rec_thresholds = rec_thresholds or np.linspace(
228
+ 0.0, 1.00, round(1.00 / 0.01) + 1).tolist()
229
+
230
+ if max_detection_thresholds is not None and not isinstance(max_detection_thresholds, list):
231
+ raise ValueError(
232
+ f"Expected argument `max_detection_thresholds` to either be `None` or a list of ints"
233
+ f" but got {max_detection_thresholds}"
234
+ )
235
+ max_det_thr = np.sort(np.array(
236
+ max_detection_thresholds or [100], dtype=np.uint))
237
+ self.max_detection_thresholds = max_det_thr.tolist()
238
+
239
+ # check area ranges
240
+ if area_ranges is not None:
241
+ if not isinstance(area_ranges, list):
242
+ raise ValueError(
243
+ f"Expected argument `area_ranges` to either be `None` or a list of lists but got {area_ranges}"
244
+ )
245
+ for area_range in area_ranges:
246
+ if not isinstance(area_range, list) or len(area_range) != 2:
247
+ raise ValueError(
248
+ f"Expected argument `area_ranges` to be a list of lists of length 2 but got {area_ranges}"
249
+ )
250
+ self.area_ranges = area_ranges if area_ranges is not None else [
251
+ [0**2, 1e5**2]]
252
+
253
+ if area_ranges_labels is not None:
254
+ if area_ranges is None:
255
+ raise ValueError(
256
+ "Expected argument `area_ranges_labels` to be `None` if `area_ranges` is not provided"
257
+ )
258
+ if not isinstance(area_ranges_labels, list):
259
+ raise ValueError(
260
+ f"Expected argument `area_ranges_labels` to either be `None` or a list of strings"
261
+ f" but got {area_ranges_labels}"
262
+ )
263
+ if len(area_ranges_labels) != len(area_ranges):
264
+ raise ValueError(
265
+ f"Expected argument `area_ranges_labels` to be a list of length {len(area_ranges)}"
266
+ f" but got {area_ranges_labels}"
267
+ )
268
+ self.area_ranges_labels = area_ranges_labels if area_ranges_labels is not None else [
269
+ "all"]
270
+
271
+ # if not isinstance(class_metrics, bool):
272
+ # raise ValueError(
273
+ # "Expected argument `class_metrics` to be a boolean")
274
+ # self.class_metrics = class_metrics
275
+
276
+ if not isinstance(class_agnostic, bool):
277
+ raise ValueError(
278
+ "Expected argument `class_agnostic` to be a boolean")
279
+ self.class_agnostic = class_agnostic
280
+
281
+ if not isinstance(debug, bool):
282
+ raise ValueError("Expected argument `debug` to be a boolean")
283
+ self.debug = debug
284
+
285
+ self.detections = []
286
+ self.detection_scores = []
287
+ self.detection_labels = []
288
+ self.groundtruths = []
289
+ self.groundtruth_labels = []
290
+ self.groundtruth_crowds = []
291
+ self.groundtruth_area = []
292
+
293
+ # self.add_state("detections", default=[], dist_reduce_fx=None)
294
+ # self.add_state("detection_scores", default=[], dist_reduce_fx=None)
295
+ # self.add_state("detection_labels", default=[], dist_reduce_fx=None)
296
+ # self.add_state("groundtruths", default=[], dist_reduce_fx=None)
297
+ # self.add_state("groundtruth_labels", default=[], dist_reduce_fx=None)
298
+ # self.add_state("groundtruth_crowds", default=[], dist_reduce_fx=None)
299
+ # self.add_state("groundtruth_area", default=[], dist_reduce_fx=None)
300
+
301
+ def update(self, preds: List[Dict[str, np.ndarray]], target: List[Dict[str, np.ndarray]]) -> None:
302
+ """Update metric state.
303
+
304
+ Raises:
305
+ ValueError:
306
+ If ``preds`` is not of type (:class:`~List[Dict[str, np.ndarray]]`)
307
+ ValueError:
308
+ If ``target`` is not of type ``List[Dict[str, np.ndarray]]``
309
+ ValueError:
310
+ If ``preds`` and ``target`` are not of the same length
311
+ ValueError:
312
+ If any of ``preds.boxes``, ``preds.scores`` and ``preds.labels`` are not of the same length
313
+ ValueError:
314
+ If any of ``target.boxes`` and ``target.labels`` are not of the same length
315
+ ValueError:
316
+ If any box is not type float and of length 4
317
+ ValueError:
318
+ If any class is not type int and of length 1
319
+ ValueError:
320
+ If any score is not type float and of length 1
321
+ """
322
+ _input_validator(preds, target, iou_type=self.iou_type)
323
+
324
+ for item in preds:
325
+ detections = self._get_safe_item_values(item)
326
+
327
+ self.detections.append(detections)
328
+ self.detection_labels.append(item["labels"])
329
+ self.detection_scores.append(item["scores"])
330
+
331
+ for item in target:
332
+ groundtruths = self._get_safe_item_values(item)
333
+ self.groundtruths.append(groundtruths)
334
+ self.groundtruth_labels.append(item["labels"])
335
+ self.groundtruth_crowds.append(
336
+ item.get("iscrowd", np.zeros_like(item["labels"])))
337
+ self.groundtruth_area.append(
338
+ item.get("area", np.zeros_like(item["labels"])))
339
+
340
+ def compute(self) -> dict:
341
+ """Computes the metric."""
342
+ coco_target, coco_preds = COCO(), COCO()
343
+
344
+ coco_target.dataset = self._get_coco_format(
345
+ self.groundtruths, self.groundtruth_labels, crowds=self.groundtruth_crowds, area=self.groundtruth_area
346
+ )
347
+ coco_preds.dataset = self._get_coco_format(
348
+ self.detections, self.detection_labels, scores=self.detection_scores)
349
+
350
+ with contextlib.redirect_stdout(io.StringIO()) as f:
351
+ coco_target.createIndex()
352
+ coco_preds.createIndex()
353
+
354
+ coco_eval = COCOeval(coco_target, coco_preds,
355
+ iouType=self.iou_type)
356
+ coco_eval.params.iouThrs = np.array(
357
+ self.iou_thresholds, dtype=np.float64)
358
+ coco_eval.params.recThrs = np.array(
359
+ self.rec_thresholds, dtype=np.float64)
360
+ coco_eval.params.maxDets = self.max_detection_thresholds
361
+ coco_eval.params.areaRng = self.area_ranges
362
+ coco_eval.params.areaRngLbl = self.area_ranges_labels
363
+ coco_eval.params.useCats = 0 if self.class_agnostic else 1
364
+
365
+ coco_eval.evaluate()
366
+ coco_eval.accumulate()
367
+
368
+ if self.debug:
369
+ print(f.getvalue())
370
+
371
+ metrics = coco_eval.summarize()
372
+ return metrics
373
+
374
+ @staticmethod
375
+ def coco_to_np(
376
+ coco_preds: str,
377
+ coco_target: str,
378
+ iou_type: Literal["bbox", "segm"] = "bbox",
379
+ ) -> Tuple[List[Dict[str, np.ndarray]], List[Dict[str, np.ndarray]]]:
380
+ """Utility function for converting .json coco format files to the input format of this metric.
381
+
382
+ The function accepts a file for the predictions and a file for the target in coco format and converts them to
383
+ a list of dictionaries containing the boxes, labels and scores in the input format of this metric.
384
+
385
+ Args:
386
+ coco_preds: Path to the json file containing the predictions in coco format
387
+ coco_target: Path to the json file containing the targets in coco format
388
+ iou_type: Type of input, either `bbox` for bounding boxes or `segm` for segmentation masks
389
+
390
+ Returns:
391
+ preds: List of dictionaries containing the predictions in the input format of this metric
392
+ target: List of dictionaries containing the targets in the input format of this metric
393
+
394
+ Example:
395
+ >>> # File formats are defined at https://cocodataset.org/#format-data
396
+ >>> # Example files can be found at
397
+ >>> # https://github.com/cocodataset/cocoapi/tree/master/results
398
+ >>> from torchmetrics.detection import MeanAveragePrecision
399
+ >>> preds, target = MeanAveragePrecision.coco_to_tm(
400
+ ... "instances_val2014_fakebbox100_results.json.json",
401
+ ... "val2014_fake_eval_res.txt.json"
402
+ ... iou_type="bbox"
403
+ ... ) # doctest: +SKIP
404
+
405
+ """
406
+ with contextlib.redirect_stdout(io.StringIO()):
407
+ gt = COCO(coco_target)
408
+ dt = gt.loadRes(coco_preds)
409
+
410
+ gt_dataset = gt.dataset["annotations"]
411
+ dt_dataset = dt.dataset["annotations"]
412
+
413
+ target = {}
414
+ for t in gt_dataset:
415
+ if t["image_id"] not in target:
416
+ target[t["image_id"]] = {
417
+ "boxes" if iou_type == "bbox" else "masks": [],
418
+ "labels": [],
419
+ "iscrowd": [],
420
+ "area": [],
421
+ }
422
+ if iou_type == "bbox":
423
+ target[t["image_id"]]["boxes"].append(t["bbox"])
424
+ else:
425
+ target[t["image_id"]]["masks"].append(gt.annToMask(t))
426
+ target[t["image_id"]]["labels"].append(t["category_id"])
427
+ target[t["image_id"]]["iscrowd"].append(t["iscrowd"])
428
+ target[t["image_id"]]["area"].append(t["area"])
429
+
430
+ preds = {}
431
+ for p in dt_dataset:
432
+ if p["image_id"] not in preds:
433
+ preds[p["image_id"]] = {
434
+ "boxes" if iou_type == "bbox" else "masks": [], "scores": [], "labels": []}
435
+ if iou_type == "bbox":
436
+ preds[p["image_id"]]["boxes"].append(p["bbox"])
437
+ else:
438
+ preds[p["image_id"]]["masks"].append(gt.annToMask(p))
439
+ preds[p["image_id"]]["scores"].append(p["score"])
440
+ preds[p["image_id"]]["labels"].append(p["category_id"])
441
+ for k in target: # add empty predictions for images without predictions
442
+ if k not in preds:
443
+ preds[k] = {"boxes" if iou_type ==
444
+ "bbox" else "masks": [], "scores": [], "labels": []}
445
+
446
+ batched_preds, batched_target = [], []
447
+ for key in target:
448
+ name = "boxes" if iou_type == "bbox" else "masks"
449
+ batched_preds.append(
450
+ {
451
+ name: np.array(
452
+ np.array(preds[key]["boxes"]), dtype=np.float32)
453
+ if iou_type == "bbox"
454
+ else np.array(np.array(preds[key]["masks"]), dtype=np.uint8),
455
+ "scores": np.array(preds[key]["scores"], dtype=np.float32),
456
+ "labels": np.array(preds[key]["labels"], dtype=np.int32),
457
+ }
458
+ )
459
+ batched_target.append(
460
+ {
461
+ name: np.array(
462
+ target[key]["boxes"], dtype=np.float32)
463
+ if iou_type == "bbox"
464
+ else np.array(np.array(target[key]["masks"]), dtype=np.uint8),
465
+ "labels": np.array(target[key]["labels"], dtype=np.int32),
466
+ "iscrowd": np.array(target[key]["iscrowd"], dtype=np.int32),
467
+ "area": np.array(target[key]["area"], dtype=np.float32),
468
+ }
469
+ )
470
+
471
+ return batched_preds, batched_target
472
+
473
+ def np_to_coco(self, name: str = "np_map_input") -> None:
474
+ """Utility function for converting the input for this metric to coco format and saving it to a json file.
475
+
476
+ This function should be used after calling `.update(...)` or `.forward(...)` on all data that should be written
477
+ to the file, as the input is then internally cached. The function then converts to information to coco format
478
+ a writes it to json files.
479
+
480
+ Args:
481
+ name: Name of the output file, which will be appended with "_preds.json" and "_target.json"
482
+
483
+ Example:
484
+ >>> import numpy as np
485
+ >>> from metrics.detection import MeanAveragePrecision
486
+ >>> preds = [
487
+ ... dict(
488
+ ... boxes=np.array([[258.0, 41.0, 606.0, 285.0]]),
489
+ ... scores=np.array([0.536]),
490
+ ... labels=np.array([0]),
491
+ ... )
492
+ ... ]
493
+ >>> target = [
494
+ ... dict(
495
+ ... boxes=np.array([[214.0, 41.0, 562.0, 285.0]]),
496
+ ... labels=np.array([0]),
497
+ ... )
498
+ ... ]
499
+ >>> metric = PrecisionRecallF1Support()
500
+ >>> metric.update(preds, target)
501
+ >>> metric.np_to_coco("np_map_input") # doctest: +SKIP
502
+
503
+ """
504
+ target_dataset = self._get_coco_format(
505
+ self.groundtruths, self.groundtruth_labels)
506
+ preds_dataset = self._get_coco_format(
507
+ self.detections, self.detection_labels, self.detection_scores)
508
+
509
+ preds_json = json.dumps(preds_dataset["annotations"], indent=4)
510
+ target_json = json.dumps(target_dataset, indent=4)
511
+
512
+ with open(f"{name}_preds.json", "w") as f:
513
+ f.write(preds_json)
514
+
515
+ with open(f"{name}_target.json", "w") as f:
516
+ f.write(target_json)
517
+
518
+ def _get_safe_item_values(self, item: Dict[str, Any]) -> Union[np.ndarray, Tuple]:
519
+ """Convert and return the boxes or masks from the item depending on the iou_type.
520
+
521
+ Args:
522
+ item: input dictionary containing the boxes or masks
523
+
524
+ Returns:
525
+ boxes or masks depending on the iou_type
526
+
527
+ """
528
+ if self.iou_type == "bbox":
529
+ boxes = _fix_empty_arrays(item["boxes"])
530
+ if boxes.size > 0:
531
+ boxes = box_convert(
532
+ boxes, in_fmt=self.box_format, out_fmt="xywh")
533
+ return boxes
534
+ if self.iou_type == "segm":
535
+ masks = []
536
+ for i in item["masks"]:
537
+ rle = mask_utils.encode(np.asfortranarray(i))
538
+ masks.append((tuple(rle["size"]), rle["counts"]))
539
+ return tuple(masks)
540
+ raise Exception(f"IOU type {self.iou_type} is not supported")
541
+
542
+ def _get_classes(self) -> List:
543
+ """Return a list of unique classes found in ground truth and detection data."""
544
+ all_labels = np.concatenate(
545
+ self.detection_labels + self.groundtruth_labels)
546
+ unique_classes = np.unique(all_labels)
547
+ return unique_classes.tolist()
548
+
549
+ def _get_coco_format(
550
+ self,
551
+ boxes: List[np.ndarray],
552
+ labels: List[np.ndarray],
553
+ scores: Optional[List[np.ndarray]] = None,
554
+ crowds: Optional[List[np.ndarray]] = None,
555
+ area: Optional[List[np.ndarray]] = None,
556
+ ) -> Dict:
557
+ """Transforms and returns all cached targets or predictions in COCO format.
558
+
559
+ Format is defined at https://cocodataset.org/#format-data
560
+ """
561
+ images = []
562
+ annotations = []
563
+ annotation_id = 1 # has to start with 1, otherwise COCOEval results are wrong
564
+
565
+ for image_id, (image_boxes, image_labels) in enumerate(zip(boxes, labels)):
566
+ if self.iou_type == "segm" and len(image_boxes) == 0:
567
+ continue
568
+
569
+ if self.iou_type == "bbox":
570
+ image_boxes = image_boxes.tolist()
571
+ image_labels = image_labels.tolist()
572
+
573
+ images.append({"id": image_id})
574
+ if self.iou_type == "segm":
575
+ images[-1]["height"], images[-1]["width"] = image_boxes[0][0][0], image_boxes[0][0][1]
576
+
577
+ for k, (image_box, image_label) in enumerate(zip(image_boxes, image_labels)):
578
+ if self.iou_type == "bbox" and len(image_box) != 4:
579
+ raise ValueError(
580
+ f"Invalid input box of sample {image_id}, element {k} (expected 4 values, got {len(image_box)})"
581
+ )
582
+
583
+ if not isinstance(image_label, int):
584
+ raise ValueError(
585
+ f"Invalid input class of sample {image_id}, element {k}"
586
+ f" (expected value of type integer, got type {type(image_label)})"
587
+ )
588
+
589
+ stat = image_box if self.iou_type == "bbox" else {
590
+ "size": image_box[0], "counts": image_box[1]}
591
+
592
+ if area is not None and area[image_id][k].tolist() > 0:
593
+ area_stat = area[image_id][k].tolist()
594
+ else:
595
+ area_stat = image_box[2] * \
596
+ image_box[3] if self.iou_type == "bbox" else mask_utils.area(
597
+ stat)
598
+
599
+ annotation = {
600
+ "id": annotation_id,
601
+ "image_id": image_id,
602
+ "bbox" if self.iou_type == "bbox" else "segmentation": stat,
603
+ "area": area_stat,
604
+ "category_id": image_label,
605
+ "iscrowd": crowds[image_id][k].tolist() if crowds is not None else 0,
606
+ }
607
+
608
+ if scores is not None:
609
+ score = scores[image_id][k].tolist()
610
+ if not isinstance(score, float):
611
+ raise ValueError(
612
+ f"Invalid input score of sample {image_id}, element {k}"
613
+ f" (expected value of type float, got type {type(score)})"
614
+ )
615
+ annotation["score"] = score
616
+ annotations.append(annotation)
617
+ annotation_id += 1
618
+
619
+ classes = [{"id": i, "name": str(i)} for i in self._get_classes()]
620
+ return {"images": images, "annotations": annotations, "categories": classes}
modified_coco/utils.py ADDED
@@ -0,0 +1,220 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import numpy as np
2
+
3
+ def box_denormalize(boxes: np.ndarray, img_w: int, img_h: int) -> np.ndarray:
4
+ """
5
+ Denormalizes boxes from [0, 1] to [0, img_w] and [0, img_h].
6
+ Args:
7
+ boxes (Tensor[N, 4]): boxes which will be denormalized.
8
+ img_w (int): Width of image.
9
+ img_h (int): Height of image.
10
+
11
+ Returns:
12
+ Tensor[N, 4]: Denormalized boxes.
13
+ """
14
+ if boxes.size == 0:
15
+ return boxes
16
+
17
+ # check if boxes are normalized
18
+ if np.any(boxes > 1.0):
19
+ return boxes
20
+
21
+ boxes[:, 0::2] *= img_w
22
+ boxes[:, 1::2] *= img_h
23
+ return boxes
24
+
25
+
26
+ def box_convert(boxes: np.ndarray, in_fmt: str, out_fmt: str) -> np.ndarray:
27
+ """
28
+ Converts boxes from given in_fmt to out_fmt.
29
+ Supported in_fmt and out_fmt are:
30
+
31
+ 'xyxy': boxes are represented via corners, x1, y1 being top left and x2, y2 being bottom right.
32
+ This is the format that torchvision utilities expect.
33
+
34
+ 'xywh' : boxes are represented via corner, width and height, x1, y2 being top left, w, h being width and height.
35
+
36
+ 'cxcywh' : boxes are represented via centre, width and height, cx, cy being center of box, w, h
37
+ being width and height.
38
+
39
+ Args:
40
+ boxes (Tensor[N, 4]): boxes which will be converted.
41
+ in_fmt (str): Input format of given boxes. Supported formats are ['xyxy', 'xywh', 'cxcywh'].
42
+ out_fmt (str): Output format of given boxes. Supported formats are ['xyxy', 'xywh', 'cxcywh']
43
+
44
+ Returns:
45
+ Tensor[N, 4]: Boxes into converted format.
46
+ """
47
+ if boxes.size == 0:
48
+ return boxes
49
+
50
+ allowed_fmts = ("xyxy", "xywh", "cxcywh")
51
+ if in_fmt not in allowed_fmts or out_fmt not in allowed_fmts:
52
+ raise ValueError(
53
+ "Unsupported Bounding Box Conversions for given in_fmt and out_fmt")
54
+
55
+ if in_fmt == out_fmt:
56
+ return boxes.copy()
57
+
58
+ if in_fmt != "xyxy" and out_fmt != "xyxy":
59
+ # convert to xyxy and change in_fmt xyxy
60
+ if in_fmt == "xywh":
61
+ boxes = _box_xywh_to_xyxy(boxes)
62
+ elif in_fmt == "cxcywh":
63
+ boxes = _box_cxcywh_to_xyxy(boxes)
64
+ in_fmt = "xyxy"
65
+
66
+ if in_fmt == "xyxy":
67
+ if out_fmt == "xywh":
68
+ boxes = _box_xyxy_to_xywh(boxes)
69
+ elif out_fmt == "cxcywh":
70
+ boxes = _box_xyxy_to_cxcywh(boxes)
71
+ elif out_fmt == "xyxy":
72
+ if in_fmt == "xywh":
73
+ boxes = _box_xywh_to_xyxy(boxes)
74
+ elif in_fmt == "cxcywh":
75
+ boxes = _box_cxcywh_to_xyxy(boxes)
76
+ return boxes
77
+
78
+
79
+ def _box_xywh_to_xyxy(boxes):
80
+ """
81
+ Converts bounding boxes from (x, y, w, h) format to (x1, y1, x2, y2) format.
82
+ (x, y) refers to top left of bounding box.
83
+ (w, h) refers to width and height of box.
84
+ Args:
85
+ boxes (ndarray[N, 4]): boxes in (x, y, w, h) which will be converted.
86
+
87
+ Returns:
88
+ boxes (ndarray[N, 4]): boxes in (x1, y1, x2, y2) format.
89
+ """
90
+ x, y, w, h = np.split(boxes, 4, axis=-1)
91
+ x1 = x
92
+ y1 = y
93
+ x2 = x + w
94
+ y2 = y + h
95
+ converted_boxes = np.concatenate([x1, y1, x2, y2], axis=-1)
96
+ return converted_boxes
97
+
98
+
99
+ def _box_cxcywh_to_xyxy(boxes):
100
+ """
101
+ Converts bounding boxes from (cx, cy, w, h) format to (x1, y1, x2, y2) format.
102
+ (cx, cy) refers to center of bounding box
103
+ (w, h) are width and height of bounding box
104
+ Args:
105
+ boxes (ndarray[N, 4]): boxes in (cx, cy, w, h) format which will be converted.
106
+
107
+ Returns:
108
+ boxes (ndarray[N, 4]): boxes in (x1, y1, x2, y2) format.
109
+ """
110
+ cx, cy, w, h = np.split(boxes, 4, axis=-1)
111
+ x1 = cx - 0.5 * w
112
+ y1 = cy - 0.5 * h
113
+ x2 = cx + 0.5 * w
114
+ y2 = cy + 0.5 * h
115
+ converted_boxes = np.concatenate([x1, y1, x2, y2], axis=-1)
116
+ return converted_boxes
117
+
118
+
119
+ def _box_xyxy_to_xywh(boxes):
120
+ """
121
+ Converts bounding boxes from (x1, y1, x2, y2) format to (x, y, w, h) format.
122
+ (x1, y1) refer to top left of bounding box
123
+ (x2, y2) refer to bottom right of bounding box
124
+ Args:
125
+ boxes (ndarray[N, 4]): boxes in (x1, y1, x2, y2) which will be converted.
126
+
127
+ Returns:
128
+ boxes (ndarray[N, 4]): boxes in (x, y, w, h) format.
129
+ """
130
+ x1, y1, x2, y2 = np.split(boxes, 4, axis=-1)
131
+ w = x2 - x1
132
+ h = y2 - y1
133
+ converted_boxes = np.concatenate([x1, y1, w, h], axis=-1)
134
+ return converted_boxes
135
+
136
+
137
+ def _box_xyxy_to_cxcywh(boxes):
138
+ """
139
+ Converts bounding boxes from (x1, y1, x2, y2) format to (cx, cy, w, h) format.
140
+ (x1, y1) refer to top left of bounding box
141
+ (x2, y2) refer to bottom right of bounding box
142
+ Args:
143
+ boxes (ndarray[N, 4]): boxes in (x1, y1, x2, y2) format which will be converted.
144
+
145
+ Returns:
146
+ boxes (ndarray[N, 4]): boxes in (cx, cy, w, h) format.
147
+ """
148
+ x1, y1, x2, y2 = np.split(boxes, 4, axis=-1)
149
+ cx = (x1 + x2) / 2
150
+ cy = (y1 + y2) / 2
151
+ w = x2 - x1
152
+ h = y2 - y1
153
+ converted_boxes = np.concatenate([cx, cy, w, h], axis=-1)
154
+ return converted_boxes
155
+
156
+ def _fix_empty_arrays(boxes: np.ndarray) -> np.ndarray:
157
+ """Empty tensors can cause problems, this methods corrects them."""
158
+ if boxes.size == 0 and boxes.ndim == 1:
159
+ return np.expand_dims(boxes, axis=0)
160
+ return boxes
161
+
162
+ def _input_validator(preds, targets, iou_type="bbox"):
163
+ """Ensure the correct input format of `preds` and `targets`."""
164
+ if iou_type == "bbox":
165
+ item_val_name = "boxes"
166
+ elif iou_type == "segm":
167
+ item_val_name = "masks"
168
+ else:
169
+ raise Exception(f"IOU type {iou_type} is not supported")
170
+
171
+ if not isinstance(preds, (list, tuple)):
172
+ raise ValueError(
173
+ f"Expected argument `preds` to be of type list or tuple, but got {type(preds)}")
174
+ if not isinstance(targets, (list, tuple)):
175
+ raise ValueError(
176
+ f"Expected argument `targets` to be of type list or tuple, but got {type(targets)}")
177
+ if len(preds) != len(targets):
178
+ raise ValueError(
179
+ f"Expected argument `preds` and `targets` to have the same length, but got {len(preds)} and {len(targets)}"
180
+ )
181
+
182
+ for k in [item_val_name, "scores", "labels"]:
183
+ if any(k not in p for p in preds):
184
+ raise ValueError(
185
+ f"Expected all dicts in `preds` to contain the `{k}` key")
186
+
187
+ for k in [item_val_name, "labels"]:
188
+ if any(k not in p for p in targets):
189
+ raise ValueError(
190
+ f"Expected all dicts in `targets` to contain the `{k}` key")
191
+
192
+ if any(type(pred[item_val_name]) is not np.ndarray for pred in preds):
193
+ raise ValueError(
194
+ f"Expected all {item_val_name} in `preds` to be of type ndarray")
195
+ if any(type(pred["scores"]) is not np.ndarray for pred in preds):
196
+ raise ValueError(
197
+ "Expected all scores in `preds` to be of type ndarray")
198
+ if any(type(pred["labels"]) is not np.ndarray for pred in preds):
199
+ raise ValueError(
200
+ "Expected all labels in `preds` to be of type ndarray")
201
+ if any(type(target[item_val_name]) is not np.ndarray for target in targets):
202
+ raise ValueError(
203
+ f"Expected all {item_val_name} in `targets` to be of type ndarray")
204
+ if any(type(target["labels"]) is not np.ndarray for target in targets):
205
+ raise ValueError(
206
+ "Expected all labels in `targets` to be of type ndarray")
207
+
208
+ for i, item in enumerate(targets):
209
+ if item[item_val_name].shape[0] != item["labels"].shape[0]:
210
+ raise ValueError(
211
+ f"Input {item_val_name} and labels of sample {i} in targets have a"
212
+ f" different length (expected {item[item_val_name].shape[0]} labels, got {item['labels'].shape[0]})"
213
+ )
214
+ for i, item in enumerate(preds):
215
+ if not (item[item_val_name].shape[0] == item["labels"].shape[0] == item["scores"].shape[0]):
216
+ raise ValueError(
217
+ f"Input {item_val_name}, labels and scores of sample {i} in predictions have a"
218
+ f" different length (expected {item[item_val_name].shape[0]} labels and scores,"
219
+ f" got {item['labels'].shape[0]} labels and {item['scores'].shape[0]})"
220
+ )
requirements.txt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ git+https://github.com/huggingface/evaluate@main
2
+ numpy==1.24.3
3
+ pycocotools==2.0.6