franzi2505 commited on
Commit
3359d6e
1 Parent(s): e599283

adapt detection metrics to fit standard payload

Browse files
Files changed (2) hide show
  1. README.md +112 -77
  2. det-metrics.py +100 -97
README.md CHANGED
@@ -20,28 +20,17 @@ This metric can be used to calculate object detection metrics. It has an option
20
 
21
  ## How to Use
22
  ```
23
- >>> module = evaluate.load("SEA-AI/det-metrics")
24
- # shape: (n_images, m_predicted_bboxes, xywh)
25
- >>> predictions = [
26
- [
27
- [10, 15, 5, 9],
28
- [45, 30, 10, 10]
29
- ],[
30
- [14, 25, 6, 6],
31
- [10, 16, 6, 10]
32
- ],
33
- ]
34
- # shape: (n_images, m_gt_bboxes, xywh)
35
- >>> references = [
36
- [[10, 16, 6, 10]],
37
- [[30, 30, 5, 6]]
38
- ]
39
- >>> module.add_batch(
40
- predictions=predictions,
41
- references=references,
42
- predictions_scores=[[0.5,0.1], [0.8, 0.2]]
43
  )
44
- >>> module.compute()
 
 
 
45
  ```
46
 
47
  ### Metric Settings
@@ -53,10 +42,97 @@ When loading module: `module = evaluate.load("SEA-AI/det-metrics", **params)`, m
53
 
54
 
55
  ### Input Values
56
- Add predictions to the metric with the function `module.add_batches(predictions, references)` with the following parameters:
57
- - **predictions** *List[List[List[int]]]*: predicted bounding boxes in shape `n x m x 4` with `n` being the number of images that are evaluated, `m` the number of predicted bounding boxes for the n-th image and the four co-ordinates specifying the bounding box (by default: x y width height).
58
- - **references** *List[List[List[int]]]*: ground truth bounding boxes in shape `n x l x 4` with `l` being the number of ground truth bounding boxes for the n-th image.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
59
 
 
60
 
61
  ### Output Values
62
  The metric outputs a dictionary that contains sub-dictionaries for each name of the specified area ranges.
@@ -77,66 +153,23 @@ Each sub-dictionary holds performance metrics at the specific area range level:
77
 
78
 
79
  ### Examples
80
- #### Example 1
81
- Basic usage example. Add predictions and references via `module.add_batch(predictions, references)` function. Finally, compute the metrics accross predictions and ground truths over different images via `module.compute()`.
82
- ```
83
- >>> module = evaluate.load("SEA-AI/det-metrics", iou_thresholds=0.9)
84
- >>> predictions = [
85
- [
86
- [10, 15, 20, 25],
87
- [45, 30, 10, 10]
88
- ],[
89
- [14, 25, 6, 6],
90
- [10, 16, 6, 10]
91
- ]
92
- ]
93
- >>> references = [
94
- [[10, 15, 20, 20]],
95
- [[30, 30, 5, 6]]
96
- ]
97
- >>> module.add_batch(predictions=predictions, references=references, predictions_scores=[[0.5,0.3],[0.8, 0.1]])
98
- >>> result = module.compute()
99
- >>> print(result)
100
- {'all': {
101
- 'range': [0, 10000000000.0],
102
- 'iouThr': '0.00',
103
- 'maxDets': 100,
104
- 'tp': 1,
105
- 'fp': 3,
106
- 'fn': 1,
107
- 'duplicates': 0,
108
- 'precision': 0.25,
109
- 'recall': 0.5,
110
- 'f1': 0.3333333333333333,
111
- 'support': 2,
112
- 'fpi': 0,
113
- 'nImgs': 2
114
- }
115
- }
116
- ```
117
- #### Example 2
118
- We can specify different area range levels, at which we would like to compute the metrics. Further note that in the references, there is an empty list for the first image, because it does not include any ground truth bounding boxes. We still need to include it, so that we can map the false positive prediction to the references boxes correctly.
119
  ```
 
 
120
  >>> area_ranges_tuples = [
121
  ("all", [0, 1e5 ** 2]),
122
  ("small", [0 ** 2, 6 ** 2]),
123
  ("medium", [6 ** 2, 12 ** 2]),
124
  ("large", [12 ** 2, 1e5 ** 2])
125
  ]
126
- >>> module = evaluate.load("SEA-AI/det-metrics", area_ranges_tuples=area_ranges_tuples)
127
- >>> predictions = [
128
- [
129
- [10, 15, 5, 5],
130
- [45, 30, 10, 10]
131
- ],[
132
- [50, 50, 6, 10]
133
- ],
134
- ]
135
- >>> references = [
136
- [],
137
- [[10, 15, 5, 5]]
138
- ]
139
- >>> module.add_batch(predictions=predictions, references=references)
140
  >>> result = module.compute()
141
  >>> print(result)
142
  {'all':
@@ -202,6 +235,8 @@ We can specify different area range levels, at which we would like to compute th
202
  ```
203
 
204
  ## Further References
 
 
205
  Calculating metrics is based on pycoco tools: https://github.com/cocodataset/cocoapi/tree/master/PythonAPI/pycocotools
206
 
207
  Further info about metrics: https://www.analyticsvidhya.com/blog/2020/09/precision-recall-machine-learning/
 
20
 
21
  ## How to Use
22
  ```
23
+ >>> import evaluate
24
+ >>> from seametrics.fo_to_payload.utils import fo_to_payload
25
+ >>> payload = fo_to_payload(
26
+ dataset=dataset,
27
+ gt_field=gt_field,
28
+ models=model_list
 
 
 
 
 
 
 
 
 
 
 
 
 
 
29
  )
30
+ >>> for model in payload["models"]:
31
+ >>> module = evaluate.load("./detection_metric.py", iou_thresholds=0.9)
32
+ >>> module.add_batch(payload, model=model)
33
+ >>> result = module.compute()
34
  ```
35
 
36
  ### Metric Settings
 
42
 
43
 
44
  ### Input Values
45
+ Add predictions and ground truths to the metric with the function `module.add_batches(payload)`.
46
+ The format of payload should be as returned by function `fo_to_payload()` defined in seametrics library.
47
+ An example of how a payload might look like is:
48
+
49
+ ```
50
+ test_payload = {
51
+ 'dataset': 'SAILING_DATASET_QA',
52
+ 'models': ['yolov5n6_RGB_D2304-v1_9C'],
53
+ 'gt_field_name': 'ground_truth_det',
54
+ 'sequences': {
55
+ # sequence 1, 1 frame with 1 pred and 1 gt
56
+ 'Trip_14_Seq_1': {
57
+ 'resolution': (720, 1280),
58
+ 'yolov5n6_RGB_D2304-v1_9C': [[fo.Detection(
59
+ label='FAR_AWAY_OBJECT',
60
+ bounding_box=[0.35107421875, 0.274658203125, 0.0048828125, 0.009765625], # tp nr1
61
+ confidence=0.153076171875
62
+ )]],
63
+ 'ground_truth_det': [[fo.Detection(
64
+ label='FAR_AWAY_OBJECT',
65
+ bounding_box=[0.35107421875, 0.274658203125, 0.0048828125, 0.009765625]
66
+ )]]
67
+ },
68
+ # sequence 2, 2 frames with frame 1: 2 pred, 1 gt; frame 2: 1 pred 1 gt
69
+ 'Trip_14_Seq_2': {
70
+ 'resolution': (720, 1280),
71
+ 'yolov5n6_RGB_D2304-v1_9C': [
72
+ [
73
+ fo.Detection(
74
+ label='FAR_AWAY_OBJECT',
75
+ bounding_box=[0.389404296875,0.306640625,0.005126953125,0.0146484375], # tp nr 2
76
+ confidence=0.153076171875
77
+ ),
78
+ fo.Detection(
79
+ label='FAR_AWAY_OBJECT',
80
+ bounding_box=[0.50390625, 0.357666015625, 0.0048828125, 0.00976562], # fp nr 1
81
+ confidence=0.153076171875
82
+ ),
83
+ fo.Detection(
84
+ label='FAR_AWAY_OBJECT',
85
+ bounding_box=[0.455078125, 0.31494140625, 0.00390625, 0.0087890625], # fp nr 2
86
+ confidence=0.153076171875
87
+ )
88
+ ],
89
+ [
90
+ fo.Detection(
91
+ label='FAR_AWAY_OBJECT',
92
+ bounding_box=[0.455078125, 0.31494140625, 0.00390625, 0.0087890625], # tp nr 3
93
+ confidence=0.153076171875
94
+ )
95
+ ],
96
+ [
97
+ fo.Detection(
98
+ label='FAR_AWAY_OBJECT',
99
+ bounding_box=[0.455078125, 0.31494140625, 0.00390625, 0.0087890625], # fp nr 3
100
+ confidence=0.153076171875
101
+ )
102
+ ]
103
+ ],
104
+ 'ground_truth_det': [
105
+ # frame nr 1
106
+ [
107
+ fo.Detection(
108
+ label='FAR_AWAY_OBJECT',
109
+ bounding_box=[0.389404296875,0.306640625,0.005126953125,0.0146484375],
110
+ )
111
+ ],
112
+ # frame nr 2
113
+ [
114
+ fo.Detection(
115
+ label='FAR_AWAY_OBJECT',
116
+ bounding_box=[0.455078125, 0.31494140625, 0.00390625, 0.0087890625],
117
+ confidence=0.153076171875
118
+ ),
119
+ fo.Detection(
120
+ label='FAR_AWAY_OBJECT',
121
+ bounding_box=[0.35107421875, 0.274658203125, 0.0048828125, 0.009765625], # missed nr 1
122
+ confidence=0.153076171875
123
+ )
124
+ ],
125
+ # frame nr3
126
+ [
127
+ ],
128
+ ]
129
+ }
130
+ },
131
+ "sequence_list": ["Trip_14_Seq_1", 'Trip_14_Seq_2']
132
+ }
133
+ ```
134
 
135
+ Optionally, you can pass the model as string that should be evaluated, via `model=model_str`. By default, it will evaluate the first model, i.e. `model = payload["models"][0]`.
136
 
137
  ### Output Values
138
  The metric outputs a dictionary that contains sub-dictionaries for each name of the specified area ranges.
 
153
 
154
 
155
  ### Examples
156
+ We can specify different area range levels, at which we would like to compute the metrics.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
157
  ```
158
+ >>> import evaluate
159
+ >>> from seametrics.fo_to_payload.utils import fo_to_payload
160
  >>> area_ranges_tuples = [
161
  ("all", [0, 1e5 ** 2]),
162
  ("small", [0 ** 2, 6 ** 2]),
163
  ("medium", [6 ** 2, 12 ** 2]),
164
  ("large", [12 ** 2, 1e5 ** 2])
165
  ]
166
+ >>> payload = fo_to_payload(
167
+ dataset=dataset,
168
+ gt_field=gt_field,
169
+ models=model_list
170
+ )
171
+ >>> module = evaluate.load("./detection_metric.py", iou_thresholds=0.9)
172
+ >>> module.add_batch(payload)
 
 
 
 
 
 
 
173
  >>> result = module.compute()
174
  >>> print(result)
175
  {'all':
 
235
  ```
236
 
237
  ## Further References
238
+ *seametrics* library: https://github.com/SEA-AI/seametrics/tree/main
239
+
240
  Calculating metrics is based on pycoco tools: https://github.com/cocodataset/cocoapi/tree/master/PythonAPI/pycocotools
241
 
242
  Further info about metrics: https://www.analyticsvidhya.com/blog/2020/09/precision-recall-machine-learning/
det-metrics.py CHANGED
@@ -13,7 +13,7 @@
13
  # limitations under the License.
14
  """TODO: Add a description here."""
15
 
16
- from typing import List, Tuple, Optional, Literal
17
 
18
  import evaluate
19
  import datasets
@@ -21,7 +21,6 @@ import numpy as np
21
 
22
  from seametrics.detection import PrecisionRecallF1Support
23
 
24
-
25
  _CITATION = """\
26
  @InProceedings{coco:2020,
27
  title = {Microsoft {COCO:} Common Objects in Context},
@@ -82,39 +81,30 @@ Returns:
82
  'fpi': number of images with no ground truth but false positive predictions,
83
  'nImgs': number of images considered in evaluation
84
  Examples:
85
- >>> module = evaluate.load("./detection_metric.py", iou_thresholds=0.9)
86
- >>> predictions = [
87
- [
88
- [10, 15, 20, 25],
89
- [45, 30, 10, 10]
90
- ],[
91
- [14, 25, 6, 6],
92
- [10, 16, 6, 10]
93
- ]
94
- ]
95
- >>> references = [
96
- [[10, 15, 20, 20]],
97
- [[30, 30, 5, 6]]
98
- ]
99
- >>> module.add_batch(predictions=predictions, references=references, predictions_scores=[[0.5,0.3],[0.8, 0.1]])
100
- >>> result = module.compute()
101
- >>> print(result)
102
- {'all': {
103
- 'range': [0, 10000000000.0],
104
- 'iouThr': '0.00',
105
- 'maxDets': 100,
106
- 'tp': 1,
107
- 'fp': 3,
108
- 'fn': 1,
109
- 'duplicates': 0,
110
- 'precision': 0.25,
111
- 'recall': 0.5,
112
- 'f1': 0.3333333333333333,
113
- 'support': 2,
114
- 'fpi': 0,
115
- 'nImgs': 2
116
- }
117
- }
118
  """
119
 
120
 
@@ -164,38 +154,43 @@ class DetectionMetric(evaluate.Metric):
164
 
165
  def add_batch(
166
  self,
167
- predictions,
168
- references,
169
- predictions_labels: Optional[np.ndarray] = None,
170
- predictions_scores: Optional[np.ndarray] = None,
171
- references_labels: Optional[np.ndarray] = None
172
  ):
173
  """Add predictions and ground truths of a single image to update the metric.
174
 
175
  Args:
176
- predictions (List[List[List[int]]]): predicted bounding boxes, shape: (n_images, m_pred_boxes, 4)
177
- references (List[List[List[int]]]): ground truth bounding boxes, shape: (n_images, l_gt_boxes, 4)
178
- predictions_labels (Optional[np.ndarray], optional): Labels of predicted bounding boxes, shape: (n_images, m_pred_boxes).
179
- Defaults to None.
180
- predictions_scores (Optional[np.ndarray], optional): Scores of predicted bounding boxes, shape: (n_images, m_pred_boxes).
181
- Defaults to None.
182
- references_labels (Optional[np.ndarray], optional): Labels of predicted bounding boxes, shape: (n_images, l_pred_boxes).
183
- Defaults to None.
184
  """
185
- if predictions_labels is None:
186
- predictions_labels = [None]*len(predictions)
187
- if predictions_scores is None:
188
- predictions_scores = [None]*len(predictions)
189
- if references_labels is None:
190
- references_labels = [None]*len(references)
191
- for pred, ref, pred_score, pred_l, ref_l in zip(predictions,
192
- references,
193
- predictions_scores,
194
- predictions_labels,
195
- references_labels):
196
- preds, targets = self.process_preds_references(pred, ref, pred_l, pred_score, ref_l)
197
- self.coco_metric.update(preds, targets)
198
- super(evaluate.Metric, self).add_batch(predictions=predictions, references=references)
 
 
 
 
 
 
 
 
 
 
 
 
199
 
200
  def _compute(
201
  self,
@@ -205,42 +200,50 @@ class DetectionMetric(evaluate.Metric):
205
  """Returns the scores"""
206
  result = self.coco_metric.compute()["metrics"]
207
  return result
208
-
209
  @staticmethod
210
- def process_preds_references(
211
- predictions,
212
- references,
213
- predictions_labels: Optional[np.ndarray] = None,
214
- predictions_scores: Optional[np.ndarray] = None,
215
- references_labels: Optional[np.ndarray] = None
216
- ):
217
- if predictions_scores is None:
218
- predictions_scores = np.ones(shape=len(predictions), dtype=np.float32)
219
- else:
220
- predictions_scores = np.array(predictions_scores, dtype=np.float32)
221
- if predictions_labels is None:
222
- if references_labels is not None:
223
- print("Warning: Providing no prediction labels, but ground truth labels!")
224
- predictions_labels = np.zeros(shape=len(predictions), dtype=np.int16)
225
- else:
226
- predictions_labels = np.array(predictions_labels)
227
- if references_labels is None:
228
- references_labels = np.zeros(shape=len(references), dtype=np.int16)
229
- else:
230
- references_labels = np.array(references_labels)
231
-
232
- preds = [
233
- dict(
234
- boxes=np.array(predictions),
235
- scores=predictions_scores,
236
- labels=predictions_labels
 
 
 
 
 
 
237
  )
238
- ]
239
- target = [
 
 
 
240
  dict(
241
- boxes=np.array(references),
242
- labels=references_labels
 
243
  )
244
- ]
245
-
246
- return preds, target
 
13
  # limitations under the License.
14
  """TODO: Add a description here."""
15
 
16
+ from typing import List, Tuple, Dict, Literal
17
 
18
  import evaluate
19
  import datasets
 
21
 
22
  from seametrics.detection import PrecisionRecallF1Support
23
 
 
24
  _CITATION = """\
25
  @InProceedings{coco:2020,
26
  title = {Microsoft {COCO:} Common Objects in Context},
 
81
  'fpi': number of images with no ground truth but false positive predictions,
82
  'nImgs': number of images considered in evaluation
83
  Examples:
84
+ >>> import evaluate
85
+ >>> from seametrics.fo_to_payload.utils import fo_to_payload
86
+ >>> payload = fo_to_payload(..., models=model_list)
87
+ >>> for model in payload["models"]:
88
+ >>> module = evaluate.load("./detection_metric.py", iou_thresholds=0.9)
89
+ >>> module.add_batch(payload)
90
+ >>> result = module.compute()
91
+ >>> print(result)
92
+ {'all': {
93
+ 'range': [0, 10000000000.0],
94
+ 'iouThr': '0.00',
95
+ 'maxDets': 100,
96
+ 'tp': 1,
97
+ 'fp': 3,
98
+ 'fn': 1,
99
+ 'duplicates': 0,
100
+ 'precision': 0.25,
101
+ 'recall': 0.5,
102
+ 'f1': 0.3333333333333333,
103
+ 'support': 2,
104
+ 'fpi': 0,
105
+ 'nImgs': 2
106
+ }
107
+ }
 
 
 
 
 
 
 
 
 
108
  """
109
 
110
 
 
154
 
155
  def add_batch(
156
  self,
157
+ data: dict,
158
+ model: str = None
 
 
 
159
  ):
160
  """Add predictions and ground truths of a single image to update the metric.
161
 
162
  Args:
163
+ data (dict): containing standard payload of data that should be evaluated
164
+ format should be as returned by function `fo_to_payload()` in seametrics library
165
+ model (str): should be one out of values given in data["models"]
166
+ if not defined, defaults to data["models"][0], as only one model can be evaluated a time.
 
 
 
 
167
  """
168
+ # populate two empty lists in format suitable for hugging face metric
169
+ # nothing is computed based on them but prevents huggingface error
170
+ predictions, references = [], []
171
+
172
+ if model is None:
173
+ model = data["models"][0]
174
+
175
+ for sequence in data["sequence_list"]:
176
+ seq_data = data["sequences"][sequence]
177
+ gt_normalized = seq_data[data["gt_field_name"]] # shape: (n_frames, m_gts)
178
+ pred_normalized = seq_data[model] # shape: (n_frames, l_preds)
179
+ img_res = seq_data["resolution"] # (h, w)
180
+ for gt_frame, pred_frame in zip(gt_normalized, pred_normalized): # iterate over all frames
181
+ processed_pred = self._fo_dets_to_metrics_dict(pred_frame, w=img_res[1], h=img_res[0])
182
+ processed_gt = self._fo_dets_to_metrics_dict(gt_frame, w=img_res[1], h=img_res[0])
183
+ predictions.append(processed_pred[0]["boxes"].tolist())
184
+ references.append(processed_gt[0]["boxes"].tolist())
185
+
186
+ # where the magic happens: update metric with data from current frame
187
+ self.coco_metric.update(processed_pred, processed_gt)
188
+
189
+ # prevents hugging face error, doesn't do a lot
190
+ super(evaluate.Metric, self).add_batch(
191
+ predictions=predictions,
192
+ references=references
193
+ )
194
 
195
  def _compute(
196
  self,
 
200
  """Returns the scores"""
201
  result = self.coco_metric.compute()["metrics"]
202
  return result
203
+
204
  @staticmethod
205
+ def _fo_dets_to_metrics_dict(fo_dets: list,
206
+ w: int,
207
+ h: int) -> List[Dict[str, np.ndarray]]:
208
+ """Convert list of fiftyone detections to format that is
209
+ required by PrecisionRecallF1Support() function of seametrics library
210
+
211
+ Args:
212
+ fo_dets (list): list containing fiftyone detections (or empty if frame without any detections)
213
+ note: bounding boxes in fo-detections are in format xywhn
214
+ w (int): width in pixel of image
215
+ h (int): height in pixel of image
216
+
217
+ Returns:
218
+ List[Dict[str, np.ndarray]]: list holding single dict with items:
219
+ "boxes": denormalized bounding boxes of whole frame in numpy array (shape: n_bboxes, 4)
220
+ "scores": confidence scores in numpy array (shape: n_bboxes)
221
+ "labels": labels in numpy array (shape: n_bboxes)
222
+ """
223
+ detections = []
224
+ scores = []
225
+ labels = [] #TODO: map to numbers
226
+ if len(fo_dets) == 0:
227
+ return [
228
+ dict(
229
+ boxes=np.array([]),
230
+ scores=np.array([]),
231
+ labels=np.array([])
232
+ )
233
+ ]
234
+ for det in fo_dets:
235
+ bbox = det["bounding_box"]
236
+ detections.append(
237
+ [bbox[0]*w, bbox[1]*h, bbox[2]*w, bbox[3]*h]
238
  )
239
+ scores.append(det["confidence"]) # None for gt
240
+ #labels.append(bbox["label"])
241
+ labels.append(1)
242
+
243
+ return [
244
  dict(
245
+ boxes=np.array(detections),
246
+ scores=np.array(scores),
247
+ labels=np.array(labels)
248
  )
249
+ ]