Victoria Oberascher commited on
Commit
6d18e2a
·
1 Parent(s): 5b395e2

add description

Browse files
Files changed (2) hide show
  1. README.md +60 -21
  2. horizon-metrics.py +92 -56
README.md CHANGED
@@ -1,5 +1,5 @@
1
  ---
2
- title: horizonmetrics
3
  tags:
4
  - evaluate
5
  - metric
@@ -10,48 +10,87 @@ app_file: app.py
10
  pinned: false
11
  ---
12
 
13
- # Metric Card for horizonmetrics
14
 
15
  **_Module Card Instructions:_** _Fill out the following subsections. Feel free to take a look at existing metric cards if you'd like examples._
16
 
17
- ## Metric Description
18
 
19
- _Give a brief overview of this metric, including what task(s) it is usually used for, if any._
20
 
21
  ## How to Use
22
 
23
- _Give general statement of how to use the metric_
24
 
25
- _Provide simplest possible example for using the metric_
26
 
27
- ### Inputs
28
 
29
- _List all input arguments in the format below_
30
 
31
- - **input_field** _(type): Definition of input, with explanation if necessary. State any default value(s)._
 
 
32
 
33
- ### Output Values
 
 
 
 
 
34
 
35
- _Explain what this metric outputs and provide an example of what the metric output looks like. Modules should return a dictionary with one or multiple key-value pairs, e.g. {"bleu" : 6.02}_
 
 
 
 
36
 
37
- _State the range of possible values that the metric's output can take, as well as what in that range is considered good. For example: "This metric can take on any value between 0 and 100, inclusive. Higher scores are better."_
 
 
 
 
38
 
39
- #### Values from Popular Papers
40
 
41
- _Give examples, preferrably with links to leaderboards or publications, to papers that have reported this metric, along with the values they have reported._
 
 
42
 
43
- ### Examples
 
44
 
45
- _Give code examples of the metric being used. Try to include examples that clear up any potential ambiguity left from the metric description above. If possible, provide a range of examples that show both typical and atypical results, as well as examples where a variety of input parameters are passed._
46
 
47
- ## Limitations and Bias
 
 
 
 
 
 
 
 
 
 
 
48
 
49
- _Note any known limitations or biases that the metric has, with links and references if possible._
50
 
51
- ## Citation
52
 
53
- _Cite the source where this metric was introduced._
 
 
 
 
 
 
 
54
 
55
  ## Further References
56
 
57
- _Add any useful further references._
 
 
 
 
 
1
  ---
2
+ title: horizon-metrics
3
  tags:
4
  - evaluate
5
  - metric
 
10
  pinned: false
11
  ---
12
 
13
+ # Metric Card for horizon--metrics
14
 
15
  **_Module Card Instructions:_** _Fill out the following subsections. Feel free to take a look at existing metric cards if you'd like examples._
16
 
17
+ ## SEA-AI/horizon-metrics
18
 
19
+ This huggingface metric uses `seametrics.horizon.HorizonMetrics` under the hood to calculate the slope and midpoint errors.
20
 
21
  ## How to Use
22
 
23
+ To utilize horizon-metrics effectively, start by installing the necessary dependencies using the provided pip command. Once installed, import the evaluate library into your Python environment. Then, use the SEA-AI/horizon-metrics metric to evaluate your horizon prediction models. Ensure that both ground truth and prediction points are correctly formatted before computing the result. Finally, analyze the computed result to gain insights into the performance of your prediction models.
24
 
25
+ ### Getting Started
26
 
27
+ To get started with horizon-metrics, make sure you have the necessary dependencies installed. This metric relies on the `evaluate` and `seametrics` libraries.
28
 
29
+ ### Installation
30
 
31
+ ```bash
32
+ pip install evaluate git+https://github.com/SEA-AI/seametrics@develop
33
+ ```
34
 
35
+ ### Basic Usage
36
+
37
+ This is how you can quickly evaluate your horizon prediction models using SEA-AI/horizon-metrics:
38
+
39
+ ```python
40
+ import evaluate
41
 
42
+ ground_truth_points = [[[0.0, 0.5384765625], [1.0, 0.4931640625]],
43
+ [[0.0, 0.53796875], [1.0, 0.4928515625]],
44
+ [[0.0, 0.5374609375], [1.0, 0.4925390625]],
45
+ [[0.0, 0.536953125], [1.0, 0.4922265625]],
46
+ [[0.0, 0.5364453125], [1.0, 0.4919140625]]]
47
 
48
+ prediction_points = [[[0.0, 0.5428930956049597], [1.0, 0.4642497615378973]],
49
+ [[0.0, 0.5428930956049597], [1.0, 0.4642497615378973]],
50
+ [[0.0, 0.523573113510805], [1.0, 0.47642688648919496]],
51
+ [[0.0, 0.5200016849393765], [1.0, 0.4728554579177664]],
52
+ [[0.0, 0.523573113510805], [1.0, 0.47642688648919496]]]
53
 
 
54
 
55
+ module = evaluate.load("SEA-AI/horizon-metrics")
56
+ module.add(predictions=ground_truth_points, references=prediction_points)
57
+ result = module.compute()
58
 
59
+ print(result)
60
+ ```
61
 
62
+ This is output the evalutaion metrics for your horizon prediciton model:
63
 
64
+ ```python
65
+ {
66
+ 'average_slope_error': 0.014823194839790999,
67
+ 'average_midpoint_error': 0.014285714285714301,
68
+ 'stddev_slope_error': 0.01519178791378349,
69
+ 'stddev_midpoint_error': 0.0022661781575342445,
70
+ 'max_slope_error': 0.033526146567062376,
71
+ 'max_midpoint_error': 0.018161272321428612,
72
+ 'num_slope_error_jumps': 1,
73
+ 'num_midpoint_error_jumps': 1
74
+ }
75
+ ```
76
 
77
+ ### Output Values
78
 
79
+ SEA-AI/horizon-metrics provides the following performance metrics for horizon prediction:
80
 
81
+ - **average_slope_error**: Measures the average difference in slope between the predicted and ground truth horizon.
82
+ - **average_midpoint_error**: Calculates the average difference in midpoint position between the predicted and ground truth horizon.
83
+ - **stddev_slope_error**: Indicates the variability of errors in slope between the predicted and ground truth horizon.
84
+ - **stddev_midpoint_error**: Quantifies the variability of errors in midpoint position between the predicted and ground truth horizon.
85
+ - **max_slope_error**: Represents the maximum difference in slope between the predicted and ground truth horizon.
86
+ - **max_midpoint_error**: Indicates the maximum difference in midpoint position between the predicted and ground truth horizon.
87
+ - **num_slope_error_jumps**: Calculates the differences between errors in successive frames for the slope. It then counts the number of jumps in these errors by comparing the absolute differences to a specified threshold.
88
+ - **num_midpoint_error_jumps**: Calculates the differences between errors in successive frames for the midpoint. It then counts the number of jumps in these errors by comparing the absolute differences to a specified threshold.
89
 
90
  ## Further References
91
 
92
+ Explore the [seametrics GitHub repository](https://github.com/SEA-AI/seametrics/tree/main) for more details on the underlying library.
93
+
94
+ ## Contribution
95
+
96
+ Your contributions are welcome! If you'd like to improve SEA-AI/horizon-metrics or add new features, please feel free to fork the repository, make your changes, and submit a pull request.
horizon-metrics.py CHANGED
@@ -11,65 +11,109 @@
11
  # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12
  # See the License for the specific language governing permissions and
13
  # limitations under the License.
14
- """TODO: Add a description here."""
15
 
16
  import evaluate
17
  import datasets
18
- import numpy as np
19
 
20
  from seametrics.horizon.utils import *
21
 
22
- # TODO: Add BibTeX citation
23
  _CITATION = """\
24
  @InProceedings{huggingface:module,
25
- title = {A great new module},
26
  authors={huggingface, Inc.},
27
- year={2020}
28
  }
29
  """
30
 
31
  # TODO: Add description of the module here
32
  _DESCRIPTION = """\
33
- This new module is designed to solve this great ML task and is crafted with a lot of care.
34
- """
35
 
36
  # TODO: Add description of the arguments of the module here
37
  _KWARGS_DESCRIPTION = """
38
  Calculates how good are predictions given some references, using certain scores
39
  Args:
40
- predictions: list of predictions to score. Each predictions
41
- should be a string with tokens separated by spaces.
42
- references: list of reference for each prediction. Each
43
- reference should be a string with tokens separated by spaces.
44
- Returns:
45
- accuracy: description of the first score,
46
- another_score: description of the second score,
47
- Examples:
48
- Examples should be written in doctest format, and should illustrate how
49
- to use the function.
50
 
51
- >>> my_new_module = evaluate.load("my_new_module")
52
- >>> results = my_new_module.compute(references=[0, 1], predictions=[0, 1])
53
- >>> print(results)
54
- {'accuracy': 1.0}
55
- """
 
 
 
 
 
 
 
 
56
 
57
- # TODO: Define external resources urls if needed
58
- BAD_WORDS_URL = "http://url/to/external/resource/bad_words.txt"
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
59
 
60
 
61
  @evaluate.utils.file_utils.add_start_docstrings(_DESCRIPTION,
62
  _KWARGS_DESCRIPTION)
63
  class HorizonMetrics(evaluate.Metric):
64
- """TODO: Short description of my evaluation module."""
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
65
 
66
  def __init__(self,
67
  roll_threshold=0.5,
68
  pitch_threshold=0.1,
69
  vertical_fov_degrees=25.6,
70
  **kwargs):
71
- super().__init__(**kwargs)
72
 
 
73
  self.slope_threshold = roll_to_slope(roll_threshold)
74
  self.midpoint_threshold = pitch_to_midpoint(pitch_threshold,
75
  vertical_fov_degrees)
@@ -79,7 +123,12 @@ class HorizonMetrics(evaluate.Metric):
79
  self.midpoint_error_list = None
80
 
81
  def _info(self):
82
- # TODO: Specifies the evaluate.EvaluationModuleInfo object
 
 
 
 
 
83
  return evaluate.MetricInfo(
84
  # This is the description that will appear on the modules page.
85
  module_type="metric",
@@ -88,30 +137,25 @@ class HorizonMetrics(evaluate.Metric):
88
  inputs_description=_KWARGS_DESCRIPTION,
89
  # This defines the format of each prediction and reference
90
  features=datasets.Features({
91
- 'predictions': datasets.Value('int64'),
92
- 'references': datasets.Value('int64'),
 
 
93
  }),
94
- # Homepage of the module for documentation
95
- homepage="http://module.homepage",
96
- # Additional links to the codebase or references
97
- codebase_urls=["http://github.com/path/to/codebase/of/new_module"],
98
- reference_urls=["http://path.to.reference.url/new_module"])
99
 
100
  def add(self, *, predictions, references, **kwargs):
101
  """
102
- Update the predictions and ground truth detections.
103
-
104
- Parameters
105
- ----------
106
- predictions : list
107
- List of predicted horizons.
108
- ground_truth_det : list
109
- List of ground truth horizons.
110
 
 
 
 
 
111
  """
112
-
113
- # does not impact the metric, but is required for the interface x_x
114
- super(evaluate.Metric, self).add(prediction=0, references=0, **kwargs)
115
 
116
  self.predictions = predictions
117
  self.ground_truth_det = references
@@ -127,19 +171,11 @@ class HorizonMetrics(evaluate.Metric):
127
 
128
  def _compute(self, *, predictions, references, **kwargs):
129
  """
130
- Compute the horizon error across the sequence.
131
-
132
- Returns
133
- -------
134
- float
135
- The computed horizon error.
136
 
 
 
137
  """
138
  return calculate_horizon_error_across_sequence(
139
  self.slope_error_list, self.midpoint_error_list,
140
  self.slope_threshold, self.midpoint_threshold)
141
-
142
- def _download_and_prepare(self, dl_manager):
143
- """Optional: download external resources useful to compute the scores"""
144
- # TODO: Download external resources if needed
145
- pass
 
11
  # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12
  # See the License for the specific language governing permissions and
13
  # limitations under the License.
 
14
 
15
  import evaluate
16
  import datasets
 
17
 
18
  from seametrics.horizon.utils import *
19
 
 
20
  _CITATION = """\
21
  @InProceedings{huggingface:module,
22
+ title = {Horizon Metrics},
23
  authors={huggingface, Inc.},
24
+ year={2024}
25
  }
26
  """
27
 
28
  # TODO: Add description of the module here
29
  _DESCRIPTION = """\
30
+ TThis metric is intended to calculate horizon prediction metrics."""
 
31
 
32
  # TODO: Add description of the arguments of the module here
33
  _KWARGS_DESCRIPTION = """
34
  Calculates how good are predictions given some references, using certain scores
35
  Args:
36
+ predictions: list of predictions for each image. Each prediction
37
+ should be a nested array like this:
38
+ - [[x1, y1], [x2, y2]]
 
 
 
 
 
 
 
39
 
40
+ references: list of references for each image. Each reference
41
+ should be a nested array like this:
42
+ - [[x1, y1], [x2, y2]]
43
+ Returns:
44
+ dict containing following metrics:
45
+ 'average_slope_error': Measures the average difference in slope between the predicted and ground truth horizon.
46
+ 'average_midpoint_error': Calculates the average difference in midpoint position between the predicted and ground truth horizon.
47
+ 'stddev_slope_error': Indicates the variability of errors in slope between the predicted and ground truth horizon.
48
+ 'stddev_midpoint_error': Quantifies the variability of errors in midpoint position between the predicted and ground truth horizon.
49
+ 'max_slope_error': Represents the maximum difference in slope between the predicted and ground truth horizon.
50
+ 'max_midpoint_error': Indicates the maximum difference in midpoint position between the predicted and ground truth horizon.
51
+ 'num_slope_error_jumps': Calculates the differences between errors in successive frames for the slope. It then counts the number of jumps in these errors by comparing the absolute differences to a specified threshold.
52
+ 'num_midpoint_error_jumps': Calculates the differences between errors in successive frames for the midpoint. It then counts the number of jumps in these errors by comparing the absolute differences to a specified threshold.
53
 
54
+ Examples:
55
+ >>> ground_truth_points = [[[0.0, 0.5384765625], [1.0, 0.4931640625]],
56
+ [[0.0, 0.53796875], [1.0, 0.4928515625]],
57
+ [[0.0, 0.5374609375], [1.0, 0.4925390625]],
58
+ [[0.0, 0.536953125], [1.0, 0.4922265625]],
59
+ [[0.0, 0.5364453125], [1.0, 0.4919140625]]]
60
+
61
+ >>> prediction_points = [[[0.0, 0.5428930956049597], [1.0, 0.4642497615378973]],
62
+ [[0.0, 0.5428930956049597], [1.0, 0.4642497615378973]],
63
+ [[0.0, 0.523573113510805], [1.0, 0.47642688648919496]],
64
+ [[0.0, 0.5200016849393765], [1.0, 0.4728554579177664]],
65
+ [[0.0, 0.523573113510805], [1.0, 0.47642688648919496]]]
66
+
67
+
68
+ >>> module = evaluate.load("SEA-AI/horizon-metrics")
69
+ >>> module.add(predictions=ground_truth_points, references=prediction_points)
70
+ >>> module.compute()
71
+ >>> {'average_slope_error': 0.014823194839790999,
72
+ 'average_midpoint_error': 0.014285714285714301,
73
+ 'stddev_slope_error': 0.01519178791378349,
74
+ 'stddev_midpoint_error': 0.0022661781575342445,
75
+ 'max_slope_error': 0.033526146567062376,
76
+ 'max_midpoint_error': 0.018161272321428612,
77
+ 'num_slope_error_jumps': 1,
78
+ 'num_midpoint_error_jumps': 1}
79
+ """
80
 
81
 
82
  @evaluate.utils.file_utils.add_start_docstrings(_DESCRIPTION,
83
  _KWARGS_DESCRIPTION)
84
  class HorizonMetrics(evaluate.Metric):
85
+ """
86
+ HorizonMetrics is a metric class that calculates horizon prediction metrics.
87
+
88
+ Args:
89
+ roll_threshold (float, optional): Threshold for roll angle. Defaults to 0.5.
90
+ pitch_threshold (float, optional): Threshold for pitch angle. Defaults to 0.1.
91
+ vertical_fov_degrees (float, optional): Vertical field of view in degrees. Defaults to 25.6.
92
+ **kwargs: Additional keyword arguments.
93
+
94
+ Attributes:
95
+
96
+ slope_threshold (float): Threshold for slope calculated from roll threshold.
97
+ midpoint_threshold (float): Threshold for midpoint calculated from pitch threshold.
98
+ predictions (list): List of predicted horizons.
99
+ ground_truth_det (list): List of ground truth horizons.
100
+ slope_error_list (list): List of slope errors.
101
+ midpoint_error_list (list): List of midpoint errors.
102
+
103
+ Methods:
104
+
105
+ _info(): Returns the metric information.
106
+ add(predictions, references, **kwargs): Updates the predictions and ground truth detections.
107
+ _compute(predictions, references, **kwargs): Computes the horizon error across the sequence.
108
+ """
109
 
110
  def __init__(self,
111
  roll_threshold=0.5,
112
  pitch_threshold=0.1,
113
  vertical_fov_degrees=25.6,
114
  **kwargs):
 
115
 
116
+ super().__init__(**kwargs)
117
  self.slope_threshold = roll_to_slope(roll_threshold)
118
  self.midpoint_threshold = pitch_to_midpoint(pitch_threshold,
119
  vertical_fov_degrees)
 
123
  self.midpoint_error_list = None
124
 
125
  def _info(self):
126
+ """
127
+ Returns the metric information.
128
+
129
+ Returns:
130
+ MetricInfo: The metric information.
131
+ """
132
  return evaluate.MetricInfo(
133
  # This is the description that will appear on the modules page.
134
  module_type="metric",
 
137
  inputs_description=_KWARGS_DESCRIPTION,
138
  # This defines the format of each prediction and reference
139
  features=datasets.Features({
140
+ 'predictions':
141
+ datasets.Sequence(datasets.Value("float")),
142
+ 'references':
143
+ datasets.Sequence(datasets.Value("float")),
144
  }),
145
+ codebase_urls=["http://github.com/path/to/codebase/of/new_module"])
 
 
 
 
146
 
147
  def add(self, *, predictions, references, **kwargs):
148
  """
149
+ Updates the predictions and ground truth detections.
 
 
 
 
 
 
 
150
 
151
+ Parameters:
152
+ predictions (list): List of predicted horizons.
153
+ references (list): List of ground truth horizons.
154
+ **kwargs: Additional keyword arguments.
155
  """
156
+ super(evaluate.Metric, self).add(prediction=predictions,
157
+ references=references,
158
+ **kwargs)
159
 
160
  self.predictions = predictions
161
  self.ground_truth_det = references
 
171
 
172
  def _compute(self, *, predictions, references, **kwargs):
173
  """
174
+ Computes the horizon error across the sequence.
 
 
 
 
 
175
 
176
+ Returns:
177
+ float: The computed horizon error.
178
  """
179
  return calculate_horizon_error_across_sequence(
180
  self.slope_error_list, self.midpoint_error_list,
181
  self.slope_threshold, self.midpoint_threshold)