Spaces:

SEA-AI
/

horizon-metrics

Sleeping

App Files Files Community

Victoria Oberascher commited on May 6, 2024

Commit

6d18e2a

1 Parent(s): 5b395e2

add description

Browse files

Files changed (2) hide show

README.md +60 -21
horizon-metrics.py +92 -56

README.md CHANGED Viewed

@@ -1,5 +1,5 @@
 ---
-title: horizonmetrics
 tags:
   - evaluate
   - metric
@@ -10,48 +10,87 @@ app_file: app.py
 pinned: false
 ---
-# Metric Card for horizonmetrics
 **_Module Card Instructions:_** _Fill out the following subsections. Feel free to take a look at existing metric cards if you'd like examples._
-## Metric Description
-_Give a brief overview of this metric, including what task(s) it is usually used for, if any._
 ## How to Use
-_Give general statement of how to use the metric_
-_Provide simplest possible example for using the metric_
-### Inputs
-_List all input arguments in the format below_
-- **input_field** _(type): Definition of input, with explanation if necessary. State any default value(s)._
-### Output Values
-_Explain what this metric outputs and provide an example of what the metric output looks like. Modules should return a dictionary with one or multiple key-value pairs, e.g. {"bleu" : 6.02}_
-_State the range of possible values that the metric's output can take, as well as what in that range is considered good. For example: "This metric can take on any value between 0 and 100, inclusive. Higher scores are better."_
-#### Values from Popular Papers
-_Give examples, preferrably with links to leaderboards or publications, to papers that have reported this metric, along with the values they have reported._
-### Examples
-_Give code examples of the metric being used. Try to include examples that clear up any potential ambiguity left from the metric description above. If possible, provide a range of examples that show both typical and atypical results, as well as examples where a variety of input parameters are passed._
-## Limitations and Bias
-_Note any known limitations or biases that the metric has, with links and references if possible._
-## Citation
-_Cite the source where this metric was introduced._
 ## Further References
-_Add any useful further references._

 ---
+title: horizon-metrics
 tags:
   - evaluate
   - metric
 pinned: false
 ---
+# Metric Card for horizon--metrics
 **_Module Card Instructions:_** _Fill out the following subsections. Feel free to take a look at existing metric cards if you'd like examples._
+## SEA-AI/horizon-metrics
+This huggingface metric uses `seametrics.horizon.HorizonMetrics` under the hood to calculate the slope and midpoint errors.
 ## How to Use
+To utilize horizon-metrics effectively, start by installing the necessary dependencies using the provided pip command. Once installed, import the evaluate library into your Python environment. Then, use the SEA-AI/horizon-metrics metric to evaluate your horizon prediction models. Ensure that both ground truth and prediction points are correctly formatted before computing the result. Finally, analyze the computed result to gain insights into the performance of your prediction models.
+### Getting Started
+To get started with horizon-metrics, make sure you have the necessary dependencies installed. This metric relies on the `evaluate` and `seametrics` libraries.
+### Installation
+```bash
+  pip install evaluate git+https://github.com/SEA-AI/seametrics@develop
+```
+### Basic Usage
+This is how you can quickly evaluate your horizon prediction models using SEA-AI/horizon-metrics:
+```python
+  import evaluate
+  ground_truth_points = [[[0.0, 0.5384765625], [1.0, 0.4931640625]],
+                        [[0.0, 0.53796875], [1.0, 0.4928515625]],
+                        [[0.0, 0.5374609375], [1.0, 0.4925390625]],
+                        [[0.0, 0.536953125], [1.0, 0.4922265625]],
+                        [[0.0, 0.5364453125], [1.0, 0.4919140625]]]
+  prediction_points = [[[0.0, 0.5428930956049597], [1.0, 0.4642497615378973]],
+                      [[0.0, 0.5428930956049597], [1.0, 0.4642497615378973]],
+                      [[0.0, 0.523573113510805], [1.0, 0.47642688648919496]],
+                      [[0.0, 0.5200016849393765], [1.0, 0.4728554579177664]],
+                      [[0.0, 0.523573113510805], [1.0, 0.47642688648919496]]]
+  module = evaluate.load("SEA-AI/horizon-metrics")
+  module.add(predictions=ground_truth_points, references=prediction_points)
+  result = module.compute()
+  print(result)
+```
+This is output the evalutaion metrics for your horizon prediciton model:
+```python
+  {
+    'average_slope_error': 0.014823194839790999,
+    'average_midpoint_error': 0.014285714285714301,
+    'stddev_slope_error': 0.01519178791378349,
+    'stddev_midpoint_error': 0.0022661781575342445,
+    'max_slope_error': 0.033526146567062376,
+    'max_midpoint_error': 0.018161272321428612,
+    'num_slope_error_jumps': 1,
+    'num_midpoint_error_jumps': 1
+  }
+```
+### Output Values
+SEA-AI/horizon-metrics provides the following performance metrics for horizon prediction:
+- **average_slope_error**: Measures the average difference in slope between the predicted and ground truth horizon.
+- **average_midpoint_error**: Calculates the average difference in midpoint position between the predicted and ground truth horizon.
+- **stddev_slope_error**: Indicates the variability of errors in slope between the predicted and ground truth horizon.
+- **stddev_midpoint_error**: Quantifies the variability of errors in midpoint position between the predicted and ground truth horizon.
+- **max_slope_error**: Represents the maximum difference in slope between the predicted and ground truth horizon.
+- **max_midpoint_error**: Indicates the maximum difference in midpoint position between the predicted and ground truth horizon.
+- **num_slope_error_jumps**: Calculates the differences between errors in successive frames for the slope. It then counts the number of jumps in these errors by comparing the absolute differences to a specified threshold.
+- **num_midpoint_error_jumps**: Calculates the differences between errors in successive frames for the midpoint. It then counts the number of jumps in these errors by comparing the absolute differences to a specified threshold.
 ## Further References
+Explore the [seametrics GitHub repository](https://github.com/SEA-AI/seametrics/tree/main) for more details on the underlying library.
+## Contribution
+Your contributions are welcome! If you'd like to improve SEA-AI/horizon-metrics or add new features, please feel free to fork the repository, make your changes, and submit a pull request.

horizon-metrics.py CHANGED Viewed

@@ -11,65 +11,109 @@
 # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 # See the License for the specific language governing permissions and
 # limitations under the License.
-"""TODO: Add a description here."""
 import evaluate
 import datasets
-import numpy as np
 from seametrics.horizon.utils import *
-# TODO: Add BibTeX citation
 _CITATION = """\
 @InProceedings{huggingface:module,
-title = {A great new module},
 authors={huggingface, Inc.},
-year={2020}
 }
 """
 # TODO: Add description of the module here
 _DESCRIPTION = """\
-This new module is designed to solve this great ML task and is crafted with a lot of care.
-"""
 # TODO: Add description of the arguments of the module here
 _KWARGS_DESCRIPTION = """
 Calculates how good are predictions given some references, using certain scores
 Args:
-    predictions: list of predictions to score. Each predictions
-        should be a string with tokens separated by spaces.
-    references: list of reference for each prediction. Each
-        reference should be a string with tokens separated by spaces.
-Returns:
-    accuracy: description of the first score,
-    another_score: description of the second score,
-Examples:
-    Examples should be written in doctest format, and should illustrate how
-    to use the function.
-    >>> my_new_module = evaluate.load("my_new_module")
-    >>> results = my_new_module.compute(references=[0, 1], predictions=[0, 1])
-    >>> print(results)
-    {'accuracy': 1.0}
-"""
-# TODO: Define external resources urls if needed
-BAD_WORDS_URL = "http://url/to/external/resource/bad_words.txt"
 @evaluate.utils.file_utils.add_start_docstrings(_DESCRIPTION,
                                                 _KWARGS_DESCRIPTION)
 class HorizonMetrics(evaluate.Metric):
-    """TODO: Short description of my evaluation module."""
     def __init__(self,
                  roll_threshold=0.5,
                  pitch_threshold=0.1,
                  vertical_fov_degrees=25.6,
                  **kwargs):
-        super().__init__(**kwargs)
         self.slope_threshold = roll_to_slope(roll_threshold)
         self.midpoint_threshold = pitch_to_midpoint(pitch_threshold,
                                                     vertical_fov_degrees)
@@ -79,7 +123,12 @@ class HorizonMetrics(evaluate.Metric):
         self.midpoint_error_list = None
     def _info(self):
-        # TODO: Specifies the evaluate.EvaluationModuleInfo object
         return evaluate.MetricInfo(
             # This is the description that will appear on the modules page.
             module_type="metric",
@@ -88,30 +137,25 @@ class HorizonMetrics(evaluate.Metric):
             inputs_description=_KWARGS_DESCRIPTION,
             # This defines the format of each prediction and reference
             features=datasets.Features({
-                'predictions': datasets.Value('int64'),
-                'references': datasets.Value('int64'),
             }),
-            # Homepage of the module for documentation
-            homepage="http://module.homepage",
-            # Additional links to the codebase or references
-            codebase_urls=["http://github.com/path/to/codebase/of/new_module"],
-            reference_urls=["http://path.to.reference.url/new_module"])
     def add(self, *, predictions, references, **kwargs):
         """
-        Update the predictions and ground truth detections.
-        Parameters
-        ----------
-        predictions : list
-            List of predicted horizons.
-        ground_truth_det : list
-            List of ground truth horizons.
         """
-        # does not impact the metric, but is required for the interface x_x
-        super(evaluate.Metric, self).add(prediction=0, references=0, **kwargs)
         self.predictions = predictions
         self.ground_truth_det = references
@@ -127,19 +171,11 @@ class HorizonMetrics(evaluate.Metric):
     def _compute(self, *, predictions, references, **kwargs):
         """
-        Compute the horizon error across the sequence.
-        Returns
-        -------
-        float
-            The computed horizon error.
         """
         return calculate_horizon_error_across_sequence(
             self.slope_error_list, self.midpoint_error_list,
             self.slope_threshold, self.midpoint_threshold)
-    def _download_and_prepare(self, dl_manager):
-        """Optional: download external resources useful to compute the scores"""
-        # TODO: Download external resources if needed
-        pass

 # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 # See the License for the specific language governing permissions and
 # limitations under the License.
 import evaluate
 import datasets
 from seametrics.horizon.utils import *
 _CITATION = """\
 @InProceedings{huggingface:module,
+title = {Horizon Metrics},
 authors={huggingface, Inc.},
+year={2024}
 }
 """
 # TODO: Add description of the module here
 _DESCRIPTION = """\
+TThis metric is intended to calculate horizon prediction metrics."""
 # TODO: Add description of the arguments of the module here
 _KWARGS_DESCRIPTION = """
 Calculates how good are predictions given some references, using certain scores
 Args:
+    predictions: list of predictions for each image. Each prediction
+        should be a nested array like this:
+        - [[x1, y1], [x2, y2]]
+    references: list of references for each image. Each reference
+        should be a nested array like this:
+        - [[x1, y1], [x2, y2]]
+Returns:
+    dict containing following metrics:
+    'average_slope_error': Measures the average difference in slope between the predicted and ground truth horizon.
+    'average_midpoint_error': Calculates the average difference in midpoint position between the predicted and ground truth horizon.
+    'stddev_slope_error': Indicates the variability of errors in slope between the predicted and ground truth horizon.
+    'stddev_midpoint_error': Quantifies the variability of errors in midpoint position between the predicted and ground truth horizon.
+    'max_slope_error': Represents the maximum difference in slope between the predicted and ground truth horizon.
+    'max_midpoint_error': Indicates the maximum difference in midpoint position between the predicted and ground truth horizon.
+    'num_slope_error_jumps': Calculates the differences between errors in successive frames for the slope. It then counts the number of jumps in these errors by comparing the absolute differences to a specified threshold.
+    'num_midpoint_error_jumps': Calculates the differences between errors in successive frames for the midpoint. It then counts the number of jumps in these errors by comparing the absolute differences to a specified threshold.
+Examples:
+    >>> ground_truth_points = [[[0.0, 0.5384765625], [1.0, 0.4931640625]],
+                       [[0.0, 0.53796875], [1.0, 0.4928515625]],
+                       [[0.0, 0.5374609375], [1.0, 0.4925390625]],
+                       [[0.0, 0.536953125], [1.0, 0.4922265625]],
+                       [[0.0, 0.5364453125], [1.0, 0.4919140625]]]
+    >>> prediction_points = [[[0.0, 0.5428930956049597], [1.0, 0.4642497615378973]],
+                     [[0.0, 0.5428930956049597], [1.0, 0.4642497615378973]],
+                     [[0.0, 0.523573113510805], [1.0, 0.47642688648919496]],
+                     [[0.0, 0.5200016849393765], [1.0, 0.4728554579177664]],
+                     [[0.0, 0.523573113510805], [1.0, 0.47642688648919496]]]
+    >>> module = evaluate.load("SEA-AI/horizon-metrics")
+    >>> module.add(predictions=ground_truth_points, references=prediction_points)
+    >>> module.compute()
+    >>> {'average_slope_error': 0.014823194839790999,
+         'average_midpoint_error': 0.014285714285714301,
+         'stddev_slope_error': 0.01519178791378349,
+         'stddev_midpoint_error': 0.0022661781575342445,
+         'max_slope_error': 0.033526146567062376,
+         'max_midpoint_error': 0.018161272321428612,
+         'num_slope_error_jumps': 1,
+         'num_midpoint_error_jumps': 1}
+    """
 @evaluate.utils.file_utils.add_start_docstrings(_DESCRIPTION,
                                                 _KWARGS_DESCRIPTION)
 class HorizonMetrics(evaluate.Metric):
+    """
+    HorizonMetrics is a metric class that calculates horizon prediction metrics.
+    Args:
+        roll_threshold (float, optional): Threshold for roll angle. Defaults to 0.5.
+        pitch_threshold (float, optional): Threshold for pitch angle. Defaults to 0.1.
+        vertical_fov_degrees (float, optional): Vertical field of view in degrees. Defaults to 25.6.
+        **kwargs: Additional keyword arguments.
+    Attributes:
+        slope_threshold (float): Threshold for slope calculated from roll threshold.
+        midpoint_threshold (float): Threshold for midpoint calculated from pitch threshold.
+        predictions (list): List of predicted horizons.
+        ground_truth_det (list): List of ground truth horizons.
+        slope_error_list (list): List of slope errors.
+        midpoint_error_list (list): List of midpoint errors.
+    Methods:
+        _info(): Returns the metric information.
+        add(predictions, references, **kwargs): Updates the predictions and ground truth detections.
+        _compute(predictions, references, **kwargs): Computes the horizon error across the sequence.
+    """
     def __init__(self,
                  roll_threshold=0.5,
                  pitch_threshold=0.1,
                  vertical_fov_degrees=25.6,
                  **kwargs):
+        super().__init__(**kwargs)
         self.slope_threshold = roll_to_slope(roll_threshold)
         self.midpoint_threshold = pitch_to_midpoint(pitch_threshold,
                                                     vertical_fov_degrees)
         self.midpoint_error_list = None
     def _info(self):
+        """
+        Returns the metric information.
+        Returns:
+            MetricInfo: The metric information.
+        """
         return evaluate.MetricInfo(
             # This is the description that will appear on the modules page.
             module_type="metric",
             inputs_description=_KWARGS_DESCRIPTION,
             # This defines the format of each prediction and reference
             features=datasets.Features({
+                'predictions':
+                datasets.Sequence(datasets.Value("float")),
+                'references':
+                datasets.Sequence(datasets.Value("float")),
             }),
+            codebase_urls=["http://github.com/path/to/codebase/of/new_module"])
     def add(self, *, predictions, references, **kwargs):
         """
+        Updates the predictions and ground truth detections.
+        Parameters:
+            predictions (list): List of predicted horizons.
+            references (list): List of ground truth horizons.
+            **kwargs: Additional keyword arguments.
         """
+        super(evaluate.Metric, self).add(prediction=predictions,
+                                         references=references,
+                                         **kwargs)
         self.predictions = predictions
         self.ground_truth_det = references
     def _compute(self, *, predictions, references, **kwargs):
         """
+        Computes the horizon error across the sequence.
+        Returns:
+            float: The computed horizon error.
         """
         return calculate_horizon_error_across_sequence(
             self.slope_error_list, self.midpoint_error_list,
             self.slope_threshold, self.midpoint_threshold)