metadata

app_file: app.py
colorFrom: yellow
colorTo: green
description: 'TODO: add a description here'
emoji: 🤑
pinned: false
runme:
  id: 01HPS3ASFJXVQR88985QNSXVN1
  version: v3
sdk: gradio
sdk_version: 4.36.0
tags:
  - evaluate
  - metric
title: user-friendly-metrics

How to Use

import evaluate
from seametrics.payload.processor import PayloadProcessor

payload = PayloadProcessor(
    dataset_name="SENTRY_VIDEOS_DATASET_QA",
    gt_field="ground_truth_det_fused_id",
    models=["ahoy_IR_b2_engine_3_7_0_757_g8765b007_oversea"],
    sequence_list=["Sentry_2023_02_08_PROACT_CELADON_@6m_MOB_2023_02_08_14_41_51"],
    # tags=["GT_ID_FUSION"],
    tracking_mode=True
).payload

module = evaluate.load("SEA-AI/user-friendly-metrics")
res = module._compute(payload, max_iou=0.5, recognition_thresholds=[0.3, 0.5, 0.8])
print(res)

{
    "ahoy_IR_b2_engine_3_6_0_49_gd81d3b63_oversea": {
        "overall": {
            "all": {
                "f1": 0.15967351103175614,
                "fn": 2923.0,
                "fp": 3666.0,
                "num_gt_ids": 10,
                "precision": 0.14585274930102515,
                "recall": 0.1763877148492533,
                "recognition_0.3": 0.1,
                "recognition_0.5": 0.1,
                "recognition_0.8": 0.1,
                "recognized_0.3": 1,
                "recognized_0.5": 1,
                "recognized_0.8": 1,
                "tp": 626.0
            }
        },
        "per_sequence": {
            "Sentry_2023_02_08_PROACT_CELADON_@6m_MOB_2023_02_08_12_51_49": {
                "all": {
                    "f1": 0.15967351103175614,
                    "fn": 2923.0,
                    "fp": 3666.0,
                    "num_gt_ids": 10,
                    "precision": 0.14585274930102515,
                    "recall": 0.1763877148492533,
                    "recognition_0.3": 0.1,
                    "recognition_0.5": 0.1,
                    "recognition_0.8": 0.1,
                    "recognized_0.3": 1,
                    "recognized_0.5": 1,
                    "recognized_0.8": 1,
                    "tp": 626.0
                }
            }
        }
    }
}

Metric Settings

The max_iou parameter is used to filter out the bounding boxes with IOU less than the threshold. The default value is 0.5. This means that if a ground truth and a predicted bounding boxes IoU value is less than 0.5, then the predicted bounding box is not considered for association. So, the higher the max_iou value, the more the predicted bounding boxes are considered for association.

Output

The output is a dictionary containing the following metrics:

Name	Description
recall	Number of detections over number of objects.
precision	Number of detected objects over sum of detected and false positives.
f1	F1 score
num_gt_ids	Number of unique objects on the ground truth
fn	Number of false negatives
fp	Number of of false postives
tp	number of true positives
recognized_th	Total number of unique objects on the ground truth that were seen more then th% of the times
recognition_th	Total number of unique objects on the ground truth that were seen more then th% of the times over the number of unique objects on the ground truth

How it Works

We levereage one of the internal variables of motmetrics MOTAccumulator class, events, which keeps track of the detections hits and misses. These values are then processed via the track_ratios function which counts the ratio of assigned to total appearance count per unique object id. We then define the recognition function that counts how many objects have been seen more times then the desired threshold.

W&B logging

When you use module.wandb(), it is possible to log the User Frindly metrics values in Weights and Bias (W&B). The W&B key is stored as a Secret in this repository.

Params

wandb_project - Name of the W&B project (Default: 'user_freindly_metrics')
log_plots (bool, optional): Generates categorized bar charts for global metrics. Defaults to True
debug (bool, optional): Logs everything to the console and w&b Logs page. Defaults to False

import evaluate
import logging
from seametrics.payload.processor import PayloadProcessor

logging.basicConfig(level=logging.WARNING)

# Configure your dataset and model details
payload = PayloadProcessor(
    dataset_name="SENTRY_VIDEOS_DATASET_QA",
    gt_field="ground_truth_det_fused_id",
    models=["ahoy_IR_b2_engine_3_7_0_757_g8765b007_oversea"],
    sequence_list=["Sentry_2023_02_08_PROACT_CELADON_@6m_MOB_2023_02_08_14_41_51"],
    tracking_mode=True
).payload


# Evaluate using SEA-AI/user-friendly-metrics
module = evaluate.load("SEA-AI/user-friendly-metrics")
res = module._compute(payload, max_iou=0.5, recognition_thresholds=[0.3, 0.5, 0.8])

module.wandb(res,log_plots=True, debug=True)

If log_plots is True, the W&B logging function generates four bar plots:
- User_Friendly Metrics (mostly_tracked_score_%) mainly for non dev users
- User_Friendly Metrics (mostly_tracked_count_%) for dev
- Evaluation Metrics (F1, precision, recall)
- Prediction Summary (false negatives, false positives, true positives)
If debug is True, the function logs the global metrics plus the per-sequence evaluation metrics in descending order of F1 score under the Logs section of the run page.
If both log_plots and debug are False, the function logs the metrics to the Summary.

Citations

@InProceedings{huggingface:module,
title = {A great new module},
authors={huggingface, Inc.},
year={2020}}

@article{milan2016mot16,
title={MOT16: A benchmark for multi-object tracking},
author={Milan, Anton and Leal-Taix{\'e}, Laura and Reid, Ian and Roth, Stefan and Schindler, Konrad},
journal={arXiv preprint arXiv:1603.00831},
year={2016}}

Further References

Github Repository - py-motmetrics