File size: 3,464 Bytes
d47b5a6
f4ff6af
 
268cb57
 
72ac970
 
 
268cb57
d47b5a6
2e74e13
d47b5a6
 
 
 
268cb57
72ac970
 
 
f72d273
3424411
72ac970
1f331a4
72ac970
 
f72d273
72ac970
3424411
 
f72d273
72ac970
 
 
f72d273
72ac970
 
 
1f331a4
f72d273
1f331a4
 
 
 
 
b5abeb4
1f331a4
 
 
 
 
 
72ac970
 
 
3424411
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1f331a4
 
 
 
 
 
 
 
 
 
 
 
3424411
 
 
 
 
 
 
 
 
 
5a84d6f
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
---
title: Expected Calibration Error (ECE)
emoji: 🧮
colorFrom: yellow
colorTo: blue
tags:
- evaluate
- metric
description: Expected Calibration Error (ECE)
sdk: gradio
sdk_version: 5.16.0
app_file: app.py
pinned: false
---

# Metric Card for the Expected Calibration Error (ECE)

## Metric Description

This metrics computes the expected calibration error (ECE). ECE evaluates how well a model is calibrated, i.e. how well its output probabilities match the actual ground truth distribution. It measures the $L^p$ norm difference between a model’s posterior and the true likelihood of being correct.
This module directly calls the [torchmetrics package implementation](https://torchmetrics.readthedocs.io/en/stable/classification/calibration_error.html), allowing to use its flexible arguments.

## How to Use

### Inputs

*List all input arguments in the format below*
- **predictions** *(float32): predictions (after softmax). They must have a shape (N,C) if multiclass, or (N,...) if binary;*
- **references** *(int64): reference for each prediction, with a shape (N,...);*
- **kwargs** *arguments to pass to the [calibration error](https://torchmetrics.readthedocs.io/en/stable/classification/calibration_error.html) method.*

### Output Values

ECE as a float number.

### Examples

```Python
ece = evaluate.load("Natooz/ece")
results = ece.compute(
    references=np.array([[0.25, 0.20, 0.55],
                         [0.55, 0.05, 0.40],
                         [0.10, 0.30, 0.60],
                         [0.90, 0.05, 0.05]]),
    predictions=np.array([0, 1, 2, 0]),
    num_classes=3,
    n_bins=3,
    norm="l1",
)
print(results)
```

## Citation

```bibtex
@InProceedings{pmlr-v70-guo17a,
      title = 	 {On Calibration of Modern Neural Networks},
      author =       {Chuan Guo and Geoff Pleiss and Yu Sun and Kilian Q. Weinberger},
      booktitle = 	 {Proceedings of the 34th International Conference on Machine Learning},
      pages = 	 {1321--1330},
      year = 	 {2017},
      editor = 	 {Precup, Doina and Teh, Yee Whye},
      volume = 	 {70},
      series = 	 {Proceedings of Machine Learning Research},
      month = 	 {06--11 Aug},
      publisher =    {PMLR},
      pdf = 	 {http://proceedings.mlr.press/v70/guo17a/guo17a.pdf},
      url = 	 {https://proceedings.mlr.press/v70/guo17a.html},
}

```

```bibtex
@inproceedings{NEURIPS2019_f8c0c968,
     author = {Kumar, Ananya and Liang, Percy S and Ma, Tengyu},
     booktitle = {Advances in Neural Information Processing Systems},
     editor = {H. Wallach and H. Larochelle and A. Beygelzimer and F. d\textquotesingle Alch\'{e}-Buc and E. Fox and R. Garnett},
     publisher = {Curran Associates, Inc.},
     title = {Verified Uncertainty Calibration},
     url = {https://papers.nips.cc/paper_files/paper/2019/hash/f8c0c968632845cd133308b1a494967f-Abstract.html},
     volume = {32},
     year = {2019}
}
```

```bibtex
@InProceedings{Nixon_2019_CVPR_Workshops,
    author = {Nixon, Jeremy and Dusenberry, Michael W. and Zhang, Linchuan and Jerfel, Ghassen and Tran, Dustin},
    title = {Measuring Calibration in Deep Learning},
    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops},
    month = {June},
    year = {2019},
    url = {https://openaccess.thecvf.com/content_CVPRW_2019/html/Uncertainty_and_Robustness_in_Deep_Visual_Learning/Nixon_Measuring_Calibration_in_Deep_Learning_CVPRW_2019_paper.html},
}
```