Angelina Wang
specify fairness
91cbe8d
---
title: Directional Bias Amplification
emoji: 🌴
colorFrom: purple
colorTo: blue
sdk: gradio
sdk_version: 3.0.12
app_file: app.py
pinned: false
tags:
- evaluate
- metric
description: >-
Directional Bias Amplification is a metric that captures the amount of bias (i.e., a conditional probability) that is amplified. This metric was introduced in the ICML 2021 paper ["Directional Bias Amplification"](https://arxiv.org/abs/2102.12594) for fairness evaluation.
---
# Metric Card for Directional Bias Amplification
## Metric Description
Directional Bias Amplification is a metric that captures the amount of bias (i.e., a conditional probability) that is amplified. This metric was introduced in the ICML 2021 paper ["Directional Bias Amplification"](https://arxiv.org/abs/2102.12594) for fairness evaluation.
## How to Use
This metric operates on multi-label (including binary) classification settings where each image has a(n) associated sensitive attribute(s).
This metric requires three sets of inputs:
- Predictions representing the model output on the task (predictions)
- Ground-truth labels on the task (references)
- Ground-truth labels on the sensitive attribute of interest (attributes)
### Inputs
- **predictions** (`array` of `int`): Predicted task labels. Array of size n x |T|. n is number of samples, |T| is number of task labels. All values are binary 0 or 1.
- **references** (`array` of `int`): Ground truth task labels. Array of size n x |T|. n is number of samples, |T| is number of task labels. All values are binary 0 or 1.
- **attributes** (`array` of `int`): Ground truth attribute labels. Array of size n x |A|. n is number of samples, |A| is number of attribute labels. All values are binary 0 or 1.
### Output Values
- **bias_amplification** (`float`): Bias amplification value. Minimum possible value is 0, and maximum possible value is 1.0. The higher the value, the more "bias" is amplified.
- **disagg_bias_amplification** (`array` of `float`): Array of size (number of unique attribute label values) x (number of unique task label values). Each array value represents the bias amplification of that particular task given that particular attribute.
### Examples
Imagine a scenario with 3 individuals in Group A and 5 individuals in Group B. Task label `1` is biased because 2 of the 3 individuals in Group A have it, whereas only 1 of the 5 individuals in Group B do. The model amplifies this bias, and predicts all members of Group A to have task label `1`, and no members of Group B to have task label `1`.
```python
>>> bias_amp_metric = evaluate.load("directional_bias_amplification")
>>> results = bias_amp_metric.compute(references=[[0], [1], [1], [0], [0], [0], [0], [1]], predictions=[[1], [1], [1], [0], [0], [0], [0], [0]], attributes=[[0, 1], [0, 1], [0, 1], [1, 0], [1, 0], [1, 0], [1, 0], [1, 0]])
>>> print(results)
{'bias_amplification': 0.2667, 'disagg_bias_amplification': [[0.2], [0.3333]]}
```
## Limitations and Bias
An strong assumption made by this metric is that ground truth labels exist, are known, and are agreed upon. Further, a perfectly accurate model that achieves zero bias amplification is one that continues to perpetuate the biases in the data.
Please refer to Sec. 5.3 "Limitations of Bias Amplification" of ["Directional Bias Amplification"](https://arxiv.org/abs/2102.12594) for a more detailed discussion.
## Citation(s)
```
@inproceedings{wang2021biasamp,
author = {Angelina Wang and Olga Russakovsky},
title = {Directional Bias Amplification},
booktitle = {International Conference on Machine Learning (ICML)},
year = {2021}
}
```
## Further References