|
--- |
|
title: Directional Bias Amplification |
|
emoji: 🌴 |
|
colorFrom: purple |
|
colorTo: blue |
|
sdk: gradio |
|
sdk_version: 3.0.12 |
|
app_file: app.py |
|
pinned: false |
|
tags: |
|
- evaluate |
|
- metric |
|
description: >- |
|
Directional Bias Amplification is a metric that captures the amount of bias (i.e., a conditional probability) that is amplified. This metric was introduced in the ICML 2021 paper ["Directional Bias Amplification"](https://arxiv.org/abs/2102.12594) for fairness evaluation. |
|
--- |
|
|
|
# Metric Card for Directional Bias Amplification |
|
|
|
## Metric Description |
|
Directional Bias Amplification is a metric that captures the amount of bias (i.e., a conditional probability) that is amplified. This metric was introduced in the ICML 2021 paper ["Directional Bias Amplification"](https://arxiv.org/abs/2102.12594) for fairness evaluation. |
|
|
|
## How to Use |
|
This metric operates on multi-label (including binary) classification settings where each image has a(n) associated sensitive attribute(s). |
|
This metric requires three sets of inputs: |
|
- Predictions representing the model output on the task (predictions) |
|
- Ground-truth labels on the task (references) |
|
- Ground-truth labels on the sensitive attribute of interest (attributes) |
|
|
|
### Inputs |
|
- **predictions** (`array` of `int`): Predicted task labels. Array of size n x |T|. n is number of samples, |T| is number of task labels. All values are binary 0 or 1. |
|
- **references** (`array` of `int`): Ground truth task labels. Array of size n x |T|. n is number of samples, |T| is number of task labels. All values are binary 0 or 1. |
|
- **attributes** (`array` of `int`): Ground truth attribute labels. Array of size n x |A|. n is number of samples, |A| is number of attribute labels. All values are binary 0 or 1. |
|
|
|
### Output Values |
|
- **bias_amplification** (`float`): Bias amplification value. Minimum possible value is 0, and maximum possible value is 1.0. The higher the value, the more "bias" is amplified. |
|
- **disagg_bias_amplification** (`array` of `float`): Array of size (number of unique attribute label values) x (number of unique task label values). Each array value represents the bias amplification of that particular task given that particular attribute. |
|
|
|
### Examples |
|
|
|
Imagine a scenario with 3 individuals in Group A and 5 individuals in Group B. Task label `1` is biased because 2 of the 3 individuals in Group A have it, whereas only 1 of the 5 individuals in Group B do. The model amplifies this bias, and predicts all members of Group A to have task label `1`, and no members of Group B to have task label `1`. |
|
|
|
```python |
|
>>> bias_amp_metric = evaluate.load("directional_bias_amplification") |
|
>>> results = bias_amp_metric.compute(references=[[0], [1], [1], [0], [0], [0], [0], [1]], predictions=[[1], [1], [1], [0], [0], [0], [0], [0]], attributes=[[0, 1], [0, 1], [0, 1], [1, 0], [1, 0], [1, 0], [1, 0], [1, 0]]) |
|
>>> print(results) |
|
{'bias_amplification': 0.2667, 'disagg_bias_amplification': [[0.2], [0.3333]]} |
|
``` |
|
|
|
## Limitations and Bias |
|
An strong assumption made by this metric is that ground truth labels exist, are known, and are agreed upon. Further, a perfectly accurate model that achieves zero bias amplification is one that continues to perpetuate the biases in the data. |
|
|
|
Please refer to Sec. 5.3 "Limitations of Bias Amplification" of ["Directional Bias Amplification"](https://arxiv.org/abs/2102.12594) for a more detailed discussion. |
|
|
|
## Citation(s) |
|
``` |
|
@inproceedings{wang2021biasamp, |
|
author = {Angelina Wang and Olga Russakovsky}, |
|
title = {Directional Bias Amplification}, |
|
booktitle = {International Conference on Machine Learning (ICML)}, |
|
year = {2021} |
|
} |
|
``` |
|
|
|
## Further References |
|
|