Spaces:
Sleeping
Sleeping
This demo makes us of the English section of the CrowS-Pair dataset of Névéol et al. (2022), which is adapted from the original version by Nangia et al. (2020). | |
### References: | |
[CrowS-Pairs: A Challenge Dataset for Measuring Social Biases in Masked Language Models](https://aclanthology.org/2020.emnlp-main.154) (Nangia et al., EMNLP 2020) | |
[French CrowS-Pairs: Extending a challenge dataset for measuring social bias in masked language models to a language other than English](https://aclanthology.org/2022.acl-long.583) (Névéol et al., ACL 2022) | |
### Note: Measuring bias in language models is hard! | |
How to measure bias in language models is not trivial and still an active area of research. | |
First of all, what is bias? As you may have noticed, stereotypes may change across languages and cultures. | |
What is problematic in the USA, may not be relevant in the Netherlands---each cultural context requires its own careful evaluation. | |
Furthermore, defining good ways to measure it is also difficult. | |
For example, [Blodgett et al. (2021)](https://aclanthology.org/2021.acl-long.81/) find that typos, nonsensical examples, and other mistakes threaten the validity of CrowS-Pairs, the dataset we show above (partially addressed by Névéol et al., 2022). | |