About

About IndieLabel

The IndieLabel deployment on this Hugging Face Space is a collaboration between researchers at the Stanford HCI Group and ARVA (AI Risk and Vulnerability Alliance). The IndieLabel tool is a research prototype designed to empower everyday users to lead large-scale investigations of harmful algorithmic behavior, and this Space is intended as a public-facing deployment of this prototype.

IndieLabel & the End-User Audits paper

The IndieLabel tool was initially presented as a research prototype system in an academic publication called End-User Audits: A System Empowering Communities to Lead Large-Scale Investigations of Harmful Algorithmic Behavior, presented at CSCW 2022. The full paper is available on the ACM Digital Library or via Stanford HCI.

The key question underlying the End-User Audits paper is: how do we enable everyday users to lead large-scale algorithm audits? IndieLabel aids users in rapidly identifying where they disagree with a system's behavior with the help of a personalized recommender system model.

Our paper abstract summarizes our core motivation, approach, and findings: "Because algorithm audits are conducted by technical experts, audits are necessarily limited to the hypotheses that experts think to test. End users hold the promise to expand this purview, as they inhabit spaces and witness algorithmic impacts that auditors do not. In pursuit of this goal, we propose end-user audits—system-scale audits led by non-technical users—and present an approach that scaffolds end users in hypothesis generation, evidence identification, and results communication. Today, performing a system-scale audit requires substantial user effort to label thousands of system outputs, so we introduce a collaborative filtering technique that leverages the algorithmic system's own disaggregated training data to project from a small number of end user labels onto the full test set. Our end-user auditing tool, IndieLabel, employs these predicted labels so that users can rapidly explore where their opinions diverge from the algorithmic system's outputs. By highlighting topic areas where the system is under-performing for the user and surfacing sets of likely error cases, the tool guides the user in authoring an audit report. In an evaluation of end-user audits on a popular comment toxicity model with 17 non-technical participants, participants both replicated issues that formal audits had previously identified and also raised previously underreported issues such as under-flagging on veiled forms of hate that perpetuate stigma and over-flagging of slurs that have been reclaimed by marginalized communities."

The End-User Audits work was led by Stanford PhD student Michelle Lam along with co-authors Mitchell Gordon, Danaë Metaxa, Jeffrey Hancock, James Landay, and Michael Bernstein.

Michelle Lam
PhD Student, Stanford CS

Mitchell Gordon
Incoming Asst Professor, MIT EECS

Danaë Metaxa
Asst Professor, UPenn CIS

Jeffrey Hancock
Professor, Stanford Comm

James Landay
Professor, Stanford CS

Michael Bernstein
Assoc Professor, Stanford CS

ARVA

To fill in: information about ARVA and AVID.

The Team

The IndieLabel Hugging Face deployment was made possible by a wonderful team of volunteers who worked on adapting IndieLabel for use by a general audience, connecting its reports to AVID (AI Vulnerability Database), and deploying it on Hugging Face Spaces. The team includes:

Michelle Lam

Carol Anderson

Christina Pan

Nathan Butters