The IndieLabel deployment on this Hugging Face Space is a collaboration between researchers at the Stanford HCI Group and ARVA (AI Risk and Vulnerability Alliance). The IndieLabel tool is a research prototype designed to empower everyday users to lead large-scale investigations of harmful algorithmic behavior, and this Space is intended as a public-facing deployment of this prototype.
IndieLabel & the End-User Audits paper
The IndieLabel tool was initially presented as a research prototype system in an academic publication called End-User Audits: A System Empowering Communities to Lead Large-Scale Investigations of Harmful Algorithmic Behavior, presented at CSCW 2022. The full paper is available on the ACM Digital Library or via Stanford HCI.
The key question underlying the End-User Audits paper is: how do we enable everyday users to lead large-scale algorithm audits? IndieLabel aids users in rapidly identifying where they disagree with a system's behavior with the help of a personalized recommender system model.
Our paper abstract summarizes our core motivation, approach, and findings: "Because algorithm audits are conducted by technical experts, audits are necessarily limited to the hypotheses that experts think to test. End users hold the promise to expand this purview, as they inhabit spaces and witness algorithmic impacts that auditors do not. In pursuit of this goal, we propose end-user audits—system-scale audits led by non-technical users—and present an approach that scaffolds end users in hypothesis generation, evidence identification, and results communication. Today, performing a system-scale audit requires substantial user effort to label thousands of system outputs, so we introduce a collaborative filtering technique that leverages the algorithmic system's own disaggregated training data to project from a small number of end user labels onto the full test set. Our end-user auditing tool, IndieLabel, employs these predicted labels so that users can rapidly explore where their opinions diverge from the algorithmic system's outputs. By highlighting topic areas where the system is under-performing for the user and surfacing sets of likely error cases, the tool guides the user in authoring an audit report. In an evaluation of end-user audits on a popular comment toxicity model with 17 non-technical participants, participants both replicated issues that formal audits had previously identified and also raised previously underreported issues such as under-flagging on veiled forms of hate that perpetuate stigma and over-flagging of slurs that have been reclaimed by marginalized communities."
The End-User Audits work was led by Stanford PhD student Michelle Lam along with co-authors Mitchell Gordon, Danaë Metaxa, Jeffrey Hancock, James Landay, and Michael Bernstein.
Michelle Lam
PhD Student, Stanford CS
Mitchell Gordon
Incoming Asst Professor, MIT EECS
Danaë Metaxa
Asst Professor, UPenn CIS
Jeffrey Hancock
Professor, Stanford Comm
James Landay
Professor, Stanford CS
Michael Bernstein
Assoc Professor, Stanford CS
ARVA
To fill in: information about ARVA and AVID.
The Team
The IndieLabel Hugging Face deployment was made possible by a wonderful team of volunteers who worked on adapting IndieLabel for use by a general audience, connecting its reports to AVID (AI Vulnerability Database), and deploying it on Hugging Face Spaces. The team includes:
Michelle Lam
Carol Anderson
Christina Pan
Nathan Butters