About

IndieLabel Tutorial

Check out this 5-minute video tutorial for a quick overview of the IndieLabel tool. If you prefer a written document, you can also find a help article here.

Note: IndieLabel is a research prototype. It is primarily designed for educational purposes to demonstrate our end-user auditing method applied towards a real-world system.

About IndieLabel

The IndieLabel deployment on this Hugging Face Space is a collaboration between researchers at the Stanford HCI Group and ARVA (AI Risk and Vulnerability Alliance). The IndieLabel tool is a research prototype designed to empower everyday users to lead large-scale investigations of harmful algorithmic behavior, and this Space is intended as a public-facing deployment of this prototype.

IndieLabel & the End-User Audits paper

The IndieLabel tool was initially presented as a research prototype in an academic publication called End-User Audits: A System Empowering Communities to Lead Large-Scale Investigations of Harmful Algorithmic Behavior. IndieLabel aids users in rapidly identifying where they disagree with a system's behavior with the help of a personalized recommender system model.

The key question underlying the End-User Audits paper is: how do we enable everyday users to lead large-scale algorithm audits? The full paper was presented at CSCW 2022 and is available on the ACM Digital Library or via Stanford HCI. The End-User Audits paper abstract summarizes our core motivation, approach, and findings:

"Because algorithm audits are conducted by technical experts, audits are necessarily limited to the hypotheses that experts think to test. End users hold the promise to expand this purview, as they inhabit spaces and witness algorithmic impacts that auditors do not. In pursuit of this goal, we propose end-user audits—system-scale audits led by non-technical users—and present an approach that scaffolds end users in hypothesis generation, evidence identification, and results communication. Today, performing a system-scale audit requires substantial user effort to label thousands of system outputs, so we introduce a collaborative filtering technique that leverages the algorithmic system's own disaggregated training data to project from a small number of end user labels onto the full test set. Our end-user auditing tool, IndieLabel, employs these predicted labels so that users can rapidly explore where their opinions diverge from the algorithmic system's outputs. By highlighting topic areas where the system is under-performing for the user and surfacing sets of likely error cases, the tool guides the user in authoring an audit report. In an evaluation of end-user audits on a popular comment toxicity model with 17 non-technical participants, participants both replicated issues that formal audits had previously identified and also raised previously underreported issues such as under-flagging on veiled forms of hate that perpetuate stigma and over-flagging of slurs that have been reclaimed by marginalized communities."

The End-User Audits work was led by Stanford PhD student Michelle Lam along with co-authors Mitchell Gordon, Danaë Metaxa, Jeffrey Hancock, James Landay, and Michael Bernstein.

Michelle Lam

Michelle Lam
PhD Candidate, Stanford CS

Mitchell Gordon

Mitchell Gordon
Incoming Asst Professor, MIT EECS

Danaë Metaxa

Danaë Metaxa
Asst Professor, UPenn CIS

Jeffrey Hancock

Jeffrey Hancock
Professor, Stanford Comm

James Landay

James Landay
Professor, Stanford CS

Michael Bernstein

Michael Bernstein
Assoc Professor, Stanford CS

ARVA

The AI Risk and Vulnerability Alliance (ARVA) is a nonprofit organization focused on making AI safer for everyone. Our mission is to empower communities to recognize, diagnose, and manage vulnerabilities in AI that affects them. Our flagship project is the AI Vulnerability Database (AVID), is an open-source knowledge base of failure modes for AI models, datasets, and systems.

The Team

The IndieLabel Hugging Face deployment was made possible by a wonderful team of volunteers who worked on adapting IndieLabel for use by a general audience, connecting its reports to AVID (AI Vulnerability Database), and deploying it on Hugging Face Spaces. The team includes:

Michelle Lam

Michelle Lam
Michelle Lam is a PhD Candidate at Stanford University in the HCI Group. Her research focuses on building systems that empower everyday users to surface their expertise to design and evaluate AI systems.

Carol Anderson

Carol Anderson
Carol Anderson is a data scientist and machine learning practitioner with expertise in natural language processing (NLP), biological data, and AI ethics. She serves as AVID’s machine learning lead.

Christina A. Pan

Christina A. Pan
Christina A. Pan started her career building machine learning (ML) models at Google, which inspired her passion for design thinking and AI ethics.

Nathan Butters

Nathan Butters
Nathan Butters is a product manager in the Office of Ethical and Humane Use at Salesforce. He is a cofounder of the AI Risk and Vulnerability Alliance (ARVA).

Borhane Blili-Hamelin

Borhane Blili-Hamelin
Borhane Blili-Hamelin is an ethicist, researcher and AI risk management consultant. He is an officer at the AI Risk and Vulnerability Alliance (ARVA), an affiliate at Data & Society, and a senior consultant at BABL AI.