arxiv:2105.03287

Order in the Court: Explainable AI Methods Prone to Disagreement

Published on May 7, 2021

Authors:

Stefan F. Schouten ,

Maurits J. R. Bleeker ,

Ana Lucic

Abstract

By computing the rank correlation between attention weights and feature-additive explanation methods, previous analyses either invalidate or support the role of attention-based explanations as a faithful and plausible measure of salience. To investigate whether this approach is appropriate, we compare LIME, Integrated Gradients, DeepLIFT, Grad-SHAP, Deep-SHAP, and attention-based explanations, applied to two neural architectures trained on single- and pair-sequence language tasks. In most cases, we find that none of our chosen methods agree. Based on our empirical observations and theoretical objections, we conclude that rank correlation does not measure the quality of feature-additive methods. Practitioners should instead use the numerous and rigorous diagnostic methods proposed by the community.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

No model linking this paper

Cite arxiv.org/abs/2105.03287 in a model README.md to link it from this page.

No dataset linking this paper

Cite arxiv.org/abs/2105.03287 in a dataset README.md to link it from this page.

No Space linking this paper

Cite arxiv.org/abs/2105.03287 in a Space README.md to link it from this page.

No Collection including this paper

Add this paper to a collection to link it from this page.