When to Speak, When to Abstain: Contrastive Decoding with Abstention
Abstract
Large Language Models (LLMs) demonstrate exceptional performance across diverse tasks by leveraging both pre-trained knowledge (i.e., parametric knowledge) and external knowledge (i.e., contextual knowledge). While substantial efforts have been made to leverage both forms of knowledge, scenarios in which the model lacks any relevant knowledge remain underexplored. Such limitations can result in issues like hallucination, causing reduced reliability and potential risks in high-stakes applications. To address such limitations, this paper extends the task scope to encompass cases where the user's request cannot be fulfilled due to the lack of relevant knowledge. To this end, we introduce Contrastive Decoding with Abstention (CDA), a training-free decoding method that empowers LLMs to generate responses when relevant knowledge is available and to abstain otherwise. CDA evaluates the relevance of each knowledge for a given query, adaptively determining which knowledge to prioritize or which to completely ignore. Extensive experiments with four LLMs on three question-answering datasets demonstrate that CDA can effectively perform accurate generation and abstention simultaneously. These findings highlight CDA's potential to broaden the applicability of LLMs, enhancing reliability and preserving user trust.
Community
A contrastive decoding method that enables language models to abstain when necessary.
We consider two types of knowledge—parametric and contextual—and measure their relevance to a given query.
If any knowledge is relevant to the query, the model attends to it while decoding; otherwise, the model abstains from generating hallucinations.
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- Augmenting Multimodal LLMs with Self-Reflective Tokens for Knowledge-based Visual Question Answering (2024)
- Delve into Visual Contrastive Decoding for Hallucination Mitigation of Large Vision-Language Models (2024)
- DecoPrompt : Decoding Prompts Reduces Hallucinations when Large Language Models Meet False Premises (2024)
- CATCH: Complementary Adaptive Token-level Contrastive Decoding to Mitigate Hallucinations in LVLMs (2024)
- From Uncertainty to Trust: Enhancing Reliability in Vision-Language Models with Uncertainty-Guided Dropout Decoding (2024)
- LLM Hallucination Reasoning with Zero-shot Knowledge Test (2024)
- Layer Importance and Hallucination Analysis in Large Language Models via Enhanced Activation Variance-Sparsity (2024)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment:
@librarian-bot
recommend
Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper