|
--- |
|
license: cc |
|
language: |
|
- en |
|
base_model: |
|
- intfloat/e5-base-v2 |
|
tags: |
|
- retrieval |
|
- question answering |
|
--- |
|
|
|
<div align="center"> |
|
<img src="https://github.com/SapienzaNLP/zebra/blob/master/assets/zebra.png?raw=true" width="100" height="100"> |
|
</div> |
|
|
|
<div align="center"> |
|
<h1>ZEBRA: Zero-Shot Example-Based Retrieval Augmentation for Commonsense Question Answering</h1> |
|
</div> |
|
|
|
<div style="display:flex; justify-content: center; align-items: center; flex-direction: row;"> |
|
<a href="https://2024.emnlp.org/"><img src="https://img.shields.io/badge/EMNLP-2024-4b44ce"></a> |
|
<a href="https://arxiv.org/abs/2410.05077"><img src="https://img.shields.io/badge/arXiv-paper-b31b1b.svg"></a> |
|
<a href="https://creativecommons.org/licenses/by-nc-sa/4.0/"><img src="https://img.shields.io/badge/License-CC%20BY--NC--SA%204.0-lightgrey.svg"></a> |
|
<a href="https://huggingface.co/collections/sapienzanlp/zebra-66e3ec50c8ce415ea7572d0e"><img src="https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Collection-FCD21D"></a> |
|
<a href="https://github.com/SapienzaNLP/zebra"><img src="https://img.shields.io/badge/GitHub-Repo-121013?logo=github&logoColor=white"></a> |
|
</div> |
|
|
|
<div align="center"> A retrieval augmentation framework for zero-shot commonsense question answering with LLMs. </div> |
|
|
|
## π οΈ Installation |
|
|
|
Installation from PyPi |
|
|
|
```bash |
|
pip install zebra-qa |
|
``` |
|
|
|
Installation from source |
|
|
|
```bash |
|
git clone https://github.com/sapienzanlp/zebra.git |
|
cd zebra |
|
conda create -n zebra python==3.10 |
|
conda activate zebra |
|
pip install -e . |
|
``` |
|
|
|
## π Quick Start |
|
|
|
ZEBRA is a plug-and-play retrieval augmentation framework for **Commonsense Question Answering**. \ |
|
It is composed of three pipeline stages: *example retrieval*, *knowledge generation* and *informed reasoning*. |
|
|
|
- Example retrieval: given a question, we retrieve relevant examples of question-knowledge pairs from a large collection |
|
- Knowledge generation: we prompt an LLM to generate useful explanations for the given input question by leveraging the relationships in the retrieved question-knowledge pairs. |
|
- Informed reasoning: we prompt the same LLM for the question answering task by taking advantage of the previously generated explanations. |
|
|
|
Here is an example of how to use ZEBRA for question answering: |
|
|
|
```python |
|
from zebra import Zebra |
|
|
|
# Load Zebra with language model, retriever, document index and explanations. |
|
zebra = Zebra( |
|
model="meta-llama/Meta-Llama-3-8B-Instruct", |
|
retriever="sapienzanlp/zebra-retriever-e5-base-v2", |
|
document_index="sapienzanlp/zebra-kb" |
|
) |
|
|
|
# Provide a question and answer choices. |
|
questions = [ |
|
"What should you do if you see someone hurt and in need of help?", |
|
"If your friend is upset, what is the best way to support them?", |
|
"What should you do if your phone battery is running low in a public place?", |
|
"What should you do if you are running late for an important meeting?", |
|
] |
|
|
|
choices = [ |
|
["Walk away.", "Call for help.", "Take a photo for social media."], |
|
["Listen to them and offer comfort.", "Tell them they are overreacting.", "Ignore them and walk away."], |
|
["Borrow a stranger's phone.", "Use public charging station.", "Leave your phone unattended while it charges."], |
|
["Rush through traffic.", "Call and inform them you will be late.", "Do not show up at all."], |
|
] |
|
|
|
# Generate knowledge and perform question answering. |
|
zebra_output = zebra.pipeline(questions=questions, choices=choices) |
|
``` |
|
|
|
The output contains, for each question, a list of generated explanations and the predicted answer: |
|
|
|
```bash |
|
ZebraOutput( |
|
explanations=[ |
|
[ |
|
"Walking away would be neglecting the person's need for help and potentially putting them in danger.", |
|
'Calling for help, such as 911, is the most effective way to get the person the assistance they need.', |
|
"Taking a photo for social media might spread awareness, but it's not a direct way to help the person in need." |
|
], |
|
[ |
|
'Listening and offering comfort shows empathy and understanding.', |
|
"Telling someone they're overreacting can be dismissive and unhelpful.", |
|
'Ignoring someone in distress can be hurtful and unkind.' |
|
], |
|
[ |
|
"Borrow a stranger's phone: Unwise, as it's a security risk and may lead to theft or damage.", |
|
"Use public charging station: Safe and convenient, as it's a designated charging area.", |
|
'Leave your phone unattended while it charges: Not recommended, as it may be stolen or damaged.' |
|
], |
|
[ |
|
'Rush through traffic: This option is risky and may lead to accidents or stress.', |
|
'Call and inform them you will be late: This is the most likely option, as it shows respect for the meeting and allows for adjustments.', |
|
'Do not show up at all: This is unacceptable, as it shows disrespect for the meeting and may damage relationships.' |
|
], |
|
], |
|
answers=[ |
|
"Call for help.", |
|
"Listen to them and offer comfort.", |
|
"Use public charging station.", |
|
"Call and inform them you will be late." |
|
], |
|
) |
|
``` |
|
|
|
You can also call the `zebra.pipeline` method with the `return_dict` parameter set to `True` to ask ZEBRA to return also the retrieved examples along with their explanations. |
|
|
|
## Models and Data |
|
|
|
Models and data can be found at the following [HuggingFace Collection π€](https://huggingface.co/collections/sapienzanlp/zebra-66e3ec50c8ce415ea7572d0e). |
|
|
|
## π Performance |
|
|
|
We evaluate the performance of ZEBRA on 8 well-established commonsense question answering datasets. The following table shows the results (accuracy) of the models before / after the application of ZEBRA. |
|
|
|
| Model | CSQA | ARC-C | ARC-E | OBQA | PIQA | QASC | CSQA2 | WG | AVG | |
|
| ------------------------ | --------------- | --------------- | --------------- | --------------- | --------------- | --------------- | --------------- | --------------- | --------------- | |
|
| Mistral-7B-Instruct-v0.2 | 68.2 / **73.3** | 72.4 / **75.2** | 85.8 / **87.4** | 68.8 / **75.8** | 76.1 / **80.2** | 66.1 / **68.3** | 58.5 / **67.5** | 55.8 / **60.7** | 68.9 / **73.5** | |
|
| Phi3-small-8k-Instruct | 77.2 / **80.9** | 90.4 / **91.6** | 96.9 / **97.7** | 90.4 / **91.2** | 86.6 / **88.1** | **83.5** / 81.0 | 68.0 / **74.6** | 79.1 / **81.0** | 84.0 / **85.8** | |
|
| Meta-Llama-3-8b-Instruct | 73.9 / **78.7** | 79.4 / **83.5** | 91.7 / **92.9** | 73.4 / **79.6** | 78.3 / **84.0** | 78.2 / **79.1** | 64.3 / **69.4** | 56.2 / **63.2** | 74.4 / **78.8** | |
|
| Phi3-mini-128k-Instruct | 73.4 / **74.8** | 85.7 / **88.0** | 95.4 / **96.0** | 82.8 / **87.8** | 80.4 / **84.2** | **74.7** / 73.9 | 59.3 / **64.6** | 67.3 / **72.9** | 77.4 / **80.5** | |
|
|
|
You can also download the official paper results at the following [Google Drive Link](https://drive.google.com/file/d/1l7bY-TkqnmVQn5M5ynQfT-0upMcRlMnT/view?usp=drive_link). |
|
|
|
## Cite this work |
|
|
|
If you use any part of this work, please consider citing the paper as follows: |
|
|
|
```bibtex |
|
@inproceedings{molfese-etal-2024-zebra, |
|
title = "{ZEBRA}: Zero-Shot Example-Based Retrieval Augmentation for Commonsense Question Answering", |
|
author = "Molfese, Francesco Maria and |
|
Conia, Simone and |
|
Orlando, Riccardo and |
|
Navigli, Roberto", |
|
editor = "Al-Onaizan, Yaser and |
|
Bansal, Mohit and |
|
Chen, Yun-Nung", |
|
booktitle = "Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing", |
|
month = nov, |
|
year = "2024", |
|
address = "Miami, Florida, USA", |
|
publisher = "Association for Computational Linguistics", |
|
url = "https://aclanthology.org/2024.emnlp-main.1251", |
|
doi = "10.18653/v1/2024.emnlp-main.1251", |
|
pages = "22429--22444" |
|
} |
|
``` |
|
|
|
## πͺͺ License |
|
|
|
The data and software are licensed under [Creative Commons Attribution-NonCommercial-ShareAlike 4.0](https://creativecommons.org/licenses/by-nc-sa/4.0/). |
|
|
|
## Acknowledgements |
|
We gratefully acknowledge CREATIVE (CRoss-modalunderstanding and gEnerATIon of Visual and tExtual content) for supporting this work. Simone Conia gratefully acknowledges the support of Future AI Research ([PNRR MUR project PE0000013-FAIR](https://fondazione-fair.it/en/)), which fully funds his fellowship at Sapienza University of Rome since October 2023. |