Spaces:
Running
Running
title: Rquge | |
emoji: 🏢 | |
colorFrom: gray | |
colorTo: blue | |
sdk: gradio | |
sdk_version: 3.34.0 | |
app_file: app.py | |
pinned: false | |
# Metric Card for RQUGE Score | |
## Metric Description | |
RQUGE is an evaluation metric designed for assessing the quality of generated questions. RQUGE evaluates the quality of a candidate question without the need to compare | |
it to a reference question. It operates by taking into account the relevant context and answer span and employs a general question-answering module followed by | |
a span scoring mechanism to determine an acceptability score. | |
## How to Use | |
RQUGE score takes three main inputs; "generated_questions" (list of generated questions), "contexts" (list of related contexts), and "answers" (list of reference answers). Additionally, "qa_model", and "sp_model" are used to provide the path to QA and span scorer modules. "device" is also an optional input. | |
```python | |
from evaluate import load | |
rqugescore = load("alirezamsh/rquge") | |
generated_questions = ["how is the weather?"] | |
contexts = ["the weather is sunny"] | |
answers = ["sunny"] | |
results = rqugescore.compute(generated_questions=generated_questions, contexts=contexts, answers=answers) | |
print(results["mean_score"]) | |
>>> [5.05] | |
``` | |
## Output Values | |
RQUGE score outputs a dictionary with the following values: | |
``` mean_score ```: The average RQUGE score over the input texts, ranging from 1 to 5 | |
``` instance_score ```: Invidivual RQUGE score of each instance in the input, ranging from 1 to 5 | |
## Citation | |
```bibtex | |
@misc{mohammadshahi2022rquge, | |
title={RQUGE: Reference-Free Metric for Evaluating Question Generation by Answering the Question}, | |
author={Alireza Mohammadshahi and Thomas Scialom and Majid Yazdani and Pouya Yanki and Angela Fan and James Henderson and Marzieh Saeidi}, | |
year={2022}, | |
eprint={2211.01482}, | |
archivePrefix={arXiv}, | |
primaryClass={cs.CL} | |
} | |
``` | |