metadata

title: Rquge
emoji: 🏢
colorFrom: gray
colorTo: blue
sdk: gradio
sdk_version: 3.34.0
app_file: app.py
pinned: false

Metric Card for RQUGE Score

Metric Description

RQUGE is an evaluation metric designed for assessing the quality of generated questions. RQUGE evaluates the quality of a candidate question without the need to compare it to a reference question. It operates by taking into account the relevant context and answer span and employs a general question-answering module followed by a span scoring mechanism to determine an acceptability score.

How to Use

RQUGE score takes three main inputs; "generated_questions" (list of generated questions), "contexts" (list of related contexts), and "answers" (list of reference answers). Additionally, "qa_model", and "sp_model" are used to provide the path to QA and span scorer modules. "device" is also an optional input.

from evaluate import load
rqugescore = load("alirezamsh/rquge")
generated_questions = ["how is the weather?"]
contexts = ["the weather is sunny"]
answers = ["sunny"]
results = rqugescore.compute(generated_questions=generated_questions, contexts=contexts, answers=answers)
print(results["mean_score"])
>>> [5.05]

Output Values

RQUGE score outputs a dictionary with the following values:

mean_score: The average RQUGE score over the input texts, ranging from 1 to 5

instance_score: Invidivual RQUGE score of each instance in the input, ranging from 1 to 5

Citation

@misc{mohammadshahi2022rquge,
    title={RQUGE: Reference-Free Metric for Evaluating Question Generation by Answering the Question},
    author={Alireza Mohammadshahi and Thomas Scialom and Majid Yazdani and Pouya Yanki and Angela Fan and James Henderson and Marzieh Saeidi},
    year={2022},
    eprint={2211.01482},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}